pandas.to_datetime#
- pandas.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=False, format=None, exact=<no_default>, unit=None, infer_datetime_format=<no_default>, origin='unix', cache=True)[source]#
- Convert argument to datetime. - This function converts a scalar, array-like, - Seriesor- DataFrame/dict-like to a pandas datetime object.- Parameters:
- argint, float, str, datetime, list, tuple, 1-d array, Series, DataFrame/dict-like
- The object to convert to a datetime. If a - DataFrameis provided, the method expects minimally the following columns:- "year",- "month",- "day". The column “year” must be specified in 4-digit format.
- errors{‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’
- If - 'raise', then invalid parsing will raise an exception.
- If - 'coerce', then invalid parsing will be set as- NaT.
- If - 'ignore', then invalid parsing will return the input.
 
- dayfirstbool, default False
- Specify a date parse order if arg is str or is list-like. If - True, parses dates with the day first, e.g.- "10/11/12"is parsed as- 2012-11-10.- Warning - dayfirst=Trueis not strict, but will prefer to parse with day first.
- yearfirstbool, default False
- Specify a date parse order if arg is str or is list-like. - If - Trueparses dates with the year first, e.g.- "10/11/12"is parsed as- 2010-11-12.
- If both dayfirst and yearfirst are - True, yearfirst is preceded (same as- dateutil).
 - Warning - yearfirst=Trueis not strict, but will prefer to parse with year first.
- utcbool, default False
- Control timezone-related parsing, localization and conversion. - If - True, the function always returns a timezone-aware UTC-localized- Timestamp,- Seriesor- DatetimeIndex. To do this, timezone-naive inputs are localized as UTC, while timezone-aware inputs are converted to UTC.
- If - False(default), inputs will not be coerced to UTC. Timezone-naive inputs will remain naive, while timezone-aware ones will keep their time offsets. Limitations exist for mixed offsets (typically, daylight savings), see Examples section for details.
 - Warning - In a future version of pandas, parsing datetimes with mixed time zones will raise an error unless utc=True. Please specify utc=True to opt in to the new behaviour and silence this warning. To create a Series with mixed offsets and object dtype, please use apply and datetime.datetime.strptime. - See also: pandas general documentation about timezone conversion and localization. 
- formatstr, default None
- The strftime to parse time, e.g. - "%d/%m/%Y". See strftime documentation for more information on choices, though note that- "%f"will parse all the way up to nanoseconds. You can also pass:- “ISO8601”, to parse any ISO8601 time string (not necessarily in exactly the same format); 
- “mixed”, to infer the format for each element individually. This is risky, and you should probably use it along with dayfirst. 
 - Note - If a - DataFrameis passed, then format has no effect.
- exactbool, default True
- Control how format is used: - If - True, require an exact format match.
- If - False, allow the format to match anywhere in the target string.
 - Cannot be used alongside - format='ISO8601'or- format='mixed'.
- unitstr, default ‘ns’
- The unit of the arg (D,s,ms,us,ns) denote the unit, which is an integer or float number. This will be based off the origin. Example, with - unit='ms'and- origin='unix', this would calculate the number of milliseconds to the unix epoch start.
- infer_datetime_formatbool, default False
- If - Trueand no format is given, attempt to infer the format of the datetime strings based on the first non-NaN element, and if it can be inferred, switch to a faster method of parsing them. In some cases this can increase the parsing speed by ~5-10x.- Deprecated since version 2.0.0: A strict version of this argument is now the default, passing it has no effect. 
- originscalar, default ‘unix’
- Define the reference date. The numeric values would be parsed as number of units (defined by unit) since this reference date. - If - 'unix'(or POSIX) time; origin is set to 1970-01-01.
- If - 'julian', unit must be- 'D', and origin is set to beginning of Julian Calendar. Julian day number- 0is assigned to the day starting at noon on January 1, 4713 BC.
- If Timestamp convertible (Timestamp, dt.datetime, np.datetimt64 or date string), origin is set to Timestamp identified by origin. 
- If a float or integer, origin is the difference (in units determined by the - unitargument) relative to 1970-01-01.
 
- cachebool, default True
- If - True, use a cache of unique, converted dates to apply the datetime conversion. May produce significant speed-up when parsing duplicate date strings, especially ones with timezone offsets. The cache is only used when there are at least 50 values. The presence of out-of-bounds values will render the cache unusable and may slow down parsing.
 
- Returns:
- datetime
- If parsing succeeded. Return type depends on input (types in parenthesis correspond to fallback in case of unsuccessful timezone or out-of-range timestamp parsing): - scalar: - Timestamp(or- datetime.datetime)
- array-like: - DatetimeIndex(or- Serieswith- objectdtype containing- datetime.datetime)
- Series: - Seriesof- datetime64dtype (or- Seriesof- objectdtype containing- datetime.datetime)
- DataFrame: - Seriesof- datetime64dtype (or- Seriesof- objectdtype containing- datetime.datetime)
 
 
- Raises:
- ParserError
- When parsing a date from string fails. 
- ValueError
- When another datetime conversion error happens. For example when one of ‘year’, ‘month’, day’ columns is missing in a - DataFrame, or when a Timezone-aware- datetime.datetimeis found in an array-like of mixed time offsets, and- utc=False.
 
 - See also - DataFrame.astype
- Cast argument to a specified dtype. 
- to_timedelta
- Convert argument to timedelta. 
- convert_dtypes
- Convert dtypes. 
 - Notes - Many input types are supported, and lead to different output types: - scalars can be int, float, str, datetime object (from stdlib - datetimemodule or- numpy). They are converted to- Timestampwhen possible, otherwise they are converted to- datetime.datetime. None/NaN/null scalars are converted to- NaT.
- array-like can contain int, float, str, datetime objects. They are converted to - DatetimeIndexwhen possible, otherwise they are converted to- Indexwith- objectdtype, containing- datetime.datetime. None/NaN/null entries are converted to- NaTin both cases.
- Series are converted to - Serieswith- datetime64dtype when possible, otherwise they are converted to- Serieswith- objectdtype, containing- datetime.datetime. None/NaN/null entries are converted to- NaTin both cases.
- DataFrame/dict-like are converted to - Serieswith- datetime64dtype. For each row a datetime is created from assembling the various dataframe columns. Column keys can be common abbreviations like [‘year’, ‘month’, ‘day’, ‘minute’, ‘second’, ‘ms’, ‘us’, ‘ns’]) or plurals of the same.
 - The following causes are responsible for - datetime.datetimeobjects being returned (possibly inside an- Indexor a- Serieswith- objectdtype) instead of a proper pandas designated type (- Timestamp,- DatetimeIndexor- Serieswith- datetime64dtype):- when any input element is before - Timestamp.minor after- Timestamp.max, see timestamp limitations.
- when - utc=False(default) and the input is an array-like or- Seriescontaining mixed naive/aware datetime, or aware with mixed time offsets. Note that this happens in the (quite frequent) situation when the timezone has a daylight savings policy. In that case you may wish to use- utc=True.
 - Examples - Handling various input formats - Assembling a datetime from multiple columns of a - DataFrame. The keys can be common abbreviations like [‘year’, ‘month’, ‘day’, ‘minute’, ‘second’, ‘ms’, ‘us’, ‘ns’]) or plurals of the same- >>> df = pd.DataFrame({'year': [2015, 2016], ... 'month': [2, 3], ... 'day': [4, 5]}) >>> pd.to_datetime(df) 0 2015-02-04 1 2016-03-05 dtype: datetime64[ns] - Using a unix epoch time - >>> pd.to_datetime(1490195805, unit='s') Timestamp('2017-03-22 15:16:45') >>> pd.to_datetime(1490195805433502912, unit='ns') Timestamp('2017-03-22 15:16:45.433502912') - Warning - For float arg, precision rounding might happen. To prevent unexpected behavior use a fixed-width exact type. - Using a non-unix epoch origin - >>> pd.to_datetime([1, 2, 3], unit='D', ... origin=pd.Timestamp('1960-01-01')) DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'], dtype='datetime64[ns]', freq=None) - Differences with strptime behavior - "%f"will parse all the way up to nanoseconds.- >>> pd.to_datetime('2018-10-26 12:00:00.0000000011', ... format='%Y-%m-%d %H:%M:%S.%f') Timestamp('2018-10-26 12:00:00.000000001') - Non-convertible date/times - Passing - errors='coerce'will force an out-of-bounds date to- NaT, in addition to forcing non-dates (or non-parseable dates) to- NaT.- >>> pd.to_datetime('13000101', format='%Y%m%d', errors='coerce') NaT - Timezones and time offsets - The default behaviour ( - utc=False) is as follows:- Timezone-naive inputs are converted to timezone-naive - DatetimeIndex:
 - >>> pd.to_datetime(['2018-10-26 12:00:00', '2018-10-26 13:00:15']) DatetimeIndex(['2018-10-26 12:00:00', '2018-10-26 13:00:15'], dtype='datetime64[ns]', freq=None) - Timezone-aware inputs with constant time offset are converted to timezone-aware - DatetimeIndex:
 - >>> pd.to_datetime(['2018-10-26 12:00 -0500', '2018-10-26 13:00 -0500']) DatetimeIndex(['2018-10-26 12:00:00-05:00', '2018-10-26 13:00:00-05:00'], dtype='datetime64[ns, UTC-05:00]', freq=None) - However, timezone-aware inputs with mixed time offsets (for example issued from a timezone with daylight savings, such as Europe/Paris) are not successfully converted to a - DatetimeIndex. Parsing datetimes with mixed time zones will show a warning unless utc=True. If you specify utc=False the warning below will be shown and a simple- Indexcontaining- datetime.datetimeobjects will be returned:
 - >>> pd.to_datetime(['2020-10-25 02:00 +0200', ... '2020-10-25 04:00 +0100']) FutureWarning: In a future version of pandas, parsing datetimes with mixed time zones will raise an error unless `utc=True`. Please specify `utc=True` to opt in to the new behaviour and silence this warning. To create a `Series` with mixed offsets and `object` dtype, please use `apply` and `datetime.datetime.strptime`. Index([2020-10-25 02:00:00+02:00, 2020-10-25 04:00:00+01:00], dtype='object') - A mix of timezone-aware and timezone-naive inputs is also converted to a simple - Indexcontaining- datetime.datetimeobjects:
 - >>> from datetime import datetime >>> pd.to_datetime(["2020-01-01 01:00:00-01:00", ... datetime(2020, 1, 1, 3, 0)]) FutureWarning: In a future version of pandas, parsing datetimes with mixed time zones will raise an error unless `utc=True`. Please specify `utc=True` to opt in to the new behaviour and silence this warning. To create a `Series` with mixed offsets and `object` dtype, please use `apply` and `datetime.datetime.strptime`. Index([2020-01-01 01:00:00-01:00, 2020-01-01 03:00:00], dtype='object') - Setting - utc=Truesolves most of the above issues:- Timezone-naive inputs are localized as UTC 
 - >>> pd.to_datetime(['2018-10-26 12:00', '2018-10-26 13:00'], utc=True) DatetimeIndex(['2018-10-26 12:00:00+00:00', '2018-10-26 13:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None) - Timezone-aware inputs are converted to UTC (the output represents the exact same datetime, but viewed from the UTC time offset +00:00). 
 - >>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'], ... utc=True) DatetimeIndex(['2018-10-26 17:30:00+00:00', '2018-10-26 17:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None) - Inputs can contain both string or datetime, the above rules still apply 
 - >>> pd.to_datetime(['2018-10-26 12:00', datetime(2020, 1, 1, 18)], utc=True) DatetimeIndex(['2018-10-26 12:00:00+00:00', '2020-01-01 18:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None)