pandas.api.types.infer_dtype#

pandas.api.types.infer_dtype(value, skipna=True)#

Return a string label of the type of the elements in a list-like input.

This method inspects the elements of the provided input and determines classification of its data type. It is particularly useful for handling heterogeneous data inputs where explicit dtype conversion may not be possible or necessary.

Parameters:

valuelist, ndarray, or pandas type: The input data to infer the dtype.
skipnabool, default True: Ignore NaN values when inferring the type.

Returns:

str: Describing the common type of the input data.
Results can include:

string
bytes
floating
integer
integer-na
mixed-integer
mixed-integer-float
decimal
complex
categorical
boolean
datetime64
datetime
date
timedelta64
timedelta
time
period
interval
mixed
empty
unknown-array

Raises:

TypeError: If ndarray-like but cannot infer the dtype

See also

api.types.is_scalar: Check if the input is a scalar.
api.types.is_list_like: Check if the input is list-like.
api.types.is_integer: Check if the input is an integer.
api.types.is_float: Check if the input is a float.
api.types.is_bool: Check if the input is a boolean.

Notes

‘mixed’ is the catchall for anything that is not otherwise specialized
‘mixed-integer-float’ are floats and integers
‘mixed-integer’ are integers mixed with non-integers
‘integer-na’ are integers mixed with NaN, returned only when skipna=False
‘empty’ is returned for inputs with no inferable values (e.g. an empty input, or all-NA with skipna=True)
‘unknown-array’ is the catchall for something that is an array (has a dtype attribute), but has a dtype unknown to pandas (e.g. external extension array)

Examples

>>> from pandas.api.types import infer_dtype
>>> infer_dtype(['foo', 'bar'])
'string'

>>> infer_dtype(['a', np.nan, 'b'], skipna=True)
'string'

>>> infer_dtype(['a', np.nan, 'b'], skipna=False)
'mixed'

>>> infer_dtype([b'foo', b'bar'])
'bytes'

>>> infer_dtype([1, 2, 3])
'integer'

>>> infer_dtype([1, 2, 3.5])
'mixed-integer-float'

>>> infer_dtype([1.0, 2.0, 3.5])
'floating'

>>> infer_dtype(['a', 1])
'mixed-integer'

>>> from decimal import Decimal
>>> infer_dtype([Decimal(1), Decimal(2.0)])
'decimal'

>>> infer_dtype([True, False])
'boolean'

>>> infer_dtype([True, False, np.nan])
'boolean'

>>> infer_dtype([pd.Timestamp('20130101')])
'datetime'

>>> import datetime
>>> infer_dtype([datetime.date(2013, 1, 1)])
'date'

>>> infer_dtype([np.datetime64('2013-01-01')])
'datetime64'

>>> infer_dtype([datetime.timedelta(0, 1, 1)])
'timedelta'

>>> infer_dtype(pd.Series(list('aabc')).astype('category'))
'categorical'