pandas.DataFrame.convert_dtypes¶
- DataFrame.convert_dtypes(infer_objects=True, convert_string=True, convert_integer=True, convert_boolean=True, convert_floating=True)[source]¶
Convert columns to best possible dtypes using dtypes supporting
pd.NA.New in version 1.0.0.
- Parameters
- infer_objectsbool, default True
Whether object dtypes should be converted to the best possible types.
- convert_stringbool, default True
Whether object dtypes should be converted to
StringDtype().- convert_integerbool, default True
Whether, if possible, conversion can be done to integer extension types.
- convert_booleanbool, defaults True
Whether object dtypes should be converted to
BooleanDtypes().- convert_floatingbool, defaults True
Whether, if possible, conversion can be done to floating extension types. If convert_integer is also True, preference will be give to integer dtypes if the floats can be faithfully casted to integers.
New in version 1.2.0.
- Returns
- Series or DataFrame
Copy of input object with new dtype.
See also
infer_objectsInfer dtypes of objects.
to_datetimeConvert argument to datetime.
to_timedeltaConvert argument to timedelta.
to_numericConvert argument to a numeric type.
Notes
By default,
convert_dtypeswill attempt to convert a Series (or each Series in a DataFrame) to dtypes that supportpd.NA. By using the optionsconvert_string,convert_integer,convert_booleanandconvert_boolean, it is possible to turn off individual conversions toStringDtype, the integer extension types,BooleanDtypeor floating extension types, respectively.For object-dtyped columns, if
infer_objectsisTrue, use the inference rules as during normal Series/DataFrame construction. Then, if possible, convert toStringDtype,BooleanDtypeor an appropriate integer or floating extension type, otherwise leave asobject.If the dtype is integer, convert to an appropriate integer extension type.
If the dtype is numeric, and consists of all integers, convert to an appropriate integer extension type. Otherwise, convert to an appropriate floating extension type.
Changed in version 1.2: Starting with pandas 1.2, this method also converts float columns to the nullable floating extension type.
In the future, as new dtypes are added that support
pd.NA, the results of this method will change to support those new dtypes.Examples
>>> df = pd.DataFrame( ... { ... "a": pd.Series([1, 2, 3], dtype=np.dtype("int32")), ... "b": pd.Series(["x", "y", "z"], dtype=np.dtype("O")), ... "c": pd.Series([True, False, np.nan], dtype=np.dtype("O")), ... "d": pd.Series(["h", "i", np.nan], dtype=np.dtype("O")), ... "e": pd.Series([10, np.nan, 20], dtype=np.dtype("float")), ... "f": pd.Series([np.nan, 100.5, 200], dtype=np.dtype("float")), ... } ... )
Start with a DataFrame with default dtypes.
>>> df a b c d e f 0 1 x True h 10.0 NaN 1 2 y False i NaN 100.5 2 3 z NaN NaN 20.0 200.0
>>> df.dtypes a int32 b object c object d object e float64 f float64 dtype: object
Convert the DataFrame to use best possible dtypes.
>>> dfn = df.convert_dtypes() >>> dfn a b c d e f 0 1 x True h 10 <NA> 1 2 y False i <NA> 100.5 2 3 z <NA> <NA> 20 200.0
>>> dfn.dtypes a Int32 b string c boolean d string e Int64 f Float64 dtype: object
Start with a Series of strings and missing data represented by
np.nan.>>> s = pd.Series(["a", "b", np.nan]) >>> s 0 a 1 b 2 NaN dtype: object
Obtain a Series with dtype
StringDtype.>>> s.convert_dtypes() 0 a 1 b 2 <NA> dtype: string