pandas.DataFrame.select_dtypes¶
-
DataFrame.
select_dtypes
(include=None, exclude=None)[source]¶ Return a subset of the DataFrame’s columns based on the column dtypes.
Parameters: include, exclude : scalar or list-like
A selection of dtypes or strings to be included/excluded. At least one of these parameters must be supplied.
Returns: subset : DataFrame
The subset of the frame including the dtypes in
include
and excluding the dtypes inexclude
.Raises: ValueError
- If both of
include
andexclude
are empty - If
include
andexclude
have overlapping elements - If any kind of string dtype is passed in.
Notes
- To select all numeric types, use
np.number
or'number'
- To select strings you must use the
object
dtype, but note that this will return all object dtype columns - See the numpy dtype hierarchy
- To select datetimes, use
np.datetime64
,'datetime'
or'datetime64'
- To select timedeltas, use
np.timedelta64
,'timedelta'
or'timedelta64'
- To select Pandas categorical dtypes, use
'category'
- To select Pandas datetimetz dtypes, use
'datetimetz'
(new in 0.20.0) or'datetime64[ns, tz]'
Examples
>>> df = pd.DataFrame({'a': [1, 2] * 3, ... 'b': [True, False] * 3, ... 'c': [1.0, 2.0] * 3}) >>> df a b c 0 1 True 1.0 1 2 False 2.0 2 1 True 1.0 3 2 False 2.0 4 1 True 1.0 5 2 False 2.0
>>> df.select_dtypes(include='bool') b 0 True 1 False 2 True 3 False 4 True 5 False
>>> df.select_dtypes(include=['float64']) c 0 1.0 1 2.0 2 1.0 3 2.0 4 1.0 5 2.0
>>> df.select_dtypes(exclude=['int']) b c 0 True 1.0 1 False 2.0 2 True 1.0 3 False 2.0 4 True 1.0 5 False 2.0
- If both of