pandas.DataFrame.select_dtypes¶
-
DataFrame.
select_dtypes
(include=None, exclude=None)[source]¶ Return a subset of a DataFrame including/excluding columns based on their
dtype
.Parameters: include, exclude : list-like
A list of dtypes or strings to be included/excluded. You must pass in a non-empty sequence for at least one of these.
Returns: subset : DataFrame
The subset of the frame including the dtypes in
include
and excluding the dtypes inexclude
.Raises: ValueError
- If both of
include
andexclude
are empty - If
include
andexclude
have overlapping elements - If any kind of string dtype is passed in.
TypeError
- If either of
include
orexclude
is not a sequence
Notes
- To select all numeric types use the numpy dtype
numpy.number
- To select strings you must use the
object
dtype, but note that this will return all object dtype columns - See the numpy dtype hierarchy
- To select datetimes, use np.datetime64, ‘datetime’ or ‘datetime64’
- To select timedeltas, use np.timedelta64, ‘timedelta’ or ‘timedelta64’
- To select Pandas categorical dtypes, use ‘category’
- To select Pandas datetimetz dtypes, use ‘datetimetz’ (new in 0.20.0), or a ‘datetime64[ns, tz]’ string
Examples
>>> df = pd.DataFrame({'a': np.random.randn(6).astype('f4'), ... 'b': [True, False] * 3, ... 'c': [1.0, 2.0] * 3}) >>> df a b c 0 0.3962 True 1 1 0.1459 False 2 2 0.2623 True 1 3 0.0764 False 2 4 -0.9703 True 1 5 -1.2094 False 2 >>> df.select_dtypes(include=['float64']) c 0 1 1 2 2 1 3 2 4 1 5 2 >>> df.select_dtypes(exclude=['floating']) b 0 True 1 False 2 True 3 False 4 True 5 False
- If both of