pandas.core.groupby.GroupBy.apply¶
-
GroupBy.
apply
(func, *args, **kwargs)[source]¶ Apply function
func
group-wise and combine the results together.The function passed to
apply
must take a dataframe as its first argument and return a DataFrame, Series or scalar.apply
will then take care of combining the results back together into a single dataframe or series.apply
is therefore a highly flexible grouping method.While
apply
is a very flexible method, its downside is that using it can be quite a bit slower than using more specific methods likeagg
ortransform
. Pandas offers a wide range of method that will be much faster than usingapply
for their specific purposes, so try to use them before reaching forapply
.- Parameters
- funccallable
A callable that takes a dataframe as its first argument, and returns a dataframe, a series or a scalar. In addition the callable may take positional and keyword arguments.
- args, kwargstuple and dict
Optional positional and keyword arguments to pass to
func
.
- Returns
- appliedSeries or DataFrame
See also
pipe
Apply function to the full GroupBy object instead of to each group.
aggregate
Apply aggregate function to the GroupBy object.
transform
Apply function column-by-column to the GroupBy object.
Series.apply
Apply a function to a Series.
DataFrame.apply
Apply a function to each row or column of a DataFrame.
Notes
In the current implementation
apply
callsfunc
twice on the first group to decide whether it can take a fast or slow code path. This can lead to unexpected behavior iffunc
has side-effects, as they will take effect twice for the first group.Changed in version 1.3.0: The resulting dtype will reflect the return value of the passed
func
, see the examples below.Examples
>>> df = pd.DataFrame({'A': 'a a b'.split(), ... 'B': [1,2,3], ... 'C': [4,6,5]}) >>> g = df.groupby('A')
Notice that
g
has two groups,a
andb
. Calling apply in various ways, we can get different grouping results:Example 1: below the function passed to apply takes a DataFrame as its argument and returns a DataFrame. apply combines the result for each group together into a new DataFrame:
>>> g[['B', 'C']].apply(lambda x: x / x.sum()) B C 0 0.333333 0.4 1 0.666667 0.6 2 1.000000 1.0
Example 2: The function passed to apply takes a DataFrame as its argument and returns a Series. apply combines the result for each group together into a new DataFrame.
Changed in version 1.3.0: The resulting dtype will reflect the return value of the passed
func
.>>> g[['B', 'C']].apply(lambda x: x.astype(float).max() - x.min()) B C A a 1.0 2.0 b 0.0 0.0
Example 3: The function passed to apply takes a DataFrame as its argument and returns a scalar. apply combines the result for each group together into a Series, including setting the index as appropriate:
>>> g.apply(lambda x: x.C.max() - x.B.min()) A a 5 b 2 dtype: int64