pandas.core.groupby.GroupBy.apply

GroupBy.apply(func, *args, **kwargs)[source]

Apply function func group-wise and combine the results together.

The function passed to apply must take a dataframe as its first argument and return a DataFrame, Series or scalar. apply will then take care of combining the results back together into a single dataframe or series. apply is therefore a highly flexible grouping method.

While apply is a very flexible method, its downside is that using it can be quite a bit slower than using more specific methods like agg or transform. Pandas offers a wide range of method that will be much faster than using apply for their specific purposes, so try to use them before reaching for apply.

Parameters
funccallable

A callable that takes a dataframe as its first argument, and returns a dataframe, a series or a scalar. In addition the callable may take positional and keyword arguments.

args, kwargstuple and dict

Optional positional and keyword arguments to pass to func.

Returns
appliedSeries or DataFrame

See also

pipe

Apply function to the full GroupBy object instead of to each group.

aggregate

Apply aggregate function to the GroupBy object.

transform

Apply function column-by-column to the GroupBy object.

Series.apply

Apply a function to a Series.

DataFrame.apply

Apply a function to each row or column of a DataFrame.

Notes

In the current implementation apply calls func twice on the first group to decide whether it can take a fast or slow code path. This can lead to unexpected behavior if func has side-effects, as they will take effect twice for the first group.

Changed in version 1.3.0: The resulting dtype will reflect the return value of the passed func, see the examples below.

Examples

>>> df = pd.DataFrame({'A': 'a a b'.split(),
...                    'B': [1,2,3],
...                    'C': [4,6,5]})
>>> g = df.groupby('A')

Notice that g has two groups, a and b. Calling apply in various ways, we can get different grouping results:

Example 1: below the function passed to apply takes a DataFrame as its argument and returns a DataFrame. apply combines the result for each group together into a new DataFrame:

>>> g[['B', 'C']].apply(lambda x: x / x.sum())
          B    C
0  0.333333  0.4
1  0.666667  0.6
2  1.000000  1.0

Example 2: The function passed to apply takes a DataFrame as its argument and returns a Series. apply combines the result for each group together into a new DataFrame.

Changed in version 1.3.0: The resulting dtype will reflect the return value of the passed func.

>>> g[['B', 'C']].apply(lambda x: x.astype(float).max() - x.min())
     B    C
A
a  1.0  2.0
b  0.0  0.0

Example 3: The function passed to apply takes a DataFrame as its argument and returns a scalar. apply combines the result for each group together into a Series, including setting the index as appropriate:

>>> g.apply(lambda x: x.C.max() - x.B.min())
A
a    5
b    2
dtype: int64