pandas.core.groupby.GroupBy.apply¶
-
GroupBy.apply(func, *args, **kwargs)[source]¶ Apply function
funcgroup-wise and combine the results together.The function passed to
applymust take a dataframe as its first argument and return a dataframe, a series or a scalar.applywill then take care of combining the results back together into a single dataframe or series.applyis therefore a highly flexible grouping method.While
applyis a very flexible method, its downside is that using it can be quite a bit slower than using more specific methods. Pandas offers a wide range of method that will be much faster than usingapplyfor their specific purposes, so try to use them before reaching forapply.Parameters: func : function
A callable that takes a dataframe as its first argument, and returns a dataframe, a series or a scalar. In addition the callable may take positional and keyword arguments
args, kwargs : tuple and dict
Optional positional and keyword arguments to pass to
funcReturns: applied : Series or DataFrame
See also
pipe- Apply function to the full GroupBy object instead of to each group.
Notes
In the current implementation
applycalls func twice on the first group to decide whether it can take a fast or slow code path. This can lead to unexpected behavior if func has side-effects, as they will take effect twice for the first group.Examples
>>> df = pd.DataFrame({'A': 'a a b'.split(), 'B': [1,2,3], 'C': [4,6, 5]}) >>> g = df.groupby('A')
From
dfabove we can see thatghas two groups,a,b. Callingapplyin various ways, we can get different grouping results:Example 1: below the function passed to
applytakes a dataframe as its argument and returns a dataframe.applycombines the result for each group together into a new dataframe:>>> g.apply(lambda x: x / x.sum()) B C 0 0.333333 0.4 1 0.666667 0.6 2 1.000000 1.0
Example 2: The function passed to
applytakes a dataframe as its argument and returns a series.applycombines the result for each group together into a new dataframe:>>> g.apply(lambda x: x.max() - x.min()) B C A a 1 2 b 0 0
Example 3: The function passed to
applytakes a dataframe as its argument and returns a scalar.applycombines the result for each group together into a series, including setting the index as appropriate:>>> g.apply(lambda x: x.C.max() - x.B.min()) A a 5 b 2 dtype: int64