pandas.core.groupby.DataFrameGroupBy.agg¶
-
DataFrameGroupBy.
agg
(arg, *args, **kwargs)[source]¶ Aggregate using callable, string, dict, or list of string/callables
Parameters: func : callable, string, dictionary, or list of string/callables
Function to use for aggregating the data. If a function, must either work when passed a DataFrame or when passed to DataFrame.apply. For a DataFrame, can pass a dict, if the keys are DataFrame column names.
Accepted Combinations are:
- string function name
- function
- list of functions
- dict of column names -> functions (or list of functions)
Returns: aggregated : DataFrame
See also
pandas.DataFrame.groupby.apply
,pandas.DataFrame.groupby.transform
,pandas.DataFrame.aggregate
Notes
Numpy functions mean/median/prod/sum/std/var are special cased so the default behavior is applying the function along axis=0 (e.g., np.mean(arr_2d, axis=0)) as opposed to mimicking the default Numpy behavior (e.g., np.mean(arr_2d)).
agg is an alias for aggregate. Use it.
Examples
>>> df = pd.DataFrame({'A': [1, 1, 2, 2], ... 'B': [1, 2, 3, 4], ... 'C': np.random.randn(4)})
>>> df A B C 0 1 1 0.362838 1 1 2 0.227877 2 2 3 1.267767 3 2 4 -0.562860
The aggregation is for each column.
>>> df.groupby('A').agg('min') B C A 1 1 0.227877 2 3 -0.562860
Multiple aggregations
>>> df.groupby('A').agg(['min', 'max']) B C min max min max A 1 1 2 0.227877 0.362838 2 3 4 -0.562860 1.267767
Select a column for aggregation
>>> df.groupby('A').B.agg(['min', 'max']) min max A 1 1 2 2 3 4
Different aggregations per column
>>> df.groupby('A').agg({'B': ['min', 'max'], 'C': 'sum'}) B C min max sum A 1 1 2 0.590716 2 3 4 0.704907