GroupBy#

Note

For an overview, see Group by: split-apply-combine.

pandas.api.typing.DataFrameGroupBy and pandas.api.typing.SeriesGroupBy instances are returned by groupby calls pandas.DataFrame.groupby() and pandas.Series.groupby() respectively.

Indexing, iteration#

`DataFrameGroupBy.__iter__`()	Groupby iterator.
`SeriesGroupBy.__iter__`()	Groupby iterator.
`DataFrameGroupBy.groups`	Dict {group name -> group labels}.
`SeriesGroupBy.groups`	Dict {group name -> group labels}.
`DataFrameGroupBy.indices`	Dict {group name -> group indices}.
`SeriesGroupBy.indices`	Dict {group name -> group indices}.
`DataFrameGroupBy.get_group`(name)	Construct DataFrame from group with provided name.
`SeriesGroupBy.get_group`(name)	Construct DataFrame from group with provided name.

Grouper(*args, **kwargs)

A Grouper allows the user to specify a groupby instruction for an object.

Function application helper#

NamedAgg(column, aggfunc, *args, **kwargs)

Helper for column specific aggregation with control over output column names.

Function application#

`SeriesGroupBy.apply`(func, args, *kwargs)	Apply function `func` group-wise and combine the results together.
`DataFrameGroupBy.apply`(func, *args[, ...])	Apply function `func` group-wise and combine the results together.
`SeriesGroupBy.agg`([func, engine, engine_kwargs])	Aggregate using one or more operations.
`DataFrameGroupBy.agg`([func, engine, ...])	Aggregate using one or more operations.
`SeriesGroupBy.aggregate`([func, engine, ...])	Aggregate using one or more operations.
`DataFrameGroupBy.aggregate`([func, engine, ...])	Aggregate using one or more operations.
`SeriesGroupBy.transform`(func, *args[, ...])	Call function producing a same-indexed Series on each group.
`DataFrameGroupBy.transform`([func, engine, ...])	Call function producing a same-indexed DataFrame on each group.
`SeriesGroupBy.pipe`(func, args, *kwargs)	Apply a `func` with arguments to this GroupBy object and return its result.
`DataFrameGroupBy.pipe`(func, args, *kwargs)	Apply a `func` with arguments to this GroupBy object and return its result.
`DataFrameGroupBy.filter`(func[, dropna])	Filter elements from groups that don't satisfy a criterion.
`SeriesGroupBy.filter`(func[, dropna])	Filter elements from groups that don't satisfy a criterion.

`DataFrameGroupBy` computations / descriptive stats#

`DataFrameGroupBy.all`([skipna])	Return True if all values in the group are truthful, else False.
`DataFrameGroupBy.any`([skipna])	Return True if any value in the group is truthful, else False.
`DataFrameGroupBy.bfill`([limit])	Backward fill the values.
`DataFrameGroupBy.corr`([method, min_periods, ...])	Compute pairwise correlation of columns, excluding NA/null values.
`DataFrameGroupBy.corrwith`(other[, drop, ...])	(DEPRECATED) Compute pairwise correlation.
`DataFrameGroupBy.count`()	Compute count of group, excluding missing values.
`DataFrameGroupBy.cov`([min_periods, ddof, ...])	Compute pairwise covariance of columns, excluding NA/null values.
`DataFrameGroupBy.cumcount`([ascending])	Number each item in each group from 0 to the length of that group - 1.
`DataFrameGroupBy.cummax`([numeric_only, skipna])	Cumulative max for each group.
`DataFrameGroupBy.cummin`([numeric_only, skipna])	Cumulative min for each group.
`DataFrameGroupBy.cumprod`([numeric_only, skipna])	Cumulative product for each group.
`DataFrameGroupBy.cumsum`([numeric_only, skipna])	Cumulative sum for each group.
`DataFrameGroupBy.describe`([percentiles, ...])	Generate descriptive statistics for each group.
`DataFrameGroupBy.diff`([periods])	First discrete difference of element.
`DataFrameGroupBy.ewm`([com, span, halflife, ...])	Return an ewm grouper, providing ewm functionality per group.
`DataFrameGroupBy.expanding`([min_periods, method])	Return an expanding grouper, providing expanding functionality per group.
`DataFrameGroupBy.ffill`([limit])	Forward fill the values.
`DataFrameGroupBy.first`([numeric_only, ...])	Compute the first entry of each column within each group.
`DataFrameGroupBy.head`([n])	Return first n rows of each group.
`DataFrameGroupBy.idxmax`([skipna, numeric_only])	Return index of first occurrence of maximum in each group.
`DataFrameGroupBy.idxmin`([skipna, numeric_only])	Return index of first occurrence of minimum in each group.
`DataFrameGroupBy.last`([numeric_only, ...])	Compute the last entry of each column within each group.
`DataFrameGroupBy.max`([numeric_only, ...])	Compute max of group values.
`DataFrameGroupBy.mean`([numeric_only, ...])	Compute mean of groups, excluding missing values.
`DataFrameGroupBy.median`([numeric_only, skipna])	Compute median of groups, excluding missing values.
`DataFrameGroupBy.min`([numeric_only, ...])	Compute min of group values.
`DataFrameGroupBy.ngroup`([ascending])	Number each group from 0 to the number of groups - 1.
`DataFrameGroupBy.nth`	Take the nth row from each group if n is an int, otherwise a subset of rows.
`DataFrameGroupBy.nunique`([dropna])	Return DataFrame with counts of unique elements in each position.
`DataFrameGroupBy.ohlc`()	Compute open, high, low and close values of a group, excluding missing values.
`DataFrameGroupBy.pct_change`([periods, ...])	Calculate pct_change of each value to previous entry in group.
`DataFrameGroupBy.prod`([numeric_only, ...])	Compute prod of group values.
`DataFrameGroupBy.quantile`([q, ...])	Return group values at the given quantile, a la numpy.percentile.
`DataFrameGroupBy.rank`([method, ascending, ...])	Provide the rank of values within each group.
`DataFrameGroupBy.resample`(rule[, closed, ...])	Provide resampling within each group of a groupby.
`DataFrameGroupBy.rolling`(window[, ...])	Return a rolling grouper, providing rolling functionality per group.
`DataFrameGroupBy.sample`([n, frac, replace, ...])	Return a random sample of items from each group.
`DataFrameGroupBy.sem`([ddof, numeric_only, ...])	Compute standard error of the mean of groups, excluding missing values.
`DataFrameGroupBy.shift`([periods, freq, ...])	Shift each group by periods observations.
`DataFrameGroupBy.size`()	Compute group sizes.
`DataFrameGroupBy.skew`([skipna, numeric_only])	Return unbiased skew within groups.
`DataFrameGroupBy.kurt`([skipna, numeric_only])	Return unbiased kurtosis within groups.
`DataFrameGroupBy.std`([ddof, engine, ...])	Compute standard deviation of groups, excluding missing values.
`DataFrameGroupBy.sum`([numeric_only, ...])	Compute sum of group values.
`DataFrameGroupBy.var`([ddof, engine, ...])	Compute variance of groups, excluding missing values.
`DataFrameGroupBy.tail`([n])	Return last n rows of each group.
`DataFrameGroupBy.take`(indices, **kwargs)	Return the elements in the given positional indices in each group.
`DataFrameGroupBy.value_counts`([subset, ...])	Return a Series or DataFrame containing counts of unique rows.

`SeriesGroupBy` computations / descriptive stats#

`SeriesGroupBy.all`([skipna])	Return True if all values in the group are truthful, else False.
`SeriesGroupBy.any`([skipna])	Return True if any value in the group is truthful, else False.
`SeriesGroupBy.bfill`([limit])	Backward fill the values.
`SeriesGroupBy.corr`(other[, method, min_periods])	Compute correlation between each group and another Series.
`SeriesGroupBy.count`()	Compute count of group, excluding missing values.
`SeriesGroupBy.cov`(other[, min_periods, ddof])	Compute covariance between each group and another Series.
`SeriesGroupBy.cumcount`([ascending])	Number each item in each group from 0 to the length of that group - 1.
`SeriesGroupBy.cummax`([numeric_only, skipna])	Cumulative max for each group.
`SeriesGroupBy.cummin`([numeric_only, skipna])	Cumulative min for each group.
`SeriesGroupBy.cumprod`([numeric_only, skipna])	Cumulative product for each group.
`SeriesGroupBy.cumsum`([numeric_only, skipna])	Cumulative sum for each group.
`SeriesGroupBy.describe`([percentiles, ...])	Generate descriptive statistics for each group.
`SeriesGroupBy.diff`([periods])	First discrete difference of element.
`SeriesGroupBy.ewm`([com, span, halflife, ...])	Return an ewm grouper, providing ewm functionality per group.
`SeriesGroupBy.expanding`([min_periods, method])	Return an expanding grouper, providing expanding functionality per group.
`SeriesGroupBy.ffill`([limit])	Forward fill the values.
`SeriesGroupBy.first`([numeric_only, ...])	Compute the first entry of each column within each group.
`SeriesGroupBy.head`([n])	Return first n rows of each group.
`SeriesGroupBy.last`([numeric_only, ...])	Compute the last entry of each column within each group.
`SeriesGroupBy.idxmax`([skipna])	Return the row label of the maximum value.
`SeriesGroupBy.idxmin`([skipna])	Return the row label of the minimum value.
`SeriesGroupBy.is_monotonic_increasing`	Return whether each group's values are monotonically increasing.
`SeriesGroupBy.is_monotonic_decreasing`	Return whether each group's values are monotonically decreasing.
`SeriesGroupBy.max`([numeric_only, min_count, ...])	Compute max of group values.
`SeriesGroupBy.mean`([numeric_only, skipna, ...])	Compute mean of groups, excluding missing values.
`SeriesGroupBy.median`([numeric_only, skipna])	Compute median of groups, excluding missing values.
`SeriesGroupBy.min`([numeric_only, min_count, ...])	Compute min of group values.
`SeriesGroupBy.ngroup`([ascending])	Number each group from 0 to the number of groups - 1.
`SeriesGroupBy.nlargest`([n, keep])	Return the largest n elements.
`SeriesGroupBy.nsmallest`([n, keep])	Return the smallest n elements.
`SeriesGroupBy.nth`	Take the nth row from each group if n is an int, otherwise a subset of rows.
`SeriesGroupBy.nunique`([dropna])	Return number of unique elements in the group.
`SeriesGroupBy.unique`()	Return unique values for each group.
`SeriesGroupBy.ohlc`()	Compute open, high, low and close values of a group, excluding missing values.
`SeriesGroupBy.pct_change`([periods, ...])	Calculate pct_change of each value to previous entry in group.
`SeriesGroupBy.prod`([numeric_only, ...])	Compute prod of group values.
`SeriesGroupBy.quantile`([q, interpolation, ...])	Return group values at the given quantile, a la numpy.percentile.
`SeriesGroupBy.rank`([method, ascending, ...])	Provide the rank of values within each group.
`SeriesGroupBy.resample`(rule[, closed, ...])	Provide resampling within each group of a groupby.
`SeriesGroupBy.rolling`(window[, min_periods, ...])	Return a rolling grouper, providing rolling functionality per group.
`SeriesGroupBy.sample`([n, frac, replace, ...])	Return a random sample of items from each group.
`SeriesGroupBy.sem`([ddof, numeric_only, skipna])	Compute standard error of the mean of groups, excluding missing values.
`SeriesGroupBy.shift`([periods, freq, ...])	Shift each group by periods observations.
`SeriesGroupBy.size`()	Compute group sizes.
`SeriesGroupBy.skew`([skipna, numeric_only])	Return unbiased skew within groups.
`SeriesGroupBy.kurt`([skipna, numeric_only])	Return unbiased kurtosis within groups.
`SeriesGroupBy.std`([ddof, engine, ...])	Compute standard deviation of groups, excluding missing values.
`SeriesGroupBy.sum`([numeric_only, min_count, ...])	Compute sum of group values.
`SeriesGroupBy.var`([ddof, engine, ...])	Compute variance of groups, excluding missing values.
`SeriesGroupBy.tail`([n])	Return last n rows of each group.
`SeriesGroupBy.take`(indices, **kwargs)	Return the elements in the given positional indices in each group.
`SeriesGroupBy.value_counts`([normalize, ...])	Return a Series or DataFrame containing counts of unique rows.

Plotting and visualization#

`DataFrameGroupBy.boxplot`([subplots, column, ...])	Make box plots from DataFrameGroupBy data.
`DataFrameGroupBy.hist`([column, by, grid, ...])	Draw histogram of the DataFrame's columns for each group.
`SeriesGroupBy.hist`([by, ax, grid, ...])	Draw histogram for each group's values using `Series.hist()` API.
`DataFrameGroupBy.plot`	Make plots of groups from a DataFrame.
`SeriesGroupBy.plot`	Make plots of groups from a Series.