GroupBy#

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc.

Indexing, iteration#

GroupBy.__iter__()

Groupby iterator.

GroupBy.groups

Dict {group name -> group labels}.

GroupBy.indices

Dict {group name -> group indices}.

GroupBy.get_group(name[, obj])

Construct DataFrame from group with provided name.

Grouper(*args, **kwargs)

A Grouper allows the user to specify a groupby instruction for an object.

Function application#

GroupBy.apply(func, *args, **kwargs)

Apply function func group-wise and combine the results together.

GroupBy.agg(func, *args, **kwargs)

SeriesGroupBy.aggregate([func, engine, ...])

Aggregate using one or more operations over the specified axis.

DataFrameGroupBy.aggregate([func, engine, ...])

Aggregate using one or more operations over the specified axis.

SeriesGroupBy.transform(func, *args[, ...])

Call function producing a same-indexed Series on each group.

DataFrameGroupBy.transform(func, *args[, ...])

Call function producing a same-indexed DataFrame on each group.

GroupBy.pipe(func, *args, **kwargs)

Apply a func with arguments to this GroupBy object and return its result.

Computations / descriptive stats#

GroupBy.all([skipna])

Return True if all values in the group are truthful, else False.

GroupBy.any([skipna])

Return True if any value in the group is truthful, else False.

GroupBy.bfill([limit])

Backward fill the values.

GroupBy.backfill([limit])

(DEPRECATED) Backward fill the values.

GroupBy.count()

Compute count of group, excluding missing values.

GroupBy.cumcount([ascending])

Number each item in each group from 0 to the length of that group - 1.

GroupBy.cummax([axis, numeric_only])

Cumulative max for each group.

GroupBy.cummin([axis, numeric_only])

Cumulative min for each group.

GroupBy.cumprod([axis])

Cumulative product for each group.

GroupBy.cumsum([axis])

Cumulative sum for each group.

GroupBy.ffill([limit])

Forward fill the values.

GroupBy.first([numeric_only, min_count])

Compute the first non-null entry of each column.

GroupBy.head([n])

Return first n rows of each group.

GroupBy.last([numeric_only, min_count])

Compute the last non-null entry of each column.

GroupBy.max([numeric_only, min_count, ...])

Compute max of group values.

GroupBy.mean([numeric_only, engine, ...])

Compute mean of groups, excluding missing values.

GroupBy.median([numeric_only])

Compute median of groups, excluding missing values.

GroupBy.min([numeric_only, min_count, ...])

Compute min of group values.

GroupBy.ngroup([ascending])

Number each group from 0 to the number of groups - 1.

GroupBy.nth(n[, dropna])

Take the nth row from each group if n is an int, otherwise a subset of rows.

GroupBy.ohlc()

Compute open, high, low and close values of a group, excluding missing values.

GroupBy.pad([limit])

(DEPRECATED) Forward fill the values.

GroupBy.prod([numeric_only, min_count])

Compute prod of group values.

GroupBy.rank([method, ascending, na_option, ...])

Provide the rank of values within each group.

GroupBy.pct_change([periods, fill_method, ...])

Calculate pct_change of each value to previous entry in group.

GroupBy.size()

Compute group sizes.

GroupBy.sem([ddof, numeric_only])

Compute standard error of the mean of groups, excluding missing values.

GroupBy.std([ddof, engine, engine_kwargs, ...])

Compute standard deviation of groups, excluding missing values.

GroupBy.sum([numeric_only, min_count, ...])

Compute sum of group values.

GroupBy.var([ddof, engine, engine_kwargs, ...])

Compute variance of groups, excluding missing values.

GroupBy.tail([n])

Return last n rows of each group.

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

DataFrameGroupBy.all([skipna])

Return True if all values in the group are truthful, else False.

DataFrameGroupBy.any([skipna])

Return True if any value in the group is truthful, else False.

DataFrameGroupBy.backfill([limit])

(DEPRECATED) Backward fill the values.

DataFrameGroupBy.bfill([limit])

Backward fill the values.

DataFrameGroupBy.corr

Compute pairwise correlation of columns, excluding NA/null values.

DataFrameGroupBy.count()

Compute count of group, excluding missing values.

DataFrameGroupBy.cov

Compute pairwise covariance of columns, excluding NA/null values.

DataFrameGroupBy.cumcount([ascending])

Number each item in each group from 0 to the length of that group - 1.

DataFrameGroupBy.cummax([axis, numeric_only])

Cumulative max for each group.

DataFrameGroupBy.cummin([axis, numeric_only])

Cumulative min for each group.

DataFrameGroupBy.cumprod([axis])

Cumulative product for each group.

DataFrameGroupBy.cumsum([axis])

Cumulative sum for each group.

DataFrameGroupBy.describe(**kwargs)

Generate descriptive statistics.

DataFrameGroupBy.diff([periods, axis])

First discrete difference of element.

DataFrameGroupBy.ffill([limit])

Forward fill the values.

DataFrameGroupBy.fillna

Fill NA/NaN values using the specified method.

DataFrameGroupBy.filter(func[, dropna])

Return a copy of a DataFrame excluding filtered elements.

DataFrameGroupBy.hist

Make a histogram of the DataFrame's columns.

DataFrameGroupBy.idxmax([axis, skipna, ...])

Return index of first occurrence of maximum over requested axis.

DataFrameGroupBy.idxmin([axis, skipna, ...])

Return index of first occurrence of minimum over requested axis.

DataFrameGroupBy.mad

(DEPRECATED) Return the mean absolute deviation of the values over the requested axis.

DataFrameGroupBy.nunique([dropna])

Return DataFrame with counts of unique elements in each position.

DataFrameGroupBy.pad([limit])

(DEPRECATED) Forward fill the values.

DataFrameGroupBy.pct_change([periods, ...])

Calculate pct_change of each value to previous entry in group.

DataFrameGroupBy.plot

Class implementing the .plot attribute for groupby objects.

DataFrameGroupBy.quantile([q, ...])

Return group values at the given quantile, a la numpy.percentile.

DataFrameGroupBy.rank([method, ascending, ...])

Provide the rank of values within each group.

DataFrameGroupBy.resample(rule, *args, **kwargs)

Provide resampling when using a TimeGrouper.

DataFrameGroupBy.sample([n, frac, replace, ...])

Return a random sample of items from each group.

DataFrameGroupBy.shift([periods, freq, ...])

Shift each group by periods observations.

DataFrameGroupBy.size()

Compute group sizes.

DataFrameGroupBy.skew

Return unbiased skew over requested axis.

DataFrameGroupBy.take

Return the elements in the given positional indices along an axis.

DataFrameGroupBy.tshift

(DEPRECATED) Shift the time index, using the index's frequency if available.

DataFrameGroupBy.value_counts([subset, ...])

Return a Series or DataFrame containing counts of unique rows.

The following methods are available only for SeriesGroupBy objects.

SeriesGroupBy.hist

Draw histogram of the input series using matplotlib.

SeriesGroupBy.nlargest([n, keep])

Return the largest n elements.

SeriesGroupBy.nsmallest([n, keep])

Return the smallest n elements.

SeriesGroupBy.unique

Return unique values of Series object.

SeriesGroupBy.is_monotonic_increasing

Return boolean if values in the object are monotonically increasing.

SeriesGroupBy.is_monotonic_decreasing

Return boolean if values in the object are monotonically decreasing.

The following methods are available only for DataFrameGroupBy objects.

DataFrameGroupBy.corrwith

Compute pairwise correlation.

DataFrameGroupBy.boxplot([subplots, column, ...])

Make box plots from DataFrameGroupBy data.