pandas.core.groupby.DataFrameGroupBy.value_counts#

DataFrameGroupBy.value_counts(subset=None, normalize=False, sort=True, ascending=False, dropna=True)[source]#

Return a Series or DataFrame containing counts of unique rows.

Added in version 1.4.0.

Parameters:

subsetlist-like, optional: Columns to use when counting unique combinations.
normalizebool, default False: Return proportions rather than frequencies.
sortbool, default True: Sort by frequencies when True. When False, non-grouping columns will appear in the order they occur in within groups.

Changed in version 3.0.0: In prior versions, sort=False would sort the non-grouping columns by label.
ascendingbool, default False: Sort in ascending order.
dropnabool, default True: Don’t include counts of rows that contain NA values.

Returns:

Series or DataFrame: Series if the groupby as_index is True, otherwise DataFrame.

See also

Series.value_counts: Equivalent method on Series.
DataFrame.value_counts: Equivalent method on DataFrame.
SeriesGroupBy.value_counts: Equivalent method on SeriesGroupBy.

Notes

If the groupby as_index is True then the returned Series will have a MultiIndex with one level per input column.
If the groupby as_index is False then the returned DataFrame will have an additional column with the value_counts. The column is labelled ‘count’ or ‘proportion’, depending on the normalize parameter.

By default, rows that contain any NA values are omitted from the result.

By default, the result will be in descending order so that the first element of each group is the most frequently-occurring row.

Examples

>>> df = pd.DataFrame(
...     {
...         "gender": ["male", "male", "female", "male", "female", "male"],
...         "education": ["low", "medium", "high", "low", "high", "low"],
...         "country": ["US", "FR", "US", "FR", "FR", "FR"],
...     }
... )

>>> df
        gender  education   country
     male    low         US
     male    medium      FR
     female  high        US
     male    low         FR
     female  high        FR
     male    low         FR

>>> df.groupby("gender").value_counts()
gender  education  country
female  high       US         1
                   FR         1
male    low        FR         2
                   US         1
        medium     FR         1
Name: count, dtype: int64

>>> df.groupby("gender").value_counts(ascending=True)
gender  education  country
female  high       US         1
                   FR         1
male    low        US         1
        medium     FR         1
        low        FR         2
Name: count, dtype: int64

>>> df.groupby("gender").value_counts(normalize=True)
gender  education  country
female  high       US         0.50
                   FR         0.50
male    low        FR         0.50
                   US         0.25
        medium     FR         0.25
Name: proportion, dtype: float64

>>> df.groupby("gender", as_index=False).value_counts()
   gender education country  count
female      high      US      1
female      high      FR      1
  male       low      FR      2
  male       low      US      1
  male    medium      FR      1

>>> df.groupby("gender", as_index=False).value_counts(normalize=True)
   gender education country  proportion
female      high      US        0.50
female      high      FR        0.50
  male       low      FR        0.50
  male       low      US        0.25
  male    medium      FR        0.25

pandas.core.groupby.DataFrameGroupBy.value_counts#

This Page