pandas.core.groupby.DataFrameGroupBy.pipe#

DataFrameGroupBy.pipe(func, *args, **kwargs)[source]#

Apply a func with arguments to this GroupBy object and return its result.

Use .pipe when you want to improve readability by chaining together functions that expect Series, DataFrames, GroupBy or Resampler objects. Instead of writing

>>> h = lambda x, arg2, arg3: x + 1 - arg2 * arg3
>>> g = lambda x, arg1: x * 5 / arg1
>>> f = lambda x: x ** 4
>>> df = pd.DataFrame([["a", 4], ["b", 5]], columns=["group", "value"])
>>> h(g(f(df.groupby('group')), arg1=1), arg2=2, arg3=3)  

You can write

>>> (df.groupby('group')
...    .pipe(f)
...    .pipe(g, arg1=1)
...    .pipe(h, arg2=2, arg3=3))  

which is much more readable.

Parameters:
funccallable or tuple of (callable, str)

Function to apply to this GroupBy object or, alternatively, a (callable, data_keyword) tuple where data_keyword is a string indicating the keyword of callable that expects the GroupBy object.

*argsiterable, optional

Positional arguments passed into func.

**kwargsdict, optional

A dictionary of keyword arguments passed into func.

Returns:
GroupBy

The original object with the function func applied.

See also

Series.pipe

Apply a function with arguments to a series.

DataFrame.pipe

Apply a function with arguments to a dataframe.

apply

Apply function to each group instead of to the full GroupBy object.

Notes

See more here

Examples

>>> df = pd.DataFrame({'A': 'a b a b'.split(), 'B': [1, 2, 3, 4]})
>>> df
   A  B
0  a  1
1  b  2
2  a  3
3  b  4

To get the difference between each groups maximum and minimum value in one pass, you can do

>>> df.groupby('A').pipe(lambda x: x.max() - x.min())
   B
A
a  2
b  2