pandas.DataFrame.rolling#

DataFrame.rolling(window, min_periods=None, center=False, win_type=None, on=None, closed=None, step=None, method='single')[source]#

Provide rolling window calculations.

This method returns a rolling window object, enabling aggregation, transformation, and other operations over a sliding window of a specified size.

Parameters:

windowint, timedelta, str, offset, or BaseIndexer subclass

Interval of the moving window.

If an integer, the delta between the start and end of each window. The number of points in the window depends on the closed argument.

If a timedelta, str, or offset, the time period of each window. Each window will be a variable sized based on the observations included in the time-period. This is only valid for datetimelike indexes. To learn more about the offsets & frequency strings, please see this link.

If a BaseIndexer subclass, the window boundaries based on the defined get_window_bounds method. Additional rolling keyword arguments, namely min_periods, center, closed and step will be passed to get_window_bounds.

min_periodsint, default None

Minimum number of observations in window required to have a value; otherwise, result is np.nan.

For a window that is specified by an offset, min_periods will default to 1.

For a window that is specified by an integer, min_periods will default to the size of the window.

centerbool, default False

If False, set the window labels as the right edge of the window index.

If True, set the window labels as the center of the window index.

win_typestr, default None

If None, all points are evenly weighted.

If a string, it must be a valid scipy.signal window function.

Certain Scipy window types require additional parameters to be passed in the aggregation function. The additional parameters must match the keywords specified in the Scipy window type method signature.

onstr, optional

For a DataFrame, a column label or Index level on which to calculate the rolling window, rather than the DataFrame’s index.

Provided integer column is ignored and excluded from result since an integer index is not used to calculate the rolling window.

When on is specified, the values of that column also become the index of the Series passed to Rolling.apply() when raw=False, in place of the original DataFrame index.

closedstr, default None

Determines the inclusivity of points in the window

If 'right', uses the window (first, last] meaning the last point is included in the calculations.

If 'left', uses the window [first, last) meaning the first point is included in the calculations.

If 'both', uses the window [first, last] meaning all points in the window are included in the calculations.

If 'neither', uses the window (first, last) meaning the first and last points in the window are excluded from calculations.

() and [] are referencing open and closed set notation respetively.

Default None ('right').

stepint, default None

Evaluate the window at every step result, equivalent to slicing as [::step]. window must be an integer. Using a step argument other than None or 1 will produce a result with a different shape than the input.

methodstr {‘single’, ‘table’}, default ‘single’

Execute the rolling operation per single column or row ('single') or over the entire object ('table').

This argument is only implemented when specifying engine='numba' in the method call.

Returns:

pandas.api.typing.Window or pandas.api.typing.Rolling: An instance of Window is returned if win_type is passed. Otherwise, an instance of Rolling is returned.

See also

expanding: Provides expanding transformations.
ewm: Provides exponential weighted functions.

Notes

See Windowing Operations for further usage details and examples.

Examples

>>> df = pd.DataFrame({"B": [0, 1, 2, np.nan, 4]})
>>> df
     B
0  0.0
1  1.0
2  2.0
3  NaN
4  4.0

window

Rolling sum with a window length of 2 observations.

>>> df.rolling(2).sum()
     B
NaN
1.0
3.0
NaN
NaN

Rolling sum with a window span of 2 seconds.

>>> df_time = pd.DataFrame(
...     {"B": [0, 1, 2, np.nan, 4]},
...     index=[
...         pd.Timestamp("20130101 09:00:00"),
...         pd.Timestamp("20130101 09:00:02"),
...         pd.Timestamp("20130101 09:00:03"),
...         pd.Timestamp("20130101 09:00:05"),
...         pd.Timestamp("20130101 09:00:06"),
...     ],
... )

>>> df_time
                       B
2013-01-01 09:00:00  0.0
2013-01-01 09:00:02  1.0
2013-01-01 09:00:03  2.0
2013-01-01 09:00:05  NaN
2013-01-01 09:00:06  4.0

>>> df_time.rolling("2s").sum()
                       B
2013-01-01 09:00:00  0.0
2013-01-01 09:00:02  1.0
2013-01-01 09:00:03  3.0
2013-01-01 09:00:05  NaN
2013-01-01 09:00:06  4.0

Rolling sum with forward looking windows with 2 observations.

>>> indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=2)
>>> df.rolling(window=indexer, min_periods=1).sum()
     B
0  1.0
1  3.0
2  2.0
3  4.0
4  4.0

min_periods

Rolling sum with a window length of 2 observations, but only needs a minimum of 1 observation to calculate a value.

>>> df.rolling(2, min_periods=1).sum()
     B
0.0
1.0
3.0
2.0
4.0

center

Rolling sum with the result assigned to the center of the window index.

>>> df.rolling(3, min_periods=1, center=True).sum()
     B
1.0
3.0
3.0
6.0
4.0

>>> df.rolling(3, min_periods=1, center=False).sum()
     B
0.0
1.0
3.0
3.0
6.0

step

Rolling sum with a window length of 2 observations, minimum of 1 observation to calculate a value, and a step of 2.

>>> df.rolling(2, min_periods=1, step=2).sum()
     B
0  0.0
2  3.0
4  4.0

win_type

Rolling sum with a window length of 2, using the Scipy 'gaussian' window type. std is required in the aggregation function.

>>> df.rolling(2, win_type="gaussian").sum(std=3)
          B
      NaN
 0.986207
 2.958621
      NaN
      NaN

on

Rolling sum with a window length of 2 days.

>>> df = pd.DataFrame(
...     {
...         "A": [
...             pd.to_datetime("2020-01-01"),
...             pd.to_datetime("2020-01-01"),
...             pd.to_datetime("2020-01-02"),
...         ],
...         "B": [1, 2, 3],
...     },
...     index=pd.date_range("2020", periods=3),
... )

>>> df
                    A  B
2020-01-01 2020-01-01  1
2020-01-02 2020-01-01  2
2020-01-03 2020-01-02  3

>>> df.rolling("2D", on="A").sum()
                    A    B
2020-01-01 2020-01-01  1.0
2020-01-02 2020-01-01  3.0
2020-01-03 2020-01-02  6.0