pandas.DataFrame.ewm¶
- DataFrame.ewm(com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=0, times=None, method='single')[source]¶
Provide exponentially weighted (EW) calculations.
Exactly one parameter:
com,span,halflife, oralphamust be provided.- Parameters
- comfloat, optional
Specify decay in terms of center of mass
\(\alpha = 1 / (1 + com)\), for \(com \geq 0\).
- spanfloat, optional
Specify decay in terms of span
\(\alpha = 2 / (span + 1)\), for \(span \geq 1\).
- halflifefloat, str, timedelta, optional
Specify decay in terms of half-life
\(\alpha = 1 - \exp\left(-\ln(2) / halflife\right)\), for \(halflife > 0\).
If
timesis specified, the time unit (str or timedelta) over which an observation decays to half its value. Only applicable tomean(), and halflife value will not apply to the other functions.New in version 1.1.0.
- alphafloat, optional
Specify smoothing factor \(\alpha\) directly
\(0 < \alpha \leq 1\).
- min_periodsint, default 0
Minimum number of observations in window required to have a value; otherwise, result is
np.nan.- adjustbool, default True
Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings (viewing EWMA as a moving average).
When
adjust=True(default), the EW function is calculated using weights \(w_i = (1 - \alpha)^i\). For example, the EW moving average of the series [\(x_0, x_1, ..., x_t\)] would be:
\[y_t = \frac{x_t + (1 - \alpha)x_{t-1} + (1 - \alpha)^2 x_{t-2} + ... + (1 - \alpha)^t x_0}{1 + (1 - \alpha) + (1 - \alpha)^2 + ... + (1 - \alpha)^t}\]When
adjust=False, the exponentially weighted function is calculated recursively:
\[\begin{split}\begin{split} y_0 &= x_0\\ y_t &= (1 - \alpha) y_{t-1} + \alpha x_t, \end{split}\end{split}\]- ignore_nabool, default False
Ignore missing values when calculating weights.
When
ignore_na=False(default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) ifadjust=True, and \((1-\alpha)^2\) and \(\alpha\) ifadjust=False.When
ignore_na=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) ifadjust=True, and \(1-\alpha\) and \(\alpha\) ifadjust=False.
- axis{0, 1}, default 0
If
0or'index', calculate across the rows.If
1or'columns', calculate across the columns.- timesstr, np.ndarray, Series, default None
New in version 1.1.0.
Only applicable to
mean().Times corresponding to the observations. Must be monotonically increasing and
datetime64[ns]dtype.If 1-D array like, a sequence with the same shape as the observations.
Deprecated since version 1.4.0: If str, the name of the column in the DataFrame representing the times.
- methodstr {‘single’, ‘table’}, default ‘single’
New in version 1.4.0.
Execute the rolling operation per single column or row (
'single') or over the entire object ('table').This argument is only implemented when specifying
engine='numba'in the method call.Only applicable to
mean()
- Returns
ExponentialMovingWindowsubclass
Notes
See Windowing Operations for further usage details and examples.
Examples
>>> df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]}) >>> df B 0 0.0 1 1.0 2 2.0 3 NaN 4 4.0
>>> df.ewm(com=0.5).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213 >>> df.ewm(alpha=2 / 3).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213
adjust
>>> df.ewm(com=0.5, adjust=True).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213 >>> df.ewm(com=0.5, adjust=False).mean() B 0 0.000000 1 0.666667 2 1.555556 3 1.555556 4 3.650794
ignore_na
>>> df.ewm(com=0.5, ignore_na=True).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.225000 >>> df.ewm(com=0.5, ignore_na=False).mean() B 0 0.000000 1 0.750000 2 1.615385 3 1.615385 4 3.670213
times
Exponentially weighted mean with weights calculated with a timedelta
halfliferelative totimes.>>> times = ['2020-01-01', '2020-01-03', '2020-01-10', '2020-01-15', '2020-01-17'] >>> df.ewm(halflife='4 days', times=pd.DatetimeIndex(times)).mean() B 0 0.000000 1 0.585786 2 1.523889 3 1.523889 4 3.233686