pandas.core.resample.Resampler.bfill¶
-
Resampler.
bfill
(self, limit=None)[source]¶ Backward fill the new missing values in the resampled data.
In statistics, imputation is the process of replacing missing data with substituted values [1]. When resampling data, missing values may appear (e.g., when the resampling frequency is higher than the original frequency). The backward fill will replace NaN values that appeared in the resampled data with the next value in the original sequence. Missing values that existed in the original data will not be modified.
Parameters: - limit : integer, optional
Limit of how many values to fill.
Returns: - Series, DataFrame
An upsampled Series or DataFrame with backward filled NaN values.
See also
bfill
- Alias of backfill.
fillna
- Fill NaN values using the specified method, which can be ‘backfill’.
nearest
- Fill NaN values with nearest neighbor starting from center.
pad
- Forward fill NaN values.
Series.fillna
- Fill NaN values in the Series using the specified method, which can be ‘backfill’.
DataFrame.fillna
- Fill NaN values in the DataFrame using the specified method, which can be ‘backfill’.
References
[1] https://en.wikipedia.org/wiki/Imputation_(statistics) Examples
Resampling a Series:
>>> s = pd.Series([1, 2, 3], ... index=pd.date_range('20180101', periods=3, freq='h')) >>> s 2018-01-01 00:00:00 1 2018-01-01 01:00:00 2 2018-01-01 02:00:00 3 Freq: H, dtype: int64
>>> s.resample('30min').backfill() 2018-01-01 00:00:00 1 2018-01-01 00:30:00 2 2018-01-01 01:00:00 2 2018-01-01 01:30:00 3 2018-01-01 02:00:00 3 Freq: 30T, dtype: int64
>>> s.resample('15min').backfill(limit=2) 2018-01-01 00:00:00 1.0 2018-01-01 00:15:00 NaN 2018-01-01 00:30:00 2.0 2018-01-01 00:45:00 2.0 2018-01-01 01:00:00 2.0 2018-01-01 01:15:00 NaN 2018-01-01 01:30:00 3.0 2018-01-01 01:45:00 3.0 2018-01-01 02:00:00 3.0 Freq: 15T, dtype: float64
Resampling a DataFrame that has missing values:
>>> df = pd.DataFrame({'a': [2, np.nan, 6], 'b': [1, 3, 5]}, ... index=pd.date_range('20180101', periods=3, ... freq='h')) >>> df a b 2018-01-01 00:00:00 2.0 1 2018-01-01 01:00:00 NaN 3 2018-01-01 02:00:00 6.0 5
>>> df.resample('30min').backfill() a b 2018-01-01 00:00:00 2.0 1 2018-01-01 00:30:00 NaN 3 2018-01-01 01:00:00 NaN 3 2018-01-01 01:30:00 6.0 5 2018-01-01 02:00:00 6.0 5
>>> df.resample('15min').backfill(limit=2) a b 2018-01-01 00:00:00 2.0 1.0 2018-01-01 00:15:00 NaN NaN 2018-01-01 00:30:00 NaN 3.0 2018-01-01 00:45:00 NaN 3.0 2018-01-01 01:00:00 NaN 3.0 2018-01-01 01:15:00 NaN NaN 2018-01-01 01:30:00 6.0 5.0 2018-01-01 01:45:00 6.0 5.0 2018-01-01 02:00:00 6.0 5.0