pandas.core.groupby.DataFrameGroupBy.resample¶
- DataFrameGroupBy.resample(rule, how=None, axis=0, fill_method=None, closed=None, label=None, convention='start', kind=None, loffset=None, limit=None, base=0)¶
- Convenience method for frequency conversion and resampling of regular time-series data. - Parameters: - rule : string - the offset string or object representing target conversion - how : string - method for down- or re-sampling, default to ‘mean’ for downsampling - axis : int, optional, default 0 - fill_method : string, default None - fill_method for upsampling - closed : {‘right’, ‘left’} - Which side of bin interval is closed - label : {‘right’, ‘left’} - Which bin edge label to label bucket with - convention : {‘start’, ‘end’, ‘s’, ‘e’} - kind : “period”/”timestamp” - loffset : timedelta - Adjust the resampled time labels - limit : int, default None - Maximum size gap to when reindexing with fill_method - base : int, default 0 - For frequencies that evenly subdivide 1 day, the “origin” of the aggregated intervals. For example, for ‘5min’ frequency, base could range from 0 through 4. Defaults to 0 - Examples - Start by creating a series with 9 one minute timestamps. - >>> index = pd.date_range('1/1/2000', periods=9, freq='T') >>> series = pd.Series(range(9), index=index) >>> series 2000-01-01 00:00:00 0 2000-01-01 00:01:00 1 2000-01-01 00:02:00 2 2000-01-01 00:03:00 3 2000-01-01 00:04:00 4 2000-01-01 00:05:00 5 2000-01-01 00:06:00 6 2000-01-01 00:07:00 7 2000-01-01 00:08:00 8 Freq: T, dtype: int64 - Downsample the series into 3 minute bins and sum the values of the timestamps falling into a bin. - >>> series.resample('3T', how='sum') 2000-01-01 00:00:00 3 2000-01-01 00:03:00 12 2000-01-01 00:06:00 21 Freq: 3T, dtype: int64 - Downsample the series into 3 minute bins as above, but label each bin using the right edge instead of the left. Please note that the value in the bucket used as the label is not included in the bucket, which it labels. For example, in the original series the bucket 2000-01-01 00:03:00 contains the value 3, but the summed value in the resampled bucket with the label``2000-01-01 00:03:00`` does not include 3 (if it did, the summed value would be 6, not 3). To include this value close the right side of the bin interval as illustrated in the example below this one. - >>> series.resample('3T', how='sum', label='right') 2000-01-01 00:03:00 3 2000-01-01 00:06:00 12 2000-01-01 00:09:00 21 Freq: 3T, dtype: int64 - Downsample the series into 3 minute bins as above, but close the right side of the bin interval. - >>> series.resample('3T', how='sum', label='right', closed='right') 2000-01-01 00:00:00 0 2000-01-01 00:03:00 6 2000-01-01 00:06:00 15 2000-01-01 00:09:00 15 Freq: 3T, dtype: int64 - Upsample the series into 30 second bins. - >>> series.resample('30S')[0:5] #select first 5 rows 2000-01-01 00:00:00 0 2000-01-01 00:00:30 NaN 2000-01-01 00:01:00 1 2000-01-01 00:01:30 NaN 2000-01-01 00:02:00 2 Freq: 30S, dtype: float64 - Upsample the series into 30 second bins and fill the NaN values using the pad method. - >>> series.resample('30S', fill_method='pad')[0:5] 2000-01-01 00:00:00 0 2000-01-01 00:00:30 0 2000-01-01 00:01:00 1 2000-01-01 00:01:30 1 2000-01-01 00:02:00 2 Freq: 30S, dtype: int64 - Upsample the series into 30 second bins and fill the NaN values using the bfill method. - >>> series.resample('30S', fill_method='bfill')[0:5] 2000-01-01 00:00:00 0 2000-01-01 00:00:30 1 2000-01-01 00:01:00 1 2000-01-01 00:01:30 2 2000-01-01 00:02:00 2 Freq: 30S, dtype: int64 - Pass a custom function to how. - >>> def custom_resampler(array_like): ... return np.sum(array_like)+5 - >>> series.resample('3T', how=custom_resampler) 2000-01-01 00:00:00 8 2000-01-01 00:03:00 17 2000-01-01 00:06:00 26 Freq: 3T, dtype: int64