Time Series / Date functionality¶
pandas has proven very successful as a tool for working with time series data,
especially in the financial data analysis space. Using the NumPy datetime64 and timedelta64 dtypes,
we have consolidated a large number of features from other Python libraries like scikits.timeseries as well as created
a tremendous amount of new functionality for manipulating time series data.
In working with time series data, we will frequently seek to:
- generate sequences of fixed-frequency dates and time spans
- conform or convert time series to a particular frequency
- compute “relative” dates based on various non-standard time increments (e.g. 5 business days before the last business day of the year), or “roll” dates forward or backward
pandas provides a relatively compact and self-contained set of tools for performing the above tasks.
Create a range of dates:
# 72 hours starting with midnight Jan 1st, 2011
In [1]: rng = pd.date_range('1/1/2011', periods=72, freq='H')
In [2]: rng[:5]
Out[2]: 
DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
               '2011-01-01 02:00:00', '2011-01-01 03:00:00',
               '2011-01-01 04:00:00'],
              dtype='datetime64[ns]', freq='H')
Index pandas objects with dates:
In [3]: ts = pd.Series(np.random.randn(len(rng)), index=rng)
In [4]: ts.head()
Out[4]: 
2011-01-01 00:00:00    0.469112
2011-01-01 01:00:00   -0.282863
2011-01-01 02:00:00   -1.509059
2011-01-01 03:00:00   -1.135632
2011-01-01 04:00:00    1.212112
Freq: H, dtype: float64
Change frequency and fill gaps:
# to 45 minute frequency and forward fill
In [5]: converted = ts.asfreq('45Min', method='pad')
In [6]: converted.head()
Out[6]: 
2011-01-01 00:00:00    0.469112
2011-01-01 00:45:00    0.469112
2011-01-01 01:30:00   -0.282863
2011-01-01 02:15:00   -1.509059
2011-01-01 03:00:00   -1.135632
Freq: 45T, dtype: float64
Resample:
# Daily means
In [7]: ts.resample('D').mean()
Out[7]: 
2011-01-01   -0.319569
2011-01-02   -0.337703
2011-01-03    0.117258
Freq: D, dtype: float64
Overview¶
Following table shows the type of time-related classes pandas can handle and how to create them.
| Class | Remarks | How to create | 
|---|---|---|
| Timestamp | Represents a single time stamp | to_datetime,Timestamp | 
| DatetimeIndex | Index of Timestamp | to_datetime,date_range,DatetimeIndex | 
| Period | Represents a single time span | Period | 
| PeriodIndex | Index of Period | period_range,PeriodIndex | 
Time Stamps vs. Time Spans¶
Time-stamped data is the most basic type of timeseries data that associates values with points in time. For pandas objects it means using the points in time.
In [8]: pd.Timestamp(datetime(2012, 5, 1))
Out[8]: Timestamp('2012-05-01 00:00:00')
In [9]: pd.Timestamp('2012-05-01')