v0.17.0 (October 9, 2015)¶
This is a major release from 0.16.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version.
Warning
pandas >= 0.17.0 will no longer support compatibility with Python version 3.2 (GH9118)
Warning
The pandas.io.data
package is deprecated and will be replaced by the
pandas-datareader package.
This will allow the data modules to be independently updated to your pandas
installation. The API for pandas-datareader v0.1.1
is exactly the same
as in pandas v0.17.0
(GH8961, GH10861).
After installing pandas-datareader, you can easily change your imports:
from pandas.io import data, wb
becomes
from pandas_datareader import data, wb
Highlights include:
- Release the Global Interpreter Lock (GIL) on some cython operations, see here
- Plotting methods are now available as attributes of the
.plot
accessor, see here - The sorting API has been revamped to remove some long-time inconsistencies, see here
- Support for a
datetime64[ns]
with timezones as a first-class dtype, see here - The default for
to_datetime
will now be toraise
when presented with unparseable formats, previously this would return the original input. Also, date parse functions now return consistent results. See here - The default for
dropna
inHDFStore
has changed toFalse
, to store by default all rows even if they are allNaN
, see here - Datetime accessor (
dt
) now supportsSeries.dt.strftime
to generate formatted strings for datetime-likes, andSeries.dt.total_seconds
to generate each duration of the timedelta in seconds. See here Period
andPeriodIndex
can handle multiplied freq like3D
, which corresponding to 3 days span. See here- Development installed versions of pandas will now have
PEP440
compliant version strings (GH9518) - Development support for benchmarking with the Air Speed Velocity library (GH8361)
- Support for reading SAS xport files, see here
- Documentation comparing SAS to pandas, see here
- Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, see here
- Display format with plain text can optionally align with Unicode East Asian Width, see here
- Compatibility with Python 3.5 (GH11097)
- Compatibility with matplotlib 1.5.0 (GH11111)
Check the API Changes and deprecations before updating.
What’s new in v0.17.0
- New features
- Datetime with TZ
- Releasing the GIL
- Plot submethods
- Additional methods for
dt
accessor - Period Frequency Enhancement
- Support for SAS XPORT files
- Support for Math Functions in .eval()
- Changes to Excel with
MultiIndex
- Google BigQuery Enhancements
- Display Alignment with Unicode East Asian Width
- Other enhancements
- Backwards incompatible API changes
- Changes to sorting API
- Changes to to_datetime and to_timedelta
- Changes to Index Comparisons
- Changes to Boolean Comparisons vs. None
- HDFStore dropna behavior
- Changes to
display.precision
option - Changes to
Categorical.unique
- Changes to
bool
passed asheader
in Parsers - Other API Changes
- Deprecations
- Removal of prior version deprecations/changes
- Performance Improvements
- Bug Fixes
- Contributors
New features¶
Datetime with TZ¶
We are adding an implementation that natively supports datetime with timezones. A Series
or a DataFrame
column previously
could be assigned a datetime with timezones, and would work as an object
dtype. This had performance issues with a large
number rows. See the docs for more details. (GH8260, GH10763, GH11034).
The new implementation allows for having a single-timezone across all rows, with operations in a performant manner.
In [1]: df = pd.DataFrame({'A': pd.date_range('20130101', periods=3),
...: 'B': pd.date_range('20130101', periods=3, tz='US/Eastern'),
...: 'C': pd.date_range('20130101', periods=3, tz='CET')})
...:
In [2]: df
Out[2]:
A B C
0 2013-01-01 2013-01-01 00:00:00-05:00 2013-01-01 00:00:00+01:00
1 2013-01-02 2013-01-02 00:00:00-05:00 2013-01-02 00:00:00+01:00
2 2013-01-03 2013-01-03 00:00:00-05:00 2013-01-03 00:00:00+01:00
[3 rows x 3 columns]
In [3]: df.dtypes