v0.10.0 (December 17, 2012)¶
This is a major release from 0.9.1 and includes many new features and enhancements along with a large number of bug fixes. There are also a number of important API changes that long-time pandas users should pay close attention to.
File parsing new features¶
The delimited file parsing engine (the guts of read_csv
and read_table
)
has been rewritten from the ground up and now uses a fraction the amount of
memory while parsing, while being 40% or more faster in most use cases (in some
cases much faster).
There are also many new features:
- Much-improved Unicode handling via the
encoding
option. - Column filtering (
usecols
) - Dtype specification (
dtype
argument) - Ability to specify strings to be recognized as True/False
- Ability to yield NumPy record arrays (
as_recarray
) - High performance
delim_whitespace
option - Decimal format (e.g. European format) specification
- Easier CSV dialect options:
escapechar
,lineterminator
,quotechar
, etc. - More robust handling of many exceptional kinds of files observed in the wild
API changes¶
Deprecated DataFrame BINOP TimeSeries special case behavior
The default behavior of binary operations between a DataFrame and a Series has always been to align on the DataFrame’s columns and broadcast down the rows, except in the special case that the DataFrame contains time series. Since there are now method for each binary operator enabling you to specify how you want to broadcast, we are phasing out this special case (Zen of Python: Special cases aren’t special enough to break the rules). Here’s what I’m talking about:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame(np.random.randn(6, 4),
...: index=pd.date_range('1/1/2000', periods=6))
...:
In [3]: df
Out[3]:
0 1 2 3
2000-01-01 0.469112 -0.282863 -1.509059 -1.135632
2000-01-02 1.212112 -0.173215 0.119209 -1.044236
2000-01-03 -0.861849 -2.104569 -0.494929 1.071804
2000-01-04 0.721555 -0.706771 -1.039575 0.271860
2000-01-05 -0.424972 0.567020 0.276232 -1.087401
2000-01-06 -0.673690 0.113648 -1.478427 0.524988
# deprecated now
In [4]: df - df[0]