v0.9.0 (October 7, 2012)¶
This is a major release from 0.8.1 and includes several new features and enhancements along with a large number of bug fixes. New features include vectorized unicode encoding/decoding for Series.str, to_latex method to DataFrame, more flexible parsing of boolean values, and enabling the download of options data from Yahoo! Finance.
New features¶
- Add
encode
anddecode
for unicode handling to vectorized string processing methods in Series.str (GH1706)- Add
DataFrame.to_latex
method (GH1735)- Add convenient expanding window equivalents of all rolling_* ops (GH1785)
- Add Options class to pandas.io.data for fetching options data from Yahoo! Finance (GH1748, GH1739)
- More flexible parsing of boolean values (Yes, No, TRUE, FALSE, etc) (GH1691, GH1295)
- Add
level
parameter toSeries.reset_index
TimeSeries.between_time
can now select times across midnight (GH1871)- Series constructor can now handle generator as input (GH1679)
DataFrame.dropna
can now take multiple axes (tuple/list) as input (GH924)- Enable
skip_footer
parameter inExcelFile.parse
(GH1843)
API changes¶
- The default column names when
header=None
and no columns names passed to functions likeread_csv
has changed to be more Pythonic and amenable to attribute access:
In [1]: import io
In [2]: data = ('0,0,1\n'
...: '1,1,0\n'
...: '0,1,0')
...:
In [3]: df = pd.read_csv(io.StringIO(data), header=None)
In [4]: df
Out[4]:
0 1 2
0 0 0 1
1 1 1 0
2 0 1 0
[3 rows x 3 columns]
- Creating a Series from another Series, passing an index, will cause reindexing
to happen inside rather than treating the Series like an ndarray. Technically
improper usages like
Series(df[col1], index=df[col2])
that worked before “by accident” (this was never intended) will lead to all NA Series in some cases. To be perfectly clear:
In [5]: s1 = pd.Series([1, 2, 3])
In [6]: s1
Out[6]:
0 1
1 2
2 3
Length: 3, dtype: int64
In [7]: s2 = pd.Series(s1, index=['foo', 'bar', 'baz'])
In [8]: s2
Out[8]:
foo NaN
bar NaN
baz NaN
Length: 3, dtype: float64
- Deprecated
day_of_year
API removed from PeriodIndex, usedayofyear
(GH1723) - Don’t modify NumPy suppress printoption to True at import time
- The internal HDF5 data arrangement for DataFrames has been transposed. Legacy files will still be readable by HDFStore (GH1834, GH1824)
- Legacy cruft removed: pandas.stats.misc.quantileTS
- Use ISO8601 format for Period repr: monthly, daily, and on down (GH1776)
- Empty DataFrame columns are now created as object dtype. This will prevent a class of TypeErrors that was occurring in code where the dtype of a column would depend on the presence of data or not (e.g. a SQL query having results) (GH1783)
- Setting parts of DataFrame/Panel using ix now aligns input Series/DataFrame (GH1630)
first
andlast
methods inGroupBy
no longer drop non-numeric columns (GH1809)- Resolved inconsistencies in specifying custom NA values in text parser.
na_values
of type dict no longer override default NAs unlesskeep_default_na
is set to false explicitly (GH1657) DataFrame.dot
will not do data alignment, and also work with Series (GH1915)
See the full release notes or issue tracker on GitHub for a complete list.
Contributors¶
A total of 24 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
- Chang She
- Christopher Whelan +
- Dan Miller +
- Daniel Shapiro +
- Dieter Vandenbussche
- Doug Coleman +
- John-Colvin +
- Johnny +
- Joshua Leahy +
- Lars Buitinck +
- Mark O’Leary +
- Martin Blais
- MinRK +
- Paul Ivanov +
- Skipper Seabold
- Spencer Lyon +
- Taavi Burns +
- Wes McKinney
- Wouter Overmeire
- Yaroslav Halchenko
- lenolib +
- tshauck +
- y-p +
- Øystein S. Haaland +