v0.9.0 (October 7, 2012)

This is a major release from 0.8.1 and includes several new features and enhancements along with a large number of bug fixes. New features include vectorized unicode encoding/decoding for Series.str, to_latex method to DataFrame, more flexible parsing of boolean values, and enabling the download of options data from Yahoo! Finance.

New features

  • Add encode and decode for unicode handling to vectorized string processing methods in Series.str (GH1706)
  • Add DataFrame.to_latex method (GH1735)
  • Add convenient expanding window equivalents of all rolling_* ops (GH1785)
  • Add Options class to pandas.io.data for fetching options data from Yahoo! Finance (GH1748, GH1739)
  • More flexible parsing of boolean values (Yes, No, TRUE, FALSE, etc) (GH1691, GH1295)
  • Add level parameter to Series.reset_index
  • TimeSeries.between_time can now select times across midnight (GH1871)
  • Series constructor can now handle generator as input (GH1679)
  • DataFrame.dropna can now take multiple axes (tuple/list) as input (GH924)
  • Enable skip_footer parameter in ExcelFile.parse (GH1843)

API changes

  • The default column names when header=None and no columns names passed to functions like read_csv has changed to be more Pythonic and amenable to attribute access:
In [1]: import io

In [2]: data = ('0,0,1\n'
   ...:         '1,1,0\n'
   ...:         '0,1,0')

In [3]: df = pd.read_csv(io.StringIO(data), header=None)

In [4]: df
   0  1  2
0  0  0  1
1  1  1  0
2  0  1  0

[3 rows x 3 columns]
  • Creating a Series from another Series, passing an index, will cause reindexing to happen inside rather than treating the Series like an ndarray. Technically improper usages like Series(df[col1], index=df[col2]) that worked before “by accident” (this was never intended) will lead to all NA Series in some cases. To be perfectly clear:
In [5]: s1 = pd.Series([1, 2, 3])

In [6]: s1
0    1
1    2
2    3
Length: 3, dtype: int64

In [7]: s2 = pd.Series(s1, index=['foo', 'bar', 'baz'])

In [8]: s2
foo   NaN
bar   NaN
baz   NaN
Length: 3, dtype: float64
  • Deprecated day_of_year API removed from PeriodIndex, use dayofyear (GH1723)
  • Don’t modify NumPy suppress printoption to True at import time
  • The internal HDF5 data arrangement for DataFrames has been transposed. Legacy files will still be readable by HDFStore (GH1834, GH1824)
  • Legacy cruft removed: pandas.stats.misc.quantileTS
  • Use ISO8601 format for Period repr: monthly, daily, and on down (GH1776)
  • Empty DataFrame columns are now created as object dtype. This will prevent a class of TypeErrors that was occurring in code where the dtype of a column would depend on the presence of data or not (e.g. a SQL query having results) (GH1783)
  • Setting parts of DataFrame/Panel using ix now aligns input Series/DataFrame (GH1630)
  • first and last methods in GroupBy no longer drop non-numeric columns (GH1809)
  • Resolved inconsistencies in specifying custom NA values in text parser. na_values of type dict no longer override default NAs unless keep_default_na is set to false explicitly (GH1657)
  • DataFrame.dot will not do data alignment, and also work with Series (GH1915)

See the full release notes or issue tracker on GitHub for a complete list.


A total of 24 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

  • Chang She
  • Christopher Whelan +
  • Dan Miller +
  • Daniel Shapiro +
  • Dieter Vandenbussche
  • Doug Coleman +
  • John-Colvin +
  • Johnny +
  • Joshua Leahy +
  • Lars Buitinck +
  • Mark O’Leary +
  • Martin Blais
  • MinRK +
  • Paul Ivanov +
  • Skipper Seabold
  • Spencer Lyon +
  • Taavi Burns +
  • Wes McKinney
  • Wouter Overmeire
  • Yaroslav Halchenko
  • lenolib +
  • tshauck +
  • y-p +
  • Øystein S. Haaland +
Scroll To Top