Version 0.9.0 (October 7, 2012)

This is a major release from 0.8.1 and includes several new features and enhancements along with a large number of bug fixes. New features include vectorized unicode encoding/decoding for Series.str, to_latex method to DataFrame, more flexible parsing of boolean values, and enabling the download of options data from Yahoo! Finance.

New features

  • Add encode and decode for unicode handling to vectorized string processing methods in Series.str (GH1706)

  • Add DataFrame.to_latex method (GH1735)

  • Add convenient expanding window equivalents of all rolling_* ops (GH1785)

  • Add Options class to pandas.io.data for fetching options data from Yahoo! Finance (GH1748, GH1739)

  • More flexible parsing of boolean values (Yes, No, TRUE, FALSE, etc) (GH1691, GH1295)

  • Add level parameter to Series.reset_index

  • TimeSeries.between_time can now select times across midnight (GH1871)

  • Series constructor can now handle generator as input (GH1679)

  • DataFrame.dropna can now take multiple axes (tuple/list) as input (GH924)

  • Enable skip_footer parameter in ExcelFile.parse (GH1843)

API changes

  • The default column names when header=None and no columns names passed to functions like read_csv has changed to be more Pythonic and amenable to attribute access:

In [1]: import io

In [2]: data = """
   ...: 0,0,1
   ...: 1,1,0
   ...: 0,1,0
   ...: """
   ...: 

In [3]: df = pd.read_csv(io.StringIO(data), header=None)

In [4]: df
Out[4]: 
   0  1  2
0  0  0  1
1  1  1  0
2  0  1  0

[3 rows x 3 columns]
  • Creating a Series from another Series, passing an index, will cause reindexing to happen inside rather than treating the Series like an ndarray. Technically improper usages like Series(df[col1], index=df[col2]) that worked before “by accident” (this was never intended) will lead to all NA Series in some cases. To be perfectly clear:

In [5]: s1 = pd.Series([1, 2, 3])

In [6]: s1
Out[6]: 
0    1
1    2
2    3
Length: 3, dtype: int64

In [7]: s2 = pd.Series(s1, index=["foo", "bar", "baz"])

In [8]: s2
Out[8]: 
foo   NaN
bar   NaN
baz   NaN
Length: 3, dtype: float64
  • Deprecated day_of_year API removed from PeriodIndex, use dayofyear (GH1723)

  • Don’t modify NumPy suppress printoption to True at import time

  • The internal HDF5 data arrangement for DataFrames has been transposed. Legacy files will still be readable by HDFStore (GH1834, GH1824)

  • Legacy cruft removed: pandas.stats.misc.quantileTS

  • Use ISO8601 format for Period repr: monthly, daily, and on down (GH1776)

  • Empty DataFrame columns are now created as object dtype. This will prevent a class of TypeErrors that was occurring in code where the dtype of a column would depend on the presence of data or not (e.g. a SQL query having results) (GH1783)

  • Setting parts of DataFrame/Panel using ix now aligns input Series/DataFrame (GH1630)

  • first and last methods in GroupBy no longer drop non-numeric columns (GH1809)

  • Resolved inconsistencies in specifying custom NA values in text parser. na_values of type dict no longer override default NAs unless keep_default_na is set to false explicitly (GH1657)

  • DataFrame.dot will not do data alignment, and also work with Series (GH1915)

See the full release notes or issue tracker on GitHub for a complete list.

Contributors

A total of 24 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

  • Chang She

  • Christopher Whelan +

  • Dan Miller +

  • Daniel Shapiro +

  • Dieter Vandenbussche

  • Doug Coleman +

  • John-Colvin +

  • Johnny +

  • Joshua Leahy +

  • Lars Buitinck +

  • Mark O’Leary +

  • Martin Blais

  • MinRK +

  • Paul Ivanov +

  • Skipper Seabold

  • Spencer Lyon +

  • Taavi Burns +

  • Wes McKinney

  • Wouter Overmeire

  • Yaroslav Halchenko

  • lenolib +

  • tshauck +

  • y-p +

  • Øystein S. Haaland +