Version 0.20.2 (June 4, 2017)

This is a minor bug-fix release in the 0.20.x series and includes some small regression fixes, bug fixes and performance improvements. We recommend that all users upgrade to this version.

Enhancements

  • Unblocked access to additional compression types supported in pytables: ‘blosc:blosclz, ‘blosc:lz4’, ‘blosc:lz4hc’, ‘blosc:snappy’, ‘blosc:zlib’, ‘blosc:zstd’ (GH14478)

  • Series provides a to_latex method (GH16180)

  • A new groupby method ngroup(), parallel to the existing cumcount(), has been added to return the group order (GH11642); see here.

Performance improvements

  • Performance regression fix when indexing with a list-like (GH16285)

  • Performance regression fix for MultiIndexes (GH16319, GH16346)

  • Improved performance of .clip() with scalar arguments (GH15400)

  • Improved performance of groupby with categorical groupers (GH16413)

  • Improved performance of MultiIndex.remove_unused_levels() (GH16556)

Bug fixes

  • Silenced a warning on some Windows environments about “tput: terminal attributes: No such device or address” when detecting the terminal size. This fix only applies to python 3 (GH16496)

  • Bug in using pathlib.Path or py.path.local objects with io functions (GH16291)

  • Bug in Index.symmetric_difference() on two equal MultiIndex’s, results in a TypeError (GH13490)

  • Bug in DataFrame.update() with overwrite=False and NaN values (GH15593)

  • Passing an invalid engine to read_csv() now raises an informative ValueError rather than UnboundLocalError. (GH16511)

  • Bug in unique() on an array of tuples (GH16519)

  • Bug in cut() when labels are set, resulting in incorrect label ordering (GH16459)

  • Fixed a compatibility issue with IPython 6.0’s tab completion showing deprecation warnings on Categoricals (GH16409)

Conversion

  • Bug in to_numeric() in which empty data inputs were causing a segfault of the interpreter (GH16302)

  • Silence numpy warnings when broadcasting DataFrame to Series with comparison ops (GH16378, GH16306)

Indexing

  • Bug in DataFrame.reset_index(level=) with single level index (GH16263)

  • Bug in partial string indexing with a monotonic, but not strictly-monotonic, index incorrectly reversing the slice bounds (GH16515)

  • Bug in MultiIndex.remove_unused_levels() that would not return a MultiIndex equal to the original. (GH16556)

IO

  • Bug in read_csv() when comment is passed in a space delimited text file (GH16472)

  • Bug in read_csv() not raising an exception with nonexistent columns in usecols when it had the correct length (GH14671)

  • Bug that would force importing of the clipboard routines unnecessarily, potentially causing an import error on startup (GH16288)

  • Bug that raised IndexError when HTML-rendering an empty DataFrame (GH15953)

  • Bug in read_csv() in which tarfile object inputs were raising an error in Python 2.x for the C engine (GH16530)

  • Bug where DataFrame.to_html() ignored the index_names parameter (GH16493)

  • Bug where pd.read_hdf() returns numpy strings for index names (GH13492)

  • Bug in HDFStore.select_as_multiple() where start/stop arguments were not respected (GH16209)

Plotting

  • Bug in DataFrame.plot with a single column and a list-like color (GH3486)

  • Bug in plot where NaT in DatetimeIndex results in Timestamp.min (GH12405)

  • Bug in DataFrame.boxplot where figsize keyword was not respected for non-grouped boxplots (GH11959)

GroupBy/resample/rolling

  • Bug in creating a time-based rolling window on an empty DataFrame (GH15819)

  • Bug in rolling.cov() with offset window (GH16058)

  • Bug in .resample() and .groupby() when aggregating on integers (GH16361)

Sparse

  • Bug in construction of SparseDataFrame from scipy.sparse.dok_matrix (GH16179)

Reshaping

  • Bug in DataFrame.stack with unsorted levels in MultiIndex columns (GH16323)

  • Bug in pd.wide_to_long() where no error was raised when i was not a unique identifier (GH16382)

  • Bug in Series.isin(..) with a list of tuples (GH16394)

  • Bug in construction of a DataFrame with mixed dtypes including an all-NaT column. (GH16395)

  • Bug in DataFrame.agg() and Series.agg() with aggregating on non-callable attributes (GH16405)

Numeric

  • Bug in .interpolate(), where limit_direction was not respected when limit=None (default) was passed (GH16282)

Categorical

  • Fixed comparison operations considering the order of the categories when both categoricals are unordered (GH16014)

Other

  • Bug in DataFrame.drop() with an empty-list with non-unique indices (GH16270)

Contributors

A total of 34 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

  • Aaron Barber +

  • Andrew 亮 +

  • Becky Sweger +

  • Christian Prinoth +

  • Christian Stade-Schuldt +

  • DSM

  • Erik Fredriksen +

  • Hugues Valois +

  • Jeff Reback

  • Jeff Tratner

  • JimStearns206 +

  • John W. O’Brien

  • Joris Van den Bossche

  • JosephWagner +

  • Keith Webber +

  • Mehmet Ali “Mali” Akmanalp +

  • Pankaj Pandey

  • Patrick Luo +

  • Patrick O’Melveny +

  • Pietro Battiston

  • RobinFiveWords +

  • Ryan Hendrickson +

  • SimonBaron +

  • Tom Augspurger

  • WBare +

  • bpraggastis +

  • chernrick +

  • chris-b1

  • economy +

  • gfyoung

  • jaredsnyder +

  • keitakurita +

  • linebp

  • lloydkirk +