Version 0.20.2 (June 4, 2017)#

This is a minor bug-fix release in the 0.20.x series and includes some small regression fixes, bug fixes and performance improvements. We recommend that all users upgrade to this version.

Enhancements#

  • Unblocked access to additional compression types supported in pytables: ‘blosc:blosclz, ‘blosc:lz4’, ‘blosc:lz4hc’, ‘blosc:snappy’, ‘blosc:zlib’, ‘blosc:zstd’ (GH 14478)

  • Series provides a to_latex method (GH 16180)

  • A new groupby method ngroup(), parallel to the existing cumcount(), has been added to return the group order (GH 11642); see here.

Performance improvements#

  • Performance regression fix when indexing with a list-like (GH 16285)

  • Performance regression fix for MultiIndexes (GH 16319, GH 16346)

  • Improved performance of .clip() with scalar arguments (GH 15400)

  • Improved performance of groupby with categorical groupers (GH 16413)

  • Improved performance of MultiIndex.remove_unused_levels() (GH 16556)

Bug fixes#

  • Silenced a warning on some Windows environments about “tput: terminal attributes: No such device or address” when detecting the terminal size. This fix only applies to python 3 (GH 16496)

  • Bug in using pathlib.Path or py.path.local objects with io functions (GH 16291)

  • Bug in Index.symmetric_difference() on two equal MultiIndex’s, results in a TypeError (GH 13490)

  • Bug in DataFrame.update() with overwrite=False and NaN values (GH 15593)

  • Passing an invalid engine to read_csv() now raises an informative ValueError rather than UnboundLocalError. (GH 16511)

  • Bug in unique() on an array of tuples (GH 16519)

  • Bug in cut() when labels are set, resulting in incorrect label ordering (GH 16459)

  • Fixed a compatibility issue with IPython 6.0’s tab completion showing deprecation warnings on Categoricals (GH 16409)

Conversion#

  • Bug in to_numeric() in which empty data inputs were causing a segfault of the interpreter (GH 16302)

  • Silence numpy warnings when broadcasting DataFrame to Series with comparison ops (GH 16378, GH 16306)

Indexing#

  • Bug in DataFrame.reset_index(level=) with single level index (GH 16263)

  • Bug in partial string indexing with a monotonic, but not strictly-monotonic, index incorrectly reversing the slice bounds (GH 16515)

  • Bug in MultiIndex.remove_unused_levels() that would not return a MultiIndex equal to the original. (GH 16556)

IO#

  • Bug in read_csv() when comment is passed in a space delimited text file (GH 16472)

  • Bug in read_csv() not raising an exception with nonexistent columns in usecols when it had the correct length (GH 14671)

  • Bug that would force importing of the clipboard routines unnecessarily, potentially causing an import error on startup (GH 16288)

  • Bug that raised IndexError when HTML-rendering an empty DataFrame (GH 15953)

  • Bug in read_csv() in which tarfile object inputs were raising an error in Python 2.x for the C engine (GH 16530)

  • Bug where DataFrame.to_html() ignored the index_names parameter (GH 16493)

  • Bug where pd.read_hdf() returns numpy strings for index names (GH 13492)

  • Bug in HDFStore.select_as_multiple() where start/stop arguments were not respected (GH 16209)

Plotting#

  • Bug in DataFrame.plot with a single column and a list-like color (GH 3486)

  • Bug in plot where NaT in DatetimeIndex results in Timestamp.min (GH 12405)

  • Bug in DataFrame.boxplot where figsize keyword was not respected for non-grouped boxplots (GH 11959)

GroupBy/resample/rolling#

  • Bug in creating a time-based rolling window on an empty DataFrame (GH 15819)

  • Bug in rolling.cov() with offset window (GH 16058)

  • Bug in .resample() and .groupby() when aggregating on integers (GH 16361)

Sparse#

  • Bug in construction of SparseDataFrame from scipy.sparse.dok_matrix (GH 16179)

Reshaping#

  • Bug in DataFrame.stack with unsorted levels in MultiIndex columns (GH 16323)

  • Bug in pd.wide_to_long() where no error was raised when i was not a unique identifier (GH 16382)

  • Bug in Series.isin(..) with a list of tuples (GH 16394)

  • Bug in construction of a DataFrame with mixed dtypes including an all-NaT column. (GH 16395)

  • Bug in DataFrame.agg() and Series.agg() with aggregating on non-callable attributes (GH 16405)

Numeric#

  • Bug in .interpolate(), where limit_direction was not respected when limit=None (default) was passed (GH 16282)

Categorical#

  • Fixed comparison operations considering the order of the categories when both categoricals are unordered (GH 16014)

Other#

  • Bug in DataFrame.drop() with an empty-list with non-unique indices (GH 16270)

Contributors#

A total of 34 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

  • Aaron Barber +

  • Andrew 亮 +

  • Becky Sweger +

  • Christian Prinoth +

  • Christian Stade-Schuldt +

  • DSM

  • Erik Fredriksen +

  • Hugues Valois +

  • Jeff Reback

  • Jeff Tratner

  • JimStearns206 +

  • John W. O’Brien

  • Joris Van den Bossche

  • JosephWagner +

  • Keith Webber +

  • Mehmet Ali “Mali” Akmanalp +

  • Pankaj Pandey

  • Patrick Luo +

  • Patrick O’Melveny +

  • Pietro Battiston

  • RobinFiveWords +

  • Ryan Hendrickson +

  • SimonBaron +

  • Tom Augspurger

  • WBare +

  • bpraggastis +

  • chernrick +

  • chris-b1

  • economy +

  • gfyoung

  • jaredsnyder +

  • keitakurita +

  • linebp

  • lloydkirk +