What’s new in 0.25.1 (August 21, 2019)#

These are the changes in pandas 0.25.1. See Release notes for a full changelog including other versions of pandas.

IO and LZMA#

Some users may unknowingly have an incomplete Python installation lacking the lzma module from the standard library. In this case, import pandas failed due to an ImportError (GH 27575). pandas will now warn, rather than raising an ImportError if the lzma module is not present. Any subsequent attempt to use lzma methods will raise a RuntimeError. A possible fix for the lack of the lzma module is to ensure you have the necessary libraries and then re-install Python. For example, on MacOS installing Python with pyenv may lead to an incomplete Python installation due to unmet system dependencies at compilation time (like xz). Compilation will succeed, but Python might fail at run time. The issue can be solved by installing the necessary dependencies and then re-installing Python.

Bug fixes#

Categorical#

  • Bug in Categorical.fillna() that would replace all values, not just those that are NaN (GH 26215)

Datetimelike#

  • Bug in to_datetime() where passing a timezone-naive DatetimeArray or DatetimeIndex and utc=True would incorrectly return a timezone-naive result (GH 27733)

  • Bug in Period.to_timestamp() where a Period outside the Timestamp implementation bounds (roughly 1677-09-21 to 2262-04-11) would return an incorrect Timestamp instead of raising OutOfBoundsDatetime (GH 19643)

  • Bug in iterating over DatetimeIndex when the underlying data is read-only (GH 28055)

Timezones#

  • Bug in Index where a numpy object array with a timezone aware Timestamp and np.nan would not return a DatetimeIndex (GH 27011)

Numeric#

  • Bug in Series.interpolate() when using a timezone aware DatetimeIndex (GH 27548)

  • Bug when printing negative floating point complex numbers would raise an IndexError (GH 27484)

  • Bug where DataFrame arithmetic operators such as DataFrame.mul() with a Series with axis=1 would raise an AttributeError on DataFrame larger than the minimum threshold to invoke numexpr (GH 27636)

  • Bug in DataFrame arithmetic where missing values in results were incorrectly masked with NaN instead of Inf (GH 27464)

Conversion#

  • Improved the warnings for the deprecated methods Series.real() and Series.imag() (GH 27610)

Interval#

  • Bug in IntervalIndex where dir(obj) would raise ValueError (GH 27571)

Indexing#

  • Bug in partial-string indexing returning a NumPy array rather than a Series when indexing with a scalar like .loc['2015'] (GH 27516)

  • Break reference cycle involving Index and other index classes to allow garbage collection of index objects without running the GC. (GH 27585, GH 27840)

  • Fix regression in assigning values to a single column of a DataFrame with a MultiIndex columns (GH 27841).

  • Fix regression in .ix fallback with an IntervalIndex (GH 27865).

Missing#

IO#

  • Avoid calling S3File.s3 when reading parquet, as this was removed in s3fs version 0.3.0 (GH 27756)

  • Better error message when a negative header is passed in pandas.read_csv() (GH 27779)

  • Follow the min_rows display option (introduced in v0.25.0) correctly in the HTML repr in the notebook (GH 27991).

Plotting#

GroupBy/resample/rolling#

  • Fixed regression in pands.core.groupby.DataFrameGroupBy.quantile() raising when multiple quantiles are given (GH 27526)

  • Bug in DataFrameGroupBy.transform() where applying a timezone conversion lambda function would drop timezone information (GH 27496)

  • Bug in GroupBy.nth() where observed=False was being ignored for Categorical groupers (GH 26385)

  • Bug in windowing over read-only arrays (GH 27766)

  • Fixed segfault in .DataFrameGroupBy.quantile when an invalid quantile was passed (GH 27470)

Reshaping#

  • A KeyError is now raised if .unstack() is called on a Series or DataFrame with a flat Index passing a name which is not the correct one (GH 18303)

  • Bug merge_asof() could not merge Timedelta objects when passing tolerance kwarg (GH 27642)

  • Bug in DataFrame.crosstab() when margins set to True and normalize is not False, an error is raised. (GH 27500)

  • DataFrame.join() now suppresses the FutureWarning when the sort parameter is specified (GH 21952)

  • Bug in DataFrame.join() raising with readonly arrays (GH 27943)

Sparse#

  • Bug in reductions for Series with Sparse dtypes (GH 27080)

Other#

  • Bug in Series.replace() and DataFrame.replace() when replacing timezone-aware timestamps using a dict-like replacer (GH 27720)

  • Bug in Series.rename() when using a custom type indexer. Now any value that isn’t callable or dict-like is treated as a scalar. (GH 27814)

Contributors#

A total of 5 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

  • Jeff Reback

  • Joris Van den Bossche

  • MeeseeksMachine +

  • Tom Augspurger

  • jbrockmendel