This is a minor bug-fix release in the 0.23.x series and includes some small regression fixes and bug fixes. We recommend that all users upgrade to this version.
Warning
Starting January 1, 2019, pandas feature releases will support Python 3 only. See Dropping Python 2.7 for more.
What’s new in v0.23.1
Fixed regressions
Performance improvements
Bug fixes
Contributors
Comparing Series with datetime.date
We’ve reverted a 0.23.0 change to comparing a Series holding datetimes and a datetime.date object (GH21152). In pandas 0.22 and earlier, comparing a Series holding datetimes and datetime.date objects would coerce the datetime.date to a datetime before comparing. This was inconsistent with Python, NumPy, and DatetimeIndex, which never consider a datetime and datetime.date equal.
Series
datetime.date
DatetimeIndex
In 0.23.0, we unified operations between DatetimeIndex and Series, and in the process changed comparisons between a Series of datetimes and datetime.date without warning.
We’ve temporarily restored the 0.22.0 behavior, so datetimes and dates may again compare equal, but restore the 0.23.0 behavior in a future release.
To summarize, here’s the behavior in 0.22.0, 0.23.0, 0.23.1:
# 0.22.0... Silently coerce the datetime.date >>> import datetime >>> pd.Series(pd.date_range('2017', periods=2)) == datetime.date(2017, 1, 1) 0 True 1 False dtype: bool # 0.23.0... Do not coerce the datetime.date >>> pd.Series(pd.date_range('2017', periods=2)) == datetime.date(2017, 1, 1) 0 False 1 False dtype: bool # 0.23.1... Coerce the datetime.date with a warning >>> pd.Series(pd.date_range('2017', periods=2)) == datetime.date(2017, 1, 1) /bin/python:1: FutureWarning: Comparing Series of datetimes with 'datetime.date'. Currently, the 'datetime.date' is coerced to a datetime. In the future pandas will not coerce, and the values not compare equal to the 'datetime.date'. To retain the current behavior, convert the 'datetime.date' to a datetime with 'pd.Timestamp'. #!/bin/python3 0 True 1 False dtype: bool
In addition, ordering comparisons will raise a TypeError in the future.
TypeError
Other fixes
Reverted the ability of to_sql() to perform multivalue inserts as this caused regression in certain cases (GH21103). In the future this will be made configurable.
to_sql()
Fixed regression in the DatetimeIndex.date and DatetimeIndex.time attributes in case of timezone-aware data: DatetimeIndex.time returned a tz-aware time instead of tz-naive (GH21267) and DatetimeIndex.date returned incorrect date when the input date has a non-UTC timezone (GH21230).
DatetimeIndex.date
DatetimeIndex.time
Fixed regression in pandas.io.json.json_normalize() when called with None values in nested levels in JSON, and to not drop keys with value as None (GH21158, GH21356).
pandas.io.json.json_normalize()
None
Bug in to_csv() causes encoding error when compression and encoding are specified (GH21241, GH21118)
to_csv()
Bug preventing pandas from being importable with -OO optimization (GH21071)
Bug in Categorical.fillna() incorrectly raising a TypeError when value the individual categories are iterable and value is an iterable (GH21097, GH19788)
Categorical.fillna()
value
Fixed regression in constructors coercing NA values like None to strings when passing dtype=str (GH21083)
dtype=str
Regression in pivot_table() where an ordered Categorical with missing values for the pivot’s index would give a mis-aligned result (GH21133)
pivot_table()
Categorical
index
Fixed regression in merging on boolean index/columns (GH21119).
Improved performance of CategoricalIndex.is_monotonic_increasing(), CategoricalIndex.is_monotonic_decreasing() and CategoricalIndex.is_monotonic() (GH21025)
CategoricalIndex.is_monotonic_increasing()
CategoricalIndex.is_monotonic_decreasing()
CategoricalIndex.is_monotonic()
Improved performance of CategoricalIndex.is_unique() (GH21107)
CategoricalIndex.is_unique()
Groupby/resample/rolling
Bug in DataFrame.agg() where applying multiple aggregation functions to a DataFrame with duplicated column names would cause a stack overflow (GH21063)
DataFrame.agg()
DataFrame
Bug in pandas.core.groupby.GroupBy.ffill() and pandas.core.groupby.GroupBy.bfill() where the fill within a grouping would not always be applied as intended due to the implementations’ use of a non-stable sort (GH21207)
pandas.core.groupby.GroupBy.ffill()
pandas.core.groupby.GroupBy.bfill()
Bug in pandas.core.groupby.GroupBy.rank() where results did not scale to 100% when specifying method='dense' and pct=True
pandas.core.groupby.GroupBy.rank()
method='dense'
pct=True
Bug in pandas.DataFrame.rolling() and pandas.Series.rolling() which incorrectly accepted a 0 window size rather than raising (GH21286)
pandas.DataFrame.rolling()
pandas.Series.rolling()
Data-type specific
Bug in Series.str.replace() where the method throws TypeError on Python 3.5.2 (GH21078)
Series.str.replace()
Bug in Timedelta where passing a float with a unit would prematurely round the float precision (GH14156)
Timedelta
Bug in pandas.testing.assert_index_equal() which raised AssertionError incorrectly, when comparing two CategoricalIndex objects with param check_categorical=False (GH19776)
pandas.testing.assert_index_equal()
AssertionError
CategoricalIndex
check_categorical=False
Sparse
Bug in SparseArray.shape which previously only returned the shape SparseArray.sp_values (GH21126)
SparseArray.shape
SparseArray.sp_values
Indexing
Bug in Series.reset_index() where appropriate error was not raised with an invalid level name (GH20925)
Series.reset_index()
Bug in interval_range() when start/periods or end/periods are specified with float start or end (GH21161)
interval_range()
start
periods
end
Bug in MultiIndex.set_names() where error raised for a MultiIndex with nlevels == 1 (GH21149)
MultiIndex.set_names()
MultiIndex
nlevels == 1
Bug in IntervalIndex constructors where creating an IntervalIndex from categorical data was not fully supported (GH21243, GH21253)
IntervalIndex
Bug in MultiIndex.sort_index() which was not guaranteed to sort correctly with level=1; this was also causing data misalignment in particular DataFrame.stack() operations (GH20994, GH20945, GH21052)
MultiIndex.sort_index()
level=1
DataFrame.stack()
Plotting
New keywords (sharex, sharey) to turn on/off sharing of x/y-axis by subplots generated with pandas.DataFrame().groupby().boxplot() (GH20968)
I/O
Bug in IO methods specifying compression='zip' which produced uncompressed zip archives (GH17778, GH21144)
compression='zip'
Bug in DataFrame.to_stata() which prevented exporting DataFrames to buffers and most file-like objects (GH21041)
DataFrame.to_stata()
Bug in read_stata() and StataReader which did not correctly decode utf-8 strings on Python 3 from Stata 14 files (dta version 118) (GH21244)
read_stata()
StataReader
Bug in IO JSON read_json() reading empty JSON schema with orient='table' back to DataFrame caused an error (GH21287)
read_json()
orient='table'
Reshaping
Bug in concat() where error was raised in concatenating Series with numpy scalar and tuple names (GH21015)
concat()
Bug in concat() warning message providing the wrong guidance for future behavior (GH21101)
Other
Tab completion on Index in IPython no longer outputs deprecation warnings (GH21125)
Index
Bug preventing pandas being used on Windows without C++ redistributable installed (GH21106)
A total of 30 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
Adam J. Stewart
Adam Kim +
Aly Sivji
Chalmer Lowe +
Damini Satya +
Dr. Irv
Gabe Fernando +
Giftlin Rajaiah
Jeff Reback
Jeremy Schendel +
Joris Van den Bossche
Kalyan Gokhale +
Kevin Sheppard
Matthew Roeschke
Max Kanter +
Ming Li
Pyry Kovanen +
Stefano Cianciulli
Tom Augspurger
Uddeshya Singh +
Wenhuan
William Ayd
chris-b1
gfyoung
h-vetinari
nprad +
ssikdar1 +
tmnhat2001
topper-123
zertrin +