What’s new in 2.0.2 (May 29, 2023)#

These are the changes in pandas 2.0.2. See Release notes for a full changelog including other versions of pandas.

Fixed regressions#

Fixed performance regression in GroupBy.apply() (GH 53195)
Fixed regression in merge() on Windows when dtype is np.intc (GH 52451)
Fixed regression in read_sql() dropping columns with duplicated column names (GH 53117)
Fixed regression in DataFrame.loc() losing MultiIndex name when enlarging object (GH 53053)
Fixed regression in DataFrame.to_string() printing a backslash at the end of the first row of data, instead of headers, when the DataFrame doesn’t fit the line width (GH 53054)
Fixed regression in MultiIndex.join() returning levels in wrong order (GH 53093)

Bug fixes#

Bug in arrays.ArrowExtensionArray incorrectly assigning dict instead of list for .type with pyarrow.map_ and raising a NotImplementedError with pyarrow.struct (GH 53328)
Bug in api.interchange.from_dataframe() was raising IndexError on empty categorical data (GH 53077)
Bug in api.interchange.from_dataframe() was returning DataFrame’s of incorrect sizes when called on slices (GH 52824)
Bug in api.interchange.from_dataframe() was unnecessarily raising on bitmasks (GH 49888)
Bug in merge() when merging on datetime columns on different resolutions (GH 53200)
Bug in read_csv() raising OverflowError for engine="pyarrow" and parse_dates set (GH 53295)
Bug in to_datetime() was inferring format to contain "%H" instead of "%I" if date contained “AM” / “PM” tokens (GH 53147)
Bug in to_timedelta() was raising ValueError with pandas.NA (GH 52909)
Bug in DataFrame.__getitem__() not preserving dtypes for MultiIndex partial keys (GH 51895)
Bug in DataFrame.convert_dtypes() ignores convert_* keywords when set to False dtype_backend="pyarrow" (GH 52872)
Bug in DataFrame.convert_dtypes() losing timezone for tz-aware dtypes and dtype_backend="pyarrow" (GH 53382)
Bug in DataFrame.sort_values() raising for PyArrow dictionary dtype (GH 53232)
Bug in Series.describe() treating pyarrow-backed timestamps and timedeltas as categorical data (GH 53001)
Bug in Series.rename() not making a lazy copy when Copy-on-Write is enabled when a scalar is passed to it (GH 52450)
Bug in pd.array() raising for NumPy array and pa.large_string or pa.large_binary (GH 52590)

Other#

Raised a better error message when calling Series.dt.to_pydatetime() with ArrowDtype with pyarrow.date32 or pyarrow.date64 type (GH 52812)

Contributors#

A total of 18 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

Gianluca Ficarelli +
Guillaume Lemaitre
Joris Van den Bossche
Julian Badillo +
Luke Manley
Lumberbot (aka Jack) +
Marc Garcia
Marco Edward Gorelli
MarcoGorelli
Matt Richards
Matthew Roeschke
MeeseeksMachine
Pandas Development Team
Patrick Hoefler
Simon Høxbro Hansen +
Thomas Li
Yao Xiao +
dependabot[bot]