What’s new in 2.2.1 (February 22, 2024)#
These are the changes in pandas 2.2.1. See Release notes for a full changelog including other versions of pandas.
Enhancements#
Added
pyarrowpip extra so users can install pandas and pyarrow with pip withpip install pandas[pyarrow](GH 54466)
Fixed regressions#
Fixed memory leak in
read_csv()(GH 57039)Fixed performance regression in
Series.combine_first()(GH 55845)Fixed regression causing overflow for near-minimum timestamps (GH 57150)
Fixed regression in
concat()changing long-standing behavior that always sorted the non-concatenation axis when the axis was aDatetimeIndex(GH 57006)Fixed regression in
merge_ordered()raisingTypeErrorforfill_method="ffill"andhow="left"(GH 57010)Fixed regression in
pandas.testing.assert_series_equal()defaulting tocheck_exact=Truewhen checking theIndex(GH 57067)Fixed regression in
read_json()where anIndexwould be returned instead of aRangeIndex(GH 57429)Fixed regression in
wide_to_long()raising anAttributeErrorfor string columns (GH 57066)Fixed regression in
DataFrameGroupBy.idxmin(),DataFrameGroupBy.idxmax(),SeriesGroupBy.idxmin(),SeriesGroupBy.idxmax()ignoring theskipnaargument (GH 57040)Fixed regression in
DataFrameGroupBy.idxmin(),DataFrameGroupBy.idxmax(),SeriesGroupBy.idxmin(),SeriesGroupBy.idxmax()where values containing the minimum or maximum value for the dtype could produce incorrect results (GH 57040)Fixed regression in
CategoricalIndex.difference()raisingKeyErrorwhen other contains null values other than NaN (GH 57318)Fixed regression in
DataFrame.groupby()raisingValueErrorwhen grouping by aSeriesin some cases (GH 57276)Fixed regression in
DataFrame.loc()raisingIndexErrorfor non-unique, masked dtype indexes where result has more than 10,000 rows (GH 57027)Fixed regression in
DataFrame.loc()which was unnecessarily throwing “incompatible dtype warning” when expanding with partial row indexer and multiple columns (see PDEP6) (GH 56503)Fixed regression in
DataFrame.map()withna_action="ignore"not being respected for NumPy nullable andArrowDtypes(GH 57316)Fixed regression in
DataFrame.merge()raisingValueErrorfor certain types of 3rd-party extension arrays (GH 57316)Fixed regression in
DataFrame.query()with allNaTcolumn with object dtype (GH 57068)Fixed regression in
DataFrame.shift()raisingAssertionErrorforaxis=1and emptyDataFrame(GH 57301)Fixed regression in
DataFrame.sort_index()not producing a stable sort for a index with duplicates (GH 57151)Fixed regression in
DataFrame.to_dict()withorient='list'and datetime or timedelta types returning integers (GH 54824)Fixed regression in
DataFrame.to_json()converting nullable integers to floats (GH 57224)Fixed regression in
DataFrame.to_sql()whenmethod="multi"is passed and the dialect type is not Oracle (GH 57310)Fixed regression in
DataFrame.transpose()with nullable extension dtypes not having F-contiguous data potentially causing exceptions when used (GH 57315)Fixed regression in
DataFrame.update()emitting incorrect warnings about downcasting (GH 57124)Fixed regression in
DataFrameGroupBy.idxmin(),DataFrameGroupBy.idxmax(),SeriesGroupBy.idxmin(),SeriesGroupBy.idxmax()ignoring theskipnaargument (GH 57040)Fixed regression in
DataFrameGroupBy.idxmin(),DataFrameGroupBy.idxmax(),SeriesGroupBy.idxmin(),SeriesGroupBy.idxmax()where values containing the minimum or maximum value for the dtype could produce incorrect results (GH 57040)Fixed regression in
ExtensionArray.to_numpy()raising for non-numeric masked dtypes (GH 56991)Fixed regression in
Index.join()raisingTypeErrorwhen joining an empty index to a non-empty index containing mixed dtype values (GH 57048)Fixed regression in
Series.astype()introducing decimals when converting from integer with missing values to string dtype (GH 57418)Fixed regression in
Series.pct_change()raising aValueErrorfor an emptySeries(GH 57056)Fixed regression in
Series.to_numpy()when dtype is given as float and the data contains NaNs (GH 57121)Fixed regression in addition or subtraction of
DateOffsetobjects with millisecond components todatetime64Index,Series, orDataFrame(GH 57529)
Bug fixes#
Fixed bug in
pandas.api.interchange.from_dataframe()which was raising for Nullable integers (GH 55069)Fixed bug in
pandas.api.interchange.from_dataframe()which was raising for empty inputs (GH 56700)Fixed bug in
pandas.api.interchange.from_dataframe()which wasn’t converting columns names to strings (GH 55069)Fixed bug in
DataFrame.__getitem__()for emptyDataFramewith Copy-on-Write enabled (GH 57130)Fixed bug in
PeriodIndex.asfreq()which was silently converting frequencies which are not supported as period frequencies instead of raising an error (GH 56945)
Other#
Note
The DeprecationWarning that was raised when pandas was imported without PyArrow being
installed has been removed. This decision was made because the warning was too noisy for too
many users and a lot of feedback was collected about the decision to make PyArrow a required
dependency. Pandas is currently considering the decision whether or not PyArrow should be added
as a hard dependency in 3.0. Interested users can follow the discussion
here.
Added the argument
skipnatoDataFrameGroupBy.first(),DataFrameGroupBy.last(),SeriesGroupBy.first(), andSeriesGroupBy.last(); achievingskipna=Falseused to be available viaDataFrameGroupBy.nth(), but the behavior was changed in pandas 2.0.0 (GH 57019)Added the argument
skipnatoResampler.first(),Resampler.last()(GH 57019)
Contributors#
A total of 14 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
Albert Villanova del Moral
Luke Manley
Lumberbot (aka Jack)
Marco Edward Gorelli
Matthew Roeschke
Natalia Mokeeva
Pandas Development Team
Patrick Hoefler
Richard Shadrach
Robert Schmidtke +
Samuel Chai +
Thomas Li
William Ayd
dependabot[bot]