What’s new in 1.2.1 (January 20, 2021)#

These are the changes in pandas 1.2.1. See Release notes for a full changelog including other versions of pandas.

Fixed regressions#

We have reverted a commit that resulted in several plotting related regressions in pandas 1.2.0 (GH 38969, GH 38736, GH 38865, GH 38947 and GH 39126). As a result, bugs reported as fixed in pandas 1.2.0 related to inconsistent tick labeling in bar plots are again present (GH 26186 and GH 11465)

Calling NumPy ufuncs on non-aligned DataFrames#

Before pandas 1.2.0, calling a NumPy ufunc on non-aligned DataFrames (or DataFrame / Series combination) would ignore the indices, only match the inputs by shape, and use the index/columns of the first DataFrame for the result:

In [1]: df1 = pd.DataFrame({"a": [1, 2], "b": [3, 4]}, index=[0, 1])
In [2]: df2 = pd.DataFrame({"a": [1, 2], "b": [3, 4]}, index=[1, 2])
In [3]: df1
Out[3]:
   a  b
0  1  3
1  2  4
In [4]: df2
Out[4]:
   a  b
1  1  3
2  2  4

In [5]: np.add(df1, df2)
Out[5]:
   a  b
0  2  6
1  4  8

This contrasts with how other pandas operations work, which first align the inputs:

In [6]: df1 + df2
Out[6]:
     a    b
0  NaN  NaN
1  3.0  7.0
2  NaN  NaN

In pandas 1.2.0, we refactored how NumPy ufuncs are called on DataFrames, and this started to align the inputs first (GH 39184), as happens in other pandas operations and as it happens for ufuncs called on Series objects.

For pandas 1.2.1, we restored the previous behaviour to avoid a breaking change, but the above example of np.add(df1, df2) with non-aligned inputs will now to raise a warning, and a future pandas 2.0 release will start aligning the inputs first (GH 39184). Calling a NumPy ufunc on Series objects (eg np.add(s1, s2)) already aligns and continues to do so.

To avoid the warning and keep the current behaviour of ignoring the indices, convert one of the arguments to a NumPy array:

In [7]: np.add(df1, np.asarray(df2))
Out[7]:
   a  b
0  2  6
1  4  8

To obtain the future behaviour and silence the warning, you can align manually before passing the arguments to the ufunc:

In [8]: df1, df2 = df1.align(df2)
In [9]: np.add(df1, df2)
Out[9]:
     a    b
0  NaN  NaN
1  3.0  7.0
2  NaN  NaN

Bug fixes#

  • Bug in read_csv() with float_precision="high" caused segfault or wrong parsing of long exponent strings. This resulted in a regression in some cases as the default for float_precision was changed in pandas 1.2.0 (GH 38753)

  • Bug in read_csv() not closing an opened file handle when a csv.Error or UnicodeDecodeError occurred while initializing (GH 39024)

  • Bug in pandas.testing.assert_index_equal() raising TypeError with check_order=False when Index has mixed dtype (GH 39168)

Other#

Contributors#

A total of 20 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.

  • Ada Draginda +

  • Andrew Wieteska

  • Bryan Cutler

  • Fangchen Li

  • Joris Van den Bossche

  • Matthew Roeschke

  • Matthew Zeitlin +

  • MeeseeksMachine

  • Micael Jarniac

  • Omar Afifi +

  • Pandas Development Team

  • Richard Shadrach

  • Simon Hawkins

  • Terji Petersen

  • Torsten Wörtwein

  • WANG Aiyong

  • jbrockmendel

  • kylekeppler

  • mzeitlin11

  • patrick