What’s new in 1.1.0 (July 28, 2020)#
These are the changes in pandas 1.1.0. See Release notes for a full changelog including other versions of pandas.
Enhancements#
KeyErrors raised by loc specify missing labels#
Previously, if labels were missing for a .loc call, a KeyError was raised stating that this was no longer supported.
Now the error message also includes a list of the missing labels (max 10 items, display width 80 characters). See GH34272.
All dtypes can now be converted to StringDtype#
Previously, declaring or converting to StringDtype was in general only possible if the data was already only str or nan-like (GH31204).
StringDtype now works in all situations where astype(str) or dtype=str work:
For example, the below now works:
In [1]: ser = pd.Series([1, "abc", np.nan], dtype="string")
In [2]: ser
Out[2]: 
0       1
1     abc
2    <NA>
dtype: string
In [3]: ser[0]
Out[3]: '1'
In [4]: pd.Series([1, 2, np.nan], dtype="Int64").astype("string")
Out[4]: 
0       1
1       2
2    <NA>
dtype: string
Non-monotonic PeriodIndex partial string slicing#
PeriodIndex now supports partial string slicing for non-monotonic indexes, mirroring DatetimeIndex behavior (GH31096)
For example:
In [5]: dti = pd.date_range("2014-01-01", periods=30, freq="30D")
In [6]: pi = dti.to_period("D")
In [7]: ser_monotonic = pd.Series(np.arange(30), index=pi)
In [8]: shuffler = list(range(0, 30, 2)) + list(range(1, 31, 2))
In [9]: ser = ser_monotonic[shuffler]
In [10]: ser
Out[10]: 
2014-01-01     0
2014-03-02     2
2014-05-01     4
2014-06-30     6
2014-08-29     8
              ..
2015-09-23    21
2015-11-22    23
2016-01-21    25
2016-03-21    27
2016-05-20    29
Freq: D, Length: 30, dtype: int64
In [11]: ser["2014"]
Out[11]: 
2014-01-01     0
2014-03-02     2
2014-05-01     4
2014-06-30     6
2014-08-29     8
2014-10-28    10
2014-12-27    12
2014-01-31     1
2014-04-01     3
2014-05-31     5
2014-07-30     7
2014-09-28     9
2014-11-27    11
Freq: D, dtype: int64
In [12]: ser.loc["May 2015"]
Out[12]: 
2015-05-26    17
Freq: D, dtype: int64
Comparing two DataFrame or two Series and summarizing the differences#
We’ve added DataFrame.compare() and Series.compare() for comparing two DataFrame or two Series (GH30429)
In [13]: df = pd.DataFrame(
   ....:     {
   ....:         "col1": ["a", "a", "b", "b", "a"],
   ....:         "col2": [1.0, 2.0, 3.0, np.nan, 5.0],
   ....:         "col3": [1.0, 2.0, 3.0, 4.0, 5.0]
   ....:     },
   ....:     columns=["col1", "col2", "col3"],
   ....: )
   ....: 
In [14]: df
Out[14]: 
  col1  col2  col3
0    a   1.0   1.0
1    a   2.0   2.0
2    b   3.0   3.0
3    b   NaN   4.0
4    a   5.0   5.0
In [15]: df2 = df.copy()
In [16]: df2.loc[0, 'col1'] = 'c'
In [17]: df2.loc[2, 'col3'] = 4.0
In [18]: df2
Out[18]: 
  col1  col2  col3
0    c   1.0   1.0
1    a   2.0   2.0
2    b   3.0   4.0
3    b   NaN   4.0
4    a   5.0   5.0
In [19]: df.compare(df2)
Out[19]: 
  col1       col3      
  self other self other
0    a     c  NaN   NaN
2  NaN   NaN  3.0   4.0
See User Guide for more details.
Allow NA in groupby key#
With groupby , we’ve added a dropna keyword to DataFrame.groupby() and Series.groupby() in order to
allow NA values in group keys. Users can define dropna to False if they want to include
NA values in groupby keys. The default is set to True for dropna to keep backwards
compatibility (GH3729)
In [20]: df_list = [[1, 2, 3], [1, None, 4], [2, 1, 3], [1, 2, 2]]
In [21]: df_dropna = pd.DataFrame(df_list, columns=["a", "b", "c"])
In [22]: df_dropna
Out[22]: 
   a    b  c
0  1  2.0  3
1  1  NaN  4
2  2  1.0  3
3  1  2.0  2
# Default ``dropna`` is set to True, which will exclude NaNs in keys
In [23]: df_dropna.groupby(by=["b"], dropna=True).sum()
Out[23]: 
     a  c
b        
1.0  2  3
2.0  2  5
# In order to allow NaN in keys, set ``dropna`` to False
In [24]: df_dropna.groupby(by=["b"], dropna=False).sum()
Out[24]: 
     a  c
b        
1.0  2  3
2.0  2  5
NaN  1  4
The default setting of dropna argument is True which means NA are not included in group keys.
Sorting with keys#
We’ve added a key argument to the DataFrame and Series sorting methods, including
DataFrame.sort_values(), DataFrame.sort_index(), Series.sort_values(),
and Series.sort_index(). The key can be any callable function which is applied
column-by-column to each column used for sorting, before sorting is performed (GH27237).
See sort_values with keys and sort_index with keys for more information.
In [25]: s = pd.Series(['C', 'a', 'B'])
In [26]: s
Out[26]: 
0    C
1    a
2    B
dtype: object
In [27]: s.sort_values()
Out[27]: 
2    B
0    C
1    a
dtype: object
Note how this is sorted with capital letters first. If we apply the Series.str.lower()
method, we get
In [28]: s.sort_values(key=lambda x: x.str.lower())
Out[28]: 
1    a
2    B
0    C
dtype: object
When applied to a DataFrame, they key is applied per-column to all columns or a subset if
by is specified, e.g.
In [29]: df = pd.DataFrame({'a': ['C', 'C', 'a', 'a', 'B', 'B'],
   ....:                    'b': [1, 2, 3, 4, 5, 6]})
   ....: 
In [30]: df
Out[30]: 
   a  b
0  C  1
1  C  2
2  a  3
3  a  4
4  B  5
5  B  6
In [31]: df.sort_values(by=['a'], key=lambda col: col.str.lower())
Out[31]: 
   a  b
2  a  3
3  a  4
4  B  5
5  B  6
0  C  1
1  C  2
For more details, see examples and documentation in DataFrame.sort_values(),
Series.sort_values(), and sort_index().
Fold argument support in Timestamp constructor#
Timestamp: now supports the keyword-only fold argument according to PEP 495 similar to parent datetime.datetime class. It supports both accepting fold as an initialization argument and inferring fold from other constructor arguments (GH25057, GH31338). Support is limited to dateutil timezones as pytz doesn’t support fold.
For example:
In [32]: ts = pd.Timestamp("2019-10-27 01:30:00+00:00")
In [33]: ts.fold
Out[33]: 0
In [34]: ts = pd.Timestamp(year=2019, month=10, day=27, hour=1, minute=30,
   ....:                   tz="dateutil/Europe/London", fold=1)
   ....: 
In [35]: ts
Out[35]: Timestamp('2019-10-27 01:30:00+0000', tz='dateutil//usr/share/zoneinfo/Europe/London')
For more on working with fold, see Fold subsection in the user guide.
Parsing timezone-aware format with different timezones in to_datetime#
to_datetime() now supports parsing formats containing timezone names (%Z) and UTC offsets (%z) from different timezones then converting them to UTC by setting utc=True. This would return a DatetimeIndex with timezone at UTC as opposed to an Index with object dtype if utc=True is not set (GH32792).
For example:
In [36]: tz_strs = ["2010-01-01 12:00:00 +0100", "2010-01-01 12:00:00 -0100",
   ....:            "2010-01-01 12:00:00 +0300", "2010-01-01 12:00:00 +0400"]
   ....: 
In [37]: pd.to_datetime(tz_strs, format='%Y-%m-%d %H:%M:%S %z', utc=True)
Out[37]: 
DatetimeIndex(['2010-01-01 11:00:00+00:00', '2010-01-01 13:00:00+00:00',
               '2010-01-01 09:00:00+00:00', '2010-01-01 08:00:00+00:00'],
              dtype='datetime64[ns, UTC]', freq=None)
In [38]: pd.to_datetime(tz_strs, format='%Y-%m-%d %H:%M:%S %z')
Out[38]: 
Index([2010-01-01 12:00:00+01:00, 2010-01-01 12:00:00-01:00,
       2010-01-01 12:00:00+03:00, 2010-01-01 12:00:00+04:00],
      dtype='object')
Grouper and resample now supports the arguments origin and offset#
Grouper and DataFrame.resample() now supports the arguments origin and offset. It let the user control the timestamp on which to adjust the grouping. (GH31809)
The bins of the grouping are adjusted based on the beginning of the day of the time series starting point. This works well with frequencies that are multiples of a day (like 30D) or that divides a day (like 90s or 1min). But it can create inconsistencies with some frequencies that do not meet this criteria. To change this behavior you can now specify a fixed timestamp with the argument origin.
Two arguments are now deprecated (more information in the documentation of DataFrame.resample()):
- baseshould be replaced by- offset.
- loffsetshould be replaced by directly adding an offset to the index- DataFrameafter being resampled.
Small example of the use of origin:
In [39]: start, end = '2000-10-01 23:30:00', '2000-10-02 00:30:00'
In [40]: middle = '2000-10-02 00:00:00'
In [41]: rng = pd.date_range(start, end, freq='7min')
In [42]: ts = pd.Series(np.arange(len(rng)) * 3, index=rng)
In [43]: ts
Out[43]: 
2000-10-01 23:30:00     0
2000-10-01 23:37:00     3
2000-10-01 23:44:00     6
2000-10-01 23:51:00     9
2000-10-01 23:58:00    12
2000-10-02 00:05:00    15
2000-10-02 00:12:00    18
2000-10-02 00:19:00    21
2000-10-02 00:26:00    24
Freq: 7T, dtype: int64
Resample with the default behavior 'start_day' (origin is 2000-10-01 00:00:00):
In [44]: ts.resample('17min').sum()
Out[44]: 
2000-10-01 23:14:00     0
2000-10-01 23:31:00     9
2000-10-01 23:48:00    21
2000-10-02 00:05:00    54
2000-10-02 00:22:00    24
Freq: 17T, dtype: int64
In [45]: ts.resample('17min', origin='start_day').sum()
Out[45]: 
2000-10-01 23:14:00     0
2000-10-01 23:31:00     9
2000-10-01 23:48:00    21
2000-10-02 00:05:00    54
2000-10-02 00:22:00    24
Freq: 17T, dtype: int64
Resample using a fixed origin:
In [46]: ts.resample('17min', origin='epoch').sum()
Out[46]: 
2000-10-01 23:18:00     0
2000-10-01 23:35:00    18
2000-10-01 23:52:00    27
2000-10-02 00:09:00    39
2000-10-02 00:26:00    24
Freq: 17T, dtype: int64
In [47]: ts.resample('17min', origin='2000-01-01').sum()
Out[47]: 
2000-10-01 23:24:00     3
2000-10-01 23:41:00    15
2000-10-01 23:58:00    45
2000-10-02 00:15:00    45
Freq: 17T, dtype: int64
If needed you can adjust the bins with the argument offset (a Timedelta) that would be added to the default origin.
For a full example, see: Use origin or offset to adjust the start of the bins.
fsspec now used for filesystem handling#
For reading and writing to filesystems other than local and reading from HTTP(S),
the optional dependency fsspec will be used to dispatch operations (GH33452).
This will give unchanged
functionality for S3 and GCS storage, which were already supported, but also add
support for several other storage implementations such as Azure Datalake and Blob,
SSH, FTP, dropbox and github. For docs and capabilities, see the fsspec docs.
The existing capability to interface with S3 and GCS will be unaffected by this
change, as fsspec will still bring in the same packages as before.
Other enhancements#
- Compatibility with matplotlib 3.3.0 (GH34850) 
- IntegerArray.astype()now supports- datetime64dtype (GH32538)
- IntegerArraynow implements the- sumoperation (GH33172)
- Added - pandas.errors.InvalidIndexError(GH34570).
- Added - DataFrame.value_counts()(GH5377)
- Added a - pandas.api.indexers.FixedForwardWindowIndexer()class to support forward-looking windows during- rollingoperations.
- Added a - pandas.api.indexers.VariableOffsetWindowIndexer()class to support- rollingoperations with non-fixed offsets (GH34994)
- describe()now includes a- datetime_is_numerickeyword to control how datetime columns are summarized (GH30164, GH34798)
- Stylermay now render CSS more efficiently where multiple cells have the same styling (GH30876)
- highlight_null()now accepts- subsetargument (GH31345)
- When writing directly to a sqlite connection - DataFrame.to_sql()now supports the- multimethod (GH29921)
- pandas.errors.OptionErroris now exposed in- pandas.errors(GH27553)
- Added - api.extensions.ExtensionArray.argmax()and- api.extensions.ExtensionArray.argmin()(GH24382)
- timedelta_range()will now infer a frequency when passed- start,- stop, and- periods(GH32377)
- Positional slicing on a - IntervalIndexnow supports slices with- step > 1(GH31658)
- Series.strnow has a- fullmatchmethod that matches a regular expression against the entire string in each row of the- Series, similar to- re.fullmatch(GH32806).
- DataFrame.sample()will now also allow array-like and BitGenerator objects to be passed to- random_stateas seeds (GH32503)
- Index.union()will now raise- RuntimeWarningfor- MultiIndexobjects if the object inside are unsortable. Pass- sort=Falseto suppress this warning (GH33015)
- Added - Series.dt.isocalendar()and- DatetimeIndex.isocalendar()that returns a- DataFramewith year, week, and day calculated according to the ISO 8601 calendar (GH33206, GH34392).
- The - DataFrame.to_feather()method now supports additional keyword arguments (e.g. to set the compression) that are added in pyarrow 0.17 (GH33422).
- The - cut()will now accept parameter- orderedwith default- ordered=True. If- ordered=Falseand no labels are provided, an error will be raised (GH33141)
- DataFrame.to_csv(),- DataFrame.to_pickle(), and- DataFrame.to_json()now support passing a dict of compression arguments when using the- gzipand- bz2protocols. This can be used to set a custom compression level, e.g.,- df.to_csv(path, compression={'method': 'gzip', 'compresslevel': 1}(GH33196)
- melt()has gained an- ignore_index(default- True) argument that, if set to- False, prevents the method from dropping the index (GH17440).
- Series.update()now accepts objects that can be coerced to a- Series, such as- dictand- list, mirroring the behavior of- DataFrame.update()(GH33215)
- transform()and- aggregate()have gained- engineand- engine_kwargsarguments that support executing functions with- Numba(GH32854, GH33388)
- interpolate()now supports SciPy interpolation method- scipy.interpolate.CubicSplineas method- cubicspline(GH33670)
- DataFrameGroupByand- SeriesGroupBynow implement the- samplemethod for doing random sampling within groups (GH31775)
- DataFrame.to_numpy()now supports the- na_valuekeyword to control the NA sentinel in the output array (GH33820)
- Added - api.extension.ExtensionArray.equalsto the extension array interface, similar to- Series.equals()(GH27081)
- The minimum supported dta version has increased to 105 in - read_stata()and- StataReader(GH26667).
- to_stata()supports compression using the- compressionkeyword argument. Compression can either be inferred or explicitly set using a string or a dictionary containing both the method and any additional arguments that are passed to the compression library. Compression was also added to the low-level Stata-file writers- StataWriter,- StataWriter117, and- StataWriterUTF8(GH26599).
- HDFStore.put()now accepts a- track_timesparameter. This parameter is passed to the- create_tablemethod of- PyTables(GH32682).
- Series.plot()and- DataFrame.plot()now accepts- xlabeland- ylabelparameters to present labels on x and y axis (GH9093).
- Made - pandas.core.window.rolling.Rollingand- pandas.core.window.expanding.Expandingiterable(GH11704)
- Made - option_contexta- contextlib.ContextDecorator, which allows it to be used as a decorator over an entire function (GH34253).
- DataFrame.to_csv()and- Series.to_csv()now accept an- errorsargument (GH22610)
- transform()now allows- functo be- pad,- backfilland- cumcount(GH31269).
- read_json()now accepts an- nrowsparameter. (GH33916).
- DataFrame.hist(),- Series.hist(),- core.groupby.DataFrameGroupBy.hist(), and- core.groupby.SeriesGroupBy.hist()have gained the- legendargument. Set to True to show a legend in the histogram. (GH6279)
- concat()and- append()now preserve extension dtypes, for example combining a nullable integer column with a numpy integer column will no longer result in object dtype but preserve the integer dtype (GH33607, GH34339, GH34095).
- read_gbq()now allows to disable progress bar (GH33360).
- read_gbq()now supports the- max_resultskwarg from- pandas-gbq(GH34639).
- DataFrame.cov()and- Series.cov()now support a new parameter- ddofto support delta degrees of freedom as in the corresponding numpy methods (GH34611).
- DataFrame.to_html()and- DataFrame.to_string()’s- col_spaceparameter now accepts a list or dict to change only some specific columns’ width (GH28917).
- DataFrame.to_excel()can now also write OpenOffice spreadsheet (.ods) files (GH27222)
- explode()now accepts- ignore_indexto reset the index, similar to- pd.concat()or- DataFrame.sort_values()(GH34932).
- DataFrame.to_markdown()and- Series.to_markdown()now accept- indexargument as an alias for tabulate’s- showindex(GH32667)
- read_csv()now accepts string values like “0”, “0.0”, “1”, “1.0” as convertible to the nullable Boolean dtype (GH34859)
- pandas.core.window.ExponentialMovingWindownow supports a- timesargument that allows- meanto be calculated with observations spaced by the timestamps in- times(GH34839)
- DataFrame.agg()and- Series.agg()now accept named aggregation for renaming the output columns/indexes. (GH26513)
- compute.use_numbanow exists as a configuration option that utilizes the numba engine when available (GH33966, GH35374)
- Series.plot()now supports asymmetric error bars. Previously, if- Series.plot()received a “2xN” array with error values for- yerrand/or- xerr, the left/lower values (first row) were mirrored, while the right/upper values (second row) were ignored. Now, the first row represents the left/lower error values and the second row the right/upper error values. (GH9536)
Notable bug fixes#
These are bug fixes that might have notable behavior changes.
MultiIndex.get_indexer interprets method argument correctly#
This restores the behavior of MultiIndex.get_indexer() with method='backfill' or method='pad' to the behavior before pandas 0.23.0. In particular, MultiIndexes are treated as a list of tuples and padding or backfilling is done with respect to the ordering of these lists of tuples (GH29896).
As an example of this, given:
In [48]: df = pd.DataFrame({
   ....:     'a': [0, 0, 0, 0],
   ....:     'b': [0, 2, 3, 4],
   ....:     'c': ['A', 'B', 'C', 'D'],
   ....: }).set_index(['a', 'b'])
   ....: 
In [49]: mi_2 = pd.MultiIndex.from_product([[0], [-1, 0, 1, 3, 4, 5]])
The differences in reindexing df with mi_2 and using method='backfill' can be seen here:
pandas >= 0.23, < 1.1.0:
In [1]: df.reindex(mi_2, method='backfill')
Out[1]:
      c
0 -1  A
   0  A
   1  D
   3  A
   4  A
   5  C
pandas <0.23, >= 1.1.0
In [50]: df.reindex(mi_2, method='backfill')
Out[50]: 
        c
0 -1    A
   0    A
   1    B
   3    C
   4    D
   5  NaN
And the differences in reindexing df with mi_2 and using method='pad' can be seen here:
pandas >= 0.23, < 1.1.0
In [1]: df.reindex(mi_2, method='pad')
Out[1]:
        c
0 -1  NaN
   0  NaN
   1    D
   3  NaN
   4    A
   5    C
pandas < 0.23, >= 1.1.0
In [51]: df.reindex(mi_2, method='pad')
Out[51]: 
        c
0 -1  NaN
   0    A
   1    A
   3    C
   4    D
   5    D
Failed label-based lookups always raise KeyError#
Label lookups series[key], series.loc[key] and frame.loc[key]
used to raise either KeyError or TypeError depending on the type of
key and type of Index.  These now consistently raise KeyError (GH31867)
In [52]: ser1 = pd.Series(range(3), index=[0, 1, 2])
In [53]: ser2 = pd.Series(range(3), index=pd.date_range("2020-02-01", periods=3))
Previous behavior:
In [3]: ser1[1.5]
...
TypeError: cannot do label indexing on Int64Index with these indexers [1.5] of type float
In [4] ser1["foo"]
...
KeyError: 'foo'
In [5]: ser1.loc[1.5]
...
TypeError: cannot do label indexing on Int64Index with these indexers [1.5] of type float
In [6]: ser1.loc["foo"]
...
KeyError: 'foo'
In [7]: ser2.loc[1]
...
TypeError: cannot do label indexing on DatetimeIndex with these indexers [1] of type int
In [8]: ser2.loc[pd.Timestamp(0)]
...
KeyError: Timestamp('1970-01-01 00:00:00')
New behavior:
In [3]: ser1[1.5]
...
KeyError: 1.5
In [4] ser1["foo"]
...
KeyError: 'foo'
In [5]: ser1.loc[1.5]
...
KeyError: 1.5
In [6]: ser1.loc["foo"]
...
KeyError: 'foo'
In [7]: ser2.loc[1]
...
KeyError: 1
In [8]: ser2.loc[pd.Timestamp(0)]
...
KeyError: Timestamp('1970-01-01 00:00:00')
Similarly, DataFrame.at() and Series.at() will raise a TypeError instead of a ValueError if an incompatible key is passed, and KeyError if a missing key is passed, matching the behavior of .loc[] (GH31722)
Failed Integer Lookups on MultiIndex Raise KeyError#
Indexing with integers with a MultiIndex that has an integer-dtype
first level incorrectly failed to raise KeyError when one or more of
those integer keys is not present in the first level of the index (GH33539)
In [54]: idx = pd.Index(range(4))
In [55]: dti = pd.date_range("2000-01-03", periods=3)
In [56]: mi = pd.MultiIndex.from_product([idx, dti])
In [57]: ser = pd.Series(range(len(mi)), index=mi)
Previous behavior:
In [5]: ser[[5]]
Out[5]: Series([], dtype: int64)
New behavior:
In [5]: ser[[5]]
...
KeyError: '[5] not in index'
DataFrame.merge() preserves right frame’s row order#
DataFrame.merge() now preserves the right frame’s row order when executing a right merge (GH27453)
In [58]: left_df = pd.DataFrame({'animal': ['dog', 'pig'],
   ....:                        'max_speed': [40, 11]})
   ....: 
In [59]: right_df = pd.DataFrame({'animal': ['quetzal', 'pig'],
   ....:                         'max_speed': [80, 11]})
   ....: 
In [60]: left_df
Out[60]: 
  animal  max_speed
0    dog         40
1    pig         11
In [61]: right_df
Out[61]: 
    animal  max_speed
0  quetzal         80
1      pig         11
Previous behavior:
>>> left_df.merge(right_df, on=['animal', 'max_speed'], how="right")
    animal  max_speed
0      pig         11
1  quetzal         80
New behavior:
In [62]: left_df.merge(right_df, on=['animal', 'max_speed'], how="right")
Out[62]: 
    animal  max_speed
0  quetzal         80
1      pig         11
Assignment to multiple columns of a DataFrame when some columns do not exist#
Assignment to multiple columns of a DataFrame when some of the columns do not exist would previously assign the values to the last column. Now, new columns will be constructed with the right values. (GH13658)
In [63]: df = pd.DataFrame({'a': [0, 1, 2], 'b': [3, 4, 5]})
In [64]: df
Out[64]: 
   a  b
0  0  3
1  1  4
2  2  5
Previous behavior:
In [3]: df[['a', 'c']] = 1
In [4]: df
Out[4]:
   a  b
0  1  1
1  1  1
2  1  1
New behavior:
In [65]: df[['a', 'c']] = 1
In [66]: df
Out[66]: 
   a  b  c
0  1  3  1
1  1  4  1
2  1  5  1
Consistency across groupby reductions#
Using DataFrame.groupby() with as_index=True and the aggregation nunique would include the grouping column(s) in the columns of the result. Now the grouping column(s) only appear in the index, consistent with other reductions. (GH32579)
In [67]: df = pd.DataFrame({"a": ["x", "x", "y", "y"], "b": [1, 1, 2, 3]})
In [68]: df
Out[68]: 
   a  b
0  x  1
1  x  1
2  y  2
3  y  3
Previous behavior:
In [3]: df.groupby("a", as_index=True).nunique()
Out[4]:
   a  b
a
x  1  1
y  1  2
New behavior:
In [69]: df.groupby("a", as_index=True).nunique()
Out[69]: 
   b
a   
x  1
y  2
Using DataFrame.groupby() with as_index=False and the function idxmax, idxmin, mad, nunique, sem, skew, or std would modify the grouping column. Now the grouping column remains unchanged, consistent with other reductions. (GH21090, GH10355)
Previous behavior:
In [3]: df.groupby("a", as_index=False).nunique()
Out[4]:
   a  b
0  1  1
1  1  2
New behavior:
In [70]: df.groupby("a", as_index=False).nunique()
Out[70]: 
   a  b
0  x  1
1  y  2
The method size() would previously ignore as_index=False. Now the grouping columns are returned as columns, making the result a DataFrame instead of a Series. (GH32599)
Previous behavior:
In [3]: df.groupby("a", as_index=False).size()
Out[4]:
a
x    2
y    2
dtype: int64
New behavior:
In [71]: df.groupby("a", as_index=False).size()
Out[71]: 
   a  size
0  x     2
1  y     2
agg() lost results with as_index=False when relabeling columns#
Previously agg() lost the result columns, when the as_index option was
set to False and the result columns were relabeled. In this case the result values were replaced with
the previous index (GH32240).
In [72]: df = pd.DataFrame({"key": ["x", "y", "z", "x", "y", "z"],
   ....:                    "val": [1.0, 0.8, 2.0, 3.0, 3.6, 0.75]})
   ....: 
In [73]: df
Out[73]: 
  key   val
0   x  1.00
1   y  0.80
2   z  2.00
3   x  3.00
4   y  3.60
5   z  0.75
Previous behavior:
In [2]: grouped = df.groupby("key", as_index=False)
In [3]: result = grouped.agg(min_val=pd.NamedAgg(column="val", aggfunc="min"))
In [4]: result
Out[4]:
     min_val
 0   x
 1   y
 2   z
New behavior:
In [74]: grouped = df.groupby("key", as_index=False)
In [75]: result = grouped.agg(min_val=pd.NamedAgg(column="val", aggfunc="min"))
In [76]: result
Out[76]: 
  key  min_val
0   x     1.00
1   y     0.80
2   z     0.75
apply and applymap on DataFrame evaluates first row/column only once#
In [77]: df = pd.DataFrame({'a': [1, 2], 'b': [3, 6]})
In [78]: def func(row):
   ....:     print(row)
   ....:     return row
   ....: 
Previous behavior:
In [4]: df.apply(func, axis=1)
a    1
b    3
Name: 0, dtype: int64
a    1
b    3
Name: 0, dtype: int64
a    2
b    6
Name: 1, dtype: int64
Out[4]:
   a  b
0  1  3
1  2  6
New behavior:
In [79]: df.apply(func, axis=1)
a    1
b    3
Name: 0, dtype: int64
a    2
b    6
Name: 1, dtype: int64
Out[79]: 
   a  b
0  1  3
1  2  6
Backwards incompatible API changes#
Added check_freq argument to testing.assert_frame_equal and testing.assert_series_equal#
The check_freq argument was added to testing.assert_frame_equal() and testing.assert_series_equal() in pandas 1.1.0 and defaults to True. testing.assert_frame_equal() and testing.assert_series_equal() now raise AssertionError if the indexes do not have the same frequency. Before pandas 1.1.0, the index frequency was not checked.
Increased minimum versions for dependencies#
Some minimum supported versions of dependencies were updated (GH33718, GH29766, GH29723, pytables >= 3.4.3). If installed, we now require:
| Package | Minimum Version | Required | Changed | 
|---|---|---|---|
| numpy | 1.15.4 | X | X | 
| pytz | 2015.4 | X | |
| python-dateutil | 2.7.3 | X | X | 
| bottleneck | 1.2.1 | ||
| numexpr | 2.6.2 | ||
| pytest (dev) | 4.0.2 | 
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
| Package | Minimum Version | Changed | 
|---|---|---|
| beautifulsoup4 | 4.6.0 | |
| fastparquet | 0.3.2 | |
| fsspec | 0.7.4 | |
| gcsfs | 0.6.0 | X | 
| lxml | 3.8.0 | |
| matplotlib | 2.2.2 | |
| numba | 0.46.0 | |
| openpyxl | 2.5.7 | |
| pyarrow | 0.13.0 | |
| pymysql | 0.7.1 | |
| pytables | 3.4.3 | X | 
| s3fs | 0.4.0 | X | 
| scipy | 1.2.0 | X | 
| sqlalchemy | 1.1.4 | |
| xarray | 0.8.2 | |
| xlrd | 1.1.0 | |
| xlsxwriter | 0.9.8 | |
| xlwt | 1.2.0 | |
| pandas-gbq | 1.2.0 | X | 
See Dependencies and Optional dependencies for more.
Development changes#
- The minimum version of Cython is now the most recent bug-fix version (0.29.16) (GH33334). 
Deprecations#
- Lookups on a - Serieswith a single-item list containing a slice (e.g.- ser[[slice(0, 4)]]) are deprecated and will raise in a future version. Either convert the list to a tuple, or pass the slice directly instead (GH31333)
- DataFrame.mean()and- DataFrame.median()with- numeric_only=Nonewill include- datetime64and- datetime64tzcolumns in a future version (GH29941)
- Setting values with - .locusing a positional slice is deprecated and will raise in a future version. Use- .locwith labels or- .ilocwith positions instead (GH31840)
- DataFrame.to_dict()has deprecated accepting short names for- orientand will raise in a future version (GH32515)
- Categorical.to_dense()is deprecated and will be removed in a future version, use- np.asarray(cat)instead (GH32639)
- The - fastpathkeyword in the- SingleBlockManagerconstructor is deprecated and will be removed in a future version (GH33092)
- Providing - suffixesas a- setin- pandas.merge()is deprecated. Provide a tuple instead (GH33740, GH34741).
- Indexing a - Serieswith a multi-dimensional indexer like- [:, None]to return an- ndarraynow raises a- FutureWarning. Convert to a NumPy array before indexing instead (GH27837)
- Index.is_mixed()is deprecated and will be removed in a future version, check- index.inferred_typedirectly instead (GH32922)
- Passing any arguments but the first one to - read_html()as positional arguments is deprecated. All other arguments should be given as keyword arguments (GH27573).
- Passing any arguments but - path_or_buf(the first one) to- read_json()as positional arguments is deprecated. All other arguments should be given as keyword arguments (GH27573).
- Passing any arguments but the first two to - read_excel()as positional arguments is deprecated. All other arguments should be given as keyword arguments (GH27573).
- pandas.api.types.is_categorical()is deprecated and will be removed in a future version; use- pandas.api.types.is_categorical_dtype()instead (GH33385)
- Index.get_value()is deprecated and will be removed in a future version (GH19728)
- Series.dt.week()and- Series.dt.weekofyear()are deprecated and will be removed in a future version, use- Series.dt.isocalendar().week()instead (GH33595)
- DatetimeIndex.week()and- DatetimeIndex.weekofyearare deprecated and will be removed in a future version, use- DatetimeIndex.isocalendar().weekinstead (GH33595)
- DatetimeArray.week()and- DatetimeArray.weekofyearare deprecated and will be removed in a future version, use- DatetimeArray.isocalendar().weekinstead (GH33595)
- DateOffset.__call__()is deprecated and will be removed in a future version, use- offset + otherinstead (GH34171)
- apply_index()is deprecated and will be removed in a future version. Use- offset + otherinstead (GH34580)
- DataFrame.tshift()and- Series.tshift()are deprecated and will be removed in a future version, use- DataFrame.shift()and- Series.shift()instead (GH11631)
- Indexing an - Indexobject with a float key is deprecated, and will raise an- IndexErrorin the future. You can manually convert to an integer key instead (GH34191).
- The - squeezekeyword in- groupby()is deprecated and will be removed in a future version (GH32380)
- The - tzkeyword in- Period.to_timestamp()is deprecated and will be removed in a future version; use- per.to_timestamp(...).tz_localize(tz)instead (GH34522)
- DatetimeIndex.to_perioddelta()is deprecated and will be removed in a future version. Use- index - index.to_period(freq).to_timestamp()instead (GH34853)
- DataFrame.melt()accepting a- value_namethat already exists is deprecated, and will be removed in a future version (GH34731)
- The - centerkeyword in the- DataFrame.expanding()function is deprecated and will be removed in a future version (GH20647)
Performance improvements#
- Performance improvement in flex arithmetic ops between - DataFrameand- Serieswith- axis=0(GH31296)
- Performance improvement in arithmetic ops between - DataFrameand- Serieswith- axis=1(GH33600)
- The internal index method - _shallow_copy()now copies cached attributes over to the new index, avoiding creating these again on the new index. This can speed up many operations that depend on creating copies of existing indexes (GH28584, GH32640, GH32669)
- Significant performance improvement when creating a - DataFramewith sparse values from- scipy.sparsematrices using the- DataFrame.sparse.from_spmatrix()constructor (GH32821, GH32825, GH32826, GH32856, GH32858).
- Performance improvement for groupby methods - first()and- last()(GH34178)
- Performance improvement in - factorize()for nullable (integer and Boolean) dtypes (GH33064).
- Performance improvement when constructing - Categoricalobjects (GH33921)
- Fixed performance regression in - pandas.qcut()and- pandas.cut()(GH33921)
- Performance improvement in reductions ( - sum,- prod,- min,- max) for nullable (integer and Boolean) dtypes (GH30982, GH33261, GH33442).
- Performance improvement in arithmetic operations between two - DataFrameobjects (GH32779)
- Performance improvement in - pandas.core.groupby.RollingGroupby(GH34052)
- Performance improvement in arithmetic operations ( - sub,- add,- mul,- div) for- MultiIndex(GH34297)
- Performance improvement in - DataFrame[bool_indexer]when- bool_indexeris a- list(GH33924)
- Significant performance improvement of - io.formats.style.Styler.render()with styles added with various ways such as- io.formats.style.Styler.apply(),- io.formats.style.Styler.applymap()or- io.formats.style.Styler.bar()(GH19917)
Bug fixes#
Categorical#
- Passing an invalid - fill_valueto- Categorical.take()raises a- ValueErrorinstead of- TypeError(GH33660)
- Combining a - Categoricalwith integer categories and which contains missing values with a float dtype column in operations such as- concat()or- append()will now result in a float column instead of an object dtype column (GH33607)
- Bug where - merge()was unable to join on non-unique categorical indices (GH28189)
- Bug when passing categorical data to - Indexconstructor along with- dtype=objectincorrectly returning a- CategoricalIndexinstead of object-dtype- Index(GH32167)
- Bug where - Categoricalcomparison operator- __ne__would incorrectly evaluate to- Falsewhen either element was missing (GH32276)
- Categorical.fillna()now accepts- Categorical- otherargument (GH32420)
- Repr of - Categoricalwas not distinguishing between- intand- str(GH33676)
Datetimelike#
- Passing an integer dtype other than - int64to- np.array(period_index, dtype=...)will now raise- TypeErrorinstead of incorrectly using- int64(GH32255)
- Series.to_timestamp()now raises a- TypeErrorif the axis is not a- PeriodIndex. Previously an- AttributeErrorwas raised (GH33327)
- Series.to_period()now raises a- TypeErrorif the axis is not a- DatetimeIndex. Previously an- AttributeErrorwas raised (GH33327)
- Periodno longer accepts tuples for the- freqargument (GH34658)
- Bug in - Timestampwhere constructing a- Timestampfrom ambiguous epoch time and calling constructor again changed the- Timestamp.value()property (GH24329)
- DatetimeArray.searchsorted(),- TimedeltaArray.searchsorted(),- PeriodArray.searchsorted()not recognizing non-pandas scalars and incorrectly raising- ValueErrorinstead of- TypeError(GH30950)
- Bug in - Timestampwhere constructing- Timestampwith dateutil timezone less than 128 nanoseconds before daylight saving time switch from winter to summer would result in nonexistent time (GH31043)
- Bug in - Period.to_timestamp(),- Period.start_time()with microsecond frequency returning a timestamp one nanosecond earlier than the correct time (GH31475)
- Timestampraised a confusing error message when year, month or day is missing (GH31200)
- Bug in - DatetimeIndexconstructor incorrectly accepting- bool-dtype inputs (GH32668)
- Bug in - DatetimeIndex.searchsorted()not accepting a- listor- Seriesas its argument (GH32762)
- Bug where - PeriodIndex()raised when passed a- Seriesof strings (GH26109)
- Bug in - Timestamparithmetic when adding or subtracting an- np.ndarraywith- timedelta64dtype (GH33296)
- Bug in - DatetimeIndex.to_period()not inferring the frequency when called with no arguments (GH33358)
- Bug in - DatetimeIndex.tz_localize()incorrectly retaining- freqin some cases where the original- freqis no longer valid (GH30511)
- Bug in - DatetimeIndex.intersection()losing- freqand timezone in some cases (GH33604)
- Bug in - DatetimeIndex.get_indexer()where incorrect output would be returned for mixed datetime-like targets (GH33741)
- Bug in - DatetimeIndexaddition and subtraction with some types of- DateOffsetobjects incorrectly retaining an invalid- freqattribute (GH33779)
- Bug in - DatetimeIndexwhere setting the- freqattribute on an index could silently change the- freqattribute on another index viewing the same data (GH33552)
- DataFrame.min()and- DataFrame.max()were not returning consistent results with- Series.min()and- Series.max()when called on objects initialized with empty- pd.to_datetime()
- Bug in - DatetimeIndex.intersection()and- TimedeltaIndex.intersection()with results not having the correct- nameattribute (GH33904)
- Bug in - DatetimeArray.__setitem__(),- TimedeltaArray.__setitem__(),- PeriodArray.__setitem__()incorrectly allowing values with- int64dtype to be silently cast (GH33717)
- Bug in subtracting - TimedeltaIndexfrom- Periodincorrectly raising- TypeErrorin some cases where it should succeed and- IncompatibleFrequencyin some cases where it should raise- TypeError(GH33883)
- Bug in constructing a - Seriesor- Indexfrom a read-only NumPy array with non-ns resolution which converted to object dtype instead of coercing to- datetime64[ns]dtype when within the timestamp bounds (GH34843).
- The - freqkeyword in- Period,- date_range(),- period_range(),- pd.tseries.frequencies.to_offset()no longer allows tuples, pass as string instead (GH34703)
- Bug in - DataFrame.append()when appending a- Seriescontaining a scalar tz-aware- Timestampto an empty- DataFrameresulted in an object column instead of- datetime64[ns, tz]dtype (GH35038)
- OutOfBoundsDatetimeissues an improved error message when timestamp is out of implementation bounds. (GH32967)
- Bug in - AbstractHolidayCalendar.holidays()when no rules were defined (GH31415)
- Bug in - Tickcomparisons raising- TypeErrorwhen comparing against timedelta-like objects (GH34088)
- Bug in - Tickmultiplication raising- TypeErrorwhen multiplying by a float (GH34486)
Timedelta#
- Bug in constructing a - Timedeltawith a high precision integer that would round the- Timedeltacomponents (GH31354)
- Bug in dividing - np.nanor- Noneby- Timedeltaincorrectly returning- NaT(GH31869)
- Timedeltanow understands- µsas an identifier for microsecond (GH32899)
- Timedeltastring representation now includes nanoseconds, when nanoseconds are non-zero (GH9309)
- Bug in comparing a - Timedeltaobject against an- np.ndarraywith- timedelta64dtype incorrectly viewing all entries as unequal (GH33441)
- Bug in - timedelta_range()that produced an extra point on a edge case (GH30353, GH33498)
- Bug in - DataFrame.resample()that produced an extra point on a edge case (GH30353, GH13022, GH33498)
- Bug in - DataFrame.resample()that ignored the- loffsetargument when dealing with timedelta (GH7687, GH33498)
- Bug in - Timedeltaand- pandas.to_timedelta()that ignored the- unitargument for string input (GH12136)
Timezones#
- Bug in - to_datetime()with- infer_datetime_format=Truewhere timezone names (e.g.- UTC) would not be parsed correctly (GH33133)
Numeric#
- Bug in - DataFrame.floordiv()with- axis=0not treating division-by-zero like- Series.floordiv()(GH31271)
- Bug in - to_numeric()with string argument- "uint64"and- errors="coerce"silently fails (GH32394)
- Bug in - to_numeric()with- downcast="unsigned"fails for empty data (GH32493)
- Bug in - DataFrame.mean()with- numeric_only=Falseand either- datetime64dtype or- PeriodDtypecolumn incorrectly raising- TypeError(GH32426)
- Bug in - DataFrame.count()with- level="foo"and index level- "foo"containing NaNs causes segmentation fault (GH21824)
- Bug in - DataFrame.diff()with- axis=1returning incorrect results with mixed dtypes (GH32995)
- Bug in - DataFrame.corr()and- DataFrame.cov()raising when handling nullable integer columns with- pandas.NA(GH33803)
- Bug in arithmetic operations between - DataFrameobjects with non-overlapping columns with duplicate labels causing an infinite loop (GH35194)
- Bug in - DataFrameand- Seriesaddition and subtraction between object-dtype objects and- datetime64dtype objects (GH33824)
- Bug in - Index.difference()giving incorrect results when comparing a- Float64Indexand object- Index(GH35217)
- Bug in - DataFramereductions (e.g.- df.min(),- df.max()) with- ExtensionArraydtypes (GH34520, GH32651)
- Series.interpolate()and- DataFrame.interpolate()now raise a ValueError if- limit_directionis- 'forward'or- 'both'and- methodis- 'backfill'or- 'bfill'or- limit_directionis- 'backward'or- 'both'and- methodis- 'pad'or- 'ffill'(GH34746)
Conversion#
- Bug in - Seriesconstruction from NumPy array with big-endian- datetime64dtype (GH29684)
- Bug in - Timedeltaconstruction with large nanoseconds keyword value (GH32402)
- Bug in - DataFrameconstruction where sets would be duplicated rather than raising (GH32582)
- The - DataFrameconstructor no longer accepts a list of- DataFrameobjects. Because of changes to NumPy,- DataFrameobjects are now consistently treated as 2D objects, so a list of- DataFrameobjects is considered 3D, and no longer acceptable for the- DataFrameconstructor (GH32289).
- Bug in - DataFramewhen initiating a frame with lists and assign- columnswith nested list for- MultiIndex(GH32173)
- Improved error message for invalid construction of list when creating a new index (GH35190) 
Strings#
- Bug in the - astype()method when converting “string” dtype data to nullable integer dtype (GH32450).
- Fixed issue where taking - minor- maxof a- StringArrayor- Serieswith- StringDtypetype would raise. (GH31746)
- Bug in - Series.str.cat()returning- NaNoutput when other had- Indextype (GH33425)
- pandas.api.dtypes.is_string_dtype()no longer incorrectly identifies categorical series as string.
Interval#
- Bug in - IntervalArrayincorrectly allowing the underlying data to be changed when setting values (GH32782)
Indexing#
- DataFrame.xs()now raises a- TypeErrorif a- levelkeyword is supplied and the axis is not a- MultiIndex. Previously an- AttributeErrorwas raised (GH33610)
- Bug in slicing on a - DatetimeIndexwith a partial-timestamp dropping high-resolution indices near the end of a year, quarter, or month (GH31064)
- Bug in - PeriodIndex.get_loc()treating higher-resolution strings differently from- PeriodIndex.get_value()(GH31172)
- Bug in - Series.at()and- DataFrame.at()not matching- .locbehavior when looking up an integer in a- Float64Index(GH31329)
- Bug in - PeriodIndex.is_monotonic()incorrectly returning- Truewhen containing leading- NaTentries (GH31437)
- Bug in - DatetimeIndex.get_loc()raising- KeyErrorwith converted-integer key instead of the user-passed key (GH31425)
- Bug in - Series.xs()incorrectly returning- Timestampinstead of- datetime64in some object-dtype cases (GH31630)
- Bug in - DataFrame.iat()incorrectly returning- Timestampinstead of- datetimein some object-dtype cases (GH32809)
- Bug in - DataFrame.at()when either columns or index is non-unique (GH33041)
- Bug in - Series.loc()and- DataFrame.loc()when indexing with an integer key on a object-dtype- Indexthat is not all-integers (GH31905)
- Bug in - DataFrame.iloc.__setitem__()on a- DataFramewith duplicate columns incorrectly setting values for all matching columns (GH15686, GH22036)
- Bug in - DataFrame.loc()and- Series.loc()with a- DatetimeIndex,- TimedeltaIndex, or- PeriodIndexincorrectly allowing lookups of non-matching datetime-like dtypes (GH32650)
- Bug in - Series.__getitem__()indexing with non-standard scalars, e.g.- np.dtype(GH32684)
- Bug in - Indexconstructor where an unhelpful error message was raised for NumPy scalars (GH33017)
- Bug in - DataFrame.lookup()incorrectly raising an- AttributeErrorwhen- frame.indexor- frame.columnsis not unique; this will now raise a- ValueErrorwith a helpful error message (GH33041)
- Bug in - Intervalwhere a- Timedeltacould not be added or subtracted from a- Timestampinterval (GH32023)
- Bug in - DataFrame.copy()not invalidating _item_cache after copy caused post-copy value updates to not be reflected (GH31784)
- Fixed regression in - DataFrame.loc()and- Series.loc()throwing an error when a- datetime64[ns, tz]value is provided (GH32395)
- Bug in - Series.__getitem__()with an integer key and a- MultiIndexwith leading integer level failing to raise- KeyErrorif the key is not present in the first level (GH33355)
- Bug in - DataFrame.iloc()when slicing a single column- DataFramewith- ExtensionDtype(e.g.- df.iloc[:, :1]) returning an invalid result (GH32957)
- Bug in - DatetimeIndex.insert()and- TimedeltaIndex.insert()causing index- freqto be lost when setting an element into an empty- Series(GH33573)
- Bug in - Series.__setitem__()with an- IntervalIndexand a list-like key of integers (GH33473)
- Bug in - Series.__getitem__()allowing missing labels with- np.ndarray,- Index,- Seriesindexers but not- list, these now all raise- KeyError(GH33646)
- Bug in - DataFrame.truncate()and- Series.truncate()where index was assumed to be monotone increasing (GH33756)
- Indexing with a list of strings representing datetimes failed on - DatetimeIndexor- PeriodIndex(GH11278)
- Bug in - Series.at()when used with a- MultiIndexwould raise an exception on valid inputs (GH26989)
- Bug in - DataFrame.loc()with dictionary of values changes columns with dtype of- intto- float(GH34573)
- Bug in - Series.loc()when used with a- MultiIndexwould raise an- IndexingErrorwhen accessing a- Nonevalue (GH34318)
- Bug in - DataFrame.reset_index()and- Series.reset_index()would not preserve data types on an empty- DataFrameor- Serieswith a- MultiIndex(GH19602)
- Bug in - Seriesand- DataFrameindexing with a- timekey on a- DatetimeIndexwith- NaTentries (GH35114)
Missing#
- Calling - fillna()on an empty- Seriesnow correctly returns a shallow copied object. The behaviour is now consistent with- Index,- DataFrameand a non-empty- Series(GH32543).
- Bug in - Series.replace()when argument- to_replaceis of type dict/list and is used on a- Seriescontaining- <NA>was raising a- TypeError. The method now handles this by ignoring- <NA>values when doing the comparison for the replacement (GH32621)
- Bug in - any()and- all()incorrectly returning- <NA>for all- Falseor all- Truevalues using the nulllable Boolean dtype and with- skipna=False(GH33253)
- Clarified documentation on interpolate with - method=akima. The- derparameter must be scalar or- None(GH33426)
- DataFrame.interpolate()uses the correct axis convention now. Previously interpolating along columns lead to interpolation along indices and vice versa. Furthermore interpolating with methods- pad,- ffill,- bfilland- backfillare identical to using these methods with- DataFrame.fillna()(GH12918, GH29146)
- Bug in - DataFrame.interpolate()when called on a- DataFramewith column names of string type was throwing a ValueError. The method is now independent of the type of the column names (GH33956)
- Passing - NAinto a format string using format specs will now work. For example- "{:.1f}".format(pd.NA)would previously raise a- ValueError, but will now return the string- "<NA>"(GH34740)
- Bug in - Series.map()not raising on invalid- na_action(GH32815)
MultiIndex#
- DataFrame.swaplevels()now raises a- TypeErrorif the axis is not a- MultiIndex. Previously an- AttributeErrorwas raised (GH31126)
- Bug in - Dataframe.loc()when used with a- MultiIndex. The returned values were not in the same order as the given inputs (GH22797)
In [80]: df = pd.DataFrame(np.arange(4),
   ....:                   index=[["a", "a", "b", "b"], [1, 2, 1, 2]])
   ....: 
# Rows are now ordered as the requested keys
In [81]: df.loc[(['b', 'a'], [2, 1]), :]
Out[81]: 
     0
b 2  3
  1  2
a 2  1
  1  0
- Bug in - MultiIndex.intersection()was not guaranteed to preserve order when- sort=False. (GH31325)
- Bug in - DataFrame.truncate()was dropping- MultiIndexnames. (GH34564)
In [82]: left = pd.MultiIndex.from_arrays([["b", "a"], [2, 1]])
In [83]: right = pd.MultiIndex.from_arrays([["a", "b", "c"], [1, 2, 3]])
# Common elements are now guaranteed to be ordered by the left side
In [84]: left.intersection(right, sort=False)
Out[84]: 
MultiIndex([('b', 2),
            ('a', 1)],
           )
- Bug when joining two - MultiIndexwithout specifying level with different columns. Return-indexers parameter was ignored. (GH34074)
IO#
- Passing a - setas- namesargument to- pandas.read_csv(),- pandas.read_table(), or- pandas.read_fwf()will raise- ValueError: Names should be an ordered collection.(GH34946)
- Bug in print-out when - display.precisionis zero. (GH20359)
- Bug in - read_json()where integer overflow was occurring when json contains big number strings. (GH30320)
- read_csv()will now raise a- ValueErrorwhen the arguments- headerand- prefixboth are not- None. (GH27394)
- Bug in - DataFrame.to_json()was raising- NotFoundErrorwhen- path_or_bufwas an S3 URI (GH28375)
- Bug in - DataFrame.to_parquet()overwriting pyarrow’s default for- coerce_timestamps; following pyarrow’s default allows writing nanosecond timestamps with- version="2.0"(GH31652).
- Bug in - read_csv()was raising- TypeErrorwhen- sep=Nonewas used in combination with- commentkeyword (GH31396)
- Bug in - HDFStorethat caused it to set to- int64the dtype of a- datetime64column when reading a- DataFramein Python 3 from fixed format written in Python 2 (GH31750)
- read_sas()now handles dates and datetimes larger than- Timestamp.maxreturning them as- datetime.datetimeobjects (GH20927)
- Bug in - DataFrame.to_json()where- Timedeltaobjects would not be serialized correctly with- date_format="iso"(GH28256)
- read_csv()will raise a- ValueErrorwhen the column names passed in- parse_datesare missing in the- Dataframe(GH31251)
- Bug in - read_excel()where a UTF-8 string with a high surrogate would cause a segmentation violation (GH23809)
- Bug in - read_csv()was causing a file descriptor leak on an empty file (GH31488)
- Bug in - read_csv()was causing a segfault when there were blank lines between the header and data rows (GH28071)
- Bug in - read_csv()was raising a misleading exception on a permissions issue (GH23784)
- Bug in - read_csv()was raising an- IndexErrorwhen- header=Noneand two extra data columns
- Bug in - read_sas()was raising an- AttributeErrorwhen reading files from Google Cloud Storage (GH33069)
- Bug in - DataFrame.to_sql()where an- AttributeErrorwas raised when saving an out of bounds date (GH26761)
- Bug in - read_excel()did not correctly handle multiple embedded spaces in OpenDocument text cells. (GH32207)
- Bug in - read_json()was raising- TypeErrorwhen reading a- listof Booleans into a- Series. (GH31464)
- Bug in - pandas.io.json.json_normalize()where location specified by- record_pathdoesn’t point to an array. (GH26284)
- pandas.read_hdf()has a more explicit error message when loading an unsupported HDF file (GH9539)
- Bug in - read_feather()was raising an- ArrowIOErrorwhen reading an s3 or http file path (GH29055)
- Bug in - to_excel()could not handle the column name- renderand was raising an- KeyError(GH34331)
- Bug in - execute()was raising a- ProgrammingErrorfor some DB-API drivers when the SQL statement contained the- %character and no parameters were present (GH34211)
- Bug in - StataReader()which resulted in categorical variables with different dtypes when reading data using an iterator. (GH31544)
- HDFStore.keys()has now an optional- includeparameter that allows the retrieval of all native HDF5 table names (GH29916)
- TypeErrorexceptions raised by- read_csv()and- read_table()were showing as- parser_fwhen an unexpected keyword argument was passed (GH25648)
- Bug in - read_excel()for ODS files removes 0.0 values (GH27222)
- Bug in - ujson.encode()was raising an- OverflowErrorwith numbers larger than- sys.maxsize(GH34395)
- Bug in - HDFStore.append_to_multiple()was raising a- ValueErrorwhen the- min_itemsizeparameter is set (GH11238)
- Bug in - create_table()now raises an error when- columnargument was not specified in- data_columnson input (GH28156)
- read_json()now could read line-delimited json file from a file url while- linesand- chunksizeare set.
- Bug in - DataFrame.to_sql()when reading DataFrames with- -np.infentries with MySQL now has a more explicit- ValueError(GH34431)
- Bug where capitalised files extensions were not decompressed by read_* functions (GH35164) 
- Bug in - read_excel()that was raising a- TypeErrorwhen- header=Noneand- index_colis given as a- list(GH31783)
- Bug in - read_excel()where datetime values are used in the header in a- MultiIndex(GH34748)
- read_excel()no longer takes- **kwdsarguments. This means that passing in the keyword argument- chunksizenow raises a- TypeError(previously raised a- NotImplementedError), while passing in the keyword argument- encodingnow raises a- TypeError(GH34464)
- Bug in - DataFrame.to_records()was incorrectly losing timezone information in timezone-aware- datetime64columns (GH32535)
Plotting#
- DataFrame.plot()for line/bar now accepts color by dictionary (GH8193).
- Bug in - DataFrame.plot.hist()where weights are not working for multiple columns (GH33173)
- Bug in - DataFrame.boxplot()and- DataFrame.plot.boxplot()lost color attributes of- medianprops,- whiskerprops,- cappropsand- boxprops(GH30346)
- Bug in - DataFrame.hist()where the order of- columnargument was ignored (GH29235)
- Bug in - DataFrame.plot.scatter()that when adding multiple plots with different- cmap, colorbars always use the first- cmap(GH33389)
- Bug in - DataFrame.plot.scatter()was adding a colorbar to the plot even if the argument- cwas assigned to a column containing color names (GH34316)
- Bug in - pandas.plotting.bootstrap_plot()was causing cluttered axes and overlapping labels (GH34905)
- Bug in - DataFrame.plot.scatter()caused an error when plotting variable marker sizes (GH32904)
GroupBy/resample/rolling#
- Using a - pandas.api.indexers.BaseIndexerwith- count,- min,- max,- median,- skew,- cov,- corrwill now return correct results for any monotonic- pandas.api.indexers.BaseIndexerdescendant (GH32865)
- DataFrameGroupby.mean()and- SeriesGroupby.mean()(and similarly for- median(),- std()and- var()) now raise a- TypeErrorif a non-accepted keyword argument is passed into it. Previously an- UnsupportedFunctionCallwas raised (- AssertionErrorif- min_countpassed into- median()) (GH31485)
- Bug in - DataFrameGroupBy.apply()and- SeriesGroupBy.apply()raising- ValueErrorwhen the- byaxis is not sorted, has duplicates, and the applied- funcdoes not mutate passed in objects (GH30667)
- Bug in - DataFrameGroupBy.transform()produces an incorrect result with transformation functions (GH30918)
- Bug in - DataFrameGroupBy.transform()and- SeriesGroupBy.transform()were returning the wrong result when grouping by multiple keys of which some were categorical and others not (GH32494)
- Bug in - DataFrameGroupBy.count()and- SeriesGroupBy.count()causing segmentation fault when grouped-by columns contain NaNs (GH32841)
- Bug in - DataFrame.groupby()and- Series.groupby()produces inconsistent type when aggregating Boolean- Series(GH32894)
- Bug in - DataFrameGroupBy.sum()and- SeriesGroupBy.sum()where a large negative number would be returned when the number of non-null values was below- min_countfor nullable integer dtypes (GH32861)
- Bug in - SeriesGroupBy.quantile()was raising on nullable integers (GH33136)
- Bug in - DataFrame.resample()where an- AmbiguousTimeErrorwould be raised when the resulting timezone aware- DatetimeIndexhad a DST transition at midnight (GH25758)
- Bug in - DataFrame.groupby()where a- ValueErrorwould be raised when grouping by a categorical column with read-only categories and- sort=False(GH33410)
- Bug in - DataFrameGroupBy.agg(),- SeriesGroupBy.agg(),- DataFrameGroupBy.transform(),- SeriesGroupBy.transform(),- DataFrameGroupBy.resample(), and- SeriesGroupBy.resample()where subclasses are not preserved (GH28330)
- Bug in - SeriesGroupBy.agg()where any column name was accepted in the named aggregation of- SeriesGroupBypreviously. The behaviour now allows only- strand callables else would raise- TypeError. (GH34422)
- Bug in - DataFrame.groupby()lost the name of the- Indexwhen one of the- aggkeys referenced an empty list (GH32580)
- Bug in - Rolling.apply()where- center=Truewas ignored when- engine='numba'was specified (GH34784)
- Bug in - DataFrame.ewm.cov()was throwing- AssertionErrorfor- MultiIndexinputs (GH34440)
- Bug in - core.groupby.DataFrameGroupBy.quantile()raised- TypeErrorfor non-numeric types rather than dropping the columns (GH27892)
- Bug in - core.groupby.DataFrameGroupBy.transform()when- func='nunique'and columns are of type- datetime64, the result would also be of type- datetime64instead of- int64(GH35109)
- Bug in - DataFrame.groupby()raising an- AttributeErrorwhen selecting a column and aggregating with- as_index=False(GH35246).
- Bug in - DataFrameGroupBy.first()and- DataFrameGroupBy.last()that would raise an unnecessary- ValueErrorwhen grouping on multiple- Categoricals(GH34951)
Reshaping#
- Bug effecting all numeric and Boolean reduction methods not returning subclassed data type. (GH25596) 
- Bug in - DataFrame.pivot_table()when only- MultiIndexedcolumns is set (GH17038)
- Bug in - DataFrame.unstack()and- Series.unstack()can take tuple names in- MultiIndexeddata (GH19966)
- Bug in - DataFrame.pivot_table()when- marginis- Trueand only- columnis defined (GH31016)
- Fixed incorrect error message in - DataFrame.pivot()when- columnsis set to- None. (GH30924)
- Bug in - crosstab()when inputs are two- Seriesand have tuple names, the output will keep a dummy- MultiIndexas columns. (GH18321)
- DataFrame.pivot()can now take lists for- indexand- columnsarguments (GH21425)
- Bug in - concat()where the resulting indices are not copied when- copy=True(GH29879)
- Bug in - SeriesGroupBy.aggregate()was resulting in aggregations being overwritten when they shared the same name (GH30880)
- Bug where - Index.astype()would lose the- nameattribute when converting from- Float64Indexto- Int64Index, or when casting to an- ExtensionArraydtype (GH32013)
- Series.append()will now raise a- TypeErrorwhen passed a- DataFrameor a sequence containing- DataFrame(GH31413)
- DataFrame.replace()and- Series.replace()will raise a- TypeErrorif- to_replaceis not an expected type. Previously the- replacewould fail silently (GH18634)
- Bug on inplace operation of a - Seriesthat was adding a column to the- DataFramefrom where it was originally dropped from (using- inplace=True) (GH30484)
- Bug in - DataFrame.apply()where callback was called with- Seriesparameter even though- raw=Truerequested. (GH32423)
- Bug in - DataFrame.pivot_table()losing timezone information when creating a- MultiIndexlevel from a column with timezone-aware dtype (GH32558)
- Bug in - concat()where when passing a non-dict mapping as- objswould raise a- TypeError(GH32863)
- DataFrame.agg()now provides more descriptive- SpecificationErrormessage when attempting to aggregate a non-existent column (GH32755)
- Bug in - DataFrame.unstack()when- MultiIndexcolumns and- MultiIndexrows were used (GH32624, GH24729 and GH28306)
- Appending a dictionary to a - DataFramewithout passing- ignore_index=Truewill raise- TypeError: Can only append a dict if ignore_index=Trueinstead of- TypeError: Can only append a :class:`Series` if ignore_index=True or if the :class:`Series` has a name(GH30871)
- Bug in - DataFrame.corrwith(),- DataFrame.memory_usage(),- DataFrame.dot(),- DataFrame.idxmin(),- DataFrame.idxmax(),- DataFrame.duplicated(),- DataFrame.isin(),- DataFrame.count(),- Series.explode(),- Series.asof()and- DataFrame.asof()not returning subclassed types. (GH31331)
- Bug in - concat()was not allowing for concatenation of- DataFrameand- Serieswith duplicate keys (GH33654)
- Bug in - cut()raised an error when the argument- labelscontains duplicates (GH33141)
- Bug in - Dataframe.aggregate()and- Series.aggregate()was causing a recursive loop in some cases (GH34224)
- Fixed bug in - melt()where melting- MultiIndexcolumns with- col_level > 0would raise a- KeyErroron- id_vars(GH34129)
- Bug in - Series.where()with an empty- Seriesand empty- condhaving non-bool dtype (GH34592)
- Fixed regression where - DataFrame.apply()would raise- ValueErrorfor elements with- Sdtype (GH34529)
Sparse#
- Creating a - SparseArrayfrom timezone-aware dtype will issue a warning before dropping timezone information, instead of doing so silently (GH32501)
- Bug in - arrays.SparseArray.from_spmatrix()wrongly read scipy sparse matrix (GH31991)
- Bug in - Series.sum()with- SparseArrayraised a- TypeError(GH25777)
- Bug where - DataFramecontaining an all-sparse- SparseArrayfilled with- NaNwhen indexed by a list-like (GH27781, GH29563)
- The repr of - SparseDtypenow includes the repr of its- fill_valueattribute. Previously it used- fill_value’s string representation (GH34352)
- Bug where empty - DataFramecould not be cast to- SparseDtype(GH33113)
- Bug in - arrays.SparseArray()was returning the incorrect type when indexing a sparse dataframe with an iterable (GH34526, GH34540)
ExtensionArray#
- Fixed bug where - Series.value_counts()would raise on empty input of- Int64dtype (GH33317)
- Fixed bug in - concat()when concatenating- DataFrameobjects with non-overlapping columns resulting in object-dtype columns rather than preserving the extension dtype (GH27692, GH33027)
- Fixed bug where - StringArray.isna()would return- Falsefor NA values when- pandas.options.mode.use_inf_as_nawas set to- True(GH33655)
- Fixed bug in - Seriesconstruction with EA dtype and index but no data or scalar data fails (GH26469)
- Fixed bug that caused - Series.__repr__()to crash for extension types whose elements are multidimensional arrays (GH33770).
- Fixed bug where - Series.update()would raise a- ValueErrorfor- ExtensionArraydtypes with missing values (GH33980)
- Fixed bug where - StringArray.memory_usage()was not implemented (GH33963)
- Fixed bug where - DataFrameGroupBy()would ignore the- min_countargument for aggregations on nullable Boolean dtypes (GH34051)
- Fixed bug where the constructor of - DataFramewith- dtype='string'would fail (GH27953, GH33623)
- Bug where - DataFramecolumn set to scalar extension type was considered an object type rather than the extension type (GH34832)
- Fixed bug in - IntegerArray.astype()to correctly copy the mask as well (GH34931).
Other#
- Set operations on an object-dtype - Indexnow always return object-dtype results (GH31401)
- Fixed - pandas.testing.assert_series_equal()to correctly raise if the- leftargument is a different subclass with- check_series_type=True(GH32670).
- Getting a missing attribute in a - DataFrame.query()or- DataFrame.eval()string raises the correct- AttributeError(GH32408)
- Fixed bug in - pandas.testing.assert_series_equal()where dtypes were checked for- Intervaland- ExtensionArrayoperands when- check_dtypewas- False(GH32747)
- Bug in - DataFrame.__dir__()caused a segfault when using unicode surrogates in a column name (GH25509)
- Bug in - DataFrame.equals()and- Series.equals()in allowing subclasses to be equal (GH34402).
Contributors#
A total of 368 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
- 3vts + 
- A Brooks + 
- Abbie Popa + 
- Achmad Syarif Hidayatullah + 
- Adam W Bagaskarta + 
- Adrian Mastronardi + 
- Aidan Montare + 
- Akbar Septriyan + 
- Akos Furton + 
- Alejandro Hall + 
- Alex Hall + 
- Alex Itkes + 
- Alex Kirko 
- Ali McMaster + 
- Alvaro Aleman + 
- Amy Graham + 
- Andrew Schonfeld + 
- Andrew Shumanskiy + 
- Andrew Wieteska + 
- Angela Ambroz 
- Anjali Singh + 
- Anna Daglis 
- Anthony Milbourne + 
- Antony Lee + 
- Ari Sosnovsky + 
- Arkadeep Adhikari + 
- Arunim Samudra + 
- Ashkan + 
- Ashwin Prakash Nalwade + 
- Ashwin Srinath + 
- Atsushi Nukariya + 
- Ayappan + 
- Ayla Khan + 
- Bart + 
- Bart Broere + 
- Benjamin Beier Liu + 
- Benjamin Fischer + 
- Bharat Raghunathan 
- Bradley Dice + 
- Brendan Sullivan + 
- Brian Strand + 
- Carsten van Weelden + 
- Chamoun Saoma + 
- ChrisRobo + 
- Christian Chwala 
- Christopher Whelan 
- Christos Petropoulos + 
- Chuanzhu Xu 
- CloseChoice + 
- Clément Robert + 
- CuylenE + 
- DanBasson + 
- Daniel Saxton 
- Danilo Horta + 
- DavaIlhamHaeruzaman + 
- Dave Hirschfeld 
- Dave Hughes 
- David Rouquet + 
- David S + 
- Deepyaman Datta 
- Dennis Bakhuis + 
- Derek McCammond + 
- Devjeet Roy + 
- Diane Trout 
- Dina + 
- Dom + 
- Drew Seibert + 
- EdAbati 
- Emiliano Jordan + 
- Erfan Nariman + 
- Eric Groszman + 
- Erik Hasse + 
- Erkam Uyanik + 
- Evan D + 
- Evan Kanter + 
- Fangchen Li + 
- Farhan Reynaldo + 
- Farhan Reynaldo Hutabarat + 
- Florian Jetter + 
- Fred Reiss + 
- GYHHAHA + 
- Gabriel Moreira + 
- Gabriel Tutui + 
- Galuh Sahid 
- Gaurav Chauhan + 
- George Hartzell + 
- Gim Seng + 
- Giovanni Lanzani + 
- Gordon Chen + 
- Graham Wetzler + 
- Guillaume Lemaitre 
- Guillem Sánchez + 
- HH-MWB + 
- Harshavardhan Bachina 
- How Si Wei 
- Ian Eaves 
- Iqrar Agalosi Nureyza + 
- Irv Lustig 
- Iva Laginja + 
- JDkuba 
- Jack Greisman + 
- Jacob Austin + 
- Jacob Deppen + 
- Jacob Peacock + 
- Jake Tae + 
- Jake Vanderplas + 
- James Cobon-Kerr 
- Jan Červenka + 
- Jan Škoda 
- Jane Chen + 
- Jean-Francois Zinque + 
- Jeanderson Barros Candido + 
- Jeff Reback 
- Jered Dominguez-Trujillo + 
- Jeremy Schendel 
- Jesse Farnham 
- Jiaxiang 
- Jihwan Song + 
- Joaquim L. Viegas + 
- Joel Nothman 
- John Bodley + 
- John Paton + 
- Jon Thielen + 
- Joris Van den Bossche 
- Jose Manuel Martí + 
- Joseph Gulian + 
- Josh Dimarsky 
- Joy Bhalla + 
- João Veiga + 
- Julian de Ruiter + 
- Justin Essert + 
- Justin Zheng 
- KD-dev-lab + 
- Kaiqi Dong 
- Karthik Mathur + 
- Kaushal Rohit + 
- Kee Chong Tan 
- Ken Mankoff + 
- Kendall Masse 
- Kenny Huynh + 
- Ketan + 
- Kevin Anderson + 
- Kevin Bowey + 
- Kevin Sheppard 
- Kilian Lieret + 
- Koki Nishihara + 
- Krishna Chivukula + 
- KrishnaSai2020 + 
- Lesley + 
- Lewis Cowles + 
- Linda Chen + 
- Linxiao Wu + 
- Lucca Delchiaro Costabile + 
- MBrouns + 
- Mabel Villalba 
- Mabroor Ahmed + 
- Madhuri Palanivelu + 
- Mak Sze Chun 
- Malcolm + 
- Marc Garcia 
- Marco Gorelli 
- Marian Denes + 
- Martin Bjeldbak Madsen + 
- Martin Durant + 
- Martin Fleischmann + 
- Martin Jones + 
- Martin Winkel 
- Martina Oefelein + 
- Marvzinc + 
- María Marino + 
- Matheus Cardoso + 
- Mathis Felardos + 
- Matt Roeschke 
- Matteo Felici + 
- Matteo Santamaria + 
- Matthew Roeschke 
- Matthias Bussonnier 
- Max Chen 
- Max Halford + 
- Mayank Bisht + 
- Megan Thong + 
- Michael Marino + 
- Miguel Marques + 
- Mike Kutzma 
- Mohammad Hasnain Mohsin Rajan + 
- Mohammad Jafar Mashhadi + 
- MomIsBestFriend 
- Monica + 
- Natalie Jann 
- Nate Armstrong + 
- Nathanael + 
- Nick Newman + 
- Nico Schlömer + 
- Niklas Weber + 
- ObliviousParadigm + 
- Olga Lyashevska + 
- OlivierLuG + 
- Pandas Development Team 
- Parallels + 
- Patrick + 
- Patrick Cando + 
- Paul Lilley + 
- Paul Sanders + 
- Pearcekieser + 
- Pedro Larroy + 
- Pedro Reys 
- Peter Bull + 
- Peter Steinbach + 
- Phan Duc Nhat Minh + 
- Phil Kirlin + 
- Pierre-Yves Bourguignon + 
- Piotr Kasprzyk + 
- Piotr Niełacny + 
- Prakhar Pandey 
- Prashant Anand + 
- Puneetha Pai + 
- Quang Nguyễn + 
- Rafael Jaimes III + 
- Rafif + 
- RaisaDZ + 
- Rakshit Naidu + 
- Ram Rachum + 
- Red + 
- Ricardo Alanis + 
- Richard Shadrach + 
- Rik-de-Kort 
- Robert de Vries 
- Robin to Roxel + 
- Roger Erens + 
- Rohith295 + 
- Roman Yurchak 
- Ror + 
- Rushabh Vasani 
- Ryan 
- Ryan Nazareth 
- SAI SRAVAN MEDICHERLA + 
- SHUBH CHATTERJEE + 
- Sam Cohan 
- Samira-g-js + 
- Sandu Ursu + 
- Sang Agung + 
- SanthoshBala18 + 
- Sasidhar Kasturi + 
- SatheeshKumar Mohan + 
- Saul Shanabrook 
- Scott Gigante + 
- Sebastian Berg + 
- Sebastián Vanrell 
- Sergei Chipiga + 
- Sergey + 
- ShilpaSugan + 
- Simon Gibbons 
- Simon Hawkins 
- Simon Legner + 
- Soham Tiwari + 
- Song Wenhao + 
- Souvik Mandal 
- Spencer Clark 
- Steffen Rehberg + 
- Steffen Schmitz + 
- Stijn Van Hoey 
- Stéphan Taljaard 
- SultanOrazbayev + 
- Sumanau Sareen 
- SurajH1 + 
- Suvayu Ali + 
- Terji Petersen 
- Thomas J Fan + 
- Thomas Li 
- Thomas Smith + 
- Tim Swast 
- Tobias Pitters + 
- Tom + 
- Tom Augspurger 
- Uwe L. Korn 
- Valentin Iovene + 
- Vandana Iyer + 
- Venkatesh Datta + 
- Vijay Sai Mutyala + 
- Vikas Pandey 
- Vipul Rai + 
- Vishwam Pandya + 
- Vladimir Berkutov + 
- Will Ayd 
- Will Holmgren 
- William + 
- William Ayd 
- Yago González + 
- Yosuke KOBAYASHI + 
- Zachary Lawrence + 
- Zaky Bilfagih + 
- Zeb Nicholls + 
- alimcmaster1 
- alm + 
- andhikayusup + 
- andresmcneill + 
- avinashpancham + 
- benabel + 
- bernie gray + 
- biddwan09 + 
- brock + 
- chris-b1 
- cleconte987 + 
- dan1261 + 
- david-cortes + 
- davidwales + 
- dequadras + 
- dhuettenmoser + 
- dilex42 + 
- elmonsomiat + 
- epizzigoni + 
- fjetter 
- gabrielvf1 + 
- gdex1 + 
- gfyoung 
- guru kiran + 
- h-vishal 
- iamshwin 
- jamin-aws-ospo + 
- jbrockmendel 
- jfcorbett + 
- jnecus + 
- kernc 
- kota matsuoka + 
- kylekeppler + 
- leandermaben + 
- link2xt + 
- manoj_koneni + 
- marydmit + 
- masterpiga + 
- maxime.song + 
- mglasder + 
- moaraccounts + 
- mproszewska 
- neilkg 
- nrebena 
- ossdev07 + 
- paihu 
- pan Jacek + 
- partev + 
- patrick + 
- pedrooa + 
- pizzathief + 
- proost 
- pvanhauw + 
- rbenes 
- rebecca-palmer 
- rhshadrach + 
- rjfs + 
- s-scherrer + 
- sage + 
- sagungrp + 
- salem3358 + 
- saloni30 + 
- smartswdeveloper + 
- smartvinnetou + 
- themien + 
- timhunderwood + 
- tolhassianipar + 
- tonywu1999 
- tsvikas 
- tv3141 
- venkateshdatta1993 + 
- vivikelapoutre + 
- willbowditch + 
- willpeppo + 
- za + 
- zaki-indra +