These are the changes in pandas 1.1.0. See Release notes for a full changelog including other versions of pandas.
Previously, if labels were missing for a .loc call, a KeyError was raised stating that this was no longer supported.
.loc
Now the error message also includes a list of the missing labels (max 10 items, display width 80 characters). See GH34272.
StringDtype
Previously, declaring or converting to StringDtype was in general only possible if the data was already only str or nan-like (GH31204). StringDtype now works in all situations where astype(str) or dtype=str work:
str
astype(str)
dtype=str
For example, the below now works:
In [1]: ser = pd.Series([1, "abc", np.nan], dtype="string") In [2]: ser Out[2]: 0 1 1 abc 2 <NA> Length: 3, dtype: string In [3]: ser[0] Out[3]: '1' In [4]: pd.Series([1, 2, np.nan], dtype="Int64").astype("string") Out[4]: 0 1 1 2 2 <NA> Length: 3, dtype: string
PeriodIndex now supports partial string slicing for non-monotonic indexes, mirroring DatetimeIndex behavior (GH31096)
PeriodIndex
DatetimeIndex
For example:
In [5]: dti = pd.date_range("2014-01-01", periods=30, freq="30D") In [6]: pi = dti.to_period("D") In [7]: ser_monotonic = pd.Series(np.arange(30), index=pi) In [8]: shuffler = list(range(0, 30, 2)) + list(range(1, 31, 2)) In [9]: ser = ser_monotonic[shuffler] In [10]: ser Out[10]: 2014-01-01 0 2014-03-02 2 2014-05-01 4 2014-06-30 6 2014-08-29 8 .. 2015-09-23 21 2015-11-22 23 2016-01-21 25 2016-03-21 27 2016-05-20 29 Freq: D, Length: 30, dtype: int64
In [11]: ser["2014"] Out[11]: 2014-01-01 0 2014-03-02 2 2014-05-01 4 2014-06-30 6 2014-08-29 8 2014-10-28 10 2014-12-27 12 2014-01-31 1 2014-04-01 3 2014-05-31 5 2014-07-30 7 2014-09-28 9 2014-11-27 11 Freq: D, Length: 13, dtype: int64 In [12]: ser.loc["May 2015"] Out[12]: 2015-05-26 17 Freq: D, Length: 1, dtype: int64
DataFrame
Series
We’ve added DataFrame.compare() and Series.compare() for comparing two DataFrame or two Series (GH30429)
DataFrame.compare()
Series.compare()
In [13]: df = pd.DataFrame( ....: { ....: "col1": ["a", "a", "b", "b", "a"], ....: "col2": [1.0, 2.0, 3.0, np.nan, 5.0], ....: "col3": [1.0, 2.0, 3.0, 4.0, 5.0] ....: }, ....: columns=["col1", "col2", "col3"], ....: ) ....: In [14]: df Out[14]: col1 col2 col3 0 a 1.0 1.0 1 a 2.0 2.0 2 b 3.0 3.0 3 b NaN 4.0 4 a 5.0 5.0 [5 rows x 3 columns]
In [15]: df2 = df.copy() In [16]: df2.loc[0, 'col1'] = 'c' In [17]: df2.loc[2, 'col3'] = 4.0 In [18]: df2 Out[18]: col1 col2 col3 0 c 1.0 1.0 1 a 2.0 2.0 2 b 3.0 4.0 3 b NaN 4.0 4 a 5.0 5.0 [5 rows x 3 columns]
In [19]: df.compare(df2) Out[19]: col1 col3 self other self other 0 a c NaN NaN 2 NaN NaN 3.0 4.0 [2 rows x 4 columns]
See User Guide for more details.
With groupby , we’ve added a dropna keyword to DataFrame.groupby() and Series.groupby() in order to allow NA values in group keys. Users can define dropna to False if they want to include NA values in groupby keys. The default is set to True for dropna to keep backwards compatibility (GH3729)
dropna
DataFrame.groupby()
Series.groupby()
NA
False
True
In [20]: df_list = [[1, 2, 3], [1, None, 4], [2, 1, 3], [1, 2, 2]] In [21]: df_dropna = pd.DataFrame(df_list, columns=["a", "b", "c"]) In [22]: df_dropna Out[22]: a b c 0 1 2.0 3 1 1 NaN 4 2 2 1.0 3 3 1 2.0 2 [4 rows x 3 columns]
# Default ``dropna`` is set to True, which will exclude NaNs in keys In [23]: df_dropna.groupby(by=["b"], dropna=True).sum() Out[23]: a c b 1.0 2 3 2.0 2 5 [2 rows x 2 columns] # In order to allow NaN in keys, set ``dropna`` to False In [24]: df_dropna.groupby(by=["b"], dropna=False).sum() Out[24]: a c b 1.0 2 3 2.0 2 5 NaN 1 4 [3 rows x 2 columns]
The default setting of dropna argument is True which means NA are not included in group keys.
We’ve added a key argument to the DataFrame and Series sorting methods, including DataFrame.sort_values(), DataFrame.sort_index(), Series.sort_values(), and Series.sort_index(). The key can be any callable function which is applied column-by-column to each column used for sorting, before sorting is performed (GH27237). See sort_values with keys and sort_index with keys for more information.
key
DataFrame.sort_values()
DataFrame.sort_index()
Series.sort_values()
Series.sort_index()
In [25]: s = pd.Series(['C', 'a', 'B']) In [26]: s Out[26]: 0 C 1 a 2 B Length: 3, dtype: object
In [27]: s.sort_values() Out[27]: 2 B 0 C 1 a Length: 3, dtype: object
Note how this is sorted with capital letters first. If we apply the Series.str.lower() method, we get
Series.str.lower()
In [28]: s.sort_values(key=lambda x: x.str.lower()) Out[28]: 1 a 2 B 0 C Length: 3, dtype: object
When applied to a DataFrame, they key is applied per-column to all columns or a subset if by is specified, e.g.
by
In [29]: df = pd.DataFrame({'a': ['C', 'C', 'a', 'a', 'B', 'B'], ....: 'b': [1, 2, 3, 4, 5, 6]}) ....: In [30]: df Out[30]: a b 0 C 1 1 C 2 2 a 3 3 a 4 4 B 5 5 B 6 [6 rows x 2 columns]
In [31]: df.sort_values(by=['a'], key=lambda col: col.str.lower()) Out[31]: a b 2 a 3 3 a 4 4 B 5 5 B 6 0 C 1 1 C 2 [6 rows x 2 columns]
For more details, see examples and documentation in DataFrame.sort_values(), Series.sort_values(), and sort_index().
sort_index()
Timestamp: now supports the keyword-only fold argument according to PEP 495 similar to parent datetime.datetime class. It supports both accepting fold as an initialization argument and inferring fold from other constructor arguments (GH25057, GH31338). Support is limited to dateutil timezones as pytz doesn’t support fold.
Timestamp:
datetime.datetime
dateutil
pytz
In [32]: ts = pd.Timestamp("2019-10-27 01:30:00+00:00") In [33]: ts.fold Out[33]: 0
In [34]: ts = pd.Timestamp(year=2019, month=10, day=27, hour=1, minute=30, ....: tz="dateutil/Europe/London", fold=1) ....: In [35]: ts Out[35]: Timestamp('2019-10-27 01:30:00+0000', tz='dateutil//usr/share/zoneinfo/Europe/London')
For more on working with fold, see Fold subsection in the user guide.
to_datetime() now supports parsing formats containing timezone names (%Z) and UTC offsets (%z) from different timezones then converting them to UTC by setting utc=True. This would return a DatetimeIndex with timezone at UTC as opposed to an Index with object dtype if utc=True is not set (GH32792).
to_datetime()
%Z
%z
utc=True
Index
object
In [36]: tz_strs = ["2010-01-01 12:00:00 +0100", "2010-01-01 12:00:00 -0100", ....: "2010-01-01 12:00:00 +0300", "2010-01-01 12:00:00 +0400"] ....: In [37]: pd.to_datetime(tz_strs, format='%Y-%m-%d %H:%M:%S %z', utc=True) Out[37]: DatetimeIndex(['2010-01-01 11:00:00+00:00', '2010-01-01 13:00:00+00:00', '2010-01-01 09:00:00+00:00', '2010-01-01 08:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None) In [38]: pd.to_datetime(tz_strs, format='%Y-%m-%d %H:%M:%S %z') Out[38]: Index([2010-01-01 12:00:00+01:00, 2010-01-01 12:00:00-01:00, 2010-01-01 12:00:00+03:00, 2010-01-01 12:00:00+04:00], dtype='object')
Grouper and DataFrame.resample() now supports the arguments origin and offset. It let the user control the timestamp on which to adjust the grouping. (GH31809)
Grouper
DataFrame.resample()
origin
offset
The bins of the grouping are adjusted based on the beginning of the day of the time series starting point. This works well with frequencies that are multiples of a day (like 30D) or that divides a day (like 90s or 1min). But it can create inconsistencies with some frequencies that do not meet this criteria. To change this behavior you can now specify a fixed timestamp with the argument origin.
30D
90s
1min
Two arguments are now deprecated (more information in the documentation of DataFrame.resample()):
base should be replaced by offset.
base
loffset should be replaced by directly adding an offset to the index DataFrame after being resampled.
loffset
Small example of the use of origin:
In [39]: start, end = '2000-10-01 23:30:00', '2000-10-02 00:30:00' In [40]: middle = '2000-10-02 00:00:00' In [41]: rng = pd.date_range(start, end, freq='7min') In [42]: ts = pd.Series(np.arange(len(rng)) * 3, index=rng) In [43]: ts Out[43]: 2000-10-01 23:30:00 0 2000-10-01 23:37:00 3 2000-10-01 23:44:00 6 2000-10-01 23:51:00 9 2000-10-01 23:58:00 12 2000-10-02 00:05:00 15 2000-10-02 00:12:00 18 2000-10-02 00:19:00 21 2000-10-02 00:26:00 24 Freq: 7T, Length: 9, dtype: int64
Resample with the default behavior 'start_day' (origin is 2000-10-01 00:00:00):
'start_day'
2000-10-01 00:00:00
In [44]: ts.resample('17min').sum() Out[44]: 2000-10-01 23:14:00 0 2000-10-01 23:31:00 9 2000-10-01 23:48:00 21 2000-10-02 00:05:00 54 2000-10-02 00:22:00 24 Freq: 17T, Length: 5, dtype: int64 In [45]: ts.resample('17min', origin='start_day').sum() Out[45]: 2000-10-01 23:14:00 0 2000-10-01 23:31:00 9 2000-10-01 23:48:00 21 2000-10-02 00:05:00 54 2000-10-02 00:22:00 24 Freq: 17T, Length: 5, dtype: int64
Resample using a fixed origin:
In [46]: ts.resample('17min', origin='epoch').sum() Out[46]: 2000-10-01 23:18:00 0 2000-10-01 23:35:00 18 2000-10-01 23:52:00 27 2000-10-02 00:09:00 39 2000-10-02 00:26:00 24 Freq: 17T, Length: 5, dtype: int64 In [47]: ts.resample('17min', origin='2000-01-01').sum() Out[47]: 2000-10-01 23:24:00 3 2000-10-01 23:41:00 15 2000-10-01 23:58:00 45 2000-10-02 00:15:00 45 Freq: 17T, Length: 4, dtype: int64
If needed you can adjust the bins with the argument offset (a Timedelta) that would be added to the default origin.
Timedelta
For a full example, see: Use origin or offset to adjust the start of the bins.
For reading and writing to filesystems other than local and reading from HTTP(S), the optional dependency fsspec will be used to dispatch operations (GH33452). This will give unchanged functionality for S3 and GCS storage, which were already supported, but also add support for several other storage implementations such as Azure Datalake and Blob, SSH, FTP, dropbox and github. For docs and capabilities, see the fsspec docs.
fsspec
The existing capability to interface with S3 and GCS will be unaffected by this change, as fsspec will still bring in the same packages as before.
Compatibility with matplotlib 3.3.0 (GH34850)
IntegerArray.astype() now supports datetime64 dtype (GH32538)
IntegerArray.astype()
datetime64
IntegerArray now implements the sum operation (GH33172)
IntegerArray
sum
Added pandas.errors.InvalidIndexError (GH34570).
pandas.errors.InvalidIndexError
Added DataFrame.value_counts() (GH5377)
DataFrame.value_counts()
Added a pandas.api.indexers.FixedForwardWindowIndexer() class to support forward-looking windows during rolling operations.
pandas.api.indexers.FixedForwardWindowIndexer()
rolling
Added a pandas.api.indexers.VariableOffsetWindowIndexer() class to support rolling operations with non-fixed offsets (GH34994)
pandas.api.indexers.VariableOffsetWindowIndexer()
describe() now includes a datetime_is_numeric keyword to control how datetime columns are summarized (GH30164, GH34798)
describe()
datetime_is_numeric
Styler may now render CSS more efficiently where multiple cells have the same styling (GH30876)
Styler
highlight_null() now accepts subset argument (GH31345)
highlight_null()
subset
When writing directly to a sqlite connection DataFrame.to_sql() now supports the multi method (GH29921)
DataFrame.to_sql()
multi
pandas.errors.OptionError is now exposed in pandas.errors (GH27553)
pandas.errors.OptionError
pandas.errors
Added api.extensions.ExtensionArray.argmax() and api.extensions.ExtensionArray.argmin() (GH24382)
api.extensions.ExtensionArray.argmax()
api.extensions.ExtensionArray.argmin()
timedelta_range() will now infer a frequency when passed start, stop, and periods (GH32377)
timedelta_range()
start
stop
periods
Positional slicing on a IntervalIndex now supports slices with step > 1 (GH31658)
IntervalIndex
step > 1
Series.str now has a fullmatch method that matches a regular expression against the entire string in each row of the Series, similar to re.fullmatch (GH32806).
Series.str
fullmatch
re.fullmatch
DataFrame.sample() will now also allow array-like and BitGenerator objects to be passed to random_state as seeds (GH32503)
DataFrame.sample()
random_state
Index.union() will now raise RuntimeWarning for MultiIndex objects if the object inside are unsortable. Pass sort=False to suppress this warning (GH33015)
Index.union()
RuntimeWarning
MultiIndex
sort=False
Added Series.dt.isocalendar() and DatetimeIndex.isocalendar() that returns a DataFrame with year, week, and day calculated according to the ISO 8601 calendar (GH33206, GH34392).
Series.dt.isocalendar()
DatetimeIndex.isocalendar()
The DataFrame.to_feather() method now supports additional keyword arguments (e.g. to set the compression) that are added in pyarrow 0.17 (GH33422).
DataFrame.to_feather()
The cut() will now accept parameter ordered with default ordered=True. If ordered=False and no labels are provided, an error will be raised (GH33141)
cut()
ordered
ordered=True
ordered=False
DataFrame.to_csv(), DataFrame.to_pickle(), and DataFrame.to_json() now support passing a dict of compression arguments when using the gzip and bz2 protocols. This can be used to set a custom compression level, e.g., df.to_csv(path, compression={'method': 'gzip', 'compresslevel': 1} (GH33196)
DataFrame.to_csv()
DataFrame.to_pickle()
DataFrame.to_json()
gzip
bz2
df.to_csv(path, compression={'method': 'gzip', 'compresslevel': 1}
melt() has gained an ignore_index (default True) argument that, if set to False, prevents the method from dropping the index (GH17440).
melt()
ignore_index
Series.update() now accepts objects that can be coerced to a Series, such as dict and list, mirroring the behavior of DataFrame.update() (GH33215)
Series.update()
dict
list
DataFrame.update()
transform() and aggregate() have gained engine and engine_kwargs arguments that support executing functions with Numba (GH32854, GH33388)
transform()
aggregate()
engine
engine_kwargs
Numba
interpolate() now supports SciPy interpolation method scipy.interpolate.CubicSpline as method cubicspline (GH33670)
interpolate()
scipy.interpolate.CubicSpline
cubicspline
DataFrameGroupBy and SeriesGroupBy now implement the sample method for doing random sampling within groups (GH31775)
DataFrameGroupBy
SeriesGroupBy
sample
DataFrame.to_numpy() now supports the na_value keyword to control the NA sentinel in the output array (GH33820)
DataFrame.to_numpy()
na_value
Added api.extension.ExtensionArray.equals to the extension array interface, similar to Series.equals() (GH27081)
api.extension.ExtensionArray.equals
Series.equals()
The minimum supported dta version has increased to 105 in read_stata() and StataReader (GH26667).
read_stata()
StataReader
to_stata() supports compression using the compression keyword argument. Compression can either be inferred or explicitly set using a string or a dictionary containing both the method and any additional arguments that are passed to the compression library. Compression was also added to the low-level Stata-file writers StataWriter, StataWriter117, and StataWriterUTF8 (GH26599).
to_stata()
compression
StataWriter
StataWriter117
StataWriterUTF8
HDFStore.put() now accepts a track_times parameter. This parameter is passed to the create_table method of PyTables (GH32682).
HDFStore.put()
track_times
create_table
PyTables
Series.plot() and DataFrame.plot() now accepts xlabel and ylabel parameters to present labels on x and y axis (GH9093).
Series.plot()
DataFrame.plot()
xlabel
ylabel
Made pandas.core.window.rolling.Rolling and pandas.core.window.expanding.Expanding iterable(GH11704)
pandas.core.window.rolling.Rolling
pandas.core.window.expanding.Expanding
Made option_context a contextlib.ContextDecorator, which allows it to be used as a decorator over an entire function (GH34253).
option_context
contextlib.ContextDecorator
DataFrame.to_csv() and Series.to_csv() now accept an errors argument (GH22610)
Series.to_csv()
errors
transform() now allows func to be pad, backfill and cumcount (GH31269).
func
pad
backfill
cumcount
read_json() now accepts an nrows parameter. (GH33916).
read_json()
nrows
DataFrame.hist(), Series.hist(), core.groupby.DataFrameGroupBy.hist(), and core.groupby.SeriesGroupBy.hist() have gained the legend argument. Set to True to show a legend in the histogram. (GH6279)
DataFrame.hist()
Series.hist()
core.groupby.DataFrameGroupBy.hist()
core.groupby.SeriesGroupBy.hist()
legend
concat() and append() now preserve extension dtypes, for example combining a nullable integer column with a numpy integer column will no longer result in object dtype but preserve the integer dtype (GH33607, GH34339, GH34095).
concat()
append()
read_gbq() now allows to disable progress bar (GH33360).
read_gbq()
read_gbq() now supports the max_results kwarg from pandas-gbq (GH34639).
max_results
pandas-gbq
DataFrame.cov() and Series.cov() now support a new parameter ddof to support delta degrees of freedom as in the corresponding numpy methods (GH34611).
DataFrame.cov()
Series.cov()
ddof
DataFrame.to_html() and DataFrame.to_string()’s col_space parameter now accepts a list or dict to change only some specific columns’ width (GH28917).
DataFrame.to_html()
DataFrame.to_string()
col_space
DataFrame.to_excel() can now also write OpenOffice spreadsheet (.ods) files (GH27222)
DataFrame.to_excel()
explode() now accepts ignore_index to reset the index, similar to pd.concat() or DataFrame.sort_values() (GH34932).
explode()
pd.concat()
DataFrame.to_markdown() and Series.to_markdown() now accept index argument as an alias for tabulate’s showindex (GH32667)
DataFrame.to_markdown()
Series.to_markdown()
index
showindex
read_csv() now accepts string values like “0”, “0.0”, “1”, “1.0” as convertible to the nullable Boolean dtype (GH34859)
read_csv()
pandas.core.window.ExponentialMovingWindow now supports a times argument that allows mean to be calculated with observations spaced by the timestamps in times (GH34839)
pandas.core.window.ExponentialMovingWindow
times
mean
DataFrame.agg() and Series.agg() now accept named aggregation for renaming the output columns/indexes. (GH26513)
DataFrame.agg()
Series.agg()
compute.use_numba now exists as a configuration option that utilizes the numba engine when available (GH33966, GH35374)
compute.use_numba
Series.plot() now supports asymmetric error bars. Previously, if Series.plot() received a “2xN” array with error values for yerr and/or xerr, the left/lower values (first row) were mirrored, while the right/upper values (second row) were ignored. Now, the first row represents the left/lower error values and the second row the right/upper error values. (GH9536)
yerr
xerr
These are bug fixes that might have notable behavior changes.
MultiIndex.get_indexer
method
This restores the behavior of MultiIndex.get_indexer() with method='backfill' or method='pad' to the behavior before pandas 0.23.0. In particular, MultiIndexes are treated as a list of tuples and padding or backfilling is done with respect to the ordering of these lists of tuples (GH29896).
MultiIndex.get_indexer()
method='backfill'
method='pad'
As an example of this, given:
In [48]: df = pd.DataFrame({ ....: 'a': [0, 0, 0, 0], ....: 'b': [0, 2, 3, 4], ....: 'c': ['A', 'B', 'C', 'D'], ....: }).set_index(['a', 'b']) ....: In [49]: mi_2 = pd.MultiIndex.from_product([[0], [-1, 0, 1, 3, 4, 5]])
The differences in reindexing df with mi_2 and using method='backfill' can be seen here:
df
mi_2
pandas >= 0.23, < 1.1.0:
In [1]: df.reindex(mi_2, method='backfill') Out[1]: c 0 -1 A 0 A 1 D 3 A 4 A 5 C
pandas <0.23, >= 1.1.0
In [50]: df.reindex(mi_2, method='backfill') Out[50]: c 0 -1 A 0 A 1 B 3 C 4 D 5 NaN [6 rows x 1 columns]
And the differences in reindexing df with mi_2 and using method='pad' can be seen here:
pandas >= 0.23, < 1.1.0
In [1]: df.reindex(mi_2, method='pad') Out[1]: c 0 -1 NaN 0 NaN 1 D 3 NaN 4 A 5 C
pandas < 0.23, >= 1.1.0
In [51]: df.reindex(mi_2, method='pad') Out[51]: c 0 -1 NaN 0 A 1 A 3 C 4 D 5 D [6 rows x 1 columns]
Label lookups series[key], series.loc[key] and frame.loc[key] used to raise either KeyError or TypeError depending on the type of key and type of Index. These now consistently raise KeyError (GH31867)
series[key]
series.loc[key]
frame.loc[key]
KeyError
TypeError
In [52]: ser1 = pd.Series(range(3), index=[0, 1, 2]) In [53]: ser2 = pd.Series(range(3), index=pd.date_range("2020-02-01", periods=3))
Previous behavior:
In [3]: ser1[1.5] ... TypeError: cannot do label indexing on Int64Index with these indexers [1.5] of type float In [4] ser1["foo"] ... KeyError: 'foo' In [5]: ser1.loc[1.5] ... TypeError: cannot do label indexing on Int64Index with these indexers [1.5] of type float In [6]: ser1.loc["foo"] ... KeyError: 'foo' In [7]: ser2.loc[1] ... TypeError: cannot do label indexing on DatetimeIndex with these indexers [1] of type int In [8]: ser2.loc[pd.Timestamp(0)] ... KeyError: Timestamp('1970-01-01 00:00:00')
New behavior:
In [3]: ser1[1.5] ... KeyError: 1.5 In [4] ser1["foo"] ... KeyError: 'foo' In [5]: ser1.loc[1.5] ... KeyError: 1.5 In [6]: ser1.loc["foo"] ... KeyError: 'foo' In [7]: ser2.loc[1] ... KeyError: 1 In [8]: ser2.loc[pd.Timestamp(0)] ... KeyError: Timestamp('1970-01-01 00:00:00')
Similarly, DataFrame.at() and Series.at() will raise a TypeError instead of a ValueError if an incompatible key is passed, and KeyError if a missing key is passed, matching the behavior of .loc[] (GH31722)
DataFrame.at()
Series.at()
ValueError
.loc[]
Indexing with integers with a MultiIndex that has an integer-dtype first level incorrectly failed to raise KeyError when one or more of those integer keys is not present in the first level of the index (GH33539)
In [54]: idx = pd.Index(range(4)) In [55]: dti = pd.date_range("2000-01-03", periods=3) In [56]: mi = pd.MultiIndex.from_product([idx, dti]) In [57]: ser = pd.Series(range(len(mi)), index=mi)
In [5]: ser[[5]] Out[5]: Series([], dtype: int64)
In [5]: ser[[5]] ... KeyError: '[5] not in index'
DataFrame.merge()
DataFrame.merge() now preserves the right frame’s row order when executing a right merge (GH27453)
In [58]: left_df = pd.DataFrame({'animal': ['dog', 'pig'], ....: 'max_speed': [40, 11]}) ....: In [59]: right_df = pd.DataFrame({'animal': ['quetzal', 'pig'], ....: 'max_speed': [80, 11]}) ....: In [60]: left_df Out[60]: animal max_speed 0 dog 40 1 pig 11 [2 rows x 2 columns] In [61]: right_df Out[61]: animal max_speed 0 quetzal 80 1 pig 11 [2 rows x 2 columns]
>>> left_df.merge(right_df, on=['animal', 'max_speed'], how="right") animal max_speed 0 pig 11 1 quetzal 80
In [62]: left_df.merge(right_df, on=['animal', 'max_speed'], how="right") Out[62]: animal max_speed 0 quetzal 80 1 pig 11 [2 rows x 2 columns]
Assignment to multiple columns of a DataFrame when some of the columns do not exist would previously assign the values to the last column. Now, new columns will be constructed with the right values. (GH13658)
In [63]: df = pd.DataFrame({'a': [0, 1, 2], 'b': [3, 4, 5]}) In [64]: df Out[64]: a b 0 0 3 1 1 4 2 2 5 [3 rows x 2 columns]
In [3]: df[['a', 'c']] = 1 In [4]: df Out[4]: a b 0 1 1 1 1 1 2 1 1
In [65]: df[['a', 'c']] = 1 In [66]: df Out[66]: a b c 0 1 3 1 1 1 4 1 2 1 5 1 [3 rows x 3 columns]
Using DataFrame.groupby() with as_index=True and the aggregation nunique would include the grouping column(s) in the columns of the result. Now the grouping column(s) only appear in the index, consistent with other reductions. (GH32579)
as_index=True
nunique
In [67]: df = pd.DataFrame({"a": ["x", "x", "y", "y"], "b": [1, 1, 2, 3]}) In [68]: df Out[68]: a b 0 x 1 1 x 1 2 y 2 3 y 3 [4 rows x 2 columns]
In [3]: df.groupby("a", as_index=True).nunique() Out[4]: a b a x 1 1 y 1 2
In [69]: df.groupby("a", as_index=True).nunique() Out[69]: b a x 1 y 2 [2 rows x 1 columns]
Using DataFrame.groupby() with as_index=False and the function idxmax, idxmin, mad, nunique, sem, skew, or std would modify the grouping column. Now the grouping column remains unchanged, consistent with other reductions. (GH21090, GH10355)
as_index=False
idxmax
idxmin
mad
sem
skew
std
In [3]: df.groupby("a", as_index=False).nunique() Out[4]: a b 0 1 1 1 1 2
In [70]: df.groupby("a", as_index=False).nunique() Out[70]: a b 0 x 1 1 y 2 [2 rows x 2 columns]
The method size() would previously ignore as_index=False. Now the grouping columns are returned as columns, making the result a DataFrame instead of a Series. (GH32599)
size()
In [3]: df.groupby("a", as_index=False).size() Out[4]: a x 2 y 2 dtype: int64
In [71]: df.groupby("a", as_index=False).size() Out[71]: a size 0 x 2 1 y 2 [2 rows x 2 columns]
agg()
Previously agg() lost the result columns, when the as_index option was set to False and the result columns were relabeled. In this case the result values were replaced with the previous index (GH32240).
as_index
In [72]: df = pd.DataFrame({"key": ["x", "y", "z", "x", "y", "z"], ....: "val": [1.0, 0.8, 2.0, 3.0, 3.6, 0.75]}) ....: In [73]: df Out[73]: key val 0 x 1.00 1 y 0.80 2 z 2.00 3 x 3.00 4 y 3.60 5 z 0.75 [6 rows x 2 columns]
In [2]: grouped = df.groupby("key", as_index=False) In [3]: result = grouped.agg(min_val=pd.NamedAgg(column="val", aggfunc="min")) In [4]: result Out[4]: min_val 0 x 1 y 2 z
In [74]: grouped = df.groupby("key", as_index=False) In [75]: result = grouped.agg(min_val=pd.NamedAgg(column="val", aggfunc="min")) In [76]: result Out[76]: key min_val 0 x 1.00 1 y 0.80 2 z 0.75 [3 rows x 2 columns]
In [77]: df = pd.DataFrame({'a': [1, 2], 'b': [3, 6]}) In [78]: def func(row): ....: print(row) ....: return row ....:
In [4]: df.apply(func, axis=1) a 1 b 3 Name: 0, dtype: int64 a 1 b 3 Name: 0, dtype: int64 a 2 b 6 Name: 1, dtype: int64 Out[4]: a b 0 1 3 1 2 6
In [79]: df.apply(func, axis=1) a 1 b 3 Name: 0, Length: 2, dtype: int64 a 2 b 6 Name: 1, Length: 2, dtype: int64 Out[79]: a b 0 1 3 1 2 6 [2 rows x 2 columns]
check_freq
testing.assert_frame_equal
testing.assert_series_equal
The check_freq argument was added to testing.assert_frame_equal() and testing.assert_series_equal() in pandas 1.1.0 and defaults to True. testing.assert_frame_equal() and testing.assert_series_equal() now raise AssertionError if the indexes do not have the same frequency. Before pandas 1.1.0, the index frequency was not checked.
testing.assert_frame_equal()
testing.assert_series_equal()
AssertionError
Some minimum supported versions of dependencies were updated (GH33718, GH29766, GH29723, pytables >= 3.4.3). If installed, we now require:
Package
Minimum Version
Required
Changed
numpy
1.15.4
X
2015.4
python-dateutil
2.7.3
bottleneck
1.2.1
numexpr
2.6.2
pytest (dev)
4.0.2
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
beautifulsoup4
4.6.0
fastparquet
0.3.2
0.7.4
gcsfs
0.6.0
lxml
3.8.0
matplotlib
2.2.2
numba
0.46.0
openpyxl
2.5.7
pyarrow
0.13.0
pymysql
0.7.1
pytables
3.4.3
s3fs
0.4.0
scipy
1.2.0
sqlalchemy
1.1.4
xarray
0.8.2
xlrd
1.1.0
xlsxwriter
0.9.8
xlwt
See Dependencies and Optional dependencies for more.
The minimum version of Cython is now the most recent bug-fix version (0.29.16) (GH33334).
Lookups on a Series with a single-item list containing a slice (e.g. ser[[slice(0, 4)]]) are deprecated and will raise in a future version. Either convert the list to a tuple, or pass the slice directly instead (GH31333)
ser[[slice(0, 4)]]
DataFrame.mean() and DataFrame.median() with numeric_only=None will include datetime64 and datetime64tz columns in a future version (GH29941)
DataFrame.mean()
DataFrame.median()
numeric_only=None
datetime64tz
Setting values with .loc using a positional slice is deprecated and will raise in a future version. Use .loc with labels or .iloc with positions instead (GH31840)
.iloc
DataFrame.to_dict() has deprecated accepting short names for orient and will raise in a future version (GH32515)
DataFrame.to_dict()
orient
Categorical.to_dense() is deprecated and will be removed in a future version, use np.asarray(cat) instead (GH32639)
Categorical.to_dense()
np.asarray(cat)
The fastpath keyword in the SingleBlockManager constructor is deprecated and will be removed in a future version (GH33092)
fastpath
SingleBlockManager
Providing suffixes as a set in pandas.merge() is deprecated. Provide a tuple instead (GH33740, GH34741).
suffixes
set
pandas.merge()
Indexing a Series with a multi-dimensional indexer like [:, None] to return an ndarray now raises a FutureWarning. Convert to a NumPy array before indexing instead (GH27837)
[:, None]
ndarray
FutureWarning
Index.is_mixed() is deprecated and will be removed in a future version, check index.inferred_type directly instead (GH32922)
Index.is_mixed()
index.inferred_type
Passing any arguments but the first one to read_html() as positional arguments is deprecated. All other arguments should be given as keyword arguments (GH27573).
read_html()
Passing any arguments but path_or_buf (the first one) to read_json() as positional arguments is deprecated. All other arguments should be given as keyword arguments (GH27573).
path_or_buf
Passing any arguments but the first two to read_excel() as positional arguments is deprecated. All other arguments should be given as keyword arguments (GH27573).
read_excel()
pandas.api.types.is_categorical() is deprecated and will be removed in a future version; use pandas.api.types.is_categorical_dtype() instead (GH33385)
pandas.api.types.is_categorical()
pandas.api.types.is_categorical_dtype()
Index.get_value() is deprecated and will be removed in a future version (GH19728)
Index.get_value()
Series.dt.week() and Series.dt.weekofyear() are deprecated and will be removed in a future version, use Series.dt.isocalendar().week() instead (GH33595)
Series.dt.week()
Series.dt.weekofyear()
Series.dt.isocalendar().week()
DatetimeIndex.week() and DatetimeIndex.weekofyear are deprecated and will be removed in a future version, use DatetimeIndex.isocalendar().week instead (GH33595)
DatetimeIndex.week()
DatetimeIndex.weekofyear
DatetimeIndex.isocalendar().week
DatetimeArray.week() and DatetimeArray.weekofyear are deprecated and will be removed in a future version, use DatetimeArray.isocalendar().week instead (GH33595)
DatetimeArray.week()
DatetimeArray.weekofyear
DatetimeArray.isocalendar().week
DateOffset.__call__() is deprecated and will be removed in a future version, use offset + other instead (GH34171)
DateOffset.__call__()
offset + other
apply_index() is deprecated and will be removed in a future version. Use offset + other instead (GH34580)
apply_index()
DataFrame.tshift() and Series.tshift() are deprecated and will be removed in a future version, use DataFrame.shift() and Series.shift() instead (GH11631)
DataFrame.tshift()
Series.tshift()
DataFrame.shift()
Series.shift()
Indexing an Index object with a float key is deprecated, and will raise an IndexError in the future. You can manually convert to an integer key instead (GH34191).
IndexError
The squeeze keyword in groupby() is deprecated and will be removed in a future version (GH32380)
squeeze
groupby()
The tz keyword in Period.to_timestamp() is deprecated and will be removed in a future version; use per.to_timestamp(...).tz_localize(tz) instead (GH34522)
tz
Period.to_timestamp()
per.to_timestamp(...).tz_localize(tz)
DatetimeIndex.to_perioddelta() is deprecated and will be removed in a future version. Use index - index.to_period(freq).to_timestamp() instead (GH34853)
DatetimeIndex.to_perioddelta()
index - index.to_period(freq).to_timestamp()
DataFrame.melt() accepting a value_name that already exists is deprecated, and will be removed in a future version (GH34731)
DataFrame.melt()
value_name
The center keyword in the DataFrame.expanding() function is deprecated and will be removed in a future version (GH20647)
center
DataFrame.expanding()
Performance improvement in Timedelta constructor (GH30543)
Performance improvement in Timestamp constructor (GH30543)
Timestamp
Performance improvement in flex arithmetic ops between DataFrame and Series with axis=0 (GH31296)
axis=0
Performance improvement in arithmetic ops between DataFrame and Series with axis=1 (GH33600)
axis=1
The internal index method _shallow_copy() now copies cached attributes over to the new index, avoiding creating these again on the new index. This can speed up many operations that depend on creating copies of existing indexes (GH28584, GH32640, GH32669)
_shallow_copy()
Significant performance improvement when creating a DataFrame with sparse values from scipy.sparse matrices using the DataFrame.sparse.from_spmatrix() constructor (GH32821, GH32825, GH32826, GH32856, GH32858).
scipy.sparse
DataFrame.sparse.from_spmatrix()
Performance improvement for groupby methods first() and last() (GH34178)
first()
last()
Performance improvement in factorize() for nullable (integer and Boolean) dtypes (GH33064).
factorize()
Performance improvement when constructing Categorical objects (GH33921)
Categorical
Fixed performance regression in pandas.qcut() and pandas.cut() (GH33921)
pandas.qcut()
pandas.cut()
Performance improvement in reductions (sum, prod, min, max) for nullable (integer and Boolean) dtypes (GH30982, GH33261, GH33442).
prod
min
max
Performance improvement in arithmetic operations between two DataFrame objects (GH32779)
Performance improvement in pandas.core.groupby.RollingGroupby (GH34052)
pandas.core.groupby.RollingGroupby
Performance improvement in arithmetic operations (sub, add, mul, div) for MultiIndex (GH34297)
sub
add
mul
div
Performance improvement in DataFrame[bool_indexer] when bool_indexer is a list (GH33924)
DataFrame[bool_indexer]
bool_indexer
Significant performance improvement of io.formats.style.Styler.render() with styles added with various ways such as io.formats.style.Styler.apply(), io.formats.style.Styler.applymap() or io.formats.style.Styler.bar() (GH19917)
io.formats.style.Styler.render()
io.formats.style.Styler.apply()
io.formats.style.Styler.applymap()
io.formats.style.Styler.bar()
Passing an invalid fill_value to Categorical.take() raises a ValueError instead of TypeError (GH33660)
fill_value
Categorical.take()
Combining a Categorical with integer categories and which contains missing values with a float dtype column in operations such as concat() or append() will now result in a float column instead of an object dtype column (GH33607)
Bug where merge() was unable to join on non-unique categorical indices (GH28189)
merge()
Bug when passing categorical data to Index constructor along with dtype=object incorrectly returning a CategoricalIndex instead of object-dtype Index (GH32167)
dtype=object
CategoricalIndex
Bug where Categorical comparison operator __ne__ would incorrectly evaluate to False when either element was missing (GH32276)
__ne__
Categorical.fillna() now accepts Categorical other argument (GH32420)
Categorical.fillna()
other
Repr of Categorical was not distinguishing between int and str (GH33676)
int
Passing an integer dtype other than int64 to np.array(period_index, dtype=...) will now raise TypeError instead of incorrectly using int64 (GH32255)
int64
np.array(period_index, dtype=...)
Series.to_timestamp() now raises a TypeError if the axis is not a PeriodIndex. Previously an AttributeError was raised (GH33327)
Series.to_timestamp()
AttributeError
Series.to_period() now raises a TypeError if the axis is not a DatetimeIndex. Previously an AttributeError was raised (GH33327)
Series.to_period()
Period no longer accepts tuples for the freq argument (GH34658)
Period
freq
Bug in Timestamp where constructing a Timestamp from ambiguous epoch time and calling constructor again changed the Timestamp.value() property (GH24329)
Timestamp.value()
DatetimeArray.searchsorted(), TimedeltaArray.searchsorted(), PeriodArray.searchsorted() not recognizing non-pandas scalars and incorrectly raising ValueError instead of TypeError (GH30950)
DatetimeArray.searchsorted()
TimedeltaArray.searchsorted()
PeriodArray.searchsorted()
Bug in Timestamp where constructing Timestamp with dateutil timezone less than 128 nanoseconds before daylight saving time switch from winter to summer would result in nonexistent time (GH31043)
Bug in Period.to_timestamp(), Period.start_time() with microsecond frequency returning a timestamp one nanosecond earlier than the correct time (GH31475)
Period.start_time()
Timestamp raised a confusing error message when year, month or day is missing (GH31200)
Bug in DatetimeIndex constructor incorrectly accepting bool-dtype inputs (GH32668)
bool
Bug in DatetimeIndex.searchsorted() not accepting a list or Series as its argument (GH32762)
DatetimeIndex.searchsorted()
Bug where PeriodIndex() raised when passed a Series of strings (GH26109)
PeriodIndex()
Bug in Timestamp arithmetic when adding or subtracting an np.ndarray with timedelta64 dtype (GH33296)
np.ndarray
timedelta64
Bug in DatetimeIndex.to_period() not inferring the frequency when called with no arguments (GH33358)
DatetimeIndex.to_period()
Bug in DatetimeIndex.tz_localize() incorrectly retaining freq in some cases where the original freq is no longer valid (GH30511)
DatetimeIndex.tz_localize()
Bug in DatetimeIndex.intersection() losing freq and timezone in some cases (GH33604)
DatetimeIndex.intersection()
Bug in DatetimeIndex.get_indexer() where incorrect output would be returned for mixed datetime-like targets (GH33741)
DatetimeIndex.get_indexer()
Bug in DatetimeIndex addition and subtraction with some types of DateOffset objects incorrectly retaining an invalid freq attribute (GH33779)
DateOffset
Bug in DatetimeIndex where setting the freq attribute on an index could silently change the freq attribute on another index viewing the same data (GH33552)
DataFrame.min() and DataFrame.max() were not returning consistent results with Series.min() and Series.max() when called on objects initialized with empty pd.to_datetime()
DataFrame.min()
DataFrame.max()
Series.min()
Series.max()
pd.to_datetime()
Bug in DatetimeIndex.intersection() and TimedeltaIndex.intersection() with results not having the correct name attribute (GH33904)
TimedeltaIndex.intersection()
name
Bug in DatetimeArray.__setitem__(), TimedeltaArray.__setitem__(), PeriodArray.__setitem__() incorrectly allowing values with int64 dtype to be silently cast (GH33717)
DatetimeArray.__setitem__()
TimedeltaArray.__setitem__()
PeriodArray.__setitem__()
Bug in subtracting TimedeltaIndex from Period incorrectly raising TypeError in some cases where it should succeed and IncompatibleFrequency in some cases where it should raise TypeError (GH33883)
TimedeltaIndex
IncompatibleFrequency
Bug in constructing a Series or Index from a read-only NumPy array with non-ns resolution which converted to object dtype instead of coercing to datetime64[ns] dtype when within the timestamp bounds (GH34843).
datetime64[ns]
The freq keyword in Period, date_range(), period_range(), pd.tseries.frequencies.to_offset() no longer allows tuples, pass as string instead (GH34703)
date_range()
period_range()
pd.tseries.frequencies.to_offset()
Bug in DataFrame.append() when appending a Series containing a scalar tz-aware Timestamp to an empty DataFrame resulted in an object column instead of datetime64[ns, tz] dtype (GH35038)
DataFrame.append()
datetime64[ns, tz]
OutOfBoundsDatetime issues an improved error message when timestamp is out of implementation bounds. (GH32967)
OutOfBoundsDatetime
Bug in AbstractHolidayCalendar.holidays() when no rules were defined (GH31415)
AbstractHolidayCalendar.holidays()
Bug in Tick comparisons raising TypeError when comparing against timedelta-like objects (GH34088)
Tick
Bug in Tick multiplication raising TypeError when multiplying by a float (GH34486)
Bug in constructing a Timedelta with a high precision integer that would round the Timedelta components (GH31354)
Bug in dividing np.nan or None by Timedelta incorrectly returning NaT (GH31869)
np.nan
None
NaT
Timedelta now understands µs as an identifier for microsecond (GH32899)
µs
Timedelta string representation now includes nanoseconds, when nanoseconds are non-zero (GH9309)
Bug in comparing a Timedelta object against an np.ndarray with timedelta64 dtype incorrectly viewing all entries as unequal (GH33441)
Bug in timedelta_range() that produced an extra point on a edge case (GH30353, GH33498)
Bug in DataFrame.resample() that produced an extra point on a edge case (GH30353, GH13022, GH33498)
Bug in DataFrame.resample() that ignored the loffset argument when dealing with timedelta (GH7687, GH33498)
Bug in Timedelta and pandas.to_timedelta() that ignored the unit argument for string input (GH12136)
pandas.to_timedelta()
unit
Bug in to_datetime() with infer_datetime_format=True where timezone names (e.g. UTC) would not be parsed correctly (GH33133)
infer_datetime_format=True
UTC
Bug in DataFrame.floordiv() with axis=0 not treating division-by-zero like Series.floordiv() (GH31271)
DataFrame.floordiv()
Series.floordiv()
Bug in to_numeric() with string argument "uint64" and errors="coerce" silently fails (GH32394)
to_numeric()
"uint64"
errors="coerce"
Bug in to_numeric() with downcast="unsigned" fails for empty data (GH32493)
downcast="unsigned"
Bug in DataFrame.mean() with numeric_only=False and either datetime64 dtype or PeriodDtype column incorrectly raising TypeError (GH32426)
numeric_only=False
PeriodDtype
Bug in DataFrame.count() with level="foo" and index level "foo" containing NaNs causes segmentation fault (GH21824)
DataFrame.count()
level="foo"
"foo"
Bug in DataFrame.diff() with axis=1 returning incorrect results with mixed dtypes (GH32995)
DataFrame.diff()
Bug in DataFrame.corr() and DataFrame.cov() raising when handling nullable integer columns with pandas.NA (GH33803)
DataFrame.corr()
pandas.NA
Bug in arithmetic operations between DataFrame objects with non-overlapping columns with duplicate labels causing an infinite loop (GH35194)
Bug in DataFrame and Series addition and subtraction between object-dtype objects and datetime64 dtype objects (GH33824)
Bug in Index.difference() giving incorrect results when comparing a Float64Index and object Index (GH35217)
Index.difference()
Float64Index
Bug in DataFrame reductions (e.g. df.min(), df.max()) with ExtensionArray dtypes (GH34520, GH32651)
df.min()
df.max()
ExtensionArray
Series.interpolate() and DataFrame.interpolate() now raise a ValueError if limit_direction is 'forward' or 'both' and method is 'backfill' or 'bfill' or limit_direction is 'backward' or 'both' and method is 'pad' or 'ffill' (GH34746)
Series.interpolate()
DataFrame.interpolate()
limit_direction
'forward'
'both'
'backfill'
'bfill'
'backward'
'pad'
'ffill'
Bug in Series construction from NumPy array with big-endian datetime64 dtype (GH29684)
Bug in Timedelta construction with large nanoseconds keyword value (GH32402)
Bug in DataFrame construction where sets would be duplicated rather than raising (GH32582)
The DataFrame constructor no longer accepts a list of DataFrame objects. Because of changes to NumPy, DataFrame objects are now consistently treated as 2D objects, so a list of DataFrame objects is considered 3D, and no longer acceptable for the DataFrame constructor (GH32289).
Bug in DataFrame when initiating a frame with lists and assign columns with nested list for MultiIndex (GH32173)
columns
Improved error message for invalid construction of list when creating a new index (GH35190)
Bug in the astype() method when converting “string” dtype data to nullable integer dtype (GH32450).
astype()
Fixed issue where taking min or max of a StringArray or Series with StringDtype type would raise. (GH31746)
StringArray
Bug in Series.str.cat() returning NaN output when other had Index type (GH33425)
Series.str.cat()
NaN
pandas.api.dtypes.is_string_dtype() no longer incorrectly identifies categorical series as string.
pandas.api.dtypes.is_string_dtype()
Bug in IntervalArray incorrectly allowing the underlying data to be changed when setting values (GH32782)
IntervalArray
DataFrame.xs() now raises a TypeError if a level keyword is supplied and the axis is not a MultiIndex. Previously an AttributeError was raised (GH33610)
DataFrame.xs()
level
Bug in slicing on a DatetimeIndex with a partial-timestamp dropping high-resolution indices near the end of a year, quarter, or month (GH31064)
Bug in PeriodIndex.get_loc() treating higher-resolution strings differently from PeriodIndex.get_value() (GH31172)
PeriodIndex.get_loc()
PeriodIndex.get_value()
Bug in Series.at() and DataFrame.at() not matching .loc behavior when looking up an integer in a Float64Index (GH31329)
Bug in PeriodIndex.is_monotonic() incorrectly returning True when containing leading NaT entries (GH31437)
PeriodIndex.is_monotonic()
Bug in DatetimeIndex.get_loc() raising KeyError with converted-integer key instead of the user-passed key (GH31425)
DatetimeIndex.get_loc()
Bug in Series.xs() incorrectly returning Timestamp instead of datetime64 in some object-dtype cases (GH31630)
Series.xs()
Bug in DataFrame.iat() incorrectly returning Timestamp instead of datetime in some object-dtype cases (GH32809)
DataFrame.iat()
datetime
Bug in DataFrame.at() when either columns or index is non-unique (GH33041)
Bug in Series.loc() and DataFrame.loc() when indexing with an integer key on a object-dtype Index that is not all-integers (GH31905)
Series.loc()
DataFrame.loc()
Bug in DataFrame.iloc.__setitem__() on a DataFrame with duplicate columns incorrectly setting values for all matching columns (GH15686, GH22036)
DataFrame.iloc.__setitem__()
Bug in DataFrame.loc() and Series.loc() with a DatetimeIndex, TimedeltaIndex, or PeriodIndex incorrectly allowing lookups of non-matching datetime-like dtypes (GH32650)
Bug in Series.__getitem__() indexing with non-standard scalars, e.g. np.dtype (GH32684)
Series.__getitem__()
np.dtype
Bug in Index constructor where an unhelpful error message was raised for NumPy scalars (GH33017)
Bug in DataFrame.lookup() incorrectly raising an AttributeError when frame.index or frame.columns is not unique; this will now raise a ValueError with a helpful error message (GH33041)
DataFrame.lookup()
frame.index
frame.columns
Bug in Interval where a Timedelta could not be added or subtracted from a Timestamp interval (GH32023)
Interval
Bug in DataFrame.copy() not invalidating _item_cache after copy caused post-copy value updates to not be reflected (GH31784)
DataFrame.copy()
Fixed regression in DataFrame.loc() and Series.loc() throwing an error when a datetime64[ns, tz] value is provided (GH32395)
Bug in Series.__getitem__() with an integer key and a MultiIndex with leading integer level failing to raise KeyError if the key is not present in the first level (GH33355)
Bug in DataFrame.iloc() when slicing a single column DataFrame with ExtensionDtype (e.g. df.iloc[:, :1]) returning an invalid result (GH32957)
DataFrame.iloc()
ExtensionDtype
df.iloc[:, :1]
Bug in DatetimeIndex.insert() and TimedeltaIndex.insert() causing index freq to be lost when setting an element into an empty Series (GH33573)
DatetimeIndex.insert()
TimedeltaIndex.insert()
Bug in Series.__setitem__() with an IntervalIndex and a list-like key of integers (GH33473)
Series.__setitem__()
Bug in Series.__getitem__() allowing missing labels with np.ndarray, Index, Series indexers but not list, these now all raise KeyError (GH33646)
Bug in DataFrame.truncate() and Series.truncate() where index was assumed to be monotone increasing (GH33756)
DataFrame.truncate()
Series.truncate()
Indexing with a list of strings representing datetimes failed on DatetimeIndex or PeriodIndex (GH11278)
Bug in Series.at() when used with a MultiIndex would raise an exception on valid inputs (GH26989)
Bug in DataFrame.loc() with dictionary of values changes columns with dtype of int to float (GH34573)
float
Bug in Series.loc() when used with a MultiIndex would raise an IndexingError when accessing a None value (GH34318)
IndexingError
Bug in DataFrame.reset_index() and Series.reset_index() would not preserve data types on an empty DataFrame or Series with a MultiIndex (GH19602)
DataFrame.reset_index()
Series.reset_index()
Bug in Series and DataFrame indexing with a time key on a DatetimeIndex with NaT entries (GH35114)
time
Calling fillna() on an empty Series now correctly returns a shallow copied object. The behaviour is now consistent with Index, DataFrame and a non-empty Series (GH32543).
fillna()
Bug in Series.replace() when argument to_replace is of type dict/list and is used on a Series containing <NA> was raising a TypeError. The method now handles this by ignoring <NA> values when doing the comparison for the replacement (GH32621)
Series.replace()
to_replace
<NA>
Bug in any() and all() incorrectly returning <NA> for all False or all True values using the nulllable Boolean dtype and with skipna=False (GH33253)
any()
all()
skipna=False
Clarified documentation on interpolate with method=akima. The der parameter must be scalar or None (GH33426)
method=akima
der
DataFrame.interpolate() uses the correct axis convention now. Previously interpolating along columns lead to interpolation along indices and vice versa. Furthermore interpolating with methods pad, ffill, bfill and backfill are identical to using these methods with DataFrame.fillna() (GH12918, GH29146)
ffill
bfill
DataFrame.fillna()
Bug in DataFrame.interpolate() when called on a DataFrame with column names of string type was throwing a ValueError. The method is now independent of the type of the column names (GH33956)
Passing NA into a format string using format specs will now work. For example "{:.1f}".format(pd.NA) would previously raise a ValueError, but will now return the string "<NA>" (GH34740)
"{:.1f}".format(pd.NA)
"<NA>"
Bug in Series.map() not raising on invalid na_action (GH32815)
Series.map()
na_action
DataFrame.swaplevels() now raises a TypeError if the axis is not a MultiIndex. Previously an AttributeError was raised (GH31126)
DataFrame.swaplevels()
Bug in Dataframe.loc() when used with a MultiIndex. The returned values were not in the same order as the given inputs (GH22797)
Dataframe.loc()
In [80]: df = pd.DataFrame(np.arange(4), ....: index=[["a", "a", "b", "b"], [1, 2, 1, 2]]) ....: # Rows are now ordered as the requested keys In [81]: df.loc[(['b', 'a'], [2, 1]), :] Out[81]: 0 b 2 3 1 2 a 2 1 1 0 [4 rows x 1 columns]
Bug in MultiIndex.intersection() was not guaranteed to preserve order when sort=False. (GH31325)
MultiIndex.intersection()
Bug in DataFrame.truncate() was dropping MultiIndex names. (GH34564)
In [82]: left = pd.MultiIndex.from_arrays([["b", "a"], [2, 1]]) In [83]: right = pd.MultiIndex.from_arrays([["a", "b", "c"], [1, 2, 3]]) # Common elements are now guaranteed to be ordered by the left side In [84]: left.intersection(right, sort=False) Out[84]: MultiIndex([('b', 2), ('a', 1)], )
Bug when joining two MultiIndex without specifying level with different columns. Return-indexers parameter was ignored. (GH34074)
Passing a set as names argument to pandas.read_csv(), pandas.read_table(), or pandas.read_fwf() will raise ValueError: Names should be an ordered collection. (GH34946)
names
pandas.read_csv()
pandas.read_table()
pandas.read_fwf()
ValueError: Names should be an ordered collection.
Bug in print-out when display.precision is zero. (GH20359)
display.precision
Bug in read_json() where integer overflow was occurring when json contains big number strings. (GH30320)
read_csv() will now raise a ValueError when the arguments header and prefix both are not None. (GH27394)
header
prefix
Bug in DataFrame.to_json() was raising NotFoundError when path_or_buf was an S3 URI (GH28375)
NotFoundError
Bug in DataFrame.to_parquet() overwriting pyarrow’s default for coerce_timestamps; following pyarrow’s default allows writing nanosecond timestamps with version="2.0" (GH31652).
DataFrame.to_parquet()
coerce_timestamps
version="2.0"
Bug in read_csv() was raising TypeError when sep=None was used in combination with comment keyword (GH31396)
sep=None
comment
Bug in HDFStore that caused it to set to int64 the dtype of a datetime64 column when reading a DataFrame in Python 3 from fixed format written in Python 2 (GH31750)
HDFStore
read_sas() now handles dates and datetimes larger than Timestamp.max returning them as datetime.datetime objects (GH20927)
read_sas()
Timestamp.max
Bug in DataFrame.to_json() where Timedelta objects would not be serialized correctly with date_format="iso" (GH28256)
date_format="iso"
read_csv() will raise a ValueError when the column names passed in parse_dates are missing in the Dataframe (GH31251)
parse_dates
Dataframe
Bug in read_excel() where a UTF-8 string with a high surrogate would cause a segmentation violation (GH23809)
Bug in read_csv() was causing a file descriptor leak on an empty file (GH31488)
Bug in read_csv() was causing a segfault when there were blank lines between the header and data rows (GH28071)
Bug in read_csv() was raising a misleading exception on a permissions issue (GH23784)
Bug in read_csv() was raising an IndexError when header=None and two extra data columns
header=None
Bug in read_sas() was raising an AttributeError when reading files from Google Cloud Storage (GH33069)
Bug in DataFrame.to_sql() where an AttributeError was raised when saving an out of bounds date (GH26761)
Bug in read_excel() did not correctly handle multiple embedded spaces in OpenDocument text cells. (GH32207)
Bug in read_json() was raising TypeError when reading a list of Booleans into a Series. (GH31464)
Bug in pandas.io.json.json_normalize() where location specified by record_path doesn’t point to an array. (GH26284)
pandas.io.json.json_normalize()
record_path
pandas.read_hdf() has a more explicit error message when loading an unsupported HDF file (GH9539)
pandas.read_hdf()
Bug in read_feather() was raising an ArrowIOError when reading an s3 or http file path (GH29055)
read_feather()
ArrowIOError
Bug in to_excel() could not handle the column name render and was raising an KeyError (GH34331)
to_excel()
render
Bug in execute() was raising a ProgrammingError for some DB-API drivers when the SQL statement contained the % character and no parameters were present (GH34211)
execute()
ProgrammingError
%
Bug in StataReader() which resulted in categorical variables with different dtypes when reading data using an iterator. (GH31544)
StataReader()
HDFStore.keys() has now an optional include parameter that allows the retrieval of all native HDF5 table names (GH29916)
HDFStore.keys()
include
TypeError exceptions raised by read_csv() and read_table() were showing as parser_f when an unexpected keyword argument was passed (GH25648)
read_table()
parser_f
Bug in read_excel() for ODS files removes 0.0 values (GH27222)
Bug in ujson.encode() was raising an OverflowError with numbers larger than sys.maxsize (GH34395)
ujson.encode()
OverflowError
sys.maxsize
Bug in HDFStore.append_to_multiple() was raising a ValueError when the min_itemsize parameter is set (GH11238)
HDFStore.append_to_multiple()
min_itemsize
Bug in create_table() now raises an error when column argument was not specified in data_columns on input (GH28156)
create_table()
column
data_columns
read_json() now could read line-delimited json file from a file url while lines and chunksize are set.
lines
chunksize
Bug in DataFrame.to_sql() when reading DataFrames with -np.inf entries with MySQL now has a more explicit ValueError (GH34431)
-np.inf
Bug where capitalised files extensions were not decompressed by read_* functions (GH35164)
Bug in read_excel() that was raising a TypeError when header=None and index_col is given as a list (GH31783)
index_col
Bug in read_excel() where datetime values are used in the header in a MultiIndex (GH34748)
read_excel() no longer takes **kwds arguments. This means that passing in the keyword argument chunksize now raises a TypeError (previously raised a NotImplementedError), while passing in the keyword argument encoding now raises a TypeError (GH34464)
**kwds
NotImplementedError
encoding
Bug in DataFrame.to_records() was incorrectly losing timezone information in timezone-aware datetime64 columns (GH32535)
DataFrame.to_records()
DataFrame.plot() for line/bar now accepts color by dictionary (GH8193).
Bug in DataFrame.plot.hist() where weights are not working for multiple columns (GH33173)
DataFrame.plot.hist()
Bug in DataFrame.boxplot() and DataFrame.plot.boxplot() lost color attributes of medianprops, whiskerprops, capprops and boxprops (GH30346)
DataFrame.boxplot()
DataFrame.plot.boxplot()
medianprops
whiskerprops
capprops
boxprops
Bug in DataFrame.hist() where the order of column argument was ignored (GH29235)
Bug in DataFrame.plot.scatter() that when adding multiple plots with different cmap, colorbars always use the first cmap (GH33389)
DataFrame.plot.scatter()
cmap
Bug in DataFrame.plot.scatter() was adding a colorbar to the plot even if the argument c was assigned to a column containing color names (GH34316)
c
Bug in pandas.plotting.bootstrap_plot() was causing cluttered axes and overlapping labels (GH34905)
pandas.plotting.bootstrap_plot()
Bug in DataFrame.plot.scatter() caused an error when plotting variable marker sizes (GH32904)
Using a pandas.api.indexers.BaseIndexer with count, min, max, median, skew, cov, corr will now return correct results for any monotonic pandas.api.indexers.BaseIndexer descendant (GH32865)
pandas.api.indexers.BaseIndexer
count
median
cov
corr
DataFrameGroupby.mean() and SeriesGroupby.mean() (and similarly for median(), std() and var()) now raise a TypeError if a non-accepted keyword argument is passed into it. Previously an UnsupportedFunctionCall was raised (AssertionError if min_count passed into median()) (GH31485)
DataFrameGroupby.mean()
SeriesGroupby.mean()
median()
std()
var()
UnsupportedFunctionCall
min_count
Bug in GroupBy.apply() raises ValueError when the by axis is not sorted, has duplicates, and the applied func does not mutate passed in objects (GH30667)
GroupBy.apply()
Bug in DataFrameGroupBy.transform() produces an incorrect result with transformation functions (GH30918)
DataFrameGroupBy.transform()
Bug in Groupby.transform() was returning the wrong result when grouping by multiple keys of which some were categorical and others not (GH32494)
Groupby.transform()
Bug in GroupBy.count() causes segmentation fault when grouped-by columns contain NaNs (GH32841)
GroupBy.count()
Bug in DataFrame.groupby() and Series.groupby() produces inconsistent type when aggregating Boolean Series (GH32894)
Bug in DataFrameGroupBy.sum() and SeriesGroupBy.sum() where a large negative number would be returned when the number of non-null values was below min_count for nullable integer dtypes (GH32861)
DataFrameGroupBy.sum()
SeriesGroupBy.sum()
Bug in SeriesGroupBy.quantile() was raising on nullable integers (GH33136)
SeriesGroupBy.quantile()
Bug in DataFrame.resample() where an AmbiguousTimeError would be raised when the resulting timezone aware DatetimeIndex had a DST transition at midnight (GH25758)
AmbiguousTimeError
Bug in DataFrame.groupby() where a ValueError would be raised when grouping by a categorical column with read-only categories and sort=False (GH33410)
Bug in GroupBy.agg(), GroupBy.transform(), and GroupBy.resample() where subclasses are not preserved (GH28330)
GroupBy.agg()
GroupBy.transform()
GroupBy.resample()
Bug in SeriesGroupBy.agg() where any column name was accepted in the named aggregation of SeriesGroupBy previously. The behaviour now allows only str and callables else would raise TypeError. (GH34422)
SeriesGroupBy.agg()
Bug in DataFrame.groupby() lost the name of the Index when one of the agg keys referenced an empty list (GH32580)
agg
Bug in Rolling.apply() where center=True was ignored when engine='numba' was specified (GH34784)
Rolling.apply()
center=True
engine='numba'
Bug in DataFrame.ewm.cov() was throwing AssertionError for MultiIndex inputs (GH34440)
DataFrame.ewm.cov()
Bug in core.groupby.DataFrameGroupBy.quantile() raised TypeError for non-numeric types rather than dropping the columns (GH27892)
core.groupby.DataFrameGroupBy.quantile()
Bug in core.groupby.DataFrameGroupBy.transform() when func='nunique' and columns are of type datetime64, the result would also be of type datetime64 instead of int64 (GH35109)
core.groupby.DataFrameGroupBy.transform()
func='nunique'
Bug in DataFrame.groupby() raising an AttributeError when selecting a column and aggregating with as_index=False (GH35246).
Bug in DataFrameGroupBy.first() and DataFrameGroupBy.last() that would raise an unnecessary ValueError when grouping on multiple Categoricals (GH34951)
DataFrameGroupBy.first()
DataFrameGroupBy.last()
Categoricals
Bug effecting all numeric and Boolean reduction methods not returning subclassed data type. (GH25596)
Bug in DataFrame.pivot_table() when only MultiIndexed columns is set (GH17038)
DataFrame.pivot_table()
MultiIndexed
Bug in DataFrame.unstack() and Series.unstack() can take tuple names in MultiIndexed data (GH19966)
DataFrame.unstack()
Series.unstack()
Bug in DataFrame.pivot_table() when margin is True and only column is defined (GH31016)
margin
Fixed incorrect error message in DataFrame.pivot() when columns is set to None. (GH30924)
DataFrame.pivot()
Bug in crosstab() when inputs are two Series and have tuple names, the output will keep a dummy MultiIndex as columns. (GH18321)
crosstab()
DataFrame.pivot() can now take lists for index and columns arguments (GH21425)
Bug in concat() where the resulting indices are not copied when copy=True (GH29879)
copy=True
Bug in SeriesGroupBy.aggregate() was resulting in aggregations being overwritten when they shared the same name (GH30880)
SeriesGroupBy.aggregate()
Bug where Index.astype() would lose the name attribute when converting from Float64Index to Int64Index, or when casting to an ExtensionArray dtype (GH32013)
Index.astype()
Int64Index
Series.append() will now raise a TypeError when passed a DataFrame or a sequence containing DataFrame (GH31413)
Series.append()
DataFrame.replace() and Series.replace() will raise a TypeError if to_replace is not an expected type. Previously the replace would fail silently (GH18634)
DataFrame.replace()
replace
Bug on inplace operation of a Series that was adding a column to the DataFrame from where it was originally dropped from (using inplace=True) (GH30484)
inplace=True
Bug in DataFrame.apply() where callback was called with Series parameter even though raw=True requested. (GH32423)
DataFrame.apply()
raw=True
Bug in DataFrame.pivot_table() losing timezone information when creating a MultiIndex level from a column with timezone-aware dtype (GH32558)
Bug in concat() where when passing a non-dict mapping as objs would raise a TypeError (GH32863)
objs
DataFrame.agg() now provides more descriptive SpecificationError message when attempting to aggregate a non-existent column (GH32755)
SpecificationError
Bug in DataFrame.unstack() when MultiIndex columns and MultiIndex rows were used (GH32624, GH24729 and GH28306)
Appending a dictionary to a DataFrame without passing ignore_index=True will raise TypeError: Can only append a dict if ignore_index=True instead of TypeError: Can only append a :class:`Series` if ignore_index=True or if the :class:`Series` has a name (GH30871)
ignore_index=True
TypeError: Can only append a dict if ignore_index=True
TypeError: Can only append a :class:`Series` if ignore_index=True or if the :class:`Series` has a name
Bug in DataFrame.corrwith(), DataFrame.memory_usage(), DataFrame.dot(), DataFrame.idxmin(), DataFrame.idxmax(), DataFrame.duplicated(), DataFrame.isin(), DataFrame.count(), Series.explode(), Series.asof() and DataFrame.asof() not returning subclassed types. (GH31331)
DataFrame.corrwith()
DataFrame.memory_usage()
DataFrame.dot()
DataFrame.idxmin()
DataFrame.idxmax()
DataFrame.duplicated()
DataFrame.isin()
Series.explode()
Series.asof()
DataFrame.asof()
Bug in concat() was not allowing for concatenation of DataFrame and Series with duplicate keys (GH33654)
Bug in cut() raised an error when the argument labels contains duplicates (GH33141)
labels
Ensure only named functions can be used in eval() (GH32460)
eval()
Bug in Dataframe.aggregate() and Series.aggregate() was causing a recursive loop in some cases (GH34224)
Dataframe.aggregate()
Series.aggregate()
Fixed bug in melt() where melting MultiIndex columns with col_level > 0 would raise a KeyError on id_vars (GH34129)
col_level > 0
id_vars
Bug in Series.where() with an empty Series and empty cond having non-bool dtype (GH34592)
Series.where()
cond
Fixed regression where DataFrame.apply() would raise ValueError for elements with S dtype (GH34529)
S
Creating a SparseArray from timezone-aware dtype will issue a warning before dropping timezone information, instead of doing so silently (GH32501)
SparseArray
Bug in arrays.SparseArray.from_spmatrix() wrongly read scipy sparse matrix (GH31991)
arrays.SparseArray.from_spmatrix()
Bug in Series.sum() with SparseArray raised a TypeError (GH25777)
Series.sum()
Bug where DataFrame containing an all-sparse SparseArray filled with NaN when indexed by a list-like (GH27781, GH29563)
The repr of SparseDtype now includes the repr of its fill_value attribute. Previously it used fill_value’s string representation (GH34352)
SparseDtype
Bug where empty DataFrame could not be cast to SparseDtype (GH33113)
Bug in arrays.SparseArray() was returning the incorrect type when indexing a sparse dataframe with an iterable (GH34526, GH34540)
arrays.SparseArray()
Fixed bug where Series.value_counts() would raise on empty input of Int64 dtype (GH33317)
Series.value_counts()
Int64
Fixed bug in concat() when concatenating DataFrame objects with non-overlapping columns resulting in object-dtype columns rather than preserving the extension dtype (GH27692, GH33027)
Fixed bug where StringArray.isna() would return False for NA values when pandas.options.mode.use_inf_as_na was set to True (GH33655)
StringArray.isna()
pandas.options.mode.use_inf_as_na
Fixed bug in Series construction with EA dtype and index but no data or scalar data fails (GH26469)
Fixed bug that caused Series.__repr__() to crash for extension types whose elements are multidimensional arrays (GH33770).
Series.__repr__()
Fixed bug where Series.update() would raise a ValueError for ExtensionArray dtypes with missing values (GH33980)
Fixed bug where StringArray.memory_usage() was not implemented (GH33963)
StringArray.memory_usage()
Fixed bug where DataFrameGroupBy() would ignore the min_count argument for aggregations on nullable Boolean dtypes (GH34051)
DataFrameGroupBy()
Fixed bug where the constructor of DataFrame with dtype='string' would fail (GH27953, GH33623)
dtype='string'
Bug where DataFrame column set to scalar extension type was considered an object type rather than the extension type (GH34832)
Fixed bug in IntegerArray.astype() to correctly copy the mask as well (GH34931).
Set operations on an object-dtype Index now always return object-dtype results (GH31401)
Fixed pandas.testing.assert_series_equal() to correctly raise if the left argument is a different subclass with check_series_type=True (GH32670).
pandas.testing.assert_series_equal()
left
check_series_type=True
Getting a missing attribute in a DataFrame.query() or DataFrame.eval() string raises the correct AttributeError (GH32408)
DataFrame.query()
DataFrame.eval()
Fixed bug in pandas.testing.assert_series_equal() where dtypes were checked for Interval and ExtensionArray operands when check_dtype was False (GH32747)
check_dtype
Bug in DataFrame.__dir__() caused a segfault when using unicode surrogates in a column name (GH25509)
DataFrame.__dir__()
Bug in DataFrame.equals() and Series.equals() in allowing subclasses to be equal (GH34402).
DataFrame.equals()
A total of 368 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
3vts +
A Brooks +
Abbie Popa +
Achmad Syarif Hidayatullah +
Adam W Bagaskarta +
Adrian Mastronardi +
Aidan Montare +
Akbar Septriyan +
Akos Furton +
Alejandro Hall +
Alex Hall +
Alex Itkes +
Alex Kirko
Ali McMaster +
Alvaro Aleman +
Amy Graham +
Andrew Schonfeld +
Andrew Shumanskiy +
Andrew Wieteska +
Angela Ambroz
Anjali Singh +
Anna Daglis
Anthony Milbourne +
Antony Lee +
Ari Sosnovsky +
Arkadeep Adhikari +
Arunim Samudra +
Ashkan +
Ashwin Prakash Nalwade +
Ashwin Srinath +
Atsushi Nukariya +
Ayappan +
Ayla Khan +
Bart +
Bart Broere +
Benjamin Beier Liu +
Benjamin Fischer +
Bharat Raghunathan
Bradley Dice +
Brendan Sullivan +
Brian Strand +
Carsten van Weelden +
Chamoun Saoma +
ChrisRobo +
Christian Chwala
Christopher Whelan
Christos Petropoulos +
Chuanzhu Xu
CloseChoice +
Clément Robert +
CuylenE +
DanBasson +
Daniel Saxton
Danilo Horta +
DavaIlhamHaeruzaman +
Dave Hirschfeld
Dave Hughes
David Rouquet +
David S +
Deepyaman Datta
Dennis Bakhuis +
Derek McCammond +
Devjeet Roy +
Diane Trout
Dina +
Dom +
Drew Seibert +
EdAbati
Emiliano Jordan +
Erfan Nariman +
Eric Groszman +
Erik Hasse +
Erkam Uyanik +
Evan D +
Evan Kanter +
Fangchen Li +
Farhan Reynaldo +
Farhan Reynaldo Hutabarat +
Florian Jetter +
Fred Reiss +
GYHHAHA +
Gabriel Moreira +
Gabriel Tutui +
Galuh Sahid
Gaurav Chauhan +
George Hartzell +
Gim Seng +
Giovanni Lanzani +
Gordon Chen +
Graham Wetzler +
Guillaume Lemaitre
Guillem Sánchez +
HH-MWB +
Harshavardhan Bachina
How Si Wei
Ian Eaves
Iqrar Agalosi Nureyza +
Irv Lustig
Iva Laginja +
JDkuba
Jack Greisman +
Jacob Austin +
Jacob Deppen +
Jacob Peacock +
Jake Tae +
Jake Vanderplas +
James Cobon-Kerr
Jan Červenka +
Jan Škoda
Jane Chen +
Jean-Francois Zinque +
Jeanderson Barros Candido +
Jeff Reback
Jered Dominguez-Trujillo +
Jeremy Schendel
Jesse Farnham
Jiaxiang
Jihwan Song +
Joaquim L. Viegas +
Joel Nothman
John Bodley +
John Paton +
Jon Thielen +
Joris Van den Bossche
Jose Manuel Martí +
Joseph Gulian +
Josh Dimarsky
Joy Bhalla +
João Veiga +
Julian de Ruiter +
Justin Essert +
Justin Zheng
KD-dev-lab +
Kaiqi Dong
Karthik Mathur +
Kaushal Rohit +
Kee Chong Tan
Ken Mankoff +
Kendall Masse
Kenny Huynh +
Ketan +
Kevin Anderson +
Kevin Bowey +
Kevin Sheppard
Kilian Lieret +
Koki Nishihara +
Krishna Chivukula +
KrishnaSai2020 +
Lesley +
Lewis Cowles +
Linda Chen +
Linxiao Wu +
Lucca Delchiaro Costabile +
MBrouns +
Mabel Villalba
Mabroor Ahmed +
Madhuri Palanivelu +
Mak Sze Chun
Malcolm +
Marc Garcia
Marco Gorelli
Marian Denes +
Martin Bjeldbak Madsen +
Martin Durant +
Martin Fleischmann +
Martin Jones +
Martin Winkel
Martina Oefelein +
Marvzinc +
María Marino +
Matheus Cardoso +
Mathis Felardos +
Matt Roeschke
Matteo Felici +
Matteo Santamaria +
Matthew Roeschke
Matthias Bussonnier
Max Chen
Max Halford +
Mayank Bisht +
Megan Thong +
Michael Marino +
Miguel Marques +
Mike Kutzma
Mohammad Hasnain Mohsin Rajan +
Mohammad Jafar Mashhadi +
MomIsBestFriend
Monica +
Natalie Jann
Nate Armstrong +
Nathanael +
Nick Newman +
Nico Schlömer +
Niklas Weber +
ObliviousParadigm +
Olga Lyashevska +
OlivierLuG +
Pandas Development Team
Parallels +
Patrick +
Patrick Cando +
Paul Lilley +
Paul Sanders +
Pearcekieser +
Pedro Larroy +
Pedro Reys
Peter Bull +
Peter Steinbach +
Phan Duc Nhat Minh +
Phil Kirlin +
Pierre-Yves Bourguignon +
Piotr Kasprzyk +
Piotr Niełacny +
Prakhar Pandey
Prashant Anand +
Puneetha Pai +
Quang Nguyễn +
Rafael Jaimes III +
Rafif +
RaisaDZ +
Rakshit Naidu +
Ram Rachum +
Red +
Ricardo Alanis +
Richard Shadrach +
Rik-de-Kort
Robert de Vries
Robin to Roxel +
Roger Erens +
Rohith295 +
Roman Yurchak
Ror +
Rushabh Vasani
Ryan
Ryan Nazareth
SAI SRAVAN MEDICHERLA +
SHUBH CHATTERJEE +
Sam Cohan
Samira-g-js +
Sandu Ursu +
Sang Agung +
SanthoshBala18 +
Sasidhar Kasturi +
SatheeshKumar Mohan +
Saul Shanabrook
Scott Gigante +
Sebastian Berg +
Sebastián Vanrell
Sergei Chipiga +
Sergey +
ShilpaSugan +
Simon Gibbons
Simon Hawkins
Simon Legner +
Soham Tiwari +
Song Wenhao +
Souvik Mandal
Spencer Clark
Steffen Rehberg +
Steffen Schmitz +
Stijn Van Hoey
Stéphan Taljaard
SultanOrazbayev +
Sumanau Sareen
SurajH1 +
Suvayu Ali +
Terji Petersen
Thomas J Fan +
Thomas Li
Thomas Smith +
Tim Swast
Tobias Pitters +
Tom +
Tom Augspurger
Uwe L. Korn
Valentin Iovene +
Vandana Iyer +
Venkatesh Datta +
Vijay Sai Mutyala +
Vikas Pandey
Vipul Rai +
Vishwam Pandya +
Vladimir Berkutov +
Will Ayd
Will Holmgren
William +
William Ayd
Yago González +
Yosuke KOBAYASHI +
Zachary Lawrence +
Zaky Bilfagih +
Zeb Nicholls +
alimcmaster1
alm +
andhikayusup +
andresmcneill +
avinashpancham +
benabel +
bernie gray +
biddwan09 +
brock +
chris-b1
cleconte987 +
dan1261 +
david-cortes +
davidwales +
dequadras +
dhuettenmoser +
dilex42 +
elmonsomiat +
epizzigoni +
fjetter
gabrielvf1 +
gdex1 +
gfyoung
guru kiran +
h-vishal
iamshwin
jamin-aws-ospo +
jbrockmendel
jfcorbett +
jnecus +
kernc
kota matsuoka +
kylekeppler +
leandermaben +
link2xt +
manoj_koneni +
marydmit +
masterpiga +
maxime.song +
mglasder +
moaraccounts +
mproszewska
neilkg
nrebena
ossdev07 +
paihu
pan Jacek +
partev +
patrick +
pedrooa +
pizzathief +
proost
pvanhauw +
rbenes
rebecca-palmer
rhshadrach +
rjfs +
s-scherrer +
sage +
sagungrp +
salem3358 +
saloni30 +
smartswdeveloper +
smartvinnetou +
themien +
timhunderwood +
tolhassianipar +
tonywu1999
tsvikas
tv3141
venkateshdatta1993 +
vivikelapoutre +
willbowditch +
willpeppo +
za +
zaki-indra +