pandas.Series.drop_duplicates#
- Series.drop_duplicates(*, keep='first', inplace=False, ignore_index=False)[source]#
Return Series with duplicate values removed.
- Parameters:
- keep{‘first’, ‘last’,
False
}, default ‘first’ Method to handle dropping duplicates:
‘first’ : Drop duplicates except for the first occurrence.
‘last’ : Drop duplicates except for the last occurrence.
False
: Drop all duplicates.
- inplacebool, default
False
If
True
, performs operation inplace and returns None.- ignore_indexbool, default
False
If
True
, the resulting axis will be labeled 0, 1, …, n - 1.Added in version 2.0.0.
- keep{‘first’, ‘last’,
- Returns:
- Series or None
Series with duplicates dropped or None if
inplace=True
.
See also
Index.drop_duplicates
Equivalent method on Index.
DataFrame.drop_duplicates
Equivalent method on DataFrame.
Series.duplicated
Related method on Series, indicating duplicate Series values.
Series.unique
Return unique values as an array.
Examples
Generate a Series with duplicated entries.
>>> s = pd.Series( ... ["llama", "cow", "llama", "beetle", "llama", "hippo"], name="animal" ... ) >>> s 0 llama 1 cow 2 llama 3 beetle 4 llama 5 hippo Name: animal, dtype: object
With the ‘keep’ parameter, the selection behavior of duplicated values can be changed. The value ‘first’ keeps the first occurrence for each set of duplicated entries. The default value of keep is ‘first’.
>>> s.drop_duplicates() 0 llama 1 cow 3 beetle 5 hippo Name: animal, dtype: object
The value ‘last’ for parameter ‘keep’ keeps the last occurrence for each set of duplicated entries.
>>> s.drop_duplicates(keep="last") 1 cow 3 beetle 4 llama 5 hippo Name: animal, dtype: object
The value
False
for parameter ‘keep’ discards all sets of duplicated entries.>>> s.drop_duplicates(keep=False) 1 cow 3 beetle 5 hippo Name: animal, dtype: object