pandas.unique

pandas.unique(values)[source]

Hash table-based unique. Uniques are returned in order of appearance. This does NOT sort.

Significantly faster than numpy.unique. Includes NA values.

Parameters
values1d array-like
Returns
numpy.ndarray or ExtensionArray

The return can be:

  • Index : when the input is an Index

  • Categorical : when the input is a Categorical dtype

  • ndarray : when the input is a Series/ndarray

Return numpy.ndarray or ExtensionArray.

See also

Index.unique

Return unique values from an Index.

Series.unique

Return unique values of Series object.

Examples

>>> pd.unique(pd.Series([2, 1, 3, 3]))
array([2, 1, 3])
>>> pd.unique(pd.Series([2] + [1] * 5))
array([2, 1])
>>> pd.unique(pd.Series([pd.Timestamp('20160101'),
...                     pd.Timestamp('20160101')]))
array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
>>> pd.unique(pd.Series([pd.Timestamp('20160101', tz='US/Eastern'),
...                      pd.Timestamp('20160101', tz='US/Eastern')]))
array([Timestamp('2016-01-01 00:00:00-0500', tz='US/Eastern')],
      dtype=object)
>>> pd.unique(pd.Index([pd.Timestamp('20160101', tz='US/Eastern'),
...                     pd.Timestamp('20160101', tz='US/Eastern')]))
DatetimeIndex(['2016-01-01 00:00:00-05:00'],
...           dtype='datetime64[ns, US/Eastern]', freq=None)
>>> pd.unique(list('baabc'))
array(['b', 'a', 'c'], dtype=object)

An unordered Categorical will return categories in the order of appearance.

>>> pd.unique(pd.Series(pd.Categorical(list('baabc'))))
[b, a, c]
Categories (3, object): [b, a, c]
>>> pd.unique(pd.Series(pd.Categorical(list('baabc'),
...                                    categories=list('abc'))))
[b, a, c]
Categories (3, object): [b, a, c]

An ordered Categorical preserves the category ordering.

>>> pd.unique(pd.Series(pd.Categorical(list('baabc'),
...                                    categories=list('abc'),
...                                    ordered=True)))
[b, a, c]
Categories (3, object): [a < b < c]

An array of tuples

>>> pd.unique([('a', 'b'), ('b', 'a'), ('a', 'c'), ('b', 'a')])
array([('a', 'b'), ('b', 'a'), ('a', 'c')], dtype=object)