pandas.api.types.union_categoricals#

pandas.api.types.union_categoricals(to_union, sort_categories=False, ignore_order=False)[source]#

Combine list-like of Categorical-like, unioning categories.

All categories must have the same dtype.

Parameters:
to_unionlist-like

Categorical, CategoricalIndex, or Series with dtype=’category’.

sort_categoriesbool, default False

If true, resulting categories will be lexsorted, otherwise they will be ordered as they appear in the data.

ignore_orderbool, default False

If true, the ordered attribute of the Categoricals will be ignored. Results in an unordered categorical.

Returns:
Categorical
Raises:
TypeError
  • all inputs do not have the same dtype

  • all inputs do not have the same ordered property

  • all inputs are ordered and their categories are not identical

  • sort_categories=True and Categoricals are ordered

ValueError

Empty list of categoricals passed

Notes

To learn more about categories, see link

Examples

If you want to combine categoricals that do not necessarily have the same categories, union_categoricals will combine a list-like of categoricals. The new categories will be the union of the categories being combined.

>>> a = pd.Categorical(["b", "c"])
>>> b = pd.Categorical(["a", "b"])
>>> pd.api.types.union_categoricals([a, b])
['b', 'c', 'a', 'b']
Categories (3, object): ['b', 'c', 'a']

By default, the resulting categories will be ordered as they appear in the categories of the data. If you want the categories to be lexsorted, use sort_categories=True argument.

>>> pd.api.types.union_categoricals([a, b], sort_categories=True)
['b', 'c', 'a', 'b']
Categories (3, object): ['a', 'b', 'c']

union_categoricals also works with the case of combining two categoricals of the same categories and order information (e.g. what you could also append for).

>>> a = pd.Categorical(["a", "b"], ordered=True)
>>> b = pd.Categorical(["a", "b", "a"], ordered=True)
>>> pd.api.types.union_categoricals([a, b])
['a', 'b', 'a', 'b', 'a']
Categories (2, object): ['a' < 'b']

Raises TypeError because the categories are ordered and not identical.

>>> a = pd.Categorical(["a", "b"], ordered=True)
>>> b = pd.Categorical(["a", "b", "c"], ordered=True)
>>> pd.api.types.union_categoricals([a, b])
Traceback (most recent call last):
    ...
TypeError: to union ordered Categoricals, all categories must be the same

Ordered categoricals with different categories or orderings can be combined by using the ignore_ordered=True argument.

>>> a = pd.Categorical(["a", "b", "c"], ordered=True)
>>> b = pd.Categorical(["c", "b", "a"], ordered=True)
>>> pd.api.types.union_categoricals([a, b], ignore_order=True)
['a', 'b', 'c', 'c', 'b', 'a']
Categories (3, object): ['a', 'b', 'c']

union_categoricals also works with a CategoricalIndex, or Series containing categorical data, but note that the resulting array will always be a plain Categorical

>>> a = pd.Series(["b", "c"], dtype="category")
>>> b = pd.Series(["a", "b"], dtype="category")
>>> pd.api.types.union_categoricals([a, b])
['b', 'c', 'a', 'b']
Categories (3, object): ['b', 'c', 'a']