pandas.Series.cat.set_categories#
- Series.cat.set_categories(new_categories, ordered=None, rename=False)[source]#
Set the categories to the specified new categories.
new_categories
can include new categories (which will result in unused categories) or remove old categories (which results in values set toNaN
). Ifrename=True
, the categories will simply be renamed (less or more items than in old categories will result in values set toNaN
or in unused categories respectively).This method can be used to perform more than one action of adding, removing, and reordering simultaneously and is therefore faster than performing the individual steps via the more specialised methods.
On the other hand this methods does not do checks (e.g., whether the old categories are included in the new categories on a reorder), which can result in surprising changes, for example when using special string dtypes, which do not consider a S1 string equal to a single char python string.
- Parameters:
- new_categoriesIndex-like
The categories in new order.
- orderedbool, default None
Whether or not the categorical is treated as a ordered categorical. If not given, do not change the ordered information.
- renamebool, default False
Whether or not the new_categories should be considered as a rename of the old categories or as reordered categories.
- Returns:
- Categorical
New categories to be used, with optional ordering changes.
- Raises:
- ValueError
If new_categories does not validate as categories
See also
rename_categories
Rename categories.
reorder_categories
Reorder categories.
add_categories
Add new categories.
remove_categories
Remove the specified categories.
remove_unused_categories
Remove categories which are not used.
Examples
For
pandas.Series
:>>> raw_cat = pd.Categorical( ... ["a", "b", "c", "A"], categories=["a", "b", "c"], ordered=True ... ) >>> ser = pd.Series(raw_cat) >>> ser 0 a 1 b 2 c 3 NaN dtype: category Categories (3, object): ['a' < 'b' < 'c']
>>> ser.cat.set_categories(["A", "B", "C"], rename=True) 0 A 1 B 2 C 3 NaN dtype: category Categories (3, object): ['A' < 'B' < 'C']
>>> ci = pd.CategoricalIndex( ... ["a", "b", "c", "A"], categories=["a", "b", "c"], ordered=True ... ) >>> ci CategoricalIndex(['a', 'b', 'c', nan], categories=['a', 'b', 'c'], ordered=True, dtype='category')
>>> ci.set_categories(["A", "b", "c"]) CategoricalIndex([nan, 'b', 'c', nan], categories=['A', 'b', 'c'], ordered=True, dtype='category') >>> ci.set_categories(["A", "b", "c"], rename=True) CategoricalIndex(['A', 'b', 'c', nan], categories=['A', 'b', 'c'], ordered=True, dtype='category')