pandas.DataFrame.set_index#
- DataFrame.set_index(keys, *, drop=True, append=False, inplace=False, verify_integrity=False)[source]#
- Set the DataFrame index using existing columns. - Set the DataFrame index (row labels) using one or more existing columns or arrays (of the correct length). The index can replace the existing index or expand on it. - Parameters
- keyslabel or array-like or list of labels/arrays
- This parameter can be either a single column key, a single array of the same length as the calling DataFrame, or a list containing an arbitrary combination of column keys and arrays. Here, “array” encompasses - Series,- Index,- np.ndarray, and instances of- Iterator.
- dropbool, default True
- Delete columns to be used as the new index. 
- appendbool, default False
- Whether to append columns to existing index. 
- inplacebool, default False
- Whether to modify the DataFrame rather than creating a new one. 
- verify_integritybool, default False
- Check the new index for duplicates. Otherwise defer the check until necessary. Setting to False will improve the performance of this method. 
 
- Returns
- DataFrame or None
- Changed row labels or None if - inplace=True.
 
 - See also - DataFrame.reset_index
- Opposite of set_index. 
- DataFrame.reindex
- Change to new indices or expand indices. 
- DataFrame.reindex_like
- Change to same indices as other DataFrame. 
 - Examples - >>> df = pd.DataFrame({'month': [1, 4, 7, 10], ... 'year': [2012, 2014, 2013, 2014], ... 'sale': [55, 40, 84, 31]}) >>> df month year sale 0 1 2012 55 1 4 2014 40 2 7 2013 84 3 10 2014 31 - Set the index to become the ‘month’ column: - >>> df.set_index('month') year sale month 1 2012 55 4 2014 40 7 2013 84 10 2014 31 - Create a MultiIndex using columns ‘year’ and ‘month’: - >>> df.set_index(['year', 'month']) sale year month 2012 1 55 2014 4 40 2013 7 84 2014 10 31 - Create a MultiIndex using an Index and a column: - >>> df.set_index([pd.Index([1, 2, 3, 4]), 'year']) month sale year 1 2012 1 55 2 2014 4 40 3 2013 7 84 4 2014 10 31 - Create a MultiIndex using two Series: - >>> s = pd.Series([1, 2, 3, 4]) >>> df.set_index([s, s**2]) month year sale 1 1 1 2012 55 2 4 4 2014 40 3 9 7 2013 84 4 16 10 2014 31