pandas.DataFrame.sort_values#
- DataFrame.sort_values(by, *, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last', ignore_index=False, key=None)[source]#
- Sort by the values along either axis. - Parameters
- bystr or list of str
- Name or list of names to sort by. - if axis is 0 or ‘index’ then by may contain index levels and/or column labels. 
- if axis is 1 or ‘columns’ then by may contain column levels and/or index labels. 
 
- axis{0 or ‘index’, 1 or ‘columns’}, default 0
- Axis to be sorted. 
- ascendingbool or list of bool, default True
- Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by. 
- inplacebool, default False
- If True, perform operation in-place. 
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, default ‘quicksort’
- Choice of sorting algorithm. See also - numpy.sort()for more information. mergesort and stable are the only stable algorithms. For DataFrames, this option is only applied when sorting on a single column or label.
- na_position{‘first’, ‘last’}, default ‘last’
- Puts NaNs at the beginning if first; last puts NaNs at the end. 
- ignore_indexbool, default False
- If True, the resulting axis will be labeled 0, 1, …, n - 1. - New in version 1.0.0. 
- keycallable, optional
- Apply the key function to the values before sorting. This is similar to the key argument in the builtin - sorted()function, with the notable difference that this key function should be vectorized. It should expect a- Seriesand return a Series with the same shape as the input. It will be applied to each column in by independently.- New in version 1.1.0. 
 
- Returns
- DataFrame or None
- DataFrame with sorted values or None if - inplace=True.
 
 - See also - DataFrame.sort_index
- Sort a DataFrame by the index. 
- Series.sort_values
- Similar method for a Series. 
 - Examples - >>> df = pd.DataFrame({ ... 'col1': ['A', 'A', 'B', np.nan, 'D', 'C'], ... 'col2': [2, 1, 9, 8, 7, 4], ... 'col3': [0, 1, 9, 4, 2, 3], ... 'col4': ['a', 'B', 'c', 'D', 'e', 'F'] ... }) >>> df col1 col2 col3 col4 0 A 2 0 a 1 A 1 1 B 2 B 9 9 c 3 NaN 8 4 D 4 D 7 2 e 5 C 4 3 F - Sort by col1 - >>> df.sort_values(by=['col1']) col1 col2 col3 col4 0 A 2 0 a 1 A 1 1 B 2 B 9 9 c 5 C 4 3 F 4 D 7 2 e 3 NaN 8 4 D - Sort by multiple columns - >>> df.sort_values(by=['col1', 'col2']) col1 col2 col3 col4 1 A 1 1 B 0 A 2 0 a 2 B 9 9 c 5 C 4 3 F 4 D 7 2 e 3 NaN 8 4 D - Sort Descending - >>> df.sort_values(by='col1', ascending=False) col1 col2 col3 col4 4 D 7 2 e 5 C 4 3 F 2 B 9 9 c 0 A 2 0 a 1 A 1 1 B 3 NaN 8 4 D - Putting NAs first - >>> df.sort_values(by='col1', ascending=False, na_position='first') col1 col2 col3 col4 3 NaN 8 4 D 4 D 7 2 e 5 C 4 3 F 2 B 9 9 c 0 A 2 0 a 1 A 1 1 B - Sorting with a key function - >>> df.sort_values(by='col4', key=lambda col: col.str.lower()) col1 col2 col3 col4 0 A 2 0 a 1 A 1 1 B 2 B 9 9 c 3 NaN 8 4 D 4 D 7 2 e 5 C 4 3 F - Natural sort with the key argument, using the natsort <https://github.com/SethMMorton/natsort> package. - >>> df = pd.DataFrame({ ... "time": ['0hr', '128hr', '72hr', '48hr', '96hr'], ... "value": [10, 20, 30, 40, 50] ... }) >>> df time value 0 0hr 10 1 128hr 20 2 72hr 30 3 48hr 40 4 96hr 50 >>> from natsort import index_natsorted >>> df.sort_values( ... by="time", ... key=lambda x: np.argsort(index_natsorted(df["time"])) ... ) time value 0 0hr 10 3 48hr 40 2 72hr 30 4 96hr 50 1 128hr 20