Last modified: Dec 08, 2024 By Alexander Williams

Python Pandas sort_values() Simplified

Sorting data is an essential part of data analysis. In Python Pandas, the sort_values() method provides a simple way to sort rows or columns in a DataFrame.

What is sort_values()?

The sort_values() method in Pandas allows you to sort your DataFrame by one or more columns or index labels. It is highly flexible and customizable.

With sort_values(), you can specify the sorting order, handle missing values, and sort by multiple criteria easily.

Basic Syntax of sort_values()


DataFrame.sort_values(by, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

Parameters:

  • by: Column(s) or index to sort by.
  • axis: Sort along rows (axis=0) or columns (axis=1).
  • ascending: Sort in ascending (True) or descending (False) order.
  • inplace: Perform operation in-place if True.
  • na_position: Place NaN values at the start or end.

Sorting by a Single Column

Here’s a simple example where we sort a DataFrame by a single column:


import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 20]}

df = pd.DataFrame(data)

# Sort by 'Age'
sorted_df = df.sort_values(by='Age')
print(sorted_df)


       Name  Age
2  Charlie   20
0    Alice   25
1      Bob   30

Sorting by Multiple Columns

You can sort by multiple columns by passing a list to the by parameter:


data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 25],
        'Score': [90, 80, 85]}

df = pd.DataFrame(data)

# Sort by 'Age' and then by 'Score'
sorted_df = df.sort_values(by=['Age', 'Score'], ascending=[True, False])
print(sorted_df)


       Name  Age  Score
0    Alice   25     90
2  Charlie   25     85
1      Bob   30     80

Sorting in Descending Order

To sort in descending order, set the ascending parameter to False:


sorted_df = df.sort_values(by='Age', ascending=False)
print(sorted_df)

Handling Missing Values

The na_position parameter decides the position of NaN values:


data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, None, 20]}

df = pd.DataFrame(data)

# Place NaN values at the start
sorted_df = df.sort_values(by='Age', na_position='first')
print(sorted_df)


       Name   Age
1      Bob   NaN
2  Charlie  20.0
0    Alice  25.0

Real-World Use Case

Sorting is crucial when preparing data for pivot tables or merging. For instance, check our article on creating pivot tables with Pandas.

Sorting with inplace=True

Use the inplace parameter to modify the original DataFrame directly:


df.sort_values(by='Age', inplace=True)
print(df)

Key Takeaways

sort_values() is a powerful tool for organizing your data. It is versatile and allows for multi-level sorting, handling NaN values, and customization.

Be sure to experiment with the parameters to fully utilize its potential. Sorting data effectively is a foundational skill for data analysis.

Conclusion

The sort_values() method is indispensable for sorting data in Python Pandas. Its flexibility and ease of use make it a go-to tool for data manipulation.

If you found this guide helpful, explore our related article on aggregating data with Pandas agg().