Last modified: Dec 02, 2024 By Alexander Williams

Python Pandas info(): Data Overview Made Simple

Understanding your data is a crucial first step in any analysis. The info() method in Pandas offers a concise summary of your DataFrame or Series.

This guide explores the syntax, parameters, and use cases of info(), along with examples to help you master it.

What Is the info() Method?

The info() method provides a summary of a DataFrame, including:

  • Number of rows and columns.
  • Column names and data types.
  • Non-null counts.
  • Memory usage.

Syntax of info()

Here’s the basic syntax:


DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None)

By default, it prints a summary to the console.

Installing Pandas

Before using info(), ensure Pandas is installed. Follow How to Install Pandas in Python for detailed guidance.


pip install pandas

Using info(): Examples

Here’s an example of how info() works:


import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}

df = pd.DataFrame(data)

# Display DataFrame information
df.info()

Output:



RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
 2   City    3 non-null      object
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes

Exploring Parameters

Here’s a breakdown of useful parameters:

  • verbose: Toggles detailed output.
  • buf: Specifies the output stream.
  • memory_usage: Displays memory usage if set to True.

Example: Using memory_usage


# Show memory usage details
df.info(memory_usage='deep')

Output:



RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
 2   City    3 non-null      object
dtypes: int64(1), object(2)
memory usage: 1.0 KB

Use Cases of info()

The info() method is essential for:

  • Validating data types before processing.
  • Identifying missing or incomplete data.
  • Evaluating memory usage in large datasets.

For efficient data export, check out Python Pandas to_csv() or Python Pandas to_excel().

Working with Large Data

When dealing with large datasets, info() helps you assess memory usage and column structures effectively.

Example: Limited Column Output

You can control output for datasets with numerous columns:


# Display up to 2 columns
df.info(max_cols=2)

Complementing info() with head() and tail()

While info() summarizes structure, use head() or tail() to view actual data samples.

Key Takeaways

The info() method is a powerful tool for summarizing data structures. It helps you understand data integrity, memory usage, and column types efficiently.

Conclusion

By mastering info(), you’ll improve your ability to analyze and manage datasets. It’s a simple yet indispensable tool for data professionals.