Last modified: Nov 19, 2025 By Alexander Williams
Filter Search Excel Data Using Python xlrd
Excel files contain valuable data. Python helps analyze this data. The xlrd library reads Excel files efficiently. This tutorial shows filtering and searching techniques.
You will learn practical methods. These methods work with real Excel files. We cover basic to advanced filtering.
Setting Up xlrd
First, install xlrd. Use pip for installation. The process is straightforward.
Our guide Install xlrd and xlwt in Python provides detailed instructions. Follow it for proper setup.
# Install xlrd using pip
pip install xlrd
Successfully installed xlrd-2.0.1
Loading Excel Files
Start by loading your Excel file. Use the open_workbook function. This creates a workbook object.
Check our Python xlrd Guide: Load Workbooks and Iterate Sheets for more details.
import xlrd
# Load Excel workbook
workbook = xlrd.open_workbook('sales_data.xlsx')
# Get the first worksheet
worksheet = workbook.sheet_by_index(0)
print(f"Worksheet loaded: {worksheet.name}")
print(f"Total rows: {worksheet.nrows}")
print(f"Total columns: {worksheet.ncols}")
Worksheet loaded: Sales
Total rows: 100
Total columns: 6
Basic Data Filtering
Filtering finds specific data. You can search by value. Or filter by conditions.
Let's filter sales above $1000. We iterate through rows. Then check each value.
def filter_sales_above(worksheet, min_sales):
"""
Filter rows where sales exceed minimum value
"""
filtered_data = []
# Start from row 1 to skip header
for row_idx in range(1, worksheet.nrows):
sales_value = worksheet.cell_value(row_idx, 3) # Sales column
if sales_value > min_sales:
row_data = []
for col_idx in range(worksheet.ncols):
row_data.append(worksheet.cell_value(row_idx, col_idx))
filtered_data.append(row_data)
return filtered_data
# Filter sales above $1000
high_sales = filter_sales_above(worksheet, 1000)
print(f"Found {len(high_sales)} records with sales above $1000")
Found 23 records with sales above $1000
Searching for Specific Values
Searching finds exact matches. This is useful for names or codes. Let's search for a specific product.
def search_product(worksheet, product_name):
"""
Search for specific product in worksheet
"""
matches = []
for row_idx in range(1, worksheet.nrows):
current_product = worksheet.cell_value(row_idx, 1) # Product column
if product_name.lower() in current_product.lower():
match_data = []
for col_idx in range(worksheet.ncols):
match_data.append(worksheet.cell_value(row_idx, col_idx))
matches.append(match_data)
return matches
# Search for laptop products
laptop_sales = search_product(worksheet, "laptop")
print(f"Found {len(laptop_sales)} laptop products")
Found 15 laptop products
Advanced Filtering with Multiple Conditions
Real-world filtering often needs multiple conditions. Combine criteria for precise results.
Filter high sales in specific regions. This requires checking multiple columns.
def advanced_filter(worksheet, region, min_sales):
"""
Advanced filter with region and sales criteria
"""
filtered = []
for row_idx in range(1, worksheet.nrows):
current_region = worksheet.cell_value(row_idx, 2) # Region column
current_sales = worksheet.cell_value(row_idx, 3) # Sales column
if current_region == region and current_sales >= min_sales:
row_data = []
for col_idx in range(worksheet.ncols):
row_data.append(worksheet.cell_value(row_idx, col_idx))
filtered.append(row_data)
return filtered
# Filter North region sales above $800
north_high_sales = advanced_filter(worksheet, "North", 800)
print(f"Found {len(north_high_sales)} high sales in North region")
Found 12 high sales in North region
Date-Based Filtering
Excel dates need special handling. xlrd provides date conversion. Filter data by date ranges.
from datetime import datetime
def filter_by_date_range(worksheet, start_date, end_date):
"""
Filter rows within date range
"""
filtered = []
date_column = 4 # Assuming date is in column 4
for row_idx in range(1, worksheet.nrows):
excel_date = worksheet.cell_value(row_idx, date_column)
# Convert Excel date to Python datetime
if isinstance(excel_date, float):
date_value = xlrd.xldate_as_datetime(excel_date, workbook.datemode)
if start_date <= date_value <= end_date:
row_data = []
for col_idx in range(worksheet.ncols):
row_data.append(worksheet.cell_value(row_idx, col_idx))
filtered.append(row_data)
return filtered
# Define date range
start = datetime(2023, 1, 1)
end = datetime(2023, 3, 31)
# Filter Q1 2023 data
q1_data = filter_by_date_range(worksheet, start, end)
print(f"Found {len(q1_data)} records in Q1 2023")
Found 45 records in Q1 2023
Working with Specific Rows and Columns
Sometimes you need specific data sections. Our guide Read Specific Rows and Columns with Python xlrd covers this topic.
Extract only needed columns. This improves performance.
def extract_columns(worksheet, column_indices):
"""
Extract specific columns from worksheet
"""
extracted_data = []
for row_idx in range(worksheet.nrows):
row_data = []
for col_idx in column_indices:
if col_idx < worksheet.ncols:
row_data.append(worksheet.cell_value(row_idx, col_idx))
extracted_data.append(row_data)
return extracted_data
# Extract only product names and sales
product_sales = extract_columns(worksheet, [1, 3])
print(f"Extracted {len(product_sales)} rows with product and sales data")
Extracted 100 rows with product and sales data
Performance Optimization Tips
Large Excel files need optimization. Use these techniques for better performance.
Read only needed data. Avoid loading entire files. Process data in chunks if possible.
Use list comprehensions. They are faster than loops. Pre-compile your search patterns.
# Optimized version using list comprehension
def optimized_search(worksheet, search_term):
return [
[worksheet.cell_value(row_idx, col_idx) for col_idx in range(worksheet.ncols)]
for row_idx in range(1, worksheet.nrows)
if search_term.lower() in worksheet.cell_value(row_idx, 1).lower()
]
# Fast search
fast_results = optimized_search(worksheet, "tablet")
print(f"Fast search found {len(fast_results)} tablets")
Fast search found 8 tablets
Error Handling
Always handle potential errors. Excel files can have issues. Protect your code.
Use try-except blocks. Handle file not found errors. Manage data type issues.
def safe_filter(worksheet, column_index, search_value):
"""
Safe filtering with error handling
"""
results = []
for row_idx in range(1, worksheet.nrows):
try:
cell_value = worksheet.cell_value(row_idx, column_index)
# Handle different data types
if str(search_value).lower() in str(cell_value).lower():
results.append([worksheet.cell_value(row_idx, col_idx)
for col_idx in range(worksheet.ncols)])
except Exception as e:
print(f"Error processing row {row_idx}: {e}")
continue
return results
# Safe filtering example
safe_results = safe_filter(worksheet, 1, "phone")
print(f"Safe search found {len(safe_results)} phones")
Safe search found 11 phones
Conclusion
Python xlrd provides powerful Excel filtering. You can search by values, dates, and multiple criteria.
Remember these key points. Always handle errors properly. Optimize for large files. Use specific row and column access.
These techniques help analyze Excel data efficiently. They save time and improve accuracy.
For more xlrd tutorials, check our Python xlrd Tutorial: Read Excel XLS Files Step by Step guide.
Start filtering your Excel data today. Python makes it easy and efficient.