Last modified: Nov 24, 2025 By Alexander Williams
Merge Split Excel Sheets with Python pyexcel
Excel files are everywhere in business. But managing them can be hard. You might have multiple files to combine. Or one large file to split. Python pyexcel makes this easy.
This tutorial shows you how. You will learn to merge and split Excel sheets. We use simple Python code. No complex formulas needed.
What is pyexcel?
pyexcel is a Python library. It helps you work with spreadsheet data. You can read, write, and manipulate Excel files. It supports many formats like XLS, XLSX, and CSV.
The library is simple to use. You don't need Microsoft Excel installed. It works on Windows, Mac, and Linux. Perfect for automation tasks.
If you need to install pyexcel, check our guide on Install pyexcel in Python with pip and Virtualenv. It covers installation step by step.
Setting Up Your Environment
First, you need Python installed. Then install pyexcel using pip. Open your terminal or command prompt.
# Install pyexcel and required plugins
pip install pyexcel pyexcel-xlsx pyexcel-xls
These packages handle different Excel formats. pyexcel-xlsx works with XLSX files. pyexcel-xls handles older XLS format.
If you get import errors, read Fix Python ImportError: No Module Named pyexcel. It solves common installation problems.
Merging Multiple Excel Files
Merging combines multiple files into one. This is common when you have monthly reports. Or data from different departments.
Let's merge three sales files. Each file has the same columns. We want one combined file.
import pyexcel as pe
# List of files to merge
files_to_merge = [
'sales_january.xlsx',
'sales_february.xlsx',
'sales_march.xlsx'
]
# Create empty list for all records
all_records = []
# Read each file and add to combined list
for file in files_to_merge:
sheet = pe.get_sheet(file_name=file)
all_records += sheet.to_array()
# Create new sheet with all data
combined_sheet = pe.Sheet(all_records)
# Save merged file
combined_sheet.save_as("merged_sales.xlsx")
print("Files merged successfully!")
Files merged successfully!
The code reads each file. It extracts all data rows. Then combines them into one sheet. Finally saves as a new file.
Before merging, you might need to Clean Normalize Spreadsheet Data Python pyexcel. Clean data ensures better results.
Merging Files with Different Formats
Sometimes files have different formats. One might be XLSX, another CSV. pyexcel handles this easily.
import pyexcel as pe
# Files in different formats
mixed_files = [
'data.csv',
'old_data.xls',
'new_data.xlsx'
]
all_data = []
for file in mixed_files:
# pyexcel auto-detects format
sheet = pe.get_sheet(file_name=file)
all_data.extend(sheet.to_array())
# Save as single Excel file
pe.save_as(array=all_data, dest_file_name="combined_data.xlsx")
print("Mixed format files merged!")
pyexcel automatically detects file formats. You don't need separate code for each type. This saves time and effort.
Learn more about format handling in Handle Multiple Spreadsheet Formats with Python pyexcel.
Splitting Large Excel Files
Splitting breaks large files into smaller ones. This helps with file size limits. Or when you need to share specific data.
Let's split a large employee file. We'll create files by department.
import pyexcel as pe
# Read the main file
main_sheet = pe.get_sheet(file_name="employees.xlsx")
# Get all data as array
all_data = main_sheet.to_array()
# Extract header (first row)
header = all_data[0]
# Group data by department (assuming department is column 3)
department_data = {}
for row in all_data[1:]: # Skip header
dept = row[2] # Department column
if dept not in department_data:
department_data[dept] = [header] # Start with header
department_data[dept].append(row)
# Save separate files for each department
for dept, data in department_data.items():
dept_sheet = pe.Sheet(data)
filename = f"employees_{dept}.xlsx"
dept_sheet.save_as(filename)
print(f"Created {filename}")
print("File splitting completed!")
Created employees_Sales.xlsx
Created employees_IT.xlsx
Created employees_HR.xlsx
File splitting completed!
The code reads the main file. It groups rows by department. Then saves each group as a separate file. Each new file has the original header.
Splitting by Row Count
Another common split method is by row count. You might want files with exactly 100 rows each. This is useful for batch processing.
import pyexcel as pe
def split_by_row_count(filename, rows_per_file=100):
# Read source file
source_sheet = pe.get_sheet(file_name=filename)
all_data = source_sheet.to_array()
header = all_data[0]
data_rows = all_data[1:]
file_count = 1
# Process in chunks
for i in range(0, len(data_rows), rows_per_file):
chunk = data_rows[i:i + rows_per_file]
# Add header to each chunk
chunk_with_header = [header] + chunk
# Save chunk as new file
chunk_sheet = pe.Sheet(chunk_with_header)
output_file = f"{filename}_part{file_count}.xlsx"
chunk_sheet.save_as(output_file)
print(f"Created {output_file} with {len(chunk)} rows")
file_count += 1
# Split large file into chunks of 50 rows
split_by_row_count("large_dataset.xlsx", 50)
Created large_dataset.xlsx_part1.xlsx with 50 rows
Created large_dataset.xlsx_part2.xlsx with 50 rows
Created large_dataset.xlsx_part3.xlsx with 23 rows
This function splits any file by row count. The last file may have fewer rows. That's normal and expected.
Advanced Merging Techniques
Sometimes you need smarter merging. You might want to merge only specific sheets. Or combine data based on common columns.
import pyexcel as pe
def merge_with_conditions(file_list, output_file):
merged_data = None
for file in file_list:
# Get all sheets from file
sheets = pe.get_book(file_name=file)
for sheet_name in sheets.sheet_names():
sheet_data = sheets[sheet_name].to_array()
if merged_data is None:
merged_data = sheet_data
else:
# Skip header for subsequent files
merged_data.extend(sheet_data[1:])
# Save merged data
pe.save_as(array=merged_data, dest_file_name=output_file)
print(f"Merged {len(file_list)} files into {output_file}")
# Usage
files = ['q1_report.xlsx', 'q2_report.xlsx']
merge_with_conditions(files, 'half_year_report.xlsx')
This advanced function handles multiple sheets per file. It merges all data while managing headers properly. Useful for complex Excel workbooks.
Best Practices for Excel Automation
Follow these tips for better results. They prevent common errors and issues.
Always backup your original files. Automation can sometimes go wrong. Keep copies of your source data.
Check data consistency before merging. Ensure columns match across files. Different structures cause problems.
Use meaningful file names. This helps identify processed files. Include dates or version numbers.
Handle errors gracefully. Use try-except blocks in production code. This prevents crashes from bad files.
Common Issues and Solutions
You might encounter some problems. Here are common issues and their fixes.
Memory errors with very large files. Split them into smaller chunks first. Then process each chunk separately.
Formatting loss during conversion. pyexcel focuses on data, not formatting. For complex formatting, consider other tools.
Character encoding problems. Specify encoding when reading files. Use encoding='utf-8' parameter if needed.
Conclusion
Python pyexcel makes Excel automation easy. You can merge multiple files into one. Or split large files into smaller ones.
The library handles different file formats. It works without Microsoft Excel. The code is simple and readable.
Start with the basic examples. Then try the advanced techniques. You'll save hours of manual work.
Remember to explore our Python pyexcel Tutorial: Read Write Excel CSV Files for more Excel automation techniques.
Automate your Excel tasks today. Focus on analysis instead of manual work. Python pyexcel makes it possible.