Last modified: Nov 10, 2024 By Alexander Williams
Python: Convert CSV Files to Excel Format - Complete Guide
Converting CSV files to Excel format is a common task in data processing. Python offers several powerful libraries to handle this conversion efficiently. In this guide, we'll explore different methods to convert CSV files to Excel format.
Using Pandas for CSV to Excel Conversion
The most straightforward approach is using the pandas
library, which provides robust tools for data manipulation. First, install pandas using pip:
pip install pandas openpyxl
Here's a basic example of converting a CSV file to Excel:
import pandas as pd
# Read CSV file
df = pd.read_csv('input.csv')
# Save as Excel file
df.to_excel('output.xlsx', index=False)
Adding Formatting Options
For more control over the output, you can add formatting options. This is particularly useful when dealing with missing data in CSV files:
import pandas as pd
# Read CSV file
df = pd.read_csv('input.csv')
# Create Excel writer object
with pd.ExcelWriter('output.xlsx', engine='openpyxl') as writer:
df.to_excel(writer, sheet_name='Sheet1', index=False)
# Get workbook and worksheet objects
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Add formatting
worksheet.column_dimensions['A'].width = 15
worksheet.column_dimensions['B'].width = 20
Handling Multiple Sheets
When working with multiple CSV files, you might want to combine them into a single Excel workbook. This approach is similar to merging multiple CSV files:
import pandas as pd
# Read multiple CSV files
df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')
# Create Excel writer object
with pd.ExcelWriter('output.xlsx') as writer:
df1.to_excel(writer, sheet_name='Sheet1', index=False)
df2.to_excel(writer, sheet_name='Sheet2', index=False)
Data Processing Before Conversion
Sometimes you might need to process your data before converting it. For instance, you might want to sort your CSV data or extract specific columns:
import pandas as pd
# Read CSV file
df = pd.read_csv('input.csv')
# Process data
df_sorted = df.sort_values('column_name')
df_selected = df[['column1', 'column2']]
# Save processed data
df_selected.to_excel('processed_output.xlsx', index=False)
Error Handling
It's important to implement proper error handling when working with file operations:
import pandas as pd
import sys
try:
df = pd.read_csv('input.csv')
df.to_excel('output.xlsx', index=False)
except FileNotFoundError:
print("Input file not found")
except Exception as e:
print(f"An error occurred: {str(e)}")
Conclusion
Converting CSV files to Excel format in Python is straightforward with the pandas library. Remember to consider formatting options, handle errors properly, and process your data as needed.
For large datasets, consider using efficient processing methods to optimize performance. Always validate your output to ensure data integrity.