Last modified: Jan 10, 2026 By Alexander Williams
Save Scraped Data to CSV with BeautifulSoup
Web scraping is a powerful skill. You often need to save your data. CSV files are a perfect choice. They are simple and widely compatible.
This guide shows you how. We use BeautifulSoup for parsing. We use the Pandas library to create the CSV file. The process is straightforward.
Prerequisites and Setup
You need Python installed. You also need to install two key libraries. Use pip, the Python package manager.
First, install BeautifulSoup. The package name is beautifulsoup4. Second, install Pandas for data handling.
If you need help installing BeautifulSoup, see our Install BeautifulSoup in Python Step by Step guide.
pip install beautifulsoup4 pandas
The Basic Workflow
The process has four main steps. Fetch the HTML content. Parse it with BeautifulSoup. Extract the data into a structured format. Finally, write it to a CSV file.
We will use the requests library to fetch pages. For parsing, we use BeautifulSoup. For CSV creation, we use pandas.DataFrame.to_csv().
Step 1: Fetch and Parse HTML
Start by importing the necessary modules. Then, make a GET request to your target URL. Pass the HTML text to BeautifulSoup.
Choose a parser. We recommend 'html.parser' for simplicity. For complex pages, consider BeautifulSoup vs lxml: Which Python Parser to Use.
import requests
from bs4 import BeautifulSoup
import pandas as pd
# Target URL
url = 'https://example.com/books'
# Fetch the page
response = requests.get(url)
html_content = response.text
# Parse with BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
Step 2: Extract Data into a List of Dictionaries
You must identify the data structure on the page. Inspect the HTML to find patterns. Use BeautifulSoup methods like find_all().
Loop through the found elements. Extract text or attributes. Store each item as a dictionary. Append all dictionaries to a list.
This list-of-dicts format works perfectly with Pandas. For complex nested data, check our Parse Nested HTML with BeautifulSoup Guide.
# Find all book containers (adjust selector as needed)
book_elements = soup.find_all('div', class_='book')
# List to hold our data
books_data = []
for book in book_elements:
# Extract data points
title = book.find('h2').text.strip()
author = book.find('span', class_='author').text.strip()
price = book.find('p', class_='price').text.strip()
# Create a dictionary for this book
book_info = {
'Title': title,
'Author': author,
'Price': price
}
# Add to the list
books_data.append(book_info)
# Let's see what we collected
print(f"Scraped {len(books_data)} books.")
Scraped 5 books.
Step 3: Convert Data to a DataFrame and Save as CSV
Pandas makes this step easy. Pass your list of dictionaries to pd.DataFrame(). This creates a structured table.
Then, use the to_csv() method. Specify the filename. Use index=False to avoid extra column numbers.
# Convert list of dictionaries to a DataFrame
df = pd.DataFrame(books_data)
# Save the DataFrame to a CSV file
csv_filename = 'scraped_books.csv'
df.to_csv(csv_filename, index=False, encoding='utf-8')
print(f"Data successfully saved to '{csv_filename}'")
Data successfully saved to 'scraped_books.csv'
Handling Common Issues
You might encounter encoding problems. Always specify encoding='utf-8' in to_csv(). This prevents character corruption.
For more on this, see our BeautifulSoup Unicode Encoding Issues Guide.
Some websites load content dynamically. Requests alone won't work. For those, you need tools like Selenium or Requests-HTML.
Complete Example Script
Here is the full script from start to finish. It fetches, parses, extracts, and saves data. You can adapt it for your projects.
import requests
from bs4 import BeautifulSoup
import pandas as pd
def scrape_and_save_to_csv(url, csv_filename):
"""Fetches data from a URL and saves it to a CSV file."""
# 1. Fetch HTML
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
# 2. Extract Data (Example: Product List)
products = soup.find_all('div', class_='product')
data_list = []
for product in products:
name = product.find('h3').text.strip()
category = product.find('span', class_='cat').text.strip()
data_list.append({'Product Name': name, 'Category': category})
# 3. Save to CSV
if data_list:
df = pd.DataFrame(data_list)
df.to_csv(csv_filename, index=False, encoding='utf-8')
print(f"Saved {len(data_list)} records to {csv_filename}")
else:
print("No data found to save.")
# Run the function
scrape_and_save_to_csv('https://example.com/products', 'products.csv')
Conclusion
Saving scraped data to CSV is a fundamental task. BeautifulSoup extracts the data. Pandas organizes and exports it.
Remember to handle encoding and dynamic content. This basic pipeline is powerful. You can now store your scraped data for analysis.
Start with a simple page. Master the steps. Then tackle more complex projects like pagination.