Last modified: Jan 10, 2026 By Alexander Williams

Save Scraped Data to CSV with BeautifulSoup

Web scraping is a powerful skill. You often need to save your data. CSV files are a perfect choice. They are simple and widely compatible.

This guide shows you how. We use BeautifulSoup for parsing. We use the Pandas library to create the CSV file. The process is straightforward.

Prerequisites and Setup

You need Python installed. You also need to install two key libraries. Use pip, the Python package manager.

First, install BeautifulSoup. The package name is beautifulsoup4. Second, install Pandas for data handling.

If you need help installing BeautifulSoup, see our Install BeautifulSoup in Python Step by Step guide.


pip install beautifulsoup4 pandas
    

The Basic Workflow

The process has four main steps. Fetch the HTML content. Parse it with BeautifulSoup. Extract the data into a structured format. Finally, write it to a CSV file.

We will use the requests library to fetch pages. For parsing, we use BeautifulSoup. For CSV creation, we use pandas.DataFrame.to_csv().

Step 1: Fetch and Parse HTML

Start by importing the necessary modules. Then, make a GET request to your target URL. Pass the HTML text to BeautifulSoup.

Choose a parser. We recommend 'html.parser' for simplicity. For complex pages, consider BeautifulSoup vs lxml: Which Python Parser to Use.


import requests
from bs4 import BeautifulSoup
import pandas as pd

# Target URL
url = 'https://example.com/books'

# Fetch the page
response = requests.get(url)
html_content = response.text

# Parse with BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
    

Step 2: Extract Data into a List of Dictionaries

You must identify the data structure on the page. Inspect the HTML to find patterns. Use BeautifulSoup methods like find_all().

Loop through the found elements. Extract text or attributes. Store each item as a dictionary. Append all dictionaries to a list.

This list-of-dicts format works perfectly with Pandas. For complex nested data, check our Parse Nested HTML with BeautifulSoup Guide.


# Find all book containers (adjust selector as needed)
book_elements = soup.find_all('div', class_='book')

# List to hold our data
books_data = []

for book in book_elements:
    # Extract data points
    title = book.find('h2').text.strip()
    author = book.find('span', class_='author').text.strip()
    price = book.find('p', class_='price').text.strip()

    # Create a dictionary for this book
    book_info = {
        'Title': title,
        'Author': author,
        'Price': price
    }
    # Add to the list
    books_data.append(book_info)

# Let's see what we collected
print(f"Scraped {len(books_data)} books.")
    

Scraped 5 books.
    

Step 3: Convert Data to a DataFrame and Save as CSV

Pandas makes this step easy. Pass your list of dictionaries to pd.DataFrame(). This creates a structured table.

Then, use the to_csv() method. Specify the filename. Use index=False to avoid extra column numbers.


# Convert list of dictionaries to a DataFrame
df = pd.DataFrame(books_data)

# Save the DataFrame to a CSV file
csv_filename = 'scraped_books.csv'
df.to_csv(csv_filename, index=False, encoding='utf-8')

print(f"Data successfully saved to '{csv_filename}'")
    

Data successfully saved to 'scraped_books.csv'
    

Handling Common Issues

You might encounter encoding problems. Always specify encoding='utf-8' in to_csv(). This prevents character corruption.

For more on this, see our BeautifulSoup Unicode Encoding Issues Guide.

Some websites load content dynamically. Requests alone won't work. For those, you need tools like Selenium or Requests-HTML.

Complete Example Script

Here is the full script from start to finish. It fetches, parses, extracts, and saves data. You can adapt it for your projects.


import requests
from bs4 import BeautifulSoup
import pandas as pd

def scrape_and_save_to_csv(url, csv_filename):
    """Fetches data from a URL and saves it to a CSV file."""
    # 1. Fetch HTML
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')

    # 2. Extract Data (Example: Product List)
    products = soup.find_all('div', class_='product')
    data_list = []

    for product in products:
        name = product.find('h3').text.strip()
        category = product.find('span', class_='cat').text.strip()
        data_list.append({'Product Name': name, 'Category': category})

    # 3. Save to CSV
    if data_list:
        df = pd.DataFrame(data_list)
        df.to_csv(csv_filename, index=False, encoding='utf-8')
        print(f"Saved {len(data_list)} records to {csv_filename}")
    else:
        print("No data found to save.")

# Run the function
scrape_and_save_to_csv('https://example.com/products', 'products.csv')
    

Conclusion

Saving scraped data to CSV is a fundamental task. BeautifulSoup extracts the data. Pandas organizes and exports it.

Remember to handle encoding and dynamic content. This basic pipeline is powerful. You can now store your scraped data for analysis.

Start with a simple page. Master the steps. Then tackle more complex projects like pagination.