Last modified: Jan 19, 2026 By Alexander Williams

Scrape Job Listings with BeautifulSoup to Excel

Job hunting can be time-consuming. Scraping can automate data collection.

This guide shows you how to use Python's BeautifulSoup. You will extract job listings.

Finally, you will save the clean data into an Excel file. This is perfect for analysis.

Prerequisites and Setup

You need Python installed on your computer. Basic Python knowledge helps.

Open your terminal or command prompt. Install the required libraries.

Use the pip package manager. Run the following command.


pip install beautifulsoup4 requests pandas openpyxl
    

BeautifulSoup parses HTML. Requests fetches web pages.

Pandas handles data. Openpyxl writes Excel files.

If you are new to BeautifulSoup, read our Web Scraping Guide with BeautifulSoup for Beginners.

Understanding the Target Website Structure

First, inspect the job board website. Right-click on a job listing.

Select "Inspect" or "Inspect Element". This opens developer tools.

Look for HTML tags containing job titles, companies, and locations.

Common tags are <div>, <h2>, and <span>.

Identify their class names or IDs. We will use them to extract data.

For this tutorial, we'll use a simple example structure.

Fetching the Web Page with Requests

Use the requests.get() function. Pass the target URL.

Check the response status. A 200 code means success.

Then, pass the page content to BeautifulSoup for parsing.


import requests
from bs4 import BeautifulSoup

# URL of the job listings page
url = 'https://example.com/jobs'

# Send a GET request to the website
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the HTML content
    soup = BeautifulSoup(response.content, 'html.parser')
    print("Page fetched and parsed successfully.")
else:
    print(f"Failed to retrieve page. Status code: {response.status_code}")
    

Parsing and Extracting Job Data

Now, use BeautifulSoup's find methods. Locate the job containers.

Use find_all() to get a list of all job posting elements.

Loop through each container. Extract specific details like title.

Store each job's data in a dictionary. Append it to a master list.

Our Clean HTML Data with BeautifulSoup guide can help with complex parsing.


# Find all job listing containers (adjust selector based on actual site)
job_listings = soup.find_all('div', class_='job-listing')

jobs_data = []

for job in job_listings:
    # Extract job title
    title_elem = job.find('h2', class_='job-title')
    title = title_elem.text.strip() if title_elem else 'N/A'

    # Extract company name
    company_elem = job.find('span', class_='company')
    company = company_elem.text.strip() if company_elem else 'N/A'

    # Extract job location
    location_elem = job.find('span', class_='location')
    location = location_elem.text.strip() if location_elem else 'N/A'

    # Extract job link
    link_elem = job.find('a', href=True)
    link = link_elem['href'] if link_elem else 'N/A'
    # Make sure link is absolute
    if link and link.startswith('/'):
        link = 'https://example.com' + link

    # Create a dictionary for this job
    job_info = {
        'Title': title,
        'Company': company,
        'Location': location,
        'Link': link
    }
    jobs_data.append(job_info)

print(f"Extracted {len(jobs_data)} job listings.")
    

Page fetched and parsed successfully.
Extracted 10 job listings.
    

Saving Data to an Excel File with Pandas

Pandas makes saving data easy. Convert the list of dictionaries to a DataFrame.

Use the to_excel() method. Specify the filename.

The openpyxl engine will create the .xlsx file. It will be in your project folder.


import pandas as pd

# Create a DataFrame from the list of job dictionaries
df = pd.DataFrame(jobs_data)

# Save the DataFrame to an Excel file
excel_filename = 'job_listings.xlsx'
df.to_excel(excel_filename, index=False, engine='openpyxl')

print(f"Job listings successfully saved to {excel_filename}")
    

Job listings successfully saved to job_listings.xlsx
    

Handling Pagination and Complex Sites

Real job sites have multiple pages. You need to scrape them all.

Find the "Next" button link. Loop through pages until none are left.

Some sites use AJAX or infinite scroll. This requires advanced techniques.

For handling multiple pages, see our tutorial on Advanced BeautifulSoup Pagination & Infinite Scroll.

Always add delays between requests. Use time.sleep().

This respects the website's server. It prevents your IP from being blocked.

Ethical Scraping and Best Practices

Always check the website's robots.txt file. Respect the rules.

Do not overload servers with rapid requests. Space them out.

Use scraped data for personal analysis only. Do not republish it.

Some sites have APIs. Using an API is often better than scraping.

Identify yourself in requests. Use a descriptive User-Agent header.

Conclusion

You have learned to scrape job listings. BeautifulSoup and Requests fetch data.

Pandas saves it to a clean Excel spreadsheet. This automates data collection.

Start with a simple site. Master the basics of HTML structure.

Then move to more complex targets with pagination. Always scrape ethically.

This skill is valuable for job market research and data analysis projects.