Last modified: Jan 10, 2023 By Alexander Williams
BeautifulSoup: Get Image URL
In this tutorial, we learn how to get theURL of an image using BeautifulSoup.
Getting the URL of an image
Syntax:
image['src']
Example:
from bs4 import BeautifulSoup
#html source
html = """
<div>
<h1>Hello BeautifulSoup</h1>
<img src="img_girl.jpg" alt="Girl in a jacket">
<img src="img_boy.jpg" alt="boy in a jacket">
<img src="img_mom.jpg" alt="mom in a jacket">
<div>
"""
#BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
#find all images
all_imgs = soup.find_all('img')
#print image url
for image in all_imgs:
print(image['src'])
Output:
img_girl.jpg img_boy.jpg img_mom.jpg
Note: you will get Error if the src attribute does not exist
Let see:
#html source
html = """
<div>
<h1>Hello BeautifulSoup</h1>
<img alt="Girl in a jacket">
<img src="img_boy.jpg" alt="boy in a jacket">
<img src="img_girl.jpg" alt="mom in a jacket">
<div>
"""
#BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
#find all images
all_imgs = soup.find_all('img')
#print image url
for image in all_imgs:
print(image['src'])
Output:
File "tuto.py", line 49, in <module> print(image['src']) File "/home/py/.local/lib/python3.8/site-packages/bs4/element.py", line 1406, in __getitem__ return self.attrs[key] KeyError: 'src'
In the next part, we're going to solve this issue.
Solving the KeyError: 'src' issue
Syntax:
src=True
Example:
#html source
html = """
<div>
<h1>Hello BeautifulSoup</h1>
<img src="img_girl.jpg" alt="Girl in a jacket">
<img src="img_boy.jpg" alt="boy in a jacket">
<img alt="mom in a jacket">
<div>
"""
#BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
#find all images
all_imgs = soup.find_all('img', src=True)
#print image url
for image in all_imgs:
print(image['src'])
Output:
img_girl.jpg img_boy.jpg