Last modified: March 29, 2021

BeautifulSoup: Extract the Contents of Element

Beautiful soup has the .contents property that you can use to extract the contents of an element.

Extract contents of an element

Get all contents of div:


from bs4 import BeautifulSoup

html = '''
<div>
<span class="span" aria-label="4 people reacted to this post"<click</span>
<a href="url.com">Link</a>
</div>
'''

soup = BeautifulSoup(html, 'html.parser')

#Find Div
c = soup.find('div')

#print Div's content
print(c.contents)

Output:

['\n', <div><p>hello</p></div>, '\n', <span aria-label="4 people reacted to this post" class="span">click</span>, '\n', <a href="url.com">Link</a>, '\n']

Print element one by one:


for e in c.contents:
    print(e)

Output:

>div>>p>hello>/p>>/div>


>span aria-label="4 people reacted to this post" class="span">click>/span>


>a href="url.com">Link>/a>

Check if the element's name is a:


for e in c.contents:
    if e.name == "a":
        print(e)

Output:

>a href="url.com"<Link>/a<