How to Get inner Div Using Beautifulsoup

How to Get inner Div Using Beautifulsoup

In this Beautifulsoup tutorial, we'll learn how to get inner div using .contents property or findChildren().

If you are ready, let's get started.

Get inner div using .contents property

.contents property extracts the contents of an element. This property return results as a list.

Syntax

element.contents

1. .contents with find()

In the following example, we'll find the div element and get the inner of the div.

from bs4 import BeautifulSoup

html_source = '''
<div class="category-bar">
    <h3>Categories</h3>
    <div class="categories">
        <ul>
        <li><a href="/category/python">Python</a></li>
        </ul>
    </div>
</div>
'''

# Parse
soup = BeautifulSoup(html_source, "html.parser")

# Find Div with "category-bar" in class
category_div = soup.find("div", class_="category-bar")

# Get inner div
inner = category_div.contents

# Print Result
print(inner)

Output:

['\n', <h3>Categories</h3>, '\n', <div class="categories">
<ul>
<li><a href="/category/python">Python</a></li>
</ul>
</div>, '\n']

As you can see, the program returns the inner div as a list.

Now, let's print them one by one:

from bs4 import BeautifulSoup

html_source = '''
<div class="category-bar">
    <h3>Categories</h3>
    <div class="categories">
        <ul>
        <li><a href="/category/python">Python</a></li>
        </ul>
    </div>
</div>
'''

# Parse
soup = BeautifulSoup(html_source, "html.parser")

# Find Div with "category-bar" in class
category_div = soup.find("div", class_="category-bar")

# Get inner div
inner = category_div.contents

# Print contents of div one by one
 for el in inner:
     print(el)
     print('###')

Output:

###
<h3>Categories</h3>
###


###
<div class="categories">
<ul>
<li><a href="/category/python">Python</a></li>
</ul>
</div>
###

As you can see, the .contents property returns the first child with his children. To get all elements separately, you should use findChildren().

2. .contents with select_one()

We can also use select() or select_one() with .contents. In the following example, we'll select the div element and get the inner div (the same thing as example 1).

from bs4 import BeautifulSoup

# HTML source
html_source = '''
<div class="category-bar">
    <h3>Categories</h3>
    <div class="categories">
        <ul>
        <li><a href="/category/python">Python</a></li>
        </ul>
    </div>
</div>
'''

# Parse
soup = BeautifulSoup(html_source, "html.parser")

# Select Div
category_div = soup.select_one("div.category-bar")

# Get inner div 
inner = category_div.contents


# Print Result
print(inner)

Output:

['\n', <h3>Categories</h3>, '\n', <div class="categories">
<ul>
<li><a href="/category/python">Python</a></li>
</ul>
</div>, '\n']

Get inner div using findChildren()

findChildren() finds and returns a list of children separately. Let's see how to use it:

from bs4 import BeautifulSoup

# HTML source
html_source = '''
<div class="category-bar">
    <h3>Categories</h3>
    <div class="categories">
        <ul>
        <li><a href="/category/python">Python</a></li>
        </ul>
    </div>
</div>
'''

# Parse
soup = BeautifulSoup(html_source, "html.parser")

# Select Div
category_div = soup.select_one("div.category-bar")

# Get inner div 
inner = category_div.findChildren()

# Print Result
for el in inner:
    print(el)
    print("###")

Output:

###
<div class="categories">
<ul>
<li><a href="/category/python">Python</a></li>
</ul>
</div>
###
<ul>
<li><a href="/category/python">Python</a></li>
</ul>
###
<li><a href="/category/python">Python</a></li>
###
<a href="/category/python">Python</a>
###

Great! as you can see, all elements has been printed.

Conclusion

Alright, we're done with the tutorial. You can check out Beautifulsoup - How to get the children of a tag for more examples for getting children of any element.

Happy learning ♥