Last modified: Jan 10, 2023 By Alexander Williams
Beautifulsoup - How to get the children of a tag
In this tutorial, we'll learn how to get the children of any tag using two methods:
- .childrens method
- .contents method
We'll also learn the difference between .children and .contents methods.
Get children using the .childrens method
The .children method returns the children of a tag as a generator.
Syntax
element.children
Example
The following example will get the children of the <article> tag.
from bs4 import BeautifulSoup
# HTML Content
html = '''
<article>
<h1>Headline 1</h1>
<h2>Headline 2</h2>
<h3>Headline 3</h3>
</article>
'''
# Parse
soup = BeautifulSoup(html, 'html.parser')
# Find <article> tag
article = soup.find('article')
# Print the children of <article>
for child in article.children:
print(child)
Output:
<h1>Headline 1</h1>
<h2>Headline 2</h2>
<h3>Headline 3</h3>
As you can see, we've got all children of <article> tag.
Get children using the .contents method
the .contents method returns the children of a tag as a list.
Syntax
tag.contents
Example
from bs4 import BeautifulSoup
# HTML Content
html = '''
<article>
<h1>Headline 1</h1>
<h2>Headline 2</h2>
<h3>Headline 3</h3>
</article>
'''
# Parse
soup = BeautifulSoup(html, 'html.parser')
# Find <article> tag
article = soup.find('article')
# Print the children of <article> as a list
print(article.contents)
Output:
[<h1>Headline 1</h1>, <h2>Headline 2</h2>, <h3>Headline 3</h3>]
As you can see, we got the children as a list.
The difference between .children and .content
As I said before, the children method returns the output as a generator, and the contents method returns it as a list.
The following example will get the type of the data:
# Parse
soup = BeautifulSoup(html, 'html.parser')
# Find <article> tag
article = soup.find('article')
# Print Type of data
print(type(article.children))
print(type(article.contents))
Output:
<class 'list_iterator'>
<class 'list'>
For more information about the difference between a generator and a list, I recommend you see the following question:
Generator expressions vs. list comprehensions
.children or .contents which one we should use
You should use the .children method for:
- parsing a big data
- iterating over one time
You should use the .contents method for:
- parsing a small data
- iterating over multi times
- using the methods of the list like index ...