Last modified: Jan 10, 2023 By Alexander Williams
BeautifulSoup: Get Text value of Element using .string & .strings properties
.string and .strings are properties that get the text value of elements. This tutorial will teach us when and how to use these two properties.
Let's get started.
The .string property to get the text value of an element
.string property returns the text value of an element when the element contains a text value. Otherwise returns None.
syntax
element.string
Example
In the following example, we will get the text value of the <p> element.
# HTML source
html_doc = '''
<div class="root">
<p class="child_1">Hello Python</p>
<p class="child_2">Hello BeautifulSoup</p>
<p class="child_3">Hello Flask</p>
<p class="child_4">Hello Django</p>
</div>
'''
# Parse
soup = BeautifulSoup(html_doc, 'html.parser')
# Find "<p>" element
f_p = soup.find("p")
# Get text value of "<p>" element
print(f_p.string)
Output:
Hello Python
As you can see, we've used the find() method to find the first <p> element. Next, we've got the text value of the element.
Now let's find and get all elements' text values.
from bs4 import BeautifulSoup
# HTML source
html_doc = '''
<div class="root">
<p class="child_1">Hello Python</p>
<p class="child_2">Hello BeautifulSoup</p>
<p class="child_3">Hello Flask</p>
<p class="child_4">Hello Django</p>
</div>
'''
# Parse
soup = BeautifulSoup(html_doc, 'html.parser')
# Find "<p>" elements
f_p = soup.findAll("p")
for p in f_p:
#Get text value of "<p>" element
print(p.string)
Output:
Hello Python
Hello BeautifulSoup
Hello Flask
Hello Django
Voila!
Now, let's try to get the text value of the <div> element.
from bs4 import BeautifulSoup
# HTML source
html_doc = '''
<div class="root">
<p class="child_1">Hello Python</p>
<p class="child_2">Hello BeautifulSoup</p>
<p class="child_3">Hello Flask</p>
<p class="child_4">Hello Django</p>
</div>
'''
# Parse
soup = BeautifulSoup(html_doc, 'html.parser')
# Find "<div>" element
f_d = soup.find("div")
# Get text value of "<div>" element
print(f_d.string)
Output:
None
Oops! we got None. Why?
As I said before, the .string property return None when the element doesn't contain a text value, and our <div> has children, not text value.
To get all text values of children, we can use the .strings property.
The .string property to get the text value of elements
The .strings property returns the text value of the element and the text value of the children of the element. This property returns the response as a generator.
syntax
element.stripped_strings
Example
In the following example, we'll get the value of <div> children.
# HTML source
html_doc = '''
<div class="root">
<p class="child_1">Hello Python</p>
<p class="child_2">Hello BeautifulSoup</p>
<p class="child_3">Hello Flask</p>
<p class="child_4">Hello Django</p>
</div>
'''
# Parse
soup = BeautifulSoup(html_doc, 'html.parser')
# Find "<div>" element
f_d = soup.find("div")
for d in f_d.strings:
print(d)
Output:
Hello Python
Hello BeautifulSoup
Hello Flask
Hello Django
As you can see, the program works as expected but with the new lines. To return it without newlines, we need to use stripped_strings.
f_d = soup.find("div")
for d in f_d.stripped_strings:
print(d)
Output:
Hello Python
Hello BeautifulSoup
Hello Flask
Hello Django
Conclusion
In this tutorial, we've learned two BeautifulSoup properties to get the text value of an element or element's child.
For more tutorials about BeautifulSoup, check out:
Understand How to Use the attribute in Beautifulsoup