Last modified: Feb 15, 2023 By Alexander Williams

How to Use BeautifulSoup To Extract Title Tag

 This tutorial will explore the various methods available in BeautifulSoup to extract the title tag in HTML, along with hands-on examples for each method. These methods include:

  • The .title property
  • The find() function
  • The select() function

By the end of this tutorial, you will clearly understand how to use each method to extract the title tag in HTML using BeautifulSoup. You will also know how to extract the title tag from any website page.

Extract title tag using .title property

The '.title' property extracts the title tag from the HTML code. If the title tag is present, it returns the tag. If it's not, it returns 'None'.

Here is an example:

from bs4 import BeautifulSoup

html = '''
<!DOCTYPE html>
<html>
  <head>
    <title>My Simple HTML Page</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    <p>This is my first HTML page.</p>
  </body>
</html>
'''

soup = BeautifulSoup(html, "html.parser") # Parse HTML

title = soup.title # Get Title Tag

print(title)

Output:

<title>My Simple HTML Page</title>

Now, let's get the content inside title tag.

from bs4 import BeautifulSoup

html = '''
<!DOCTYPE html>
<html>
  <head>
    <title>My Simple HTML Page</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    <p>This is my first HTML page.</p>
  </body>
</html>
'''

soup = BeautifulSoup(html, "html.parser") # Parse HTML

title = soup.title # Get title Tag

print(title.string) # Print the content of title tag

Output:

My Simple HTML Page

As you can see, we've used the ".string" attribute to extract the content of the title tag. It is important to remember that if there is no content to the title tag, you will receive an error message: "AttributeError: 'NoneType' object has no attribute 'string'".

However, to resolve this issue, we need first to check if the title is present. See the code below:

from bs4 import BeautifulSoup

html = '''
<!DOCTYPE html>
<html>
  <head>
    <title>My Simple HTML Page</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    <p>This is my first HTML page.</p>
  </body>
</html>
'''

soup = BeautifulSoup(html, "html.parser") # Parse HTML

title = soup.title # Get title Tag

if title: # Check if the Title Tag is present
    print("The title tag is present")
    #print(title.string)

else:
    print("The title tag is not present")

Output:

The title tag is present

For more information about the .string property, check out this article BeautifulSoup: .string & .strings properties.

Extract the title tag using the find() function.

Another way to extract the title tag is by using the 'find()' function. This function finds the first tag with a given name, class, or ID. Here's an example:

from bs4 import BeautifulSoup

html = '''
<!DOCTYPE html>
<html>
  <head>
    <title>My Simple HTML Page</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    <p>This is my first HTML page.</p>
  </body>
</html>
'''

soup = BeautifulSoup(html, "html.parser") # Parse HTML

title = soup.find("title") # Find Title Tag

print(title)

Output:

<title>My Simple HTML Page</title>

If the title tag is not present, it returns 'None'.

If you want to find the title tag that has content, set 'string=True', as in the following example:

from bs4 import BeautifulSoup

html = '''
<!DOCTYPE html>
<html>
  <head>
    <title>My Simple HTML Page</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    <p>This is my first HTML page.</p>
  </body>
</html>
'''

soup = BeautifulSoup(html, "html.parser") # Parse HTML

title = soup.find("title", string=True) # Find Title Tag

The code above returns the title tag if its content exists. Otherwise, it returns 'None'.

Extract the title tag using select() function

We can also use the 'select_one()' function to extract the title tag from HTML. The 'select_one()' method returns only the first element that matches the selector.

Let's see an example:

from bs4 import BeautifulSoup

html = '''
<!DOCTYPE html>
<html>
  <head>
    <title>Hello Select()</title>
  </head>
  <body>
    <h1>Hello World!</h1>
    <p>This is my first HTML page.</p>
  </body>
</html>
'''

soup = BeautifulSoup(html, "html.parser") # Parse HTML

title = soup.select_one("title") # select Title Tag

print(title)

Output:

<title>Hello Select()</title>

Extract the title tag from any website

To extract the title tag from a website, we need to use 'requests' with BeautifulSoup. 'Requests' is used to send HTTP requests and retrieve information from a web page.

However, to install requests, execute the following command:

pip install requests

To install 'requests', run the following command:

# Import the required libraries
import requests
from bs4 import BeautifulSoup

url = "https://pytutorial.com" # URL

response = requests.get(url) # Send an HTTP GET

soup = BeautifulSoup(response.text, "html.parser") # parse HTML

title = soup.title # Get Title Tag

print(title) # Print
Output:
<title>PyTutorial - Python and Django Blog</title>

Voila! We successfully got the title tag.

Conclusion

In conclusion, BeautifulSoup is a powerful library for web scraping and parsing HTML content in Python. To extract the title tag from HTML, we've used the .title property, find() or select().

To extract the title tag from a website, we've used requests to send HTTP and get the content and .title property to get the title tag from it.