Last modified: Oct 21, 2024 By Alexander Williams

Python Get Root and Sub Domain From URL

Python is an incredibly powerful language, and being able to extract the domain from a URL is a critical aspect of the language that is often overlooked.

This tutorial will cover the basics of effectively using Python to Get Root and Sub Domain From URL.

Whether you’re a new learner just starting with Python or a more experienced user that needs a refresher, this guide is here to help.

Before getting started, look at the following image to understand the parts of the URL.

domain parts

Image Source: btmapplications

Get Root Domain From URL Using urlparse

Urlparse is a module that lets you break up URLs into their parts. It also provides a convenient way to access the various features of the URL.

The module is part of the standard Python library, so there is no need to install it.

However, We'll use the module to get the root domain from URL.

How to use urlparse

In the following example, we'll get the domain from https://pytutorial.com/hello.

from urllib.parse import urlparse # 👉️ Import urlparse module

url = "https://pytutorial.com/hello" # 👉️ URL

domain = urlparse(url).netloc # 👉️ Get Domain From URL

print(domain)

.netloc returns domain and subdomain if present. Let's see an example of a subdomain.

url = "https://blog.pytutorial.com/hello" # 👉️ URL

domain = urlparse(url).netloc # 👉️ Get Domain From URL

print(domain)

Output:

blog.pytutorial.com

See the code below if you want to get the root domain from the subdomain.

print('.'.join(domain.split('.')[-2:])) # 👉️ Get Root Domain From Subdomain

Output:

pytutorial.com

Here is an example of a .co.com domain.

url = "https://pytutorial.co.com/hello" # 👉️ URL

domain = urlparse(url).netloc # 👉️ Get Domain From URL

print(domain)

Output:

pytutorial.co.com

Get Domain with protocol From URL

It is simple to get the domain with the protocol from URL.

url = "https://pytutorial.com/hello" # 👉️ URL

protocol = urlparse(url).scheme # 👉️ Get Domain protocol From URL

domain = urlparse(url).netloc # 👉️ Get Domain From URL

domain_w_protocol = f"{protocol}://{domain}"

print(domain_w_protocol)

Output:

https://pytutorial.com

Here we've used the .scheme property to get domain protocol. Then concatenate The protocol with the domain using f-string syntax.

url = "https://blog.pytutorial.co.com/hello" # 👉️ URL

protocol = urlparse(url).scheme # 👉️ Get Domain protocol From URL

domain = urlparse(url).netloc # 👉️ Get Domain From URL

domain_w_protocol = f"{protocol}://{domain}"

print(domain_w_protocol)

Output:

https://blog.pytutorial.co.com

Conclusion

Python is a universal language for various tasks, including web development, data analysis, machine learning, and more. This article shows you how to use Python to get the domain from a URL. 

For More Articles about Python and URL, Scroll down and happy learning.