Last modified: Oct 21, 2024 By Alexander Williams
Python Get Root and Sub Domain From URL
Python is an incredibly powerful language, and being able to extract the domain from a URL is a critical aspect of the language that is often overlooked.
This tutorial will cover the basics of effectively using Python to Get Root and Sub Domain From URL.
Whether you’re a new learner just starting with Python or a more experienced user that needs a refresher, this guide is here to help.
Before getting started, look at the following image to understand the parts of the URL.
Image Source: btmapplications
Get Root Domain From URL Using urlparse
Urlparse is a module that lets you break up URLs into their parts. It also provides a convenient way to access the various features of the URL.
The module is part of the standard Python library, so there is no need to install it.
However, We'll use the module to get the root domain from URL.
How to use urlparse
In the following example, we'll get the domain from https://pytutorial.com/hello.
from urllib.parse import urlparse # 👉️ Import urlparse module
url = "https://pytutorial.com/hello" # 👉️ URL
domain = urlparse(url).netloc # 👉️ Get Domain From URL
print(domain)
.netloc returns domain and subdomain if present. Let's see an example of a subdomain.
url = "https://blog.pytutorial.com/hello" # 👉️ URL
domain = urlparse(url).netloc # 👉️ Get Domain From URL
print(domain)
Output:
blog.pytutorial.com
See the code below if you want to get the root domain from the subdomain.
print('.'.join(domain.split('.')[-2:])) # 👉️ Get Root Domain From Subdomain
Output:
pytutorial.com
Here is an example of a .co.com domain.
url = "https://pytutorial.co.com/hello" # 👉️ URL
domain = urlparse(url).netloc # 👉️ Get Domain From URL
print(domain)
Output:
pytutorial.co.com
Get Domain with protocol From URL
It is simple to get the domain with the protocol from URL.
url = "https://pytutorial.com/hello" # 👉️ URL
protocol = urlparse(url).scheme # 👉️ Get Domain protocol From URL
domain = urlparse(url).netloc # 👉️ Get Domain From URL
domain_w_protocol = f"{protocol}://{domain}"
print(domain_w_protocol)
Output:
https://pytutorial.com
Here we've used the .scheme property to get domain protocol. Then concatenate The protocol with the domain using f-string syntax.
url = "https://blog.pytutorial.co.com/hello" # 👉️ URL
protocol = urlparse(url).scheme # 👉️ Get Domain protocol From URL
domain = urlparse(url).netloc # 👉️ Get Domain From URL
domain_w_protocol = f"{protocol}://{domain}"
print(domain_w_protocol)
Output:
https://blog.pytutorial.co.com
Conclusion
Python is a universal language for various tasks, including web development, data analysis, machine learning, and more. This article shows you how to use Python to get the domain from a URL.
For More Articles about Python and URL, Scroll down and happy learning.