Last modified: Jan 10, 2023 By Alexander Williams
Remove HTTPS or HTTP From URL Python
To remove HTTPS or HTTP from URL, we can use:
- the replace() built-in function
- Regex
Remove HTTPS or HTTP from URL using replace()
replace() is a built-in function to replace a specified value with another.
replace() Syntax
string.replace(old_value, new_value, count)
- old_value: the value that we want to be replaced
- new_value: the value that replaces the old value
- count: number of how many times you want to replace the value
Remove HTTPS or HTTP from URL Examples
in the following example, we'll use replace() to remove HTTPS from a URL string.
url = "https://pytutorial.com" # 👉️ URL
new_url = url.replace("https://", "") # 👉️ Remove "https" from URL
print(new_url) # 👉️ Print
Output:
pytutorial.com
As you can see, to remove HTTPS from the URL, we've replaced https:// with an empty value. We can do the same thing if we want to remove HTTP.
url = "http://pytutorial.com" # 👉️ URL
new_url = url.replace("http://", "") # 👉️ Remove "http" from URL
print(new_url) # 👉️ Print
Output:
pytutorial.com
You can see the code below if you want to remove HTTPS and HTTP from the URL. This method works when we want to remove whatever protocol comes with the URL HTTPS or HTTP.
url = "https://pytutorial.com" # 👉️ URL
new_url = url.replace("https://", "").replace("http://", "") # 👉️ Remove "HTTPS" and "HTTP" from URL
print(new_url) # 👉️ Print
Output:
pytutorial.com
Let's an example with www in the URL.
url = "https://www.pytutorial.com" # 👉️ URL
new_url = url.replace("https://", "").replace("http://", "") # 👉️ Remove "HTTPS" and "HTTP"
print(new_url) # 👉️ Print
Output:
www.pytutorial.com
If you want to remove www, add another replace() function like the following example.
url = "https://www.pytutorial.com" # 👉️ URL
new_url = url.replace("https://", "").replace("http://", "").replace("www.", "") # 👉️ Remove "HTTPS" and "HTTP" and "WWW"
print(new_url) # 👉️ Print
Output:
pytutorial.com
Remove HTTPS or HTTP from URL using Regex
sub() is a regex method for replacing a match or more with a string. We can also use this method to remove HTTPS or HTTP from URLs.
sub() Syntax
re.sub(pattern, new_value, string, count=0, flags=0)
re. sub accepts five parameters, but the most important for us is:
- Pattern: The regular expression that you want to match.
- new_value: The value that replaces the pattern
- string: String target
- count: How many times do you want to replace the value
Remove HTTPS or HTTP from URL Regex Examples
In the following example, we'll remove HTTPS and HTTP from The URL.
import re # 👉️ Import re module
url = "https://www.pytutorial.com" # 👉️ URL
pattern = "https?://" # 👉️ pattern
new_url = re.sub(pattern, "", url) # 👉️ Remove HTTPS and HTTP from URL
print(new_url) # 👉️ Print
Output:
www.pytutorial.com
In the following examples, we'll also remove www from the URL.
import re # 👉️ Import re module
url = "https://www.pytutorial.com" # 👉️ URL
pattern = "https?://www.?" # 👉️ pattern
new_url = re.sub(pattern, "", url) # 👉️ Remove HTTPS and HTTP and WWW from URL
print(new_url) # 👉️ Print
Output:
pytutorial.com
Remove HTTPS or HTTP from URL Script
Let's write a simple function that removes HTTPS, HTTP, and WWW from a URL using Regex.
import re # 👉️ Import re module
def rm_https(url):
return re.sub("https?://www.?", "", url) # 👉️ Remove HTTPS and HTTP and WWW from URL
This function accepts URL as the parameter. Now let's see how to use it.
url = "https://www.pytutorial.com" # 👉️ URL
rm = rm_https(url) # 👉️ Remove HTTPS and HTTP and WWW from URL
print(rm) # 👉️ Print
Output:
pytutorial.com
Conclusion
In conclusion, we've learned two ways to remove HTTPS and HTTP from URLs. The first way is replace(), and the second is the re.sub().
If you want to remove HTTPS from a URL, the replace() way is recommended because it is faster. On the other hand, if you want to replace more matches, using the re.sub() is your best choice.
You can download the code of this tutorial, on github Github Code
Happy Codding </>