Last modified: Nov 08, 2024 By Alexander Williams

Python Regex Quantifiers: The Complete Guide

Regular expressions are powerful tools for pattern matching, and quantifiers make them even more flexible. In Python's re module, quantifiers help specify how many times a pattern should match.

Understanding the Asterisk (*) Quantifier

The asterisk (*) matches zero or more occurrences of the preceding pattern. This makes it extremely versatile when you're unsure about a pattern's presence.


import re

text = "color colour colouur"
pattern = r"colou*r"
matches = re.findall(pattern, text)
print(matches)


['color', 'colour', 'colouur']

The Plus (+) Quantifier

The plus (+) matches one or more occurrences of the preceding pattern. Unlike *, it requires at least one match to be present. This is useful when you need to ensure a pattern exists.


import re

text = "file1.txt file2.txt file.txt file"
pattern = r"file\d+\.txt"
print(re.findall(pattern, text))


['file1.txt', 'file2.txt']

The Question Mark (?) Quantifier

The question mark (?) makes the preceding pattern optional, matching zero or one occurrence. It's perfect for handling optional characters or patterns in your text.


import re

text = "analyze analyse analyzed analysed"
pattern = r"analys[ez]?e[d]?"
print(re.findall(pattern, text))


['analyze', 'analyse', 'analyzed', 'analysed']

Using Curly Braces {}

Curly braces allow you to specify exact quantities or ranges for pattern matching. They offer the most precise control over repetition.

There are three ways to use curly braces:

  • {n}: Exactly n occurrences
  • {n,}: At least n occurrences
  • {n,m}: Between n and m occurrences

import re

text = "ab abc abbc abbbc abbbbc"
pattern1 = r"ab{2}c"    # Exactly 2 b's
pattern2 = r"ab{2,}c"   # 2 or more b's
pattern3 = r"ab{1,3}c"  # 1 to 3 b's

print(re.findall(pattern1, text))
print(re.findall(pattern2, text))
print(re.findall(pattern3, text))


['abbc']
['abbc', 'abbbc', 'abbbbc']
['abc', 'abbc', 'abbbc']

Combining Quantifiers with Other Regex Features

You can combine quantifiers with other regex features like pattern matching and pattern extraction for powerful text processing.


import re

text = "email@domain.com user.name@site.co.uk test@test"
pattern = r"[\w\.-]+@[\w\.-]+\.\w+\.?\w*"
print(re.findall(pattern, text))


['email@domain.com', 'user.name@site.co.uk']

Best Practices and Tips

Always use raw strings (r"pattern") when writing regex patterns to avoid issues with backslashes. Consider using re.compile for patterns you'll use repeatedly.

When working with special characters, remember to escape them properly using backslashes or the re.escape() function.

Conclusion

Regex quantifiers are essential tools for flexible pattern matching in Python. Understanding how to use *, +, ?, and {} effectively will help you create more precise and powerful regular expressions.