Last modified: Nov 08, 2024 By Alexander Williams

Python re.finditer: Iterate Over Pattern Matches in Text

Python's re.finditer is a powerful function for finding all non-overlapping matches of a pattern in a string, returning an iterator of match objects instead of a list like re.findall.

Understanding re.finditer Basics

Unlike re.search which finds the first match, re.finditer finds all matches and is memory-efficient for processing large text files.

Basic Syntax

Here's the basic syntax of re.finditer:


import re
text = "Python is awesome. Python is powerful."
pattern = r"Python"
matches = re.finditer(pattern, text)

Working with Match Objects

Each match object provides useful information about the match including position and groups:


import re

text = "Python is awesome. Python is powerful."
pattern = r"Python"

for match in re.finditer(pattern, text):
    print(f"Match found at position {match.start()}-{match.end()}: {match.group()}")


Match found at position 0-6: Python
Match found at position 17-23: Python

Using Groups with re.finditer

You can use capturing groups to extract specific parts of matches:


import re

text = "Email: john@example.com, Contact: alice@email.com"
pattern = r"(\w+)@(\w+)\.com"

for match in re.finditer(pattern, text):
    print(f"Full match: {match.group()}")
    print(f"Username: {match.group(1)}")
    print(f"Domain: {match.group(2)}\n")


Full match: john@example.com
Username: john
Domain: example

Full match: alice@email.com
Username: alice
Domain: email

Memory Efficiency

Memory efficiency is a key advantage of re.finditer over re.findall, especially when working with large text files.

Position Information

Unlike re.match, re.finditer provides detailed position information through match objects:


import re

text = "Price: $10.99, Cost: $20.50"
pattern = r"\$\d+\.\d+"

for match in re.finditer(pattern, text):
    span = match.span()
    print(f"Found {match.group()} at positions {span}")


Found $10.99 at positions (7, 13)
Found $20.50 at positions (21, 27)

Conclusion

re.finditer is ideal for iterative pattern matching in Python, offering memory efficiency and detailed match information through its iterator-based approach.

Use re.finditer when you need to process matches one at a time or when working with large texts where memory usage is a concern.