Last modified: Apr 25, 2026 By Alexander Williams
Python Split By Multiple Characters
Python's str.split() method is great for splitting strings by a single delimiter. But what if you need to split by multiple characters, like commas, semicolons, and spaces? This is a common task for cleaning data or parsing logs. This guide covers the best ways to do it, with clear examples for beginners.
Why You Need Multiple Delimiters
Real-world data is messy. A single file might use different separators. For example, user input can contain commas, pipes, or newlines. You need a method that handles all these at once.
Python offers two main approaches: using the re.split() function from the regular expressions module, or chaining str.split() with str.replace(). We'll explore both.
Method 1: Using re.split() (Recommended)
The re.split() function is the most powerful and flexible way to split by multiple characters. It uses a pattern to define all delimiters at once.
First, import the re module. Then, define a pattern inside square brackets []. For example, [,; ] splits by comma, semicolon, or space.
import re
text = "apple,banana;grape orange"
# Split by comma, semicolon, or space
result = re.split(r'[,; ]', text)
print(result)
['apple', 'banana', 'grape', 'orange']
You can add as many characters as needed inside the brackets. For instance, [,;:\n\t] splits by comma, semicolon, colon, newline, and tab.
This method also handles consecutive delimiters by default. It returns empty strings between them, which you can filter out.
import re
text = "a,,b,,c"
result = re.split(r'[,]', text)
print(result) # Notice empty strings
['a', '', 'b', '', 'c']
To remove empty strings, use a list comprehension: [item for item in result if item].
Method 2: Chaining str.split() and str.replace()
If you don't want to use regular expressions, you can chain str.replace() to turn all delimiters into one, then split. This is simpler but less efficient for many delimiters.
Replace all delimiters with a common one, like a space, then split by that.
text = "apple,banana;grape orange"
# Replace comma and semicolon with space
temp = text.replace(',', ' ').replace(';', ' ')
result = temp.split()
print(result)
['apple', 'banana', 'grape', 'orange']
This works well for a few delimiters. But for many, the code becomes long and hard to read. Also, str.split() without arguments splits by any whitespace and removes empty strings automatically.
Be careful: str.replace() creates a new string each time. For large text, this can be slower than re.split().
Handling Special Characters in Patterns
Some characters have special meaning in regular expressions, like ., *, +, ?, |, (, ), [, ], {, }, ^, $, \. To use them as literal delimiters, you must escape them with a backslash \.
import re
text = "word1.word2?word3"
# Split by dot or question mark
result = re.split(r'[\.\?]', text)
print(result)
['word1', 'word2', 'word3']
Notice the backslash before the dot and question mark. This tells Python to treat them as literal characters.
Using re.split() with Capturing Groups
Sometimes you want to keep the delimiters in the result. Use a capturing group by wrapping the pattern in parentheses ().
import re
text = "a,b;c"
result = re.split(r'(,|;)', text)
print(result)
['a', ',', 'b', ';', 'c']
This is useful when you need to reconstruct the string or analyze separators.
Practical Example: Parsing a CSV with Mixed Delimiters
Imagine you have a CSV file where some lines use commas, others use semicolons, and there are spaces. Using re.split() makes it easy.
import re
csv_line = "John Doe, 25; New York, USA"
# Split by comma, semicolon, or optional spaces
parts = re.split(r'[,;]\s*', csv_line)
print(parts)
['John Doe', '25', 'New York', 'USA']
The pattern [,;]\s* splits by a comma or semicolon, then removes any following spaces. This cleans the data in one step.
Performance Considerations
For simple splits with one or two delimiters, str.replace() chaining is fast and readable. For complex patterns or many delimiters, re.split() is better. The re module is compiled in C, so it's efficient even for large strings.
If you're processing huge files, consider compiling the regex pattern with re.compile() for reuse.
import re
pattern = re.compile(r'[,;:\s]+')
text = "a,b;c:d e"
result = pattern.split(text)
print(result)
['a', 'b', 'c', 'd', 'e']
Common Mistakes to Avoid
Beginners often forget to import the re module. Another mistake is using str.split() with a string that contains multiple delimiters—it only splits by one.
Also, be careful with whitespace. str.split() without arguments splits by any whitespace and removes empty strings. But re.split(r'[ ]', text) only splits by a single space, not tabs or newlines.
Conclusion
Splitting strings by multiple characters in Python is easy with re.split(). It's flexible, handles special characters, and can even keep delimiters. For simple cases, chaining str.replace() works too. Practice with your own data to see which method fits best. Remember to escape special characters and filter empty strings if needed.
For more on handling text data, check out our guide on Python Character Encoding Guide for Beginners. It covers common encoding issues that arise when reading text files.