Last modified: May 28, 2026
Python JSON Dump UTF-8 Guide
JSON is a popular data format. Python makes it easy to write JSON files. But handling special characters like é or 中文 requires proper encoding. UTF-8 is the standard. This guide shows you how to use json.dump with UTF-8.
What is json.dump?
json.dump writes Python objects to a JSON file. It takes two main arguments: the data and a file object. Without encoding settings, it may write ASCII escape sequences. That makes your file hard to read.
For example, the character ñ becomes \u00f1. This is valid JSON but not human-friendly. UTF-8 encoding keeps the original characters.
Why Use UTF-8 with JSON?
UTF-8 supports all Unicode characters. It is the default encoding for modern web applications. When you write JSON files for APIs or data exchange, UTF-8 ensures compatibility.
Python's open() function defaults to system encoding. On Windows, that may be cp1252. This causes errors with non-ASCII text. Always specify encoding='utf-8'.
Basic Example of json.dump with UTF-8
Here is a simple example. We write a dictionary with special characters to a file.
import json
data = {
"name": "José",
"city": "São Paulo",
"language": "Python"
}
# Open file with UTF-8 encoding
with open("data.json", "w", encoding="utf-8") as f:
json.dump(data, f, ensure_ascii=False)
The ensure_ascii=False parameter is crucial. Without it, Python escapes non-ASCII characters. With it, the file contains readable text.
Output file content:
{"name": "José", "city": "São Paulo", "language": "Python"}
Handling Nested Data
Complex JSON structures work the same way. Lists, nested dictionaries, and mixed types all benefit from UTF-8 encoding.
import json
users = [
{"id": 1, "name": "Müller"},
{"id": 2, "name": "François"}
]
with open("users.json", "w", encoding="utf-8") as f:
json.dump(users, f, ensure_ascii=False, indent=4)
Output with indentation:
[
{
"id": 1,
"name": "Müller"
},
{
"id": 2,
"name": "François"
}
]
Common Errors and Solutions
Error: UnicodeEncodeError
This happens when you write non-ASCII characters without UTF-8 encoding. The open() function uses a default encoding that may not support your characters.
Solution: Always set encoding='utf-8' when opening the file.
# Wrong: may cause UnicodeEncodeError
with open("data.json", "w") as f:
json.dump(data, f)
# Correct: specify UTF-8
with open("data.json", "w", encoding="utf-8") as f:
json.dump(data, f, ensure_ascii=False)
Error: File Contains Escaped Unicode
If you forget ensure_ascii=False, your file has \uXXXX sequences. This is valid but ugly.
Solution: Add ensure_ascii=False to json.dump.
Reading UTF-8 JSON Files
To read the files you wrote, use json.load with the same encoding.
import json
with open("data.json", "r", encoding="utf-8") as f:
loaded_data = json.load(f)
print(loaded_data["name"]) # Output: José
Always match the encoding when reading. This prevents data corruption.
Best Practices for UTF-8 JSON
- Always use
encoding='utf-8'inopen(). - Set
ensure_ascii=Falseinjson.dump. - Use
indentfor readability in development. - Test with sample Unicode data.
Performance Considerations
Using UTF-8 does not slow down your code. Python handles encoding efficiently. The ensure_ascii=False parameter may slightly reduce file size because it avoids escape sequences.
For large files, consider streaming. Use json.dump with a file object instead of building a large string in memory.
Conclusion
Using json.dump with UTF-8 is simple. Open the file with encoding='utf-8' and set ensure_ascii=False. This keeps your JSON readable and compatible. Handle errors by checking your encoding settings. With these steps, you can work with any language or symbol in your JSON files.