Last modified: May 28, 2026

Python JSON Dump UTF-8 Guide

JSON is a popular data format. Python makes it easy to write JSON files. But handling special characters like é or 中文 requires proper encoding. UTF-8 is the standard. This guide shows you how to use json.dump with UTF-8.

What is json.dump?

json.dump writes Python objects to a JSON file. It takes two main arguments: the data and a file object. Without encoding settings, it may write ASCII escape sequences. That makes your file hard to read.

For example, the character ñ becomes \u00f1. This is valid JSON but not human-friendly. UTF-8 encoding keeps the original characters.

Why Use UTF-8 with JSON?

UTF-8 supports all Unicode characters. It is the default encoding for modern web applications. When you write JSON files for APIs or data exchange, UTF-8 ensures compatibility.

Python's open() function defaults to system encoding. On Windows, that may be cp1252. This causes errors with non-ASCII text. Always specify encoding='utf-8'.

Basic Example of json.dump with UTF-8

Here is a simple example. We write a dictionary with special characters to a file.

import json

data = {
    "name": "José",
    "city": "São Paulo",
    "language": "Python"
}

# Open file with UTF-8 encoding
with open("data.json", "w", encoding="utf-8") as f:
    json.dump(data, f, ensure_ascii=False)

The ensure_ascii=False parameter is crucial. Without it, Python escapes non-ASCII characters. With it, the file contains readable text.

Output file content:

{"name": "José", "city": "São Paulo", "language": "Python"}

Handling Nested Data

Complex JSON structures work the same way. Lists, nested dictionaries, and mixed types all benefit from UTF-8 encoding.

import json

users = [
    {"id": 1, "name": "Müller"},
    {"id": 2, "name": "François"}
]

with open("users.json", "w", encoding="utf-8") as f:
    json.dump(users, f, ensure_ascii=False, indent=4)

Output with indentation:

[
    {
        "id": 1,
        "name": "Müller"
    },
    {
        "id": 2,
        "name": "François"
    }
]

Common Errors and Solutions

Error: UnicodeEncodeError

This happens when you write non-ASCII characters without UTF-8 encoding. The open() function uses a default encoding that may not support your characters.

Solution: Always set encoding='utf-8' when opening the file.

# Wrong: may cause UnicodeEncodeError
with open("data.json", "w") as f:
    json.dump(data, f)

# Correct: specify UTF-8
with open("data.json", "w", encoding="utf-8") as f:
    json.dump(data, f, ensure_ascii=False)

Error: File Contains Escaped Unicode

If you forget ensure_ascii=False, your file has \uXXXX sequences. This is valid but ugly.

Solution: Add ensure_ascii=False to json.dump.

Reading UTF-8 JSON Files

To read the files you wrote, use json.load with the same encoding.

import json

with open("data.json", "r", encoding="utf-8") as f:
    loaded_data = json.load(f)

print(loaded_data["name"])  # Output: José

Always match the encoding when reading. This prevents data corruption.

Best Practices for UTF-8 JSON

  • Always use encoding='utf-8' in open().
  • Set ensure_ascii=False in json.dump.
  • Use indent for readability in development.
  • Test with sample Unicode data.

Performance Considerations

Using UTF-8 does not slow down your code. Python handles encoding efficiently. The ensure_ascii=False parameter may slightly reduce file size because it avoids escape sequences.

For large files, consider streaming. Use json.dump with a file object instead of building a large string in memory.

Conclusion

Using json.dump with UTF-8 is simple. Open the file with encoding='utf-8' and set ensure_ascii=False. This keeps your JSON readable and compatible. Handle errors by checking your encoding settings. With these steps, you can work with any language or symbol in your JSON files.