Last modified: Jan 27, 2026 By Alexander Williams

Python defaultdict Tutorial: Simplify Dictionaries

Python dictionaries are powerful. They store data as key-value pairs. But they have a common problem. Trying to access a key that doesn't exist raises a KeyError. This can break your program.

The collections.defaultdict offers a smart solution. It is a subclass of the built-in dict. It automatically provides a default value for a missing key. This makes your code cleaner and safer.

What is a defaultdict?

A defaultdict is a container from the collections module. You must import it before use. Its main feature is the default_factory argument.

You pass a callable (like int, list, set) to default_factory. When a missing key is accessed, it calls this function. It uses the result as the key's initial value.

This prevents the dreaded KeyError. It also reduces boilerplate code for initialization.

Importing and Creating a defaultdict

First, import defaultdict from the collections module. Then, create an instance by specifying the default factory type.


from collections import defaultdict

# A defaultdict with int as the default factory.
# Missing keys will automatically get a value of int(), which is 0.
count_dict = defaultdict(int)

# A defaultdict with list as the default factory.
# Missing keys will automatically get an empty list [].
list_dict = defaultdict(list)

# A defaultdict with set as the default factory.
# Missing keys will automatically get an empty set set().
set_dict = defaultdict(set)

Practical Examples of defaultdict

Let's see how defaultdict solves real problems. We will compare it with a regular dictionary.

Example 1: Counting Items

Counting items is a common task. With a regular dict, you must check if a key exists. With defaultdict(int), it's automatic.


words = ["apple", "banana", "apple", "orange", "banana", "apple"]

# Method 1: Using a regular dictionary (verbose)
count_regular = {}
for word in words:
    if word in count_regular:
        count_regular[word] += 1
    else:
        count_regular[word] = 1
print("Regular dict:", count_regular)

# Method 2: Using defaultdict (clean)
count_default = defaultdict(int)
for word in words:
    count_default[word] += 1  # No KeyError! Starts at 0.
print("Default dict:", dict(count_default))


Regular dict: {'apple': 3, 'banana': 2, 'orange': 1}
Default dict: {'apple': 3, 'banana': 2, 'orange': 1}

The defaultdict version is shorter and clearer. The line count_default[word] += 1 works for both new and existing keys.

Example 2: Grouping Items with Lists

Grouping items under a key is another classic use case. A defaultdict(list) is perfect for this. For related operations, you might need to import lists into a Python dictionary from external sources.


# Group students by their grade
students = [("Alice", "A"), ("Bob", "B"), ("Charlie", "A"), ("Diana", "C"), ("Eve", "B")]

grouped = defaultdict(list)
for name, grade in students:
    grouped[grade].append(name)  # The list is created automatically.

print(dict(grouped))


{'A': ['Alice', 'Charlie'], 'B': ['Bob', 'Eve'], 'C': ['Diana']}

Without defaultdict, you would need to write an if statement to create the first list for each new grade.

Example 3: Building a Graph with Sets

For relationships where duplicates are not allowed, use defaultdict(set). This is common in graph structures.


# Represent friendships in a social network (undirected graph)
friendships = [("Alice", "Bob"), ("Alice", "Charlie"), ("Bob", "Alice"), ("Charlie", "Diana")]

graph = defaultdict(set)
for person_a, person_b in friendships:
    graph[person_a].add(person_b)
    # For an undirected graph, you'd also add: graph[person_b].add(person_a)

print({k: list(v) for k, v in graph.items()})


{'Alice': ['Charlie', 'Bob'], 'Bob': ['Alice'], 'Charlie': ['Diana']}

Key Differences from a Regular Dictionary

defaultdict behaves almost identically to a normal dict. But there are crucial differences.

1. Automatic Default Values: The main feature. Missing keys get a value from default_factory.

2. The default_factory Attribute: You can inspect or change it. Setting it to None turns the defaultdict into a standard dictionary that can raise KeyError.


dd = defaultdict(list)
print(dd.default_factory)  # Output: 

dd.default_factory = None
# Now it acts like a regular dict
# dd["new_key"]  # This would raise a KeyError

3. Behavior with __getitem__ vs get(): The [] operator triggers the default factory. The get() method does not. It behaves like a regular dict's get.


dd = defaultdict(int)
print(dd["missing"])  # Triggers factory, returns 0, adds key.
print(dd.get("another_missing"))  # Does NOT trigger factory, returns None.
print(dict(dd))


0
None
{'missing': 0}

Common Use Cases and Patterns

defaultdict shines in many scenarios beyond counting and grouping.

Nested Dictionaries: You can create dictionaries of dictionaries easily. Use lambda: defaultdict(int) as the factory.


# Counting fruits by their color
data = [("apple", "red"), ("banana", "yellow"), ("cherry", "red"), ("apple", "green")]

nested_count = defaultdict(lambda: defaultdict(int))
for fruit, color in data:
    nested_count[color][fruit] += 1

# Convert to regular dict for readable printing
import json
print(json.dumps(nested_count, default=lambda x: dict(x)))


{"red": {"apple": 1, "cherry": 1}, "yellow": {"banana": 1}, "green": {"apple": 1}}

Accumulating Values: Great for summing or concatenating data associated with keys.

Remember, while defaultdict helps with missing keys, other dictionary operations like pop to remove items or checking if Python dict is in order of append work the same way.

Potential Pitfalls and Best Practices

1. Unintended Key Creation: Simply accessing a key with dd[key] creates it. This can inflate your dictionary size. Use in or get() if you only want to check existence.

2. Pickling and Unpickling: When pickling a defaultdict, the default_factory is saved only if it's picklable (like int, list). Lambda functions or custom callables might cause issues.

3. Using Mutable Defaults: This is safe with defaultdict(list). Each new key gets a new, separate empty list. This avoids the famous "mutable default argument" problem of functions.

4. When Not to Use It: If your logic explicitly requires a KeyError for missing keys, use a regular dict. Also, if the default value is complex to compute, a regular dict with setdefault() might be more explicit.

If you later need to process this dictionary data, you might convert the Python dict to JSON for web APIs or use a DictWriter to write CSV files.

Conclusion

The collections.defaultdict is a powerful tool. It simplifies code that builds dictionaries incrementally. It eliminates checks for key existence and manual initialization.

It is most useful for counting, grouping, and building nested data structures. Remember its key behavior: it creates entries on missing key access with [].

Use it when you want clean, safe, and Pythonic code for dictionary accumulation tasks. It's a small module with a big impact on code readability and efficiency.