Register to get access to free programming courses with interactive exercises

Initializing new values and defaultdicts Python: Dictionaries and Sets

Imagine the following situation: you need to store something changeable in a dictionary as values, such as lists. And while working with this dictionary, you find yourself in a situation where you have a key and an element to add to the list, but the issue is that the key may not be in the dictionary. Here is the code you have to write:

if key not in dictionary:
    dictionary[key] = []  # Initializing the list
dictionary[key].append(value)  # Changing the list

It is not a particularly rare situation. The writers of the Python standard library also realized this and added the setdefault method. We can rewrite the above code using this method:

dictionary.setdefault(key, []).append(value)

It is compact and concise. But what does the setdefault method do? It takes a key and a default value and returns a reference to the value in the dictionary associated with the specified key. And if that key is not in the dictionary, then the method gives that key the default value and returns a reference to it. In the example above, the default value is an empty list [].

The defaultdict package

The standard Python package includes the collections module. Among other things, this module provides the defaultdict type. The defaultdict is an ordinary dictionary with one unique property — while a dictionary would tell you off for a missing key, the defaultdict returns the default value. Let us look at an example:

from collections import defaultdict
d = defaultdict(int)
d['a'] += 5
d['b'] = d['c'] + 10
d  # defaultdict(<class 'int'>, {'a': 5, 'c': 0, 'b': 10})

When we created the dictionary, we specified the int function as an argument. If we call this function without arguments, it will return 0. And this call inside the d dictionary occurs whenever you need to get a value for a non-existent key.

Therefore, d['a'] += 5 will result in 5 because:

  • First, we create an initial value for the 'a' key making an int() call and getting 0
  • Second, we add 5 to it

In the line d['b'] = d['c'] + 10, we:

  • Create values for the 'b' and 'c' keys
  • And then write the sum of 0 + 10 to the 'b' key

Here is another example, this time with an initializer function we made ourselves:

def new_value():
    return 'foo'
x = defaultdict(new_value)
x[1]  # 'foo'
x['bar']  # 'foo'
x  # defaultdict(<function new_value at 0x7f2232cf5a60>, {1: 'foo', 'bar': 'foo'})

Disregarding the somewhat incomprehensible mention of the initializer function, we can see that all the keys we have accessed in the dictionary contents now have strings containing 'foo' written to them.

The differences between defaultdict and setdefault

Why have both methods if they are so similar, I hear you ask. Let us compare these two strings:

a.setdefault(key, []).append…
# vs
b[key].append…

# b is the defaultdict(list)

The strings are very similar. Python makes an empty list object in the first line each time it creates a new list only if it does not find the key. Since the program calculates the values of the arguments before calling the setdefault(key, []) function, we can ignore the cost of creating an empty list in this case.

Creations require database lookups, so the defaultdict option is much more preferable when the cost of creating a default value is high.

Why use setdefault at all? Well, you can use it to initialize different values with different keys. Since we pass the default value each time, we can even store different types of data using multiple keys. With defaultdict, we do not have any control over which values to put on which keys. We call the initializer function each time, and Python does not pass the key to it.

Finally, there are always rare cases where defaultdict is unsuitable because you need to initialize the values differently, but setdefault is not good either. The new values are immutable, so we cannot change them by the returned link. Here is an example of such a case with the solution to the problem of not finding the key:

x['count'] = x.get('count', 0) + 1
x['path'] = x.get('path', '') + '/' + dir

Yes, you have to use the one key many times, but the code itself reads well, and we can say this is an optimal situation.


Are there any more questions? Ask them in the Discussion section.

The Hexlet support team or other students will answer you.

For full access to the course you need a professional subscription.

A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.

Get access
130
courses
1000
exercises
2000+
hours of theory
3200
tests

Sign up

Programming courses for beginners and experienced developers. Start training for free

  • 130 courses, 2000+ hours of theory
  • 1000 practical tasks in a browser
  • 360 000 students
By sending this form, you agree to our Personal Policy and Service Conditions

Our graduates work in companies:

Bookmate
Health Samurai
Dualboot
ABBYY
Suggested learning programs
profession
new
Developing web applications with Django
10 months
from scratch
under development
Start at any time

Use Hexlet to the fullest extent!

  • Ask questions about the lesson
  • Test your knowledge in quizzes
  • Practice in your browser
  • Track your progress

Sign up or sign in

By sending this form, you agree to our Personal Policy and Service Conditions
Toto Image

Ask questions if you want to discuss a theory or an exercise. Hexlet Support Team and experienced community members can help find answers and solve a problem.