Dictionaries and sets help a lot when the information about elements is complete. It makes sense with incomplete sets or dictionaries because we cannot check for values or keys in other ways.
The situation is different with lists. A list is a sequence of items, so we need only traverse it once. And sometimes, you don't even need to go to the end of the list, for example, if we're looking for one specific item. It's important to remember that map
and filter
don't generate lists but generate new iterators based on iterator arguments instead.
Iterator-based data pipelines are efficient because they do nothing until the receiver party needs their result at the pipeline output. It is especially true when we combine them with functions from the itertools
module.
List generators have another important feature — they create the whole list one way or another, even if they do not need all items from the list. Usually, you can interrupt the loop using break
but cannot interrupt the list generator. Besides, it wouldn't look declarative. However, Python knows how to use iterator laziness in declarative code.
Generator expressions
We said above that sometimes sequences don't need to be computed. We'll also add that you seldom need to get and keep finished lists.
In those rare cases where you need a list, you can use list generators. But most problems are solved with generator expressions. They look like list generators. The only difference is they use round brackets instead of square brackets:
[x * x for x in range(10)]
# [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
(x * x for x in range(10))
# <generator object <genexpr> at 0x7fe76f7e5db0>
As you can see, the result of calculating the second expression isn't a list but a generator object
. We use it to postpone computing the elements of a sequence until we need them.
The generator objects are essentially iterators, so we cannot traverse them more than once:
def print6(xs):
for i, x in enumerate(xs):
print(x)
if i == 5:
break
i = (x * x for x in range(10))
print6(i)
# => 0
# => 1
# => 4
# => 9
# => 16
# => 25
print6(i) # Keep going through the elements
# => 36
# => 49
# => 64
# => 81
print6(i) # There's nothing else left
Here __iter__
is called for the iterator each time, but the iterator returns itself instead of the new iterator. The program computes and uses elements but does not save them anywhere.
Brackets
You can often find a generator expression in a place in the code where the interpreter can unambiguously understand the boundaries of that expression.
The most common example is a generic expression as a single-function argument:
f((… for … in …))
In these cases, we can omit the parentheses around expressions and tuples if it doesn't interfere with the code readability. This elimination of unnecessary brackets often makes the code even more concise:
any(x > 100 for x in range(1000000))
# True
We can translate this code as follows:
Is any X greater than a hundred among all Xs between zero and a million?
The program will compute this expression instantly and check the numbers one at a time. Suppose we used any([… for …])
. In this case, Python would also look for the first True
value in the list, but it would first build a list of a million elements in memory.
Generator expressions in coding
Try to use generator expressions wherever possible. Almost any function working with sequences in one form or another can use generator objects.
Even when calling a function for a bunch of arguments, it's better to use a generator expression:
print(*(x for x in "Hello World!" if x.isupper()))
# => H W
And it is worth using generator expressions in the middle of list
, set
, and dict
dict expressions, as well as among generator expressions and map
or filter
-based pipelines. Previously, Python only had list generators and generator expressions. We construct sets and dictionaries this way:
set(x * x for x in range(10))
# {0, 1, 64, 4, 36, 9, 16, 49, 81, 25}
dict((x, x * x) for x in range(10))
# {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}
Here, the generator expression creates elements one at a time. The set()
and dict()
functions take iterator elements one at a time and insert them in the right places. It is already quite an effective method. We used separate syntax constructs for set and dictionary generators to increase the expressiveness of the code.
Are there any more questions? Ask them in the Discussion section.
The Hexlet support team or other students will answer you.
For full access to the course you need a professional subscription.
A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.