- What are list generators
- When to use list generators
- How the declarative nature of list generators manifests
In a developer's everyday life, you will often see code that works with iterations because iterators are built into the language and tightly integrated into the standard library.
We usually assemble iterators and operations on them into data conveyors. Only at the end of each pipeline is reduce()
or something else that doesn't pass elements on. Most of these pipelines consist of two types of operations:
- Converting individual elements with the
map()
function. It converts the entire stream using another function that handles the individual items - Changing the composition of the elements via filtration or multiplication. The
filter()
function can filter the data. And themap()
paired with thechain()
from theitertools
module turns each element into several without changing the nesting level
For example, imagine we want a list of numbers like this:
[0, 0, 2, 2, 4, 4...]
There are two copies each of increasing even numbers. Let's write a suitable pipeline:
# Getting a stream of even numbers
def is_even(x):
return x % 2 == 0
list(filter(is_even, range(20)))
# [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
# Doubling each of them
def dup(x):
return [x, x]
list(map(dup, filter(is_even, range(20))))
# [[0, 0], [2, 2], [4, 4], [6, 6], [8, 8], [10, 10], [12, 12], [14, 14], [16, 16], [18, 18]]
# Making the pipeline flat again
from itertools import chain
list(chain(*map(dup, filter(is_even, range(20)))))
# [0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10, 12, 12, 14, 14, 16, 16, 18, 18]
# Making a single-line variant
list(chain(*map(lambda x: [x, x], filter(lambda x: x % 2 == 0, range(20)))))
# [0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10, 12, 12, 14, 14, 16, 16, 18, 18]
As you can see, the task is done by connecting ready-made elements rather than by writing all the code manually in the form of a for
loop. We can already see the issue with our constructor: if there are no ready-made functions on elements or predicates, we will declare them beforehand or use lambda
. Both options are inconvenient.
When another person reads our code with individual functions, they have to keep jumping back and forth through the code. And lambda
looks unwieldy. But don't despair, Python has a syntax that can simplify working with conveyors.
What are list generators
Let's try to solve the same problem another way:
[x for num in range(20) for x in [num, num] if num % 2 == 0]
# [0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10, 12, 12, 14, 14, 16, 16, 18, 18]
It's also a one-liner. You can get used to this syntax, although it may not look convenient now. Let's try to format the whole expression:
[x
for num in range(20)
for x in [num, num]
if num % 2 == 0
]
# [0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10, 12, 12, 14, 14, 16, 16, 18, 18]
The code now looks like two nested loops. We can write similar code on regular ones:
res = []
for y in range(20):
for x in [y, y]:
if y % 2 == 0:
res.append(x)
res
# [0, 0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10, 12, 12, 14, 14, 16, 16, 18, 18]
The code looks very similar, but there are two differences:
- For the first version, we create a new list, and for the second one, we modify a previously created one
- The first version is an expression, and the second is a set of instructions. Consequently, we can use the first option as a part of any other expressions. We didn't have to declare any auxiliary functions, and we didn't need any lambdas
Expressions that look like [… for … in …]
are called list generators. Consider the components of the new syntax. List generators are defined as follows:
[EXPRESSION for VARIABLE in SOURCE if CONDITION]
Let's look at this pattern in more detail:
EXPRESSION
can useVARIABLE
and is computed into a future list itemVARIABLE
is the name with which theSOURCE
elements are associated alternatelySOURCE
— any iterator or iterated objectCONDITION
— an expression that usesVARIABLE
, computed at each iteration
If the condition is false, we skip the computing of the current iteration, so we add no new item to the final list. If we omit the condition and the if
keyword, it will be equivalent to the if True
. There can be several variables. Here, unpacking tuples and lists, including nested ones, works too.
Here are a few examples:
# squares of numbers
[x*x for x in [1, 2, 3]]
# [1, 4, 9]
# Codes of uppercase letters from a given string
[ord(c) for c in "Hello!!" if c.isalpha() and c.islower()]
# [101, 108, 108, 111]
# Indexes of pairs whose elements are equal to each other
[i for i, (x, y) in enumerate([(1, 2), (4, 4), (5, 7), (0, 0)]) if x == y]
# [1, 3]
When to use list generators
We saw above that list generators don't override all the built-in functions for dealing with iterators. One goes well with the other.
On the other hand, it's better not to mix list generators with map()
and filter()
— they're just interchangeable entities. Also, don't mix list generators with any side effects. The point is that generators allow you to write concise and compact code. There's no need to force the programmer to think about what will change and where when creating the list.
It applies not only to code with the map()
and filter()
functions but to any declarative pipelines in general. It's worth separating code written in different paradigms into separate sections. For example, I/O is one of the main types of side effects. It may be at the beginning of the pipeline or the end of it, but not in the middle.
How the declarative nature of list generators manifests
Let's see how the list generator differs from the explicitly imperative double loop. When you use loops, you can build a list and make other side effects — for example, change the objects in lists.
The loops perform repetitive actions, so side effects are okay in them.
The list generators, in turn, describe what each item is, not how to get it from the outside world or output it to the console. You can look at the different parts and see that:
- The list generator describes the result. It says: "The resulting list is a list of numbers between 1 and 20"
- The procedural solution shows how to get a result. It says: "For each number in the range up to 20, add a number to the list"
The for
loops look the same in both cases because Python loops are more declarative than some other languages. Thus, the for
loops are considered imperative in Python because of their body, not their header.
Are there any more questions? Ask them in the Discussion section.
The Hexlet support team or other students will answer you.
For full access to the course you need a professional subscription.
A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.