The declarative approach to defining sequences has many advantages.
Sometimes, it's necessary to tone down the declarative style and use imperative techniques, such as a changeable state or the ability to abort the generation process early. In other cases, the elements of the output sequence don't depend on each other or the input sequence so explicitly that we can do without declarative methods. Simply put, sometimes you need to get to a level where the output of elements is simple and controllable — like printing elements with print()
. This code will look very imperative, but it will be effective.
In Python, iterative computation is ubiquitous, so low-level programming tools for data streams are built into the language. That's what we'll study in this lesson.
The yield
keyword
Imagine we need to construct a sequence of numbers whose elements increase exponentially. If we want to print these numbers, the code would likely look like this:
def iterate(x0, m):
x = x0
while True:
print(x)
x *= m
iterate(1, 1.1)
print("impossible!")
As soon as we call the iterate
procedure, all increasing numbers will be output indefinitely because we haven't provided any end to the cycle. The execution of the iterate
won't finish, so all the code after the call won't execute.
Now imagine we need a sequence outside the iterate
procedure. We can't do return
instead of print()
, as it'll stop the generation process in this case.
We could pass a list as an argument to which the procedure would add items instead of printing them. However, we won't be able to use the list because the process will never finish. We cannot always know how many iterations we will have to perform. To cope with the problems described above, you need a new keyword, yield
:
def iterate(x0, m):
x = x0
while True:
yield x # instead of print()
x *= m
iterate(1, 1.1)
# <generator object iterate at 0x...>
Notice that the iterate
function call is calculated into the generator object <generator object>
while we do not loop the function. Now iterate
is a function because it calculates a specific result.
These functions are called generator functions. They're constructed using the yield
keyword and return a generator object. But where are the numbers? They'll be given to us by the generator object, which works as an infinite-sequence iterator in this case. Note the wording, "works as an iterator". In Python, many things work according to conventions, so if something behaves like an iterator, it is considered an iterator.
Here's how you can apply this function:
for n in iterate(1, 1.2):
print(n)
if n > 3:
break
# => 1
# => 1.2
# => 1.44
# => 1.728
# => 2.0736
# => 2.48832
# => 2.9859839999999997
# => 3.5831807999999996
Here, the caller decides how many elements it needs and when. In this case, the generator function code does not carry the burden of something it does not need.
Initialization, pause, and end of generation
The yield
keyword is similar to return
in the example above. It also returns a single element, not a generic expression. Another similarity is that control goes back to the code that requested the item from the iterator.
Usually, it stops the body from functioning once and for all. However, yield
suspends execution. Execution resumes when the caller asks for a new element via next()
.
It continues until one of these events occurs:
- It encounters a new
yield
- It encounters a
return
- It executes the last line of the function body
In the first case, the caller will receive the generated value, and execution will pause. The other two events work the same — they complete the iteration process. The code above the first yield
is often called the initialization code. Python executes it when next()
is first applied to the generator object.
During the initialization and termination phases, it's convenient to open the files with contents portioned out by the iterator and close the file in time. Declarative generators don't have this capability per se, so it is worth knowing how to write generator functions for the sake of this flexibility.
Consider a small example, reporting all phases of how it works:
def f():
print('Initializing...')
yield 'one'
print('Continue...')
yield 'two'
print('Stopping...')
i = f()
# Nothing has been done yet
i
# <generator object f at 0x...>
next(i) # very first next()
# => Initializing...
# => 'one'
# Initialization passed, the first value is received
next(i)
# => Continue...
# => 'two'
# We execute the code between the first yield and the next one and obtain the second value
next(i)
# => Stopping...
# Traceback (most recent call last):
# ...
# next(i)
# StopIteration
# Execution has reached the end of the function body, so iteration is complete
j = iter(i) # Trying to get a new iterator
j is i
# True
# iter() The response was a reference to the original object
next(j)
# Traceback (most recent call last):
# ...
# next(j)
# StopIteration
# The same sequence could not be circumvented again
This example demonstrates that the generator object reacts to iter()
, but we cannot reuse it. However, you can always get a new instance by calling the generator function. However, saving the state between several sections that consume iterator elements can be very useful.
Are there any more questions? Ask them in the Discussion section.
The Hexlet support team or other students will answer you.
For full access to the course you need a professional subscription.
A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.