The declarative approach to defining sequences has many advantages.
Sometimes, it's necessary to tone down the declarative style and use imperative techniques, such as a changeable state or the ability to abort the generation process early. In other cases, the elements of the output sequence don't depend on each other or the input sequence so explicitly that we can do without declarative methods. Simply put, sometimes you need to get to a level where the output of elements is simple and controllable — like printing elements with print()
. This code will look very imperative, but it will be effective.
In Python, iterative computation is ubiquitous, so low-level programming tools for data streams are built into the language. That's what we'll study in this lesson.
The yield
keyword
Imagine we need to construct a sequence of numbers whose elements increase exponentially. If we want to print these numbers, the code would likely look like this:
def iterate(x0, m):
x = x0
while True:
print(x)
x *= m
iterate(1, 1.1)
print("impossible!")
As soon as we call the iterate
procedure, all increasing numbers will be output indefinitely because we haven't provided any end to the cycle. The execution of the iterate
won't finish, so all the code after the call won't execute.
Now imagine we need a sequence outside the iterate
procedure. We can't do return
instead of print()
, as it'll stop the generation process in this case.
We could pass a list as an argument to which the procedure would add items instead of printing them. However, we won't be able to use the list because the process will never finish. We cannot always know how many iterations we will have to perform. To cope with the problems described above, you need a new keyword, yield
:
def iterate(x0, m):
x = x0
while True:
yield x # instead of print()
x *= m
iterate(1, 1.1)
# <generator object iterate at 0x...>
Notice that the iterate
function call is calculated into the generator object <generator object>
while we do not loop the function. Now iterate
is a function because it calculates a specific result.
These functions are called generator functions. They're constructed using the yield
keyword and return a generator object. But where are the numbers? They'll be given to us by the generator object, which works as an infinite-sequence iterator in this case. Note the wording, "works as an iterator". In Python, many things work according to conventions, so if something behaves like an iterator, it is considered an iterator.