In the previous lesson, we mentioned that consecutive calls to the write
method append text to the end. Here we are going to continue with this topic and learn how to work with files line by line.
How to write text line-by-line
We often have an iterator that provides text line by line. Of course, you can write a loop, but there's a better way — the writelines
method. Here's how it works:
f = open("foo.txt", "w")
f.writelines(["cat\n", "dog\n"])
f.close()
f = open("foo.txt", "r")
print(f.read())
# => cat
# => dog
f.close()
As you can see, we wrote all the lines in the correct order. This approach is preferable when you need to write a large amount of text that you receive and process line by line. It's possible to accumulate the entire text in a single string beforehand, but it may require much memory. We should write the lines as they become available, and writelines
is perfect.
How to read text line-by-line
You can not only write to a file line by line but also read it in the same way:
f = open("foo.txt")
f.readline() # 'cat\n'
f.readline() # 'dog\n'
f.readline() # ''
f.close()
Python understands that we separate the text lines by the newline character. The readline
call moves the position to the following line, and once the text is over, all subsequent calls will return an empty string.
Note that the text lines include the newline characters themselves.
The readline
method is convenient when we want to control the reading process. However, we often want to read all the lines of text. For that, you need to iterate over the file object. You will get an iterator of lines that we can read in a loop:
f = open("foo.txt")
for l in f:
print(l)
# => cat
# => dog
f.close()
If you don't specify a mode, as I did this time, the file will open in read mode. Convenient, right? Think about why we printed extra empty lines. The file line iterator, as expected, is lazy. It reads lines only as needed and stops when there is nothing more to read.
Laziness allows you, among other things, to not read the entire file:
f = open("foo.txt")
for l in f:
print(l)
break
# => cat
print(f.read())
# => dog
f.close()
If you want to get all the lines of text as a list, you can call the readlines
method and get that very list.
How to use streaming for large files
Using iterators is very convenient for streaming processing of files. Streaming processing means you can process large files without storing the entire file in memory. Here's an example of a script that numbers the lines of an input file and writes them to an output file:
input_file = open("input.txt", "r")
output_file = open("output.txt", "w")
for i, line in enumerate(input_file, 1):
output_file.write(f"{i}) {line}")
input_file.close()
output_file.close()
Save this script in a file and see how it works.
Are there any more questions? Ask them in the Discussion section.
The Hexlet support team or other students will answer you.
For full access to the course you need a professional subscription.
A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.