Aggregation 2 — Python: Trees

Let's practice with another option for data aggregation on file systems. We'll write a function that accepts a directory as input and returns a list of directories of the first level of nesting and the number of files inside each of them, including all subdirectories:

from hexlet import fs

tree = fs.mkdir('/', [
    fs.mkdir('etc', [
        fs.mkdir('apache'),
        fs.mkdir('nginx', [
            fs.mkfile('nginx.conf'),
        ]),
    ]),
    fs.mkdir('consul', [
        fs.mkfile('config.json'),
        fs.mkfile('file.tmp'),
        fs.mkdir('data'),
    ]),
    fs.mkfile('hosts'),
    fs.mkfile('resolve'),
])

print(get_subdirectories_info(tree))
# => [('etc', 1), ('consul', 2)]

We can break this task down into two steps:

Counting the number of files inside a directory
Calling the file counting function on each of the subdirectories

Let's start by counting the number of files. It is a classic aggregation task:

def get_files_count(node):
    if fs.is_file(node):
        return 1
    children = fs.get_children(node)
    descendant_counts = list(map(get_files_count, children))
    return sum(descendant_counts)

The next step is to extract all the children from the source node and apply a count to each of them:

def get_subdirectories_info(node):
    children = fs.get_children(node)
    # We are only interested in directories
    filtered = filter(fs.is_directory, children)
    # Running the count for each directory
    result = map(
        lambda child: (fs.get_name(child), get_files_count(child)),
        filtered,
    )
    return list(result)

In other words, we addressed the children directly, filtered them, and then mapped them to the necessary array containing names and numbers of files for each directory.

For full access to the course you need a professional subscription.

A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.

Get access

130

courses

1000

exercises

2000+

hours of theory

3200

tests

Programming courses for beginners and experienced developers. Start training for free