Register to get access to free programming courses with interactive exercises

Accumulator Python: Trees

In some situations, additional information is needed when traversing the tree, depending on the node location. We cannot obtain this information from the node description because the node doesn't contain it. We should collect this information during the crawl. This information may include the full path to the file or the depth of the current node. Any given node won't know where it is. The file location in the file structure depends on the nodes that lead to a particular file.

In this lesson, we'll look at the accumulator, a parameter that collects the necessary data during tree traversal. It may further complicate code, but it's impossible to perform these tasks without it. We will take this task as an example and try to find all empty directories in our file system. First, we will implement a simple version, complicate it, and introduce an accumulator. Here's an example of the file system:

from hexlet import fs

tree = fs.mkdir('/', [
    fs.mkdir('etc', [
        fs.mkdir('apache'),
        fs.mkdir('nginx', [
            fs.mkfile('nginx.conf'),
        ]),
        fs.mkdir('consul', [
            fs.mkfile('config.json'),
            fs.mkdir('data'),
        ]),
    ]),
    fs.mkdir('logs'),
    fs.mkfile('hosts'),
])

There are three empty directories in this structure:

  • /logs
  • /etc/apache
  • /etc/consul/data

The code that does it looks like this:

def find_empty_dir_paths(tree):
    name = fs.get_name(tree)
    # Getting the children of the node
    children = fs.get_children(tree)
    # Returning the directory if there are no children
    if len(children) == 0:
        return name
    # Filtering the files because we do not need them
    dir_names = filter(fs.is_directory, children)
    # Looking for empty directories inside the current one
    empty_dir_names = list(map(find_empty_dir_paths, dir_names))
    # Flattening the list
    return fs.flatten(empty_dir_names)


print(find_empty_dir_paths(tree))
# => ['apache', 'data', 'logs']

https://repl.it/@hexlet/python-trees-search-find-empty-dirs

The most unusual thing in this implementation is the flatten() function. Why is it needed? If you leave only map(), then the result may be surprising:

find_empty_dir_paths(tree)
# [['apache', [], ['data']], 'logs']

It happens due to the list returned at each nesting level. The output is a list of lists that resembles the structure of the original file tree. To prevent this from happening, you need to correct the code using flatten().

Let us try to complicate the task. We will find all the empty directories with a maximum search depth of 2 levels. In other words, the directories /logs and /etc/apache fit this condition, but /etc/consul/data does not. First, you need to understand where to get the depth. Python calculates tree depth as the number of edges from the root to the desired node. It is easy to calculate visually, but what about the code? We can represent the depth of a particular node as the depth of the previous node plus one.

The next step is to add a variable we passed at each recursive call falling into the directory. This variable, in the case of our task, contains the current depth within itself. It means we add a unit at each level inside each directory. This variable is an accumulator since it accumulates data.

The only problem is that the original function find_empty_dir_paths() has node as one parameter. We cannot pass the depth to all nested directories and files using it.

We will have to introduce an internal function that will be able to throw the accumulator further up the tree:

def find_empty_dir_paths(tree):
    # There is an internal function that can transfer the battery
    # The accumulator is `depth`, a variable containing the current depth
    def walk(node, depth):
        name = fs.get_name(node)
        children = fs.get_children(node)
        # If there are no children, we return the directory
        if len(children) == 0:
            return name
        # There is no point in looking any further
        # if it is the second nesting level and the directory is not empty
        if depth == 2:
            # We got an empty list because the outside is flat
            # It reveals blank lists
            return []
        # Leaving only the directories
        dir_paths = filter(fs.is_directory, children)
        # Don't forget to increase the depth
        output = list(map(
            lambda child: walk(child, depth + 1),
            dir_paths,
          ))
        # Straightening the list before returning it
        return fs.flatten(output)

    # We start with a depth of `0`
    return walk(tree, 0)


print(find_empty_dir_paths(tree))
# => ['apache', 'logs']

https://repl.it/@hexlet/python-trees-accumulator-find-empty-dirs-with-depth

You can go even further and allow it to specify the maximum depth from outside:

def find_empty_dir_paths(tree, max_depth=2):
    # ...

But how to make it to view the whole tree by default? For example, you can take a deliberately large number and make it the default value. This approach will work, but it is more of a workaround.

The correct way to do this is to use infinity as the default value:

def find_empty_dir_paths(tree, max_depth=math.inf):
    # ...

Are there any more questions? Ask them in the Discussion section.

The Hexlet support team or other students will answer you.

For full access to the course you need a professional subscription.

A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.

Get access
130
courses
1000
exercises
2000+
hours of theory
3200
tests

Sign up

Programming courses for beginners and experienced developers. Start training for free

  • 130 courses, 2000+ hours of theory
  • 1000 practical tasks in a browser
  • 360 000 students
By sending this form, you agree to our Personal Policy and Service Conditions

Our graduates work in companies:

Bookmate
Health Samurai
Dualboot
ABBYY
Suggested learning programs
profession
new
Developing web applications with Django
10 months
from scratch
under development
Start at any time

Use Hexlet to the fullest extent!

  • Ask questions about the lesson
  • Test your knowledge in quizzes
  • Practice in your browser
  • Track your progress

Sign up or sign in

By sending this form, you agree to our Personal Policy and Service Conditions
Toto Image

Ask questions if you want to discuss a theory or an exercise. Hexlet Support Team and experienced community members can help find answers and solve a problem.