Let's practice with another option for data aggregation on file systems. We'll write a function that accepts a directory as input and returns a list of directories of the first level of nesting and the number of files inside each of them, including all subdirectories:
from hexlet import fs
tree = fs.mkdir('/', [
fs.mkdir('etc', [
fs.mkdir('apache'),
fs.mkdir('nginx', [
fs.mkfile('nginx.conf'),
]),
]),
fs.mkdir('consul', [
fs.mkfile('config.json'),
fs.mkfile('file.tmp'),
fs.mkdir('data'),
]),
fs.mkfile('hosts'),
fs.mkfile('resolve'),
])
print(get_subdirectories_info(tree))
# => [('etc', 1), ('consul', 2)]
https://repl.it/@hexlet/python-trees-search-get-subdirectories-info
We can break this task down into two steps:
- Counting the number of files inside a directory
- Calling the file counting function on each of the subdirectories
Let's start by counting the number of files. It is a classic aggregation task:
def get_files_count(node):
if fs.is_file(node):
return 1
children = fs.get_children(node)
descendant_counts = list(map(get_files_count, children))
return sum(descendant_counts)
The next step is to extract all the children from the source node and apply a count to each of them:
def get_subdirectories_info(node):
children = fs.get_children(node)
# We are only interested in directories
filtered = filter(fs.is_directory, children)
# Running the count for each directory
result = map(
lambda child: (fs.get_name(child), get_files_count(child)),
filtered,
)
return list(result)
In other words, we addressed the children directly, filtered them, and then mapped them to the necessary array containing names and numbers of files for each directory.
Are there any more questions? Ask them in the Discussion section.
The Hexlet support team or other students will answer you.
For full access to the course you need a professional subscription.
A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.