Ordering asynchronous operations — JS: Asynchronous programming

Asynchronous programming helps to use computing resources efficiently. But it creates difficulties where previously it had been easier. First of all, it concerns the flow.

Imagine we have the task of reading the contents of two files and writing them to a third file (merging files):

import fs from 'fs';

fs.readFile('./first', 'utf-8', '?');
fs.readFile('./second', 'utf-8', '?');
fs.writeFile('./new-file', content, '?');

The whole task boils down to performing three operations one by one since we can only write a new file when we've read the data from the first two.

There's only one way to arrange this kind of code. Each subsequent operation must run inside the previous one's callback. Then we build the call chain:

import fs from 'fs';

fs.readFile('./first', 'utf-8', (_error1, data1) => {
  fs.readFile('./second', 'utf-8', (_error2, data2) => {
    fs.writeFile('./new-file', `${data1}${data2}`, (_error3) => {
      console.log('File has been written');
    });
  });
});

In actual programs, the number of operations may be much more. You could end up with dozens of callbacks, featuring dozens of nested calls.

This property of asynchronous code is often called Callback Hell because of the large number of nested callbacks, making analyzing programs challenging. Someone has even made a website http://callbackhell.com/, which deals with this problem and provides the following code:

import fs from 'fs';

// This code handles errors, which we'll discuss in the next lesson
fs.readdir(source, (err, files) => {
  if (err) {
    console.log('Error finding files: ' + err)
  } else {
    files.forEach((filename, fileIndex) => {
      console.log(filename)
      gm(source + filename).size((err, values) => {
        if (err) {
          console.log('Error identifying file size: ' + err)
        } else {
          console.log(filename + ' : ' + values)
          aspect = (values.width / values.height)
          widths.forEach((width, widthIndex) => {
            height = Math.round(width / aspect)
            console.log('resizing ' + filename + 'to ' + height + 'x' + height)
            this.resize(width, height).write(dest + 'w' + width + '_' + filename, (err) => {
              if (err) console.log('Error writing file: ' + err)
            })
          }.bind(this))
        }
      })
    })
  }
})

In some cases, we don't know in advance how many operations we will perform. For example, you may need to read the contents of a directory and see who owns each file (its uid). If the code were synchronous, our solution would look like this:

import path from 'path';
import fs from 'fs';

const getFileOwners = (dirpath) => {
  // Read the contents of the directory
  const files = fs.readdirSync(dirpath);
  // Get information for each file and generate the result
  return files
    .map((fname) => [fname, fs.statSync(path.join(dirpath, fname))])
    .map(([fname, stat]) => ({ filename: fname, owner: stat.uid }));
};
// [ { filename: 'Makefile', owner: 65534 },
//       { filename: '__tests__', owner: 65534 },
//       { filename: 'babel.config.js', owner: 65534 },
//       { filename: 'info.js', owner: 65534 },
//       { filename: 'package.json', owner: 65534 } ]

Any sequential code is pretty straightforward. Each successive line executes after the previous one finishes, and each element is guaranteed to be processed sequentially in map.

But asynchronous code isn't that obvious. As we discussed, reading the directory is an operation that we do anyway. But how do we define the way the files are analyzed? There can be any number of them. Unfortunately, without ready-made abstractions that simplify this task, we can end up with a lot of complicated code. It would be so complicated, that it's better never to do so in real life.

This code is for educational purposes only:

import path from 'path';
import fs from 'fs';

const getFileOwners = (dirpath, cb) => {
  fs.readdir(dirpath, (_error1, filenames) => {
    const readFileStat = (items, result = []) => {
      if (items.length === 0) {
        // Error handling hasn't been considered yet
        cb(null, result);
        return;
      }
      const [first, ...rest] = items;
      const filepath = path.join(dirpath, first);
      fs.stat(filepath, (_error2, stat) => {
        readFileStat(rest, [...result, { filename: first, owner: stat.uid }]);
      });
    };
    readFileStat(filenames);
  });
};

Let's observe the general principle.

First, we form a special function readFileStat, which is recursively called and passes itself to the function stat. With each new call, this function processes one file and reduces the items array, which contains unprocessed files. As a second parameter, it collects the result, which at the end is passed to the callback cb given by the second argument of the getFileOwners function.

The example above implements an iterative process built on recursive functions. To better understand the code above, try copying it to your computer and running it with different arguments, setting up the debug output inside it beforehand.

Are there any more questions? Ask them in the Discussion section.

The Hexlet support team or other students will answer you.

About Hexlet learning process

Article “How to Learn and Cope with Negative Thoughts“
Article “Learning Traps“
Article “Complex and simple programming tasks“

For full access to the course you need a professional subscription.

A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.

Get access

130

courses

1000

exercises

2000+

hours of theory

3200

tests