HTML-tree — JS: Trees

Tree structures are found in many different areas, such as family trees, file systems, etc. In this lesson, we'll learn about the HTML markup tree common for web development.

<html>
  <body>
    <h1>Community</h1>
    <p>Discussion among Hexlet users</p>
    <hr>
    <input>
    <div class='hexlet-community'>
      <div class='text-xs-center'></div>
      <div class='fa fa-spinner'></div>
    </div>
  </body>
</html>

The root is the html tag. It is important to note that some tags can't have nested tags within them, such as hr and input.

Let's try to define this tree with a structure that would be easy to work with. The first step is to define the properties of each tag. At the very least, we can distinguish certain properties, such as name, type, class, and children. In reality, there are many more of these properties, but these are enough for us now. Now define the HTML tree with this structure:

const htmlTree = {
  name: 'html',
  type: 'tag-internal',
  children: [
    {
      name: 'body',
      type: 'tag-internal',
      children: [
        {
          name: 'h1',
          type: 'tag-internal',
          children: [
            {
              type: 'text',
              content: 'Community',
            },
          ],
        },
        {
          name: 'p',
          type: 'tag-internal',
          children: [
            {
              type: 'text',
              content: 'Discussion among Hexlet users',
            },
          ],
        },
        {
          name: 'hr',
          type: 'tag-leaf',
        },
        {
          name: 'input',
          type: 'tag-leaf',
        },
        {
          name: 'div',
          type: 'tag-internal',
          className: 'hexlet-community',
          children: [
            {
              name: 'div',
              type: 'tag-internal',
              className: 'text-xs-center',
              children: [],
            },
            {
              name: 'div',
              type: 'tag-internal',
              className: 'fa fa-spinner',
              children: [],
            },
          ],
        },
      ],
    },
  ],
};

The main property in each node is the node type. Our tree has tags and text. Text can be embedded in a tag, i.e., it can be a descendant. Therefore, the text is a leaf node. We also have some tags that are leaf nodes, so there are two types allocated for tags tag-internal is for internal nodes, these are tags that can have children; tag-leaf is for leaf nodes, these are tags that cannot have children. So we only need to define two types of nodes for our HTML tree:

tag-internal - tags that can have children. These are internal nodes
tag-leaf - tags that cannot have children. These are leaf nodes
text - plain text. These are leaf nodes

Now we can work with our tree. For example, filter out all the empty tags. To do this, we must first determine how to filter each type. Each type is filtered differently:

tag-internal - if there are no children or all children are empty, then the parent is also empty
tag-leaf - can't have children, this tag is always displayed
text - text nodes can't have children, but they can contain text, so we filter for empty content

The filtering function will look like this:

const filterEmpty = (tree) => {
  const filtered = tree.children
    .map((node) => {
      // Before filtering, we filter out all descendants
      if (node.type === 'tag-internal') {
        // Get a sense what's going on here. Call the filter function recursively
        // Further work won't be finished until the filter function filters out the nested empty nodes
        return filterEmpty(node);
      }
      return node;
    })
    .filter((node) => {
      const { type } = node;
      // Each type is filtered differently, switch is good for this
      switch (type) {
        case 'tag-internal': {
          // At this point, the current node descendants have been filtered out (only non-empty remain)
          const { children } = node;
          // check the current node, if it's not empty, then return true (the node stays)
          return children.length > 0;
        }
        case 'tag-leaf':
          // leaf nodes are always output
          return true;
        case 'text': {
          const { content } = node;
          // For text nodes, just check for contents
          return !!content; // For uniqueness, we can convert the value into Boolean
        }
      }
    });
  return { ...tree, children: filtered };
};

The filter takes a tag-internal node as a parameter and processes the elements nested in it. First, we go through all the descendants and filter all the tag-internal ones, using our own function (recursion) to also filter the nested elements. Then we call the filter() method, where each type is filtered according to the logic we defined.

After filtering, we get this tree:

{
  name: 'html',
  type: 'tag-internal',
  children: [
    {
      name: 'body',
      type: 'tag-internal',
      children: [
        {
          name: 'h1',
          type: 'tag-internal',
          children: [
            {
              name: '',
              type: 'text',
              content: 'Community',
            },
          ],
        },
        {
          name: 'p',
          type: 'tag-internal',
          children: [
            {
              name: '',
              type: 'text',
              content: 'Discussion among Hexlet users',
            },
          ],
        },
        {
          name: 'hr',
          type: 'tag-leaf',
        },
        {
          name: 'input',
          type: 'tag-leaf',
        },
      ],
    },
  ],
};

This tree does not contain a div element with the hexlet-community class, even though it contained other elements. This happened because its empty descendants were filtered out before filtering the parent. Now you can build the tree into a string:

// For convenience, let's define a separate function to generate the output of the class
const buildClass = (node) => node.className ? ` class=${node.className}` : '';

// Main function for building a page
const buildHtml = (node) => {
  const { type, name } = node;
  // Each type is generated in its own way, since we use switch
  switch (type) {
    case 'tag-internal': {
      // This type can have children, here we generate the output of the children
      const childrenView = node.children.map(buildHtml).join('');
      // Assembling everything, done together with the parent node
      return `<${name}${buildClass(node)}>${childrenView}</${name}>`;
    }
    case 'tag-leaf':
      // Leaf nodes are formed quite simply
      return `<${name}${buildClass(node)}>`;
    case 'text':
      // The text nodes display the content itself
      return node.content;
  }
};

// Getting a filtered tree
const filteredTree = filterEmpty(htmlTree);

// Generating the result
const html = buildHtml(filteredTree);
console.log(html); // => <html><body><h1>Community</h1><p>Discussion among Hexlet users</p><hr><input></body></html>

If you indent and put each tag on a new line, you will end up with html without empty tags:

<html>
   <body>
      <h1>Community</h1>
      <p>Discussion among Hexlet users</p>
      <hr>
      <input>
   </body>
</html>

https://replit.com/@hexlet/js-trees-html-en

The code for handling trees looks short and sweet. This is a consequence of the convenient structure for representing the HTML tree. Having identified several types of nodes in this tree, all that remains is to describe the logic for each type. Any internal node is in the same tree, so it is processed recursively by the same function. Presenting the proper structure makes processing trees significantly simpler.

Recommended materials

Intro to HTML

Are there any more questions? Ask them in the Discussion section.

The Hexlet support team or other students will answer you.

For full access to the course you need a professional subscription.

A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.

Get access

130

courses

1000

exercises

2000+

hours of theory

3200

tests