In this lesson, we will discuss character classes.
A character class is a special designation that specifies a search for any character from a particular set.
Let us look at a simple example of how character classes work. Suppose we only need to find letters from the alphabet. To do this, you can describe character classes in square brackets, for example, the English alphabet:
We can see that all alphabetical characters in the string are highlighted:
You can search for numbers from zero to nine in the same way:
And in this example, we specify just two characters, each of which will be found:
a 11_34-1938 t
With character classes, you can use a mechanism called negation. It helps to invert the search.
When we put the character
^ before the first character in square brackets. This way we will find all characters except those listed after
If we need to find a hyphen and letters from the alphabet, we enter them at the beginning or end of a group of characters. That way, the hyphen will not be perceived as a special character:
Regular expressions often use special predefined character classes. They are written using the
\ and have their designations in the regular expression language.
In the previous lesson, we used
\ as an escape character. Here we also use it as part of the notation.
Let us find all the digits in the text using
If we specify a large
D, the search will retrieve all other characters, including whitespace and tabs:
There are also:
\s, which helps search for whitespace characters
\S, representing all non-whitespace characters
As we can see, the principle is simple. Lowercase letters denote classes, and uppercase letters represent everything that does not belong to it.
There is another popular class
\w. It includes all letters of the alphabet, all numbers, and underscores. The code below does not show it, but whitespace characters do not correspond to this class, nor does
\w is equivalent to this notation:
[0-9a-zA-Z_]. Note that searches in character ranges are case-sensitive, so
a-z is followed by
\W searches for the opposite of
\w. So we can find hyphens and whitespace characters:
The Hexlet support team or other students will answer you.
A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.
Programming courses for beginners and experienced developers. Start training for free
Our graduates work in companies:
Sign up or sign in
Ask questions if you want to discuss a theory or an exercise. Hexlet Support Team and experienced community members can help find answers and solve a problem.