HTTP is a text protocol used by a client, for example, a browser or a server. It works like this:
To see how HTTP works, we'll request to the Google server and analyze how it looks. To do so, a utility called telnet is used (an example of an HTTP request made using the telnet utility):
# We pass the site address and specify the TCP port # After that, it connects to the server via the TCP protocol telnet google.com 80
HTTP is an application layer protocol. In other words, it's designed to communicate between two programs, located on different computers — client and server. But HTTP cannot connect two remote computers on its own. Other protocols are used for this, including TCP.
The TCP protocol allows you to connect programs on remote computers, creating a channel for communicating with each other. To do so, you need to know two parameters:
The telnet command above does exactly that. At the beginning, it carries out a TCP connection. Then it enters the HTTP interaction mode, if you provide the correct IP address and the connection port.
And at this point, some questions arise. We passed the site address, but where does the IP address come from? Any website address is just a name that covers its IP address. The name is set for convenience, to make it easier to remember for users.
All network programs, including browsers and telnet, convert the site name to its IP address. This is done using the DNS system, another pillar of the Internet:
# How to find out the IP address using the ping utility # The address may be different, IP addresses may change ping google.com # 220.127.116.11 # Then you can use the address to connect to the server telnet 18.104.22.168 80
Why is the port number 80? This is a generally accepted convention. Sites accessible over HTTP are available on port 80, and over HTTPS — on port 443. That’s why ports are not specified in browsers.
Now we've established a connection, so we can see that a connection has been made with a web server. It is a program that serves HTTP requests to google.com:
telnet google.com 80 Trying 22.214.171.124... Connected to google.com. Escape character is '^]'.
After connecting, the web server starts to wait for the HTTP address. All that's left is to send it.
Any request consists of several parts:
In the request line we specify a special word, or a verb. HTTP has various verbs, but we won't go into details now. Let's just say that they determine how to respond to this request.
In this case, we'll use the verb HEAD. It simply asks the server to give only headers, no content. GET is a more common verb. It's GET that we use to request the content of the site.
After the verb, we give the path to the resource — request URI. If we need to specify the root of the site, we use
After that, all we need to do is give the name of the protocol and its version. In this course, we'll only talk about HTTP 1.0 and 1.1, because this is the fundamental part of the protocol, and it's worth starting with them.
There are fundamental differences between the versions, which you need to know and understand well. Version 1.0 continues to be used for various purposes by command line utilities.
Overall, this is enough, and we do not have to do anything else for 1.0:
HEAD / HTTP/1.0
Then there are the headers. They allow you to send additional information. For example, browsers might provide information about themselves so that it's clear where the request comes from.
In addition, headers state what compression formats they support, what format they're ready to receive a response in, and so on. The number of standard headings is quite large, and you can add your own too.
Let's take a look at what headers look like. We specify a name and a value, separated by a colon like that:
Usually, we give headers in capital letters, but the case isn't important here. The order of headers is also not specified. The entire response body will be parsed simultaneously, despite the order we pass the headers.
Browsers use many different headers. For example, the header
user-agent is used for analytics and adaptive design of the site for different screens or browsers. But even without it, everything should work:
HEAD / HTTP/1.0 User-Agent: google chrome
It's important to remember one key point. Since this HTTP is a protocol, it has a set of rules, and you mustn't break them.
HTTP is a text-based protocol. All rules are based on simple agreements. For example, multiple headings should be separated from each other by a line break. We can't put them on one line, even if we separate them with commas. The other rules are strict too.
Let's observe another example. There should be a way how the server determines that you've finished transferring data. There should be a kind of marker. In HTTP, this is two line breaks. After that, the server thinks that all the data has been sent and there'll be no more data. Essentially, the two line breaks are the reason why data is sent.
Let's make a request and see what comes back to us as a response. Let's make a HEAD request and see what gets returned:
telnet google.com 80 Trying 126.96.36.199... Connected to google.com. Escape character is '^]'. HEAD / HTTP/1.0 HTTP/1.0 200 OK Date: Sat, 18 Jan 2020 09:24:50 GMT Expires: -1 Cache-Control: private, max-age=0 Content-Type: text/html; charset=ISO-8859-1 P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info." Server: gws X-XSS-Protection: 0 X-Frame-Options: SAMEORIGIN Set-Cookie: 1P_JAR=2020-01-18-09; expires=Mon, 17-Feb-2020 09:24:50 GMT; path=/; domain=.google.com; Secure Set-Cookie: NID=196=wsHLMAMfnAaSyF7zduokI8TJeE5UoIKPHYC58HYH93VMnev9Nc2bAjhRdzoc4UhmuOd7ZVCorDnzGDe51yPefsRMeVyOFnYdHYYgQNqI8A1dYuk4pDK4OJurQgL4lX8kiNGSNi_kkUESFQ-MqLCB_YspxA9JRejhZdkTRtGyHNk; expires=Sun, 19-Jul-2020 09:24:50 GMT; path=/; domain=.google.com; HttpOnly Accept-Ranges: none Vary: Accept-Encoding Connection closed by foreign host.
Here we got a response.
There is the status line
HTTP/1.0 200 OK, which specifies the protocol and gives the response status:
In HTTP, we can define many different statuses — 400, 500, etc. They show if the information hasn't been found, if there are errors on the server, etc. All statuses have a mnemonic name, which is also passed by the last value. For example, 200 and
OK means everything went OK.
Then a large number of different headings are displayed. There is nothing complicated about them, and you don't have to learn them all. There are some common ones, and they're clear enough.
All headers consist of a key, a colon and a value. You may notice that there are things related to coding and caching. Some headers are specific to the current server.
X-XSS-Protection: 0, where
X indicates a custom header. But no web server or web browser will crash when such additional headers are sent.
In HTTP 1.0, the connection is closed at the end after the data has been received.
In the end, we can see one interesting detail:
Connection closed by foreign host. That's how almost everything on the Internet works. Usually, servers are set to a 30-second interval and close the connection if nothing comes during that.
That is why telnet is a bit harder to work with for beginners. They enter the request slowly, and the connection closes before they are finished, which is a bit of a pain. So we always recommend making entries in the file and then telnet them afterwards.
The Hexlet support team or other students will answer you.
A professional subscription will give you full access to all Hexlet courses, projects and lifetime access to the theory of lessons learned. You can cancel your subscription at any time.
Programming courses for beginners and experienced developers. Start training for free
Our graduates work in companies:
Sign up or sign in
Ask questions if you want to discuss a theory or an exercise. Hexlet Support Team and experienced community members can help find answers and solve a problem.