HTTP Protocol

Theory: HTTP 1.0

pages.courses.lessons.theory_unit.sign_up_block_title

pages.courses.lessons.theory_unit.sign_up_block_description

HTTP is a text protocol used by a client, for example, a browser or a server. It works like this:

The user sends a specific request to the server, requesting or passing the necessary data
The server, depending on the request, performs the necessary logic and returns the result — an HTML page or a redirect

To see how HTTP works, we'll request to the Google server and analyze how it looks. To do so, a utility called telnet is used (an example of an HTTP request made using the telnet utility):

# We pass the site address and specify the TCP port
# After that, it connects to the server via the TCP protocol
telnet google.com 80

HTTP is an application layer protocol. In other words, it's designed to communicate between two programs, located on different computers — client and server. But HTTP cannot connect two remote computers on its own. Other protocols are used for this, including TCP.

The TCP protocol allows you to connect programs on remote computers, creating a channel for communicating with each other. To do so, you need to know two parameters:

The IP address of the computer to which you want to connect
The port that the required program is tied to

The telnet command above does exactly that. At the beginning, it carries out a TCP connection. Then it enters the HTTP interaction mode, if you provide the correct IP address and the connection port.

And at this point, some questions arise. We passed the site address, but where does the IP address come from? Any website address is just a name that covers its IP address. The name is set for convenience, to make it easier to remember for users.

All network programs, including browsers and telnet, convert the site name to its IP address. This is done using the DNS system, another pillar of the Internet:

# How to find out the IP address using the ping utility
# The address may be different, IP addresses may change
ping google.com # 74.125.21.139

# Then you can use the address to connect to the server
telnet 74.125.21.139 80

Why is the port number 80? This is a generally accepted convention. Sites accessible over HTTP are available on port 80, and over HTTPS — on port 443. That’s why ports are not specified in browsers.

Now we've established a connection, so we can see that a connection has been made with a web server. It is a program that serves HTTP requests to google.com:

telnet google.com 80

Trying 74.125.21.139...
Connected to google.com.
Escape character is '^]'.

After connecting, the web server starts to wait for the HTTP address. All that's left is to send it.

What is the request

HTTP query

Any request consists of several parts:

Request line — verb and request URI
Header

In the request line we specify a special word, or a verb. HTTP has various verbs, but we won't go into details now. Let's just say that they determine how to respond to this request.

In this case, we'll use the verb HEAD. It simply asks the server to give only headers, no content. GET is a more common verb. It's GET that we use to request the content of the site.

After the verb, we give the path to the resource — request URI. If we need to specify the root of the site, we use /.

After that, all we need to do is give the name of the protocol and its version. In this course, we'll only talk about HTTP 1.0 and 1.1, because this is the fundamental part of the protocol, and it's worth starting with them.

There are fundamental differences between the versions, which you need to know and understand well. Version 1.0 continues to be used for various purposes by command line utilities.

Overall, this is enough, and we do not have to do anything else for 1.0:

HEAD / HTTP/1.0

Then there are the headers. They allow you to send additional information. For example, browsers might provide information about themselves so that it's clear where the request comes from.

In addition, headers state what compression formats they support, what format they're ready to receive a response in, and so on. The number of standard headings is quite large, and you can add your own too.

Let's take a look at what headers look like. We specify a name and a value, separated by a colon like that: refer: value.

Usually, we give headers in capital letters, but the case isn't important here. The order of headers is also not specified. The entire response body will be parsed simultaneously, despite the order we pass the headers.

Browsers use many different headers. For example, the header user-agent is used for analytics and adaptive design of the site for different screens or browsers. But even without it, everything should work:

HEAD / HTTP/1.0
User-Agent: google chrome

It's important to remember one key point. Since this HTTP is a protocol, it has a set of rules, and you mustn't break them.

HTTP is a text-based protocol. All rules are based on simple agreements. For example, multiple headings should be separated from each other by a line break. We can't put them on one line, even if we separate them with commas. The other rules are strict too.

Let's observe another example. There should be a way how the server determines that you've finished transferring data. There should be a kind of marker. In HTTP, this is two line breaks. After that, the server thinks that all the data has been sent and there'll be no more data. Essentially, the two line breaks are the reason why data is sent.

What will the response be?

HTTP response

Let's make a request and see what comes back to us as a response. Let's make a HEAD request and see what gets returned:

telnet google.com 80

Trying 64.233.164.100...
Connected to google.com.
Escape character is '^]'.
HEAD / HTTP/1.0

HTTP/1.0 200 OK
Date: Sat, 18 Jan 2020 09:24:50 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2020-01-18-09; expires=Mon, 17-Feb-2020 09:24:50 GMT; path=/; domain=.google.com; Secure
Set-Cookie: NID=196=wsHLMAMfnAaSyF7zduokI8TJeE5UoIKPHYC58HYH93VMnev9Nc2bAjhRdzoc4UhmuOd7ZVCorDnzGDe51yPefsRMeVyOFnYdHYYgQNqI8A1dYuk4pDK4OJurQgL4lX8kiNGSNi_kkUESFQ-MqLCB_YspxA9JRejhZdkTRtGyHNk; expires=Sun, 19-Jul-2020 09:24:50 GMT; path=/; domain=.google.com; HttpOnly
Accept-Ranges: none
Vary: Accept-Encoding

Connection closed by foreign host.

Here we got a response.

There is the status line HTTP/1.0 200 OK, which specifies the protocol and gives the response status: 200 OK.

In HTTP, we can define many different statuses — 400, 500, etc. They show if the information hasn't been found, if there are errors on the server, etc. All statuses have a mnemonic name, which is also passed by the last value. For example, 200 and OK means everything went OK.

Then a large number of different headings are displayed. There is nothing complicated about them, and you don't have to learn them all. There are some common ones, and they're clear enough.

All headers consist of a key, a colon and a value. You may notice that there are things related to coding and caching. Some headers are specific to the current server.

For example, X-XSS-Protection: 0, where X indicates a custom header. But no web server or web browser will crash when such additional headers are sent.

In HTTP 1.0, the connection is closed at the end after the data has been received.

In the end, we can see one interesting detail: Connection closed by foreign host. That's how almost everything on the Internet works. Usually, servers are set to a 30-second interval and close the connection if nothing comes during that.

That is why telnet is a bit harder to work with for beginners. They enter the request slowly, and the connection closes before they are finished, which is a bit of a pain. So we always recommend making entries in the file and then telnet them afterwards.

Recommended programs

10 months·For beginners

Frontend Developer

from $49

Explore

1 month·For advanced

HTTP API

from $49

Explore

Catalog

A complete list of available courses by direction