Revision 1108215 of Connection management in HTTP/1.x

Revision slug: Web/HTTP/Connection_management_in_HTTP_1.x
Revision title: Connection management in HTTP/1.x
Revision id: 1108215
Created: Aug 22, 2016, 2:16:40 PM
Creator: DarrenLester
Is current revision? No
Comment Grammar fixes

Tags:

Revision Content

Connection management is a key topic of HTTP: opening and maintaining connections largely impacts the performance of Web sites or Web applications. In HTTP/1.x, there are several models: short-lived connections, persistent connections, and HTTP pipelining.

Most of the time, HTTP relies on TCP as the transport protocol. TCP provides a connection between the client and the server. In its infancy, HTTP provided a single model to handle these connections. Connections were short-lived: a new one was created each time a request needed to be sent, and closed once the answer had been received.

This simple model limits the performance: opening a TCP connection is a time-consuming operation. Several messages have to be exchanged between the client and the server, and the latency and the bandwidth of the network is affecting the performance each time a request needs to be sent. Modern Web pages need numerous request (a dozen or more) to provide the information needed, so this early model has proved insufficient.

Two additional models have been created in HTTP/1.1. The persistent-connection model keeps connections opened between successive requests, sparing the time to open new connections; pipelining goes one step further by sending several request successively without waiting for the answer, sparing a good deal of the latency of the network.

Compares the performance of the three HTTP/1.x connection models: short-lived connections, persistent connections, and HTTP pipelining.

HTTP/2 adds further models of connection management.

An important point to keep in mind is that connection management in HTTP applies to the connection between two consecutive nodes, that is hop-by-hop and not end-to-end. The model of connections between a client and its first proxy may be different then the one between the proxy and the destination server (or any intermediate proxies). The HTTP headers involved in defining the connection model, like {{HTTPHeader("Connection")}} and {{HTTPHeader("Keep-Alive")}} are hop-by-hop headers and their values can be changed by intermediate nodes.

Short-lived connections

The original model of HTTP, and the default one in HTTP/1.0, is short-lived connections. Each HTTP request is performed on its own connection; that means that there is a TCP handshake happening before each HTTP request, and these are serialized.

The TCP handshake itself is time-consuming, but a TCP connection adapts itself to its load and becomes more efficient with longer-lived connections. This model doesn't make use of this feature of TCP and performance is degraded from the optimum by always transmitting over cold connections, rather than over warm ones.

This model is the default model used with HTTP/1.0 (that is it is the one used if there is no {{HTTPHeader("Connection")}} header, or if its value is set to close. With HTTP/1.1, this model is only used if the {{HTTPHeader("Connection")}} header is sent with a value of close.

Unless dealing with a very old system that doesn't support persistent connection, there is no compelling reason to use this model.

Persistent connections

Short-lived connections have two major drawbacks: the time to establish a new connection is significant, and the performance of the underlying TCP connection gets better only when this connection has been in use for some time (warm connection). To mitigate this problem, even before HTTP/1.1, the concept of a persistent connection has been crafted, sometimes called a keep-alive connection.

A persistent connection is a connection that stays open for some time and that can be reused for several requests, sparing the TCP handshake and making better use of the TCP performance capabilities. The connection will not stay open forever: idle connections are closed after some time (a server may use the {{HTTPHeader("Keep-Alive")}} header to give a minimum amount of time they should be kept open).

Persistent connections also have drawbacks; even when idling they consume some server resources and under heavy load, {{glossary("DoS attack", "DoS attacks")}} can be conducted. In these cases, using non-persistent connections, that are closed as soon as they are idle, brings better performance.

HTTP/1.0 connections are not persistent by default. Setting {{HTTPHeader("Connection")}} to anything other than close, usually retry-after will make them persistent.

On the other hand, on HTTP/1.1, they are persistent by default and the header is not needed (but often added so that it is taken into account in case of a fallback to HTTP/1.0).

HTTP pipelining

HTTP pipelining is not activated by default in modern browsers:

Buggy proxies are still common and these lead to strange and erratic behaviors that Web developers cannot foresee and diagnose easily.
Pipelining is complex to implement correctly: the size of the resource being transferred, the effective RTT that will be used, as well as the effective bandwidth, have a direct incidence on the improvement provided by the pipeline. Without knowing these, important messages may be delayed behind unimportant ones. The notion of important even evolves during page layout! HTTP pipelining therefore brings a marginal improvement in most cases only.
Pipelining is subject to the HOL problem.

For these reasons, pipelining got been superseded by a better algorithm, multiplexing, that is used by HTTP/2.

By default, HTTP requests are issued sequentially, with the next request being issued only after the response to the current request has been completely received. Depending on network latencies and bandwidth limitations, this can result in a significant delay before the next request is seen by the server.

Pipelining is the functionality to send several requests successively over the same persistent connection without waiting for the answer. That way the latency of the connection is avoided. Theoretically, performance could also be improved if two HTTP requests could be packed into the same TCP message. The typical MSS (Maximum Segment Size), is big enough to contain several simple requests, although the size of HTTP requests grow over the years.

Not all kinds of HTTP requests can be pipelined: only {{glossary("idempotent")}} methods, that is {{HTTPMethod("GET")}}, {{HTTPMethod("HEAD")}}, {{HTTPMethod("PUT")}} and {{HTTPMethod("DELETE")}}, can be replayed safely: when a failure happens, the pipeline content can just be replayed.

These days, every HTTP/1.1-compliant proxy and server should support pipelining, but in practice, a lot of them have limitations: that is one reason that no modern browser activates this feature by default.

Domain sharding

Unless you have a very specific immediate need, don't use this deprecated technique; switch to HTTP/2 instead. In HTTP/2, domain sharding is no more useful: the HTTP/2 connection is able to handle parallel unprioritized requests very well. Domain sharding is even detrimental to performance. Most HTTP/2 implementation use a technique called connection coalescing to revert eventual domain sharding.

As an HTTP/1.x connection is serializing requests, even if there is no real ordering between them, it can't be optimal, when the available bandwidth is large enough. To work around this, browsers open several connection to each domain and send request in parallel. Originally, they were opening 2 or 3 connections, but this increased and the most common amount of parallel connections is now 6. They can't really open more of these without triggering DoS protection on the server side.

If the server wants that the Web site or application react faster, he can trick the browsers to open more connections. Instead of having all resources on the same domain, say www.example.com it splits over several, like www1.example.com, www2.example.com, www3.example.com. Each of these domains resolve to the same server, and the browsers will open 6 connections to each of them (18 in our example). This technique is called domain sharding.

Conclusion

Better connection management allows to considerably improve performance of HTTP. With HTTP/1.1 or HTTP/1.0, using a persistent connection – at least until it gets idle – leads to the best performance. But, over the years and the failure of pipelining, better connection management models have been designed and incorporated into HTTP/2.

Revision Source

<div>{{HTTPSidebar}}</div>

<p class="summary">Connection management is a&nbsp;key topic of HTTP: opening and maintaining connections largely impacts the performance of Web sites or Web applications. In HTTP/1.x, there are several models: <em>short-lived connections</em>, <em>persistent connections</em>, and <em>HTTP pipelining. </em></p>

<p>Most of the time, HTTP relies on TCP as the transport protocol. TCP provides a connection between the client and the server. In its infancy, HTTP provided a single model to handle these connections. Connections were short-lived: a new one was created each time a request needed to be sent, and closed once the answer had been received.</p>

<p>This simple model limits the performance: opening a TCP connection is a time-consuming operation. Several messages have to be exchanged between the client and the server, and the latency and the bandwidth of the network is affecting the performance each time a request needs to be sent. Modern Web pages need numerous request (a dozen or more) to provide the information needed, so this early model has proved insufficient.</p>

<p>Two additional models have been created in HTTP/1.1. The persistent-connection model keeps connections opened between successive requests, sparing the time to open new connections; pipelining goes one step further by sending several request successively without waiting for the answer, sparing a good deal of the latency of the network.</p>

<p><img alt="Compares the performance of the three HTTP/1.x connection models: short-lived connections, persistent connections, and HTTP pipelining." src="https://mdn.mozillademos.org/files/13727/HTTP1_x_Connections.png" style="height:538px; width:813px" /></p>

<div class="note">
<p>HTTP/2 adds further models of connection management.</p>
</div>

<p>An important point to keep in mind is that connection management in HTTP applies to the connection between two consecutive nodes, that is <a href="/en-US/docs/Web/HTTP/Headers#hbh">hop-by-hop</a> and not <a href="/en-US/docs/Web/HTTP/Headers#e2e">end-to-end</a>. The model of connections between a client and its first proxy may be different then the one between the proxy and the destination server (or any intermediate proxies). The HTTP headers involved in defining the connection model, like {{HTTPHeader("Connection")}} and {{HTTPHeader("Keep-Alive")}} are <a href="/en-US/docs/Web/HTTP/Headers#hbh">hop-by-hop</a> headers and their values can be changed by intermediate nodes.</p>

<h2 id="Short-lived_connections">Short-lived connections</h2>

<p>The original model of HTTP, and the default one in HTTP/1.0, is <em>short-lived connections</em>. Each HTTP request is performed on its own connection; that means that there is a TCP handshake happening before each HTTP request, and these are serialized.</p>

<p>The TCP handshake itself is time-consuming, but a TCP connection adapts itself to its load and becomes more efficient with longer-lived connections. This model doesn't make use of this feature of TCP and performance is degraded from the optimum by always transmitting over cold connections, rather than over warm ones.</p>

<p>This model is the default model used with HTTP/1.0 (that is it is the one used if there is no {{HTTPHeader("Connection")}} header, or if its value is set to <code>close</code>. With HTTP/1.1, this model is only used if the {{HTTPHeader("Connection")}} header is sent with a value of <code>close</code>.</p>

<div class="note">
<p>Unless dealing with a very old system that doesn't support persistent connection, there is no compelling reason to use this model.</p>
</div>

<h2 id="Persistent_connections">Persistent connections</h2>

<p>Short-lived connections have two major drawbacks: the time to establish a new connection is significant, and the performance of the underlying TCP connection gets better only when this connection has been in use for some time (warm connection). To mitigate this problem, even before HTTP/1.1, the concept of a <em>persistent connection</em> has been crafted, sometimes called a <em>keep-alive connection</em>.</p>

<p>A persistent connection is a connection that stays open for some time and that can be reused for several requests, sparing the TCP handshake and making better use of the TCP performance capabilities. The connection will not stay open forever: idle connections are closed after some time (a server may use the {{HTTPHeader("Keep-Alive")}} header to give a minimum amount of time they should be kept open).</p>

<p>Persistent connections also have drawbacks; even when idling they consume some server resources and under heavy load, {{glossary("DoS attack", "DoS attacks")}} can be conducted. In these cases, using non-persistent connections, that are closed as soon as they are idle, brings better performance.</p>

<p>HTTP/1.0 connections are not persistent by default. Setting {{HTTPHeader("Connection")}} to anything other than <code>close</code>, usually <code>retry-after</code> will make them persistent.</p>

<p>On the other hand, on HTTP/1.1, they are persistent by default and the header is not needed (but often added so that it is taken into account in case of a fallback to HTTP/1.0).</p>

<h2 id="HTTP_pipelining">HTTP pipelining</h2>

<div class="note">
<p>HTTP pipelining is not activated by default in modern browsers:</p>

<ul>
 <li>Buggy <a href="https://en.wikipedia.org/wiki/Proxy_server">proxies</a> are still common and these lead to strange and erratic behaviors that Web developers cannot foresee and diagnose easily.</li>
 <li>Pipelining is complex to implement correctly: the size of the resource being transferred, the effective <a href="https://en.wikipedia.org/wiki/Round-trip_delay_time">RTT</a> that will be used, as well as the effective bandwidth, have a direct incidence on the improvement provided by the pipeline. Without knowing these, important messages may be delayed behind unimportant ones. The notion of important even evolves during page layout! HTTP pipelining therefore brings a marginal improvement in most cases only.</li>
 <li>Pipelining is subject to the <a href="https://en.wikipedia.org/wiki/Head-of-line_blocking">HOL</a> problem.</li>
</ul>

<p>For these reasons, pipelining got been superseded by a better algorithm, <em>multiplexing</em>, that is used by HTTP/2.</p>
</div>

<p>By default, <a href="/en/HTTP" title="en/HTTP">HTTP</a> requests are issued sequentially, with the next request being issued only after the response to the current request has been completely received. Depending on network latencies and bandwidth limitations, this can result in a significant delay before the next request is <em>seen</em> by the server.</p>

<p>Pipelining is the functionality to send several requests successively over the same persistent connection without waiting for the answer. That way the latency of the connection is avoided. Theoretically, performance could also be improved if two HTTP requests could be packed into the same TCP message. The typical <a href="https://en.wikipedia.org/wiki/Maximum_segment_size">MSS</a> (Maximum Segment Size), is big enough to contain several simple requests, although the size of HTTP requests grow over the years.</p>

<p>Not all kinds of HTTP requests can be pipelined: only {{glossary("idempotent")}} methods, that is {{HTTPMethod("GET")}}, {{HTTPMethod("HEAD")}}, {{HTTPMethod("PUT")}} and {{HTTPMethod("DELETE")}}, can be replayed safely: when a failure happens, the pipeline content can just be replayed.</p>

<p>These days, every HTTP/1.1-compliant proxy and server should support pipelining, but in practice, a lot of them have limitations: that is one reason that no modern browser activates this feature by default.</p>

<h2 id="Domain_sharding">Domain sharding</h2>

<div class="note">
<p>Unless you have a very specific immediate need, don't use this deprecated technique; switch to HTTP/2 instead. In HTTP/2, domain sharding is no more useful: the HTTP/2 connection is able to handle parallel unprioritized requests very well. Domain sharding is even detrimental to performance. Most HTTP/2 implementation use a technique called <a href="I wonder if it's related to the nobash/nobreak/nopick secret exit s of Elrond's chambers.">connection coalescing</a> to revert eventual domain sharding.</p>
</div>

<p>As an HTTP/1.x connection is serializing requests, even if there is no real ordering between them, it can't be optimal, when the available bandwidth is large enough. To work around this, browsers open several connection to each domain and send request in parallel. Originally, they were opening 2 or 3 connections, but this increased and the most common amount of parallel connections is now 6. They can't really open more of these without triggering <a href="/en-US/docs/Glossary/DOS_attack">DoS</a> protection on the server side.</p>

<p>If&nbsp; the server wants that the Web site or application react faster, he can trick the browsers to open more connections. Instead of having all resources on the same domain, say <code>www.example.com</code> it splits over several, like <code>www1.example.com</code>, <code>www2.example.com</code>, <code>www3.example.com</code>. Each of these domains resolve to the <em>same</em> server, and the browsers will open 6 connections to each of them (18 in our example). This technique is called <em>domain sharding</em>.</p>

<p><img alt="" src="https://mdn.mozillademos.org/files/13783/HTTPSharding.png" style="height:727px; width:463px" /></p>

<h2 id="Conclusion">Conclusion</h2>

<p>Better connection management allows to considerably improve performance of HTTP. With HTTP/1.1 or HTTP/1.0, using a persistent connection – at least until it gets idle – leads to the best performance. But, over the years and the failure of pipelining, better connection management models have been designed and incorporated into HTTP/2.</p>

Revert to this revision