Revision 1121965 of Connection management in HTTP/1.x

Revision slug: Web/HTTP/Connection_management_in_HTTP_1.x
Revision title: Connection management in HTTP/1.x
Revision id: 1121965
Created: Sep 21, 2016, 4:04:19 AM
Creator: bunnybooboo
Is current revision? No
Comment editorial review pipelining

Tags:

Revision Content

Connection management is a key topic in HTTP: opening and maintaining connections largely impacts the performance of Web sites and Web applications. In HTTP/1.x, there are several models: short-lived connections, persistent connections, and HTTP pipelining.

HTTP mostly relies on TCP for its transport protocol, providing a connection between the client and the server. In its infancy, HTTP used a single model to handle such connections. These connections were short-lived: a new one created each time a request needed sending, and closed once the answer had been received.

This simple model held an innate limitation on performance: opening each TCP connection is a resource-consuming operation. Several messages must be exchanged between the client and the server. Network latency and bandwidth affect performance when a request needs sending. Modern Web pages require many requests (a dozen or more) to serve the amount of information needed, proving this earlier model inefficient.

Two newer models were created in HTTP/1.1. The persistent-connection model keeps connections opened between successive requests, reducing the time needed to open new connections. The HTTP pipelining model goes one step further, by sending several successive requests without even waiting for an answer, reducing much of the latency in the network.

Compares the performance of the three HTTP/1.x connection models: short-lived connections, persistent connections, and HTTP pipelining.

HTTP/2 adds additional models for connection management.

It's important point to note that connection management in HTTP applies to the connection between two consecutive nodes, which is hop-by-hop and not end-to-end. The model used in connections between a client and its first proxy may differ from the model between a proxy and the destination server (or any intermediate proxies). The HTTP headers involved in defining the connection model, like {{HTTPHeader("Connection")}} and {{HTTPHeader("Keep-Alive")}}, are hop-by-hop headers with their values able to be changed by intermediary nodes.

Short-lived connections

The original model of HTTP, and the default one in HTTP/1.0, is short-lived connections. Each HTTP request is completed on its own connection; this means a TCP handshake happens before each HTTP request, and these are serialized.

The TCP handshake itself is time-consuming, but a TCP connection adapts to its load, becoming more efficient with more sustained (or warm) connections. Short-lived connections do not make use of this efficiency feature of TCP, and performance degrades from optimum by persisting to transmit over a new, cold connection.

This model is the default model used in HTTP/1.0 (if there is no {{HTTPHeader("Connection")}} header, or if its value is set to close). In HTTP/1.1, this model is only used when the {{HTTPHeader("Connection")}} header is sent with a value of close.

Unless dealing with a very old system, which doesn't support a persistent connection, there is no compelling reason to use this model.

Persistent connections

Short-lived connections have two major hitches: the time taken to establish a new connection is significant, and performance of the underlying TCP connection gets better only when this connection has been in use for some time (warm connection). To ease these problems, the concept of a persistent connection has been designed, even prior to HTTP/1.1. Alternatively this may be called a keep-alive connection.

A persistent connection is a one which remains open for a period, and can be reused for several requests, saving the the need for a new TCP handshake, and utilizing TCP's performance enhancing capabilities. This connection will not stay open forever: idle connections are closed after some time (a server may use the {{HTTPHeader("Keep-Alive")}} header to specify a minimum time the connection should be kept open).

Persistent connections also have drawbacks; even when idling they consume server resources, and under heavy load, {{glossary("DoS attack", "DoS attacks")}} can be conducted. In such cases, using non-persistent connections, which are closed as soon as they are idle, can provide better performance.

HTTP/1.0 connections are not persistent by default. Setting {{HTTPHeader("Connection")}} to anything other than close, usually retry-after, will make them persistent.

In HTTP/1.1, peristence is the default, and the header is no longer needed (but it is often added as a defensive measure against cases requiring a fallback to HTTP/1.0).

HTTP pipelining

HTTP pipelining is not activated by default in modern browsers:

Buggy proxies are still common and these lead to strange and erratic behaviors that Web developers cannot foresee and diagnose easily.
Pipelining is complex to implement correctly: the size of the resource being transferred, the effective RTT that will be used, as well as the effective bandwidth, have a direct incidence on the improvement provided by the pipeline. Without knowing these, important messages may be delayed behind unimportant ones. The notion of important even evolves during page layout! HTTP pipelining therefore brings a marginal improvement in most cases only.
Pipelining is subject to the HOL problem.

For these reasons, pipelining got been superseded by a better algorithm, multiplexing, that is used by HTTP/2.

By default, HTTP requests are issued sequentially. The next request is only issued once the response to the current request has been received. As they are affected by network latencies and bandwidth limitations, this can result in significant delay before the next request is seen by the server.

Pipelining is the process to send successive requests, over the same persistent connection, without waiting for the answer. This avoids latency of the connection. Theoretically, performance could also be improved if two HTTP requests were to be packed into the same TCP message. The typical MSS (Maximum Segment Size), is big enough to contain several simple requests, although the demand in size of HTTP requests continues to grow.

Not all types of HTTP requests can be pipelined: only {{glossary("idempotent")}} methods, that is {{HTTPMethod("GET")}}, {{HTTPMethod("HEAD")}}, {{HTTPMethod("PUT")}} and {{HTTPMethod("DELETE")}}, can be replayed safely: should a failure happen, the pipeline content can simply be repeated.

Today, every HTTP/1.1-compliant proxy and server should support pipelining, though many have limitations in practice: a significant reason no modern browser activates this feature by default.

Domain sharding

Unless you have a very specific immediate need, don't use this deprecated technique; switch to HTTP/2 instead. In HTTP/2, domain sharding is no more useful: the HTTP/2 connection is able to handle parallel unprioritized requests very well. Domain sharding is even detrimental to performance. Most HTTP/2 implementation use a technique called connection coalescing to revert eventual domain sharding.

As an HTTP/1.x connection is serializing requests, even if there is no real ordering between them, it can't be optimal, when the available bandwidth is large enough. To work around this, browsers open several connection to each domain and send request in parallel. Originally, they were opening 2 or 3 connections, but this increased and the most common amount of parallel connections is now 6. They can't really open more of these without triggering DoS protection on the server side.

If the server wants that the Web site or application react faster, he can trick the browsers to open more connections. Instead of having all resources on the same domain, say www.example.com it splits over several, like www1.example.com, www2.example.com, www3.example.com. Each of these domains resolve to the same server, and the browsers will open 6 connections to each of them (18 in our example). This technique is called domain sharding.

Conclusion

Better connection management allows to considerably improve performance of HTTP. With HTTP/1.1 or HTTP/1.0, using a persistent connection – at least until it gets idle – leads to the best performance. But, over the years and the failure of pipelining, better connection management models have been designed and incorporated into HTTP/2.

Revision Source

<div>{{HTTPSidebar}}</div>

<p class="summary">Connection management is a&nbsp;key topic in HTTP: opening and maintaining connections largely impacts the performance of Web sites and Web applications. In HTTP/1.x, there are several models: <em>short-lived connections</em>, <em>persistent connections</em>, and <em>HTTP pipelining. </em></p>

<p>HTTP mostly relies on TCP for its transport protocol, providing a connection between the client and the server. In its infancy, HTTP used a single model to handle such connections. These connections were short-lived: a new one created each time a request needed sending, and closed once the answer had been received.</p>

<p>This simple model held an innate limitation on performance: opening each TCP connection is a resource-consuming operation. Several messages must be exchanged between the client and the server. Network latency and bandwidth affect performance when a request needs sending. Modern Web pages require many requests (a dozen or more) to serve the amount of information needed, proving this earlier model inefficient.</p>

<p>Two newer models were created in HTTP/1.1. The persistent-connection model keeps connections opened between successive requests, reducing the time needed to open new connections. The HTTP pipelining model goes one step further, by sending several successive requests without even waiting for an answer, reducing much of the latency in the network.</p>

<p><img alt="Compares the performance of the three HTTP/1.x connection models: short-lived connections, persistent connections, and HTTP pipelining." src="https://mdn.mozillademos.org/files/13727/HTTP1_x_Connections.png" style="height:538px; width:813px" /></p>

<div class="note">
<p>HTTP/2 adds additional models for connection management.</p>
</div>

<p>It's important point to note that connection management in HTTP applies to the connection between two consecutive nodes, which is <a href="/en-US/docs/Web/HTTP/Headers#hbh">hop-by-hop</a> and not <a href="/en-US/docs/Web/HTTP/Headers#e2e">end-to-end</a>. The model used in connections between a client and its first proxy may differ from the model between a proxy and the destination server (or any intermediate proxies). The HTTP headers involved in defining the connection model, like {{HTTPHeader("Connection")}} and {{HTTPHeader("Keep-Alive")}}, are <a href="/en-US/docs/Web/HTTP/Headers#hbh">hop-by-hop</a> headers with their values able to be changed by intermediary nodes.</p>

<h2 id="Short-lived_connections">Short-lived connections</h2>

<p>The original model of HTTP, and the default one in HTTP/1.0, is <em>short-lived connections</em>. Each HTTP request is completed on its own connection; this means a TCP handshake happens before each HTTP request, and these are serialized.</p>

<p>The TCP handshake itself is time-consuming, but a TCP connection adapts to its load, becoming more efficient with more sustained (or warm) connections. Short-lived connections do not make use of this efficiency feature of TCP, and performance degrades from optimum by persisting to transmit over a new, cold connection.</p>

<p>This model is the default model used in HTTP/1.0 (if there is no {{HTTPHeader("Connection")}} header, or if its value is set to <code>close</code>). In HTTP/1.1, this model is only used when the {{HTTPHeader("Connection")}} header is sent with a value of <code>close</code>.</p>

<div class="note">
<p>Unless dealing with a very old system, which doesn't support a persistent connection, there is no compelling reason to use this model.</p>
</div>

<h2 id="Persistent_connections">Persistent connections</h2>

<p>Short-lived connections have two major hitches: the time taken to establish a new connection is significant, and performance of the underlying TCP connection gets better only when this connection has been in use for some time (warm connection). To ease these problems, the concept of a <em>persistent connection</em> has been designed, even prior to HTTP/1.1. Alternatively this may be called a <em>keep-alive connection</em>.</p>

<p>A persistent connection is a one which remains open for a period, and can be reused for several requests, saving the the need for a new TCP handshake, and utilizing TCP's performance enhancing capabilities. This connection will not stay open forever: idle connections are closed after some time (a server may use the {{HTTPHeader("Keep-Alive")}} header to specify a minimum time the connection should be kept open).</p>

<p>Persistent connections also have drawbacks; even when idling they consume server resources, and under heavy load, {{glossary("DoS attack", "DoS attacks")}} can be conducted. In such cases, using non-persistent connections, which are closed as soon as they are idle, can provide better performance.</p>

<p>HTTP/1.0 connections are not persistent by default. Setting {{HTTPHeader("Connection")}} to anything other than <code>close</code>, usually <code>retry-after</code>, will make them persistent.</p>

<p>In HTTP/1.1, peristence is the default, and the header is no longer needed (but it is often added as a defensive measure against cases requiring a fallback to HTTP/1.0).</p>

<h2 id="HTTP_pipelining">HTTP pipelining</h2>

<div class="note">
<p>HTTP pipelining is not activated by default in modern browsers:</p>

<ul>
 <li>Buggy <a href="https://en.wikipedia.org/wiki/Proxy_server">proxies</a> are still common and these lead to strange and erratic behaviors that Web developers cannot foresee and diagnose easily.</li>
 <li>Pipelining is complex to implement correctly: the size of the resource being transferred, the effective <a href="https://en.wikipedia.org/wiki/Round-trip_delay_time">RTT</a> that will be used, as well as the effective bandwidth, have a direct incidence on the improvement provided by the pipeline. Without knowing these, important messages may be delayed behind unimportant ones. The notion of important even evolves during page layout! HTTP pipelining therefore brings a marginal improvement in most cases only.</li>
 <li>Pipelining is subject to the <a href="https://en.wikipedia.org/wiki/Head-of-line_blocking">HOL</a> problem.</li>
</ul>

<p>For these reasons, pipelining got been superseded by a better algorithm, <em>multiplexing</em>, that is used by HTTP/2.</p>
</div>

<p>By default, <a href="/en/HTTP" title="en/HTTP">HTTP</a> requests are issued sequentially. The next request is only issued once the response to the current request has been received. As they are affected by network latencies and bandwidth limitations, this can result in significant delay before the next request is <em>seen</em> by the server.</p>

<p>Pipelining is the process to send successive requests, over the same persistent connection, without waiting for the answer. This avoids latency of the connection. Theoretically, performance could also be improved if two HTTP requests were to be packed into the same TCP message. The typical <a href="https://en.wikipedia.org/wiki/Maximum_segment_size">MSS</a> (Maximum Segment Size), is big enough to contain several simple requests, although the demand in size of HTTP requests continues to grow.</p>

<p>Not all types of HTTP requests can be pipelined: only {{glossary("idempotent")}} methods, that is {{HTTPMethod("GET")}}, {{HTTPMethod("HEAD")}}, {{HTTPMethod("PUT")}} and {{HTTPMethod("DELETE")}}, can be replayed safely: should a failure happen, the pipeline content can simply be repeated.</p>

<p>Today, every HTTP/1.1-compliant proxy and server should support pipelining, though many have limitations in practice: a significant reason no modern browser activates this feature by default.</p>

<h2 id="Domain_sharding">Domain sharding</h2>

<div class="note">
<p>Unless you have a very specific immediate need, don't use this deprecated technique; switch to HTTP/2 instead. In HTTP/2, domain sharding is no more useful: the HTTP/2 connection is able to handle parallel unprioritized requests very well. Domain sharding is even detrimental to performance. Most HTTP/2 implementation use a technique called <a href="I wonder if it's related to the nobash/nobreak/nopick secret exit s of Elrond's chambers.">connection coalescing</a> to revert eventual domain sharding.</p>
</div>

<p>As an HTTP/1.x connection is serializing requests, even if there is no real ordering between them, it can't be optimal, when the available bandwidth is large enough. To work around this, browsers open several connection to each domain and send request in parallel. Originally, they were opening 2 or 3 connections, but this increased and the most common amount of parallel connections is now 6. They can't really open more of these without triggering <a href="/en-US/docs/Glossary/DOS_attack">DoS</a> protection on the server side.</p>

<p>If&nbsp; the server wants that the Web site or application react faster, he can trick the browsers to open more connections. Instead of having all resources on the same domain, say <code>www.example.com</code> it splits over several, like <code>www1.example.com</code>, <code>www2.example.com</code>, <code>www3.example.com</code>. Each of these domains resolve to the <em>same</em> server, and the browsers will open 6 connections to each of them (18 in our example). This technique is called <em>domain sharding</em>.</p>

<p><img alt="" src="https://mdn.mozillademos.org/files/13783/HTTPSharding.png" style="height:727px; width:463px" /></p>

<h2 id="Conclusion">Conclusion</h2>

<p>Better connection management allows to considerably improve performance of HTTP. With HTTP/1.1 or HTTP/1.0, using a persistent connection – at least until it gets idle – leads to the best performance. But, over the years and the failure of pipelining, better connection management models have been designed and incorporated into HTTP/2.</p>

Revert to this revision