Please note, this is a STATIC archive of website developer.mozilla.org from 03 Nov 2016, cach3.com does not collect or store any user information, there is no "phishing" involved.

Revision 1108375 of Content negotiation

  • Revision slug: Web/HTTP/Content_negotiation
  • Revision title: Content negotiation
  • Revision id: 1108375
  • Created:
  • Creator: teoli
  • Is current revision? No
  • Comment

Revision Content

{{HTTPSidebar}}

In HTTP, content negotiation is the mechanism that is used, when facing the ability to serve several equivalent representations for a given resource, to provide the best suited one to the final user.

A specific document is called a resource. When a client wants to obtain it, it requests it using its URL. The server use this URL to choose one of the variant it can provide – each variant being called a representation – and returns this specific representation to the client. The overall resource, as well as each of the representations, has a specific URL. How a specific representation is chosen when the resource is called is determined by content negotiation and there are several ways of doing it.

 

The determination of the best suited representation is made through one of two mechanisms:

  • Specific HTTP headers by the client (server-driven negotiation or proactive negotation), which is the standard way of negotiating a specific kind of resource.
  • The {{HTTPStatus("300")}} Multiple Choices or {{HTTPStatus("406")}} Not Acceptable HTTP response codes by the server (agent-driven negotiation or reactive negotiation), that is used as the fallback mechanism, but is very rare.

Over the years, other content negotiation proposales, like transparent content negotiation and the Alternates header, have been proposed. They failed to get traction and got abandoned.

Server-driven content negotiation

In server-driven content negotiation, or proactive content negotiation, the browser (or any other kind of useragent) sends several HTTP headers along with the URL. These headers describe the preferred choice of the user. The server uses them as hints and an internal algorithm let it choose the best content to serve to the client. The algorithm is server-specific and not defined in the standard. See, for example, the Apache 2.2 negotiation algorithm.

The HTTP/1.1 standard defines list of the standard headers that start server-driven negotiation ({{ httpheader("Accept") }}, {{ httpheader("Accept-Charset") }}, {{ httpheader("Accept-Encoding") }}, {{ httpheader("Accept-Language") }}). Though stricly speaking {{ httpheader("User-Agent") }} is not in this list, it is sometimes also used to send a specific representation of the requested resource, though this is not considered as a good practice. The server uses the {{HTTPHeader("Vary")}} header to indicate which headers it actually used for content negotiation (or more precisely the associated response headers), so that caches can work optimally.

In addition to these, there is an experimental proposal to add more headers to the list of available headers, called client hints. Client hints advertises what kind of device the user agent run on (For example, is it a desktop computer or a mobile device?)

Even if server-driven content negotiation is the most common way to agree on a specific representation of a resource, it has several drawbacks:

  • The server doesn't have total knowledge of the browser. Even with the Client Hints extension, it has not a complete knowledge of the capabilitities of the browser. Unlike reactive content negotiation where the client makes the choice, the server choice is alway somewhat arbitrary.
  • The information by the client is quite verbose (HTTP/2 header compression mitigates this problem) and a privacy risk (HTTP fingerprinting)
  • As several representations of a given resource are sent, shared cache are less efficient and server implementation are slightly more complex.

The Accept header

The {{HTTPHeader("Accept")}} header lists the MIME Types of the media that the agent is willing to process. It is comma-separated lists of MIME type, each combined with a quality factor, as parameters giving the relative degree of preference between the different MIME Types lists.

The {{HTTPHeader("Accept")}} header is defined by the browser, or any other user-agent, and can vary according to the context, like fetching an HTML page or an image, a video, or a script: It is different when fetching a document entered in the address bar or an element linked via an {{ HTMLElement("img") }}, {{ HTMLElement("video") }} or {{ HTMLElement("audio") }} elements. Browsers are free to use the value of the header that they think is the most adequate; an exhaustive list of default values for common browsers is available.

The Accept-CH header {{experimental_inline}}

This is part of an experimental technology called Client Hints and currently implemented only in Chrome 46 and later

The experimental {{HTTPHeader("Accept-CH")}} lists configuration data that can be used by the server to select an appropriate response. Valid values are:

Value Meaning
DPR Indicates the client's device pixel ratio.
Viewport-Width Indicates the layout viewport width in CSS pixels. 
Width Indicates the resource width in physical pixels (in other words the intrinsic size of an image).

The Accept-Charset header

The {{HTTPHeader("Accept-Charset")}} header indicates to the server what kinds of encoding characters are understood by the user-agent. Traditionally, it was set to a different value for each locale for the browser, like ISO-8859-1,utf-8;q=0.7,*;q=0.7 for a Western European locale.

As UTF-8 is now well-supported and the preferred way of encoding characters, that sending one allow for more entropy (see configuration-based entropy) most browsers omit this header: Internet Explorer 8, Safari 5, Opera 11 and Firefox 10 all abandoned the sending of this header.

The Accept-Encoding header

The {{HTTPHeader("Accept-Encoding")}} header defines the acceptable content-encoding, mainly supported compression. The value is a q-factor list, like br, gzip;q=0.8, that indicates the priority of the encoding values. The default value identity, unless otherwise declared, is at the lowest priority.

Compressing HTTP messages is one of the most important way to improve the performance of a Web site, it shrinks the size of the data transmitted and makes better use of the available bandwidth; browsers always send this header and the server should be configured to abide to it and use compression.

The Accept-Language header

The {{HTTPHeader("Accept-Language")}} header is used to indicate the language preference of the user, as a list of value with a quality factors, like de, en;q=0.7. A different value is set according the language of the graphical interface but most browsers allow setting different language preferences.

Due to the configuration-based entropy increase, a modified value can  be used to fingerprinting of the user, it is not recommended to change it and a Web site cannot trust this value to reflect the actual wish of the user. Site-designers must not be over-zealous by using language detection via this header as it can lead to a poor user experience:

  • They should always provide a way to overcome the server-chosen language, e.g., by providing small links near the top of the page. Most user-agents provide a default value for the Accept-Language: header, adapted to the user interface language and end users often do not modify it, either by not knowing how, or by not being able to do it, as in an Internet café for instance.
  • Once a user has overridden the server-chosen language, a site should no longer use language detection and should stick with the explicitly-chosen language.. In other words, only entry pages of a site should select the proper language using this header.

The User-Agent header

Though there are legitimate uses of this header for selecting content, it is considered bad practice to rely on it to define what features are supported by the user agent. Instead try to use in priority feature-oriented object detection.

The {{HTTPHeader("User-Agent")}} header identifies the browser sending the request. This string may contain a space-separated list of product tokens and comments.

A product token is a name followed by a '/' and a version number, like Firefox/4.0.1. There may be as many of them as the user-agent wants. A comment is a free string delimited by parentheses. Obviously parentheses cannot be used in that string. The inner format of a comment is not defined by the standard, though several browser put several tokens in it, separated by ';'.

The Vary response header

In opposition with the previous Accept-* headers which are sent by the client, the {{HTTPHeader("Vary")}} HTTP header is sent by the web server in its response. It indicates the list of headers used by the server during the server-driven content negotiation phase. The header is needed in order to inform the cache of the decision criteria so that can reproduce it, allowing the cache to be functional while preventing serving erroneous content to the user.

The special value of '*' means that the server-driven content negotiation also uses information not conveyed in a header to choose the appropriate content.

The Vary header was added in the version 1.1 of HTTP and is necessary in order to allow caches to work appropriately. A cache, in order to work with agent-driven content negotiation, needs to know which criteria was used by the server to select the transmitted content. That way, the cache can replay the algorithm and will be able to serve acceptable content directly, without more request to the server. Obviously, the wildcard '*' prevents caching from occurring, as the cache cannot know what element is behind it.

Agent-driven negotiation

Server-driven negotiation suffers from a few downsides:

  • It doesn't scale well. There is one header per feature used in the negotiation. If one wants to use screen size, resolution or other dimensions, a new HTTP header must be created.
  • Sending of the headers must be done on every request. This is not too problematic with few headers, but with the eventual multiplications of them, the message size would lead to a decrease in performance.
  • The more headers are sent, the more entropy is sent, allowing for better HTTP fingerprinting and corresponding privacy concern.

HTTP allowed from the start another negotiation type, agent-driven negotiation. In this negotiation, when facing an ambiguous request, the server sends back a page containing links to the available alternative resources. The user is presented the resources and choose the one to use.

Unfortunately, the HTTP standard does not specify the format of the page allowing to choose between the available resource, preventing to easily automatize the process. Beside fallback of the server-driven negotiation, this method is almost always used in conjunction with scripting, especially with JavaScript redirection: after having checked for the negotiation criteria, the script performs the redirection.

A second problem is that one more request is needed in order to fetch the real resource, slowing the availability of the resource to the user.

Also note that the caching of the resource is trivial, as each resource has a different URI.

Revision Source

<p>{{HTTPSidebar}}</p>

<p class="summary">In <a href="/en-US/docs/Glossary/HTTP">HTTP</a>, <em><strong>content negotiation</strong></em> is the mechanism that is used, when facing the ability to serve several equivalent representations for a given resource, to provide the best suited one to the final user.</p>

<p>A specific document is called a <em>resource</em>. When a client wants to obtain it, it requests it using its URL. The server use this URL to choose one of the variant it can provide – each variant being called a <em>representation</em> – and returns this specific representation to the client. The overall resource, as well as each of the representations, has a specific URL. How a specific representation is chosen when the resource is called is determined by <em>content negotiation</em> and there are several ways of doing it.</p>

<p>&nbsp;</p>

<p>The determination of the best suited representation is made through one of two mechanisms:</p>

<ul>
 <li>Specific <a href="/en-US/docs/Web/HTTP/Headers" title="en/HTTP/Headers">HTTP headers</a> by the client (<em>server-driven negotiation</em> or <em>proactive negotation</em>), which is the standard way of negotiating a specific kind of resource.</li>
 <li>The {{HTTPStatus("300")}} <code>Multiple Choices</code> or {{HTTPStatus("406")}} <code>Not Acceptable</code> <a href="/en-US/docs/Web/HTTP/Status" title="https://developer.mozilla.org/en/HTTP/HTTP_response_codes">HTTP response codes</a> by the server (<em>agent-driven negotiation</em> or <em>reactive negotiation</em>), that is used as the fallback mechanism, but is very rare.</li>
</ul>

<p>Over the years, other content negotiation proposales, like <em>transparent content negotiation</em> and the <code>Alternates</code> header, have been proposed. They failed to get traction and got abandoned.</p>

<h2 id="Server-driven_content_negotiation">Server-driven content negotiation</h2>

<p>In <em>server-driven content negotiation</em>, or proactive content negotiation, the browser (or any other kind of useragent) sends several HTTP headers along with the URL. These headers describe the preferred choice of the user. The server uses them as hints and an internal algorithm let it choose the best content to serve to the client. The algorithm is server-specific and not defined in the standard. See, for example, the <a class="external" href="https://httpd.apache.org/docs/2.2/en/content-negotiation.html#algorithm" title="https://httpd.apache.org/docs/2.2/en/content-negotiation.html#algorithm">Apache 2.2 negotiation algorithm</a>.</p>

<p>The HTTP/1.1 standard defines list of the standard headers that start server-driven negotiation ({{ httpheader("Accept") }}, {{ httpheader("Accept-Charset") }}, {{ httpheader("Accept-Encoding") }}, {{ httpheader("Accept-Language") }}). Though stricly speaking {{ httpheader("User-Agent") }} is not in this list, it is sometimes also used to send a specific representation of the requested resource, though this is not considered as a good practice. The server uses the {{HTTPHeader("Vary")}} header to indicate which headers it actually used for content negotiation (or more precisely the associated response headers), so that <a href="/en-US/docs/Web/HTTP/Caching">caches</a> can work optimally.</p>

<p>In addition to these, there is an experimental proposal to add more headers to the list of available headers, called <em>client hints</em>. Client hints advertises what kind of device the user agent run on (For example, is it a desktop computer or a mobile device?)</p>

<p>Even if server-driven content negotiation is the most common way to agree on a specific representation of a resource, it has several drawbacks:</p>

<ul>
 <li>The server doesn't have total knowledge of the browser. Even with the Client Hints extension, it has not a complete knowledge of the capabilitities of the browser. Unlike reactive content negotiation where the client makes the choice, the server choice is alway somewhat arbitrary.</li>
 <li>The information by the client is quite verbose (HTTP/2 header compression mitigates this problem) and a privacy risk (HTTP fingerprinting)</li>
 <li>As several representations of a given resource are sent, shared cache are less efficient and server implementation are slightly more complex.</li>
</ul>

<h3 id="The_Accept_header">The <code>Accept</code> header</h3>

<p>The {{HTTPHeader("Accept")}} header lists the MIME Types of the media that the agent is willing to process. It is comma-separated lists of MIME type, each combined with a quality factor, as parameters giving the relative degree of preference between the different MIME&nbsp;Types lists.</p>

<p>The {{HTTPHeader("Accept")}} header is defined by the browser, or any other user-agent, and can vary according to the context, like fetching an HTML page or an image, a video, or a script: It is different when fetching a document entered in the address bar or an element linked via an {{ HTMLElement("img") }}, {{ HTMLElement("video") }} or {{ HTMLElement("audio") }} elements. Browsers are free to use the value of the header that they think is the most adequate; an exhaustive list of <a href="/en-US/docs/Web/HTTP/Content_negotiation/List_of_default_Accept_values">default values for common browsers</a> is available.</p>

<h3 id="The_Accept-CH_header_experimental_inline">The <code>Accept-CH</code> header&nbsp;{{experimental_inline}}</h3>

<div class="note">
<p>This is part of an <strong>experimental</strong> technology called <em>Client Hints</em> and currently implemented only in Chrome 46 and later</p>
</div>

<p>The experimental {{HTTPHeader("Accept-CH")}} lists&nbsp;configuration data that can be used by the server to select an appropriate response. Valid values are:</p>

<table class="standard-table">
 <thead>
  <tr>
   <th scope="col">Value</th>
   <th scope="col">Meaning</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><code>DPR</code></td>
   <td>Indicates the client's device pixel ratio.</td>
  </tr>
  <tr>
   <td><code>Viewport-Width</code></td>
   <td>Indicates the layout viewport width in CSS pixels.&nbsp;</td>
  </tr>
  <tr>
   <td><code>Width</code></td>
   <td>Indicates the resource width in physical pixels (in other words the intrinsic size of an image).</td>
  </tr>
 </tbody>
</table>

<h3 id="The_Accept-Charset_header">The <code>Accept-Charset</code> header</h3>

<p>The {{HTTPHeader("Accept-Charset")}} header indicates to the server what kinds of encoding characters are understood by the user-agent. Traditionally, it was set to a different value for each locale for the browser, like <code>ISO-8859-1,utf-8;q=0.7,*;q=0.7</code> for a Western European locale.</p>

<p>As UTF-8 is now well-supported and the preferred way of encoding characters, that sending one allow for more entropy (see <a class="link-https" href="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy" title="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy">configuration-based </a><a class="link-https" href="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy" title="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy">entropy</a>) most browsers omit this header: Internet Explorer 8, Safari 5, Opera 11 and Firefox 10 all abandoned the sending of this header.</p>

<h3 id="The_Accept-Encoding_header">The <code>Accept-Encoding</code> header</h3>

<p>The {{HTTPHeader("Accept-Encoding")}} header defines the acceptable content-encoding, mainly supported compression. The value is a q-factor list, like <code>br, gzip;q=0.8</code>, that indicates the priority of the encoding values. The default value <code>identity</code>, unless otherwise declared, is at the lowest priority.</p>

<p>Compressing HTTP messages is one of the most important way to improve the performance of a Web site, it shrinks the size of the data transmitted and makes better use of the available bandwidth; browsers always send this header and the server should be configured to abide to it and use compression.</p>

<h3 id="The_Accept-Language_header">The <code>Accept-Language</code> header</h3>

<p>The {{HTTPHeader("Accept-Language")}} header is used to indicate the language preference of the user, as a list of value with a quality factors, like <code>de, en;q=0.7</code>. A different value is set according the language of the graphical interface but most browsers allow setting different language preferences.</p>

<p>Due to the <a class="link-https" href="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy" title="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy">configuration-based entropy</a> increase, a modified value can&nbsp; be used to fingerprinting of the user, it is not recommended to change it and a Web site cannot trust this value to reflect the actual wish of the user. Site-designers must not be over-zealous by using language detection via this header as it can lead to a poor user experience:</p>

<ul>
 <li>They should always provide a way to overcome the server-chosen language, e.g., by providing small links near the top of the page. Most user-agents provide a default value for the <code>Accept-Language:</code> header, adapted to the user interface language and end users often do not modify it, either by not knowing how, or by not being able to do it, as in an Internet café for instance.</li>
 <li>Once a user has overridden the server-chosen language, a site should no longer use language detection and should stick with the explicitly-chosen language.. In other words, only entry pages of a site should select the proper language using this header.</li>
</ul>

<h3 id="The_User-Agent_header">The <code>User-Agent</code> header</h3>

<div class="note">
<p>Though there are legitimate uses of this header for selecting content, <a href="/en/Browser_Detection_and_Cross_Browser_Support#Limit_the_use_of_User_Agent_String_based_Detection" title="https://developer.mozilla.org/en/Browser_Detection_and_Cross_Browser_Support#Limit_the_use_of_User_Agent_String_based_Detection">it is considered bad practice</a> to rely on it to define what features are supported by the user agent. Instead try to use in priority<a href="/en/Browser_Detection_and_Cross_Browser_Support#Use_feature_oriented_object_detection" title="https://developer.mozilla.org/en/Browser_Detection_and_Cross_Browser_Support#Use_feature_oriented_object_detection"> feature-oriented object detection</a>.</p>
</div>

<p>The {{HTTPHeader("User-Agent")}} header identifies the browser sending the request. This string may contain a space-separated list of <em>product tokens</em> and <em style="font-style:italic">comments</em>.</p>

<p>A <em>product token</em> is a name followed by a '<code>/</code>' and a version number, like <code>Firefox/4.0.1</code>. There may be as many of them as the user-agent wants. A <em>comment</em> is a free string delimited by parentheses. Obviously parentheses cannot be used in that string. The inner format of a comment is not defined by the standard, though several browser put several tokens in it, separated by '<code>;</code>'.</p>

<h3 id="The_Vary_response_header">The <code>Vary</code> response header</h3>

<p>In opposition with the previous <code>Accept-*</code> headers which are sent by the client, the {{HTTPHeader("Vary")}} HTTP header is sent by the web server in its response. It indicates the list of headers used by the server during the server-driven content negotiation phase. The header is needed in order to inform the cache of the decision criteria so that can reproduce it, allowing the cache to be functional while preventing serving erroneous content to the user.</p>

<p>The special value of '<code>*</code>' means that the server-driven content negotiation also uses information not conveyed in a header to choose the appropriate content.</p>

<p>The <code>Vary</code> header was added in the version 1.1 of HTTP and is necessary in order to allow caches to work appropriately. A cache, in order to work with agent-driven content negotiation, needs to know which criteria was used by the server to select the transmitted content. That way, the cache can replay the algorithm and will be able to serve acceptable content directly, without more request to the server. Obviously, the wildcard '<code>*</code>' prevents caching from occurring, as the cache cannot know what element is behind it.</p>

<h2 id="Agent-driven_negotiation">Agent-driven negotiation</h2>

<p>Server-driven negotiation suffers from a few downsides:</p>

<ul>
 <li>It doesn't scale well. There is one header per feature used in the negotiation. If one wants to use screen size, resolution or other dimensions, a new HTTP header must be created.</li>
 <li>Sending of the headers must be done on every request. This is not too problematic with few headers, but with the eventual multiplications of them, the message size would lead to a decrease in performance.</li>
 <li>The more headers are sent, the more entropy is sent, allowing for better HTTP&nbsp;fingerprinting and corresponding privacy concern.</li>
</ul>

<p>HTTP allowed from the start another negotiation type, <em>agent-driven negotiation</em>. In this negotiation, when facing an ambiguous request, the server sends back a page containing links to the available alternative resources. The user is presented the resources and choose the one to use.</p>

<p>Unfortunately, the HTTP standard does not specify the format of the page allowing to choose between the available resource, preventing to easily automatize the process. Beside fallback of the <em>server-driven negotiation</em>, this method is almost always used in conjunction with scripting, especially with JavaScript redirection: after having checked for the negotiation criteria, the script performs the redirection.</p>

<p>A second problem is that one more request is needed in order to fetch the real resource, slowing the availability of the resource to the user.</p>

<p>Also note that the caching of the resource is trivial, as each resource has a different URI.</p>
Revert to this revision