Revision 1134217 of コンテンツネゴシエーション

リビジョンの URL スラグ: Web/HTTP/Content_negotiation
リビジョンのタイトル: コンテンツネゴシエーション
リビジョンの ID: 1134217
作成日: 2016/10/25 8:01:33
作成者: yyss
現行リビジョン？いいえ
コメント

タグ:

このリビジョンの内容

HTTP において{{Gengoheiki("コンテンツネゴシエーション", "content negotiation")}} は、同じ URL に対してさまざまなバージョンのリソースを提供するために使用する仕組みであり、ユーザーエージェントはどのリソースがユーザーにもっとも適しているか (例えばドキュメントの言語、画像の形式、コンテンツのエンコード方式) を指定できます。

コンテンツネゴシエーションの原理

具体的なドキュメントは{{Gengoheiki("リソース", "resource")}} と呼ばれます。クライアントがリソースを取得したいときは、URL を使用して要求します。サーバーはこの URL を、提供するものをバリエーションからひとつ選択するために使用します (それぞれのバリエーションは{{Gengoheiki("表現", "representation")}} と呼ばれます)。そして、特定の表現をクライアントに返します。それぞれの表現を含むすべてのリソースが、固有の URL を持ちます。リソースが呼び出されたときに特定の表現を選択する方法はコンテンツネゴシエーションによって決められ、クライアントサーバーの間で交渉する方法がいくつかあります。

もっとも適した表現の決定は、以下の 2 つの仕組みのいずれかによって行われます:

クライアントによる特定の HTTP ヘッダー (サーバー駆動型ネゴシエーション または プロアクティブネゴシエーション)。これは、特定の種類のリソースで交渉を行う標準的な方法です。
サーバーによる {{HTTPStatus("300")}} (Multiple Choices) または {{HTTPStatus("406")}} (Not Acceptable) HTTP レスポンスコード (エージェント駆動型ネゴシエーション または リアクティブネゴシエーション)。これはフォールバック機構として使用します。

数年来、{{Gengoheiki("透過的コンテンツネゴシエーション", "transparent content negotiation")}} や Alternates ヘッダーといった他のコンテンツネゴシエーションが提案されてきました。これらは支持を得られず、破棄されました。

サーバー駆動型コンテンツネゴシエーション

サーバー駆動型コンテンツネゴシエーション またはプロアクティブコンテンツネゴシエーションでは、ブラウザー (または他のユーザーエージェント) が URI と共にいくつかの HTTP ヘッダーを送信します。これらのヘッダーは、ユーザーにとって好ましいものを表します。サーバーではそれらを手がかりとして使用して内部アルゴリズムが、クライアントに提供する最善のコンテンツを選択します。そのアルゴリズムはサーバーによって異なり、標準化されていません。例として、Apache 2.2 のネゴシエーションアルゴリズムをご覧ください。

HTTP/1.1 標準では、サーバー駆動型ネゴシエーションを開始する標準ヘッダーの一覧 ({{HTTPHeader("Accept")}}, {{HTTPHeader("Accept-Charset")}}, {{HTTPHeader("Accept-Encoding")}}, {{HTTPHeader("Accept-Language")}}) を定義しています。厳密に言えば {{HTTPHeader("User-Agent")}} はこの一覧に含まれていませんが、要求したリソースの特定の表現を送信するために使用されることがあります。ただし、これはよい慣習ではないと考えられています。サーバーはどのヘッダーを実際にコンテンツネゴシエーションで使用したかを示すために {{HTTPHeader("Vary")}} ヘッダー (あるいは、より的確な関係があるレスポンスヘッダー) を使用します。これにより、キャッシュが適切に機能します。

さらに、ネゴシエーションに使用できるヘッダーを追加する実験的な提案があり、client hints と呼ばれています。client hints は、ユーザーエージェントを実行しているデバイスがどのようなものか (例えば、デスクトップコンピューターかモバイルデバイスか) を伝えます。

サーバー駆動型コンテンツネゴシエーションはリソースの特定の表現を決定するためのもっとも一般的な方法ですが、いくつか欠点があります:

サーバーは、ブラウザーのことをすべて知っているわけではありません。Client Hints 拡張を加えても、ブラウザーの機能を完全には把握できません。クライアントが選択するリアクティブコンテンツネゴシエーションとは異なり、サーバーの選択はすべて若干独断的です。
クライアントが提供する情報はかなり冗長であり (HTTP/2 のヘッダー圧縮は、この問題を緩和します)、またプライバシーのリスク (HTTP フィンガープリンティング) もあります。
指定されたリソースの複数の表現を送信すると、共有キャッシュの効率が下がります。また、サーバーの実装はより複雑になります。

`Accept` ヘッダー

{{HTTPHeader("Accept")}} ヘッダーは、エージェントが処理することを望むメディアリソースの MIME タイプを羅列します。これはカンマ区切りで MIME タイプで並べており、それぞれの MIME タイプは、別の MIME タイプとの相対的な優先度を示すパラメータであるクオリティファクターと結びつけられています。

{{HTTPHeader("Accept")}} ヘッダーはブラウザーまたは他のユーザーエージェントによって定義され、それは HTML ページ・画像・動画・スクリプトなど取得するものによって変わる場合があります。アドレスバーで指定したドキュメントを取得するときと {{HTMLElement("img")}}、{{HTMLElement("video")}}、{{HTMLElement("audio")}} 要素でリンクしたものを取得するときで、このヘッダーは異なります。ブラウザーはこのヘッダーで、最適と思われる値を自由に使用できます。一般的なブラウザーの既定値の包括的な一覧があります。

`Accept-CH` ヘッダー {{experimental_inline}}

これは Client Hints と呼ばれる実験的な技術の一部であり、現在は Chrome 46 以降だけが実装しています。

実験的な {{HTTPHeader("Accept-CH")}} は、サーバーが適切なリソースを選択するために使用できる設定データを羅列します。有効な値は以下のとおりです:

値	意味
`DPR`	クライアントのデバイスピクセル比を示します。
`Viewport-Width`	レイアウトビューポートの幅を CSS ピクセルで示します。
`Width`	リソースの幅を物理ピクセルで示します (言い換えると、画像の本来の幅です)。

`Accept-Charset` ヘッダー

{{HTTPHeader("Accept-Charset")}} ヘッダーは、ユーザーエージェントが理解する文字エンコーディングが何かを、サーバーに示します。伝統的に、例えば西ヨーロッパロケールでは ISO-8859-1,utf-8;q=0.7,*;q=0.7 のように、ブラウザーのために各ロケールへさまざまな値を設定していました。

UTF-8 が良好にサポートされるようになり、好ましい文字エンコード方法になっています。また、設定に基づくエントロピーを減らしてプライバシーを保護するために、ほとんどのブラウザーが Accept-Charset ヘッダーを省略しています。Internet Explorer 8、Safari 5、Opera 11、Firefox 10 はこのヘッダーの使用を取りやめました。

The `Accept-Encoding` header

The {{HTTPHeader("Accept-Encoding")}} header defines the acceptable content-encoding (supported compressions). The value is a q-factor list (e.g.: br, gzip;q=0.8) that indicates the priority of the encoding values. The default value identity is at the lowest priority (unless otherwise declared).

Compressing HTTP messages is one of the most important ways to improve the performance of a Web site, it shrinks the size of the data transmitted and makes better use of the available bandwidth; browsers always send this header and the server should be configured to abide to it and to use compression.

The `Accept-Language` header

The {{HTTPHeader("Accept-Language")}} header is used to indicate the language preference of the user. It is a list of values with quality factors (like: "de, en;q=0.7"). A default value is often set according the language of the graphical interface of the user agent, but most browsers allow to set different language preferences.

Due to the configuration-based entropy increase, a modified value can be used to fingerprint the user, it is not recommended to change it and a Web site cannot trust this value to reflect the actual wish of the user. Site designers must not be over-zealous by using language detection via this header as it can lead to a poor user experience:

They should always provide a way to overcome the server-chosen language, e.g., by providing a language menu on the site. Most user-agents provide a default value for the Accept-Language header, adapted to the user interface language and end users often do not modify it, either by not knowing how, or by not being able to do it, as in an Internet café for instance.
Once a user has overridden the server-chosen language, a site should no longer use language detection and should stick with the explicitly-chosen language. In other words, only entry pages of a site should select the proper language using this header.

The `User-Agent` header

Though there are legitimate uses of this header for selecting content, it is considered bad practice to rely on it to define what features are supported by the user agent.

The {{HTTPHeader("User-Agent")}} header identifies the browser sending the request. This string may contain a space-separated list of product tokens and comments.

A product token is a name followed by a '/' and a version number, like Firefox/4.0.1. There may be as many of them as the user-agent wants. A comment is a free string delimited by parentheses. Obviously parentheses cannot be used in that string. The inner format of a comment is not defined by the standard, though several browser put several tokens in it, separated by ';'.

The `Vary` response header

In opposition to the previous Accept-* headers which are sent by the client, the {{HTTPHeader("Vary")}} HTTP header is sent by the web server in its response. It indicates the list of headers used by the server during the server-driven content negotiation phase. The header is needed in order to inform the cache of the decision criteria so that can reproduce it, allowing the cache to be functional while preventing serving erroneous content to the user.

The special value of '*' means that the server-driven content negotiation also uses information not conveyed in a header to choose the appropriate content.

The Vary header was added in the version 1.1 of HTTP and is necessary in order to allow caches to work appropriately. A cache, in order to work with agent-driven content negotiation, needs to know which criteria was used by the server to select the transmitted content. That way, the cache can replay the algorithm and will be able to serve acceptable content directly, without more request to the server. Obviously, the wildcard '*' prevents caching from occurring, as the cache cannot know what element is behind it.

Agent-driven negotiation

Server-driven negotiation suffers from a few downsides: it doesn't scale well. There is one header per feature used in the negotiation. If you want to use screen size, resolution or other dimensions, a new HTTP header must be created. Sending of the headers must be done on every request. This is not too problematic with few headers, but with the eventual multiplications of them, the message size would lead to a decrease in performance. The more precise headers are sent, the more entropy is sent, allowing for more HTTP fingerprinting and corresponding privacy concern.

From the beginnings of HTTP, the protocol allowed another negotiation type: agent-driven negotiation or reactive negotiation. In this negotiation, when facing an ambiguous request, the server sends back a page containing links to the available alternative resources. The user is presented the resources and choose the one to use.

Unfortunately, the HTTP standard does not specify the format of the page allowing to choose between the available resource, which prevents to easily automatize the process. Besides falling back to the server-driven negotiation, this method is almost always used in conjunction with scripting, especially with JavaScript redirection: after having checked for the negotiation criteria, the script performs the redirection. A second problem is that one more request is needed in order to fetch the real resource, slowing the availability of the resource to the user.

このリビジョンのソースコード

<div>{{HTTPSidebar}}</div>

<p class="summary"><a href="/ja/docs/Glossary/HTTP">HTTP</a> において<em><strong>{{Gengoheiki("コンテンツネゴシエーション", "content negotiation")}}</strong></em> は、同じ URL に対してさまざまなバージョンのリソースを提供するために使用する仕組みであり、ユーザーエージェントはどのリソースがユーザーにもっとも適しているか (例えばドキュメントの言語、画像の形式、コンテンツのエンコード方式) を指定できます。</p>

<h2 id="Principles_of_content_negotiation" name="Principles_of_content_negotiation">コンテンツネゴシエーションの原理</h2>

<p>具体的なドキュメントは<em>{{Gengoheiki("リソース", "resource")}}</em> と呼ばれます。クライアントがリソースを取得したいときは、URL を使用して要求します。サーバーはこの URL を、提供するものをバリエーションからひとつ選択するために使用します (それぞれのバリエーションは<em>{{Gengoheiki("表現", "representation")}}</em> と呼ばれます)。そして、特定の表現をクライアントに返します。それぞれの表現を含むすべてのリソースが、固有の URL を持ちます。リソースが呼び出されたときに特定の表現を選択する方法は<em>コンテンツネゴシエーション</em>によって決められ、クライアントサーバーの間で交渉する方法がいくつかあります。</p>

<p><img alt="" src="https://mdn.mozillademos.org/files/13789/HTTPNego.png" style="height:311px; width:767px" /></p>

<p>もっとも適した表現の決定は、以下の 2 つの仕組みのいずれかによって行われます:</p>

<ul>
 <li>クライアントによる特定の <a href="/ja/docs/Web/HTTP/Headers">HTTP ヘッダー</a> (<em>サーバー駆動型ネゴシエーション</em> または <em>プロアクティブネゴシエーション</em>)。これは、特定の種類のリソースで交渉を行う標準的な方法です。</li>
 <li>サーバーによる {{HTTPStatus("300")}} (Multiple Choices) または {{HTTPStatus("406")}} (Not Acceptable) <a href="/ja/docs/Web/HTTP/Status">HTTP レスポンスコード</a> (<em>エージェント駆動型ネゴシエーション</em> または <em>リアクティブネゴシエーション</em>)。これはフォールバック機構として使用します。</li>
</ul>

<p>数年来、<em>{{Gengoheiki("透過的コンテンツネゴシエーション", "transparent content negotiation")}}</em> や <code>Alternates</code> ヘッダーといった他のコンテンツネゴシエーションが提案されてきました。これらは支持を得られず、破棄されました。</p>

<h2 id="Server-driven_content_negotiation" name="Server-driven_content_negotiation">サーバー駆動型コンテンツネゴシエーション</h2>

<p><em>サーバー駆動型コンテンツネゴシエーション</em> またはプロアクティブコンテンツネゴシエーションでは、ブラウザー (または他のユーザーエージェント) が URI と共にいくつかの HTTP ヘッダーを送信します。これらのヘッダーは、ユーザーにとって好ましいものを表します。サーバーではそれらを手がかりとして使用して内部アルゴリズムが、クライアントに提供する最善のコンテンツを選択します。そのアルゴリズムはサーバーによって異なり、標準化されていません。例として、<a href="https://httpd.apache.org/docs/2.2/content-negotiation.html#methods">Apache 2.2 のネゴシエーションアルゴリズム</a> をご覧ください。</p>

<p><img alt="" src="https://mdn.mozillademos.org/files/13791/HTTPNegoServer.png" style="height:380px; width:767px" /></p>

<p>HTTP/1.1 標準では、サーバー駆動型ネゴシエーションを開始する標準ヘッダーの一覧 ({{HTTPHeader("Accept")}}, {{HTTPHeader("Accept-Charset")}}, {{HTTPHeader("Accept-Encoding")}}, {{HTTPHeader("Accept-Language")}}) を定義しています。厳密に言えば {{HTTPHeader("User-Agent")}} はこの一覧に含まれていませんが、要求したリソースの特定の表現を送信するために使用されることがあります。ただし、これはよい慣習ではないと考えられています。サーバーはどのヘッダーを実際にコンテンツネゴシエーションで使用したかを示すために {{HTTPHeader("Vary")}} ヘッダー (あるいは、より的確な関係があるレスポンスヘッダー) を使用します。これにより、<a href="/ja/docs/Web/HTTP/Caching">キャッシュ</a> が適切に機能します。</p>

<p>さらに、ネゴシエーションに使用できるヘッダーを追加する実験的な提案があり、<em>client hints</em> と呼ばれています。client hints は、ユーザーエージェントを実行しているデバイスがどのようなものか (例えば、デスクトップコンピューターかモバイルデバイスか) を伝えます。</p>

<p>サーバー駆動型コンテンツネゴシエーションはリソースの特定の表現を決定するためのもっとも一般的な方法ですが、いくつか欠点があります:</p>

<ul>
 <li>サーバーは、ブラウザーのことをすべて知っているわけではありません。Client Hints 拡張を加えても、ブラウザーの機能を完全には把握できません。クライアントが選択するリアクティブコンテンツネゴシエーションとは異なり、サーバーの選択はすべて若干独断的です。</li>
 <li>クライアントが提供する情報はかなり冗長であり (HTTP/2 のヘッダー圧縮は、この問題を緩和します)、またプライバシーのリスク (HTTP フィンガープリンティング) もあります。</li>
 <li>指定されたリソースの複数の表現を送信すると、共有キャッシュの効率が下がります。また、サーバーの実装はより複雑になります。</li>
</ul>

<h3 id="The_Accept_header" name="The_Accept_header"><code>Accept</code> ヘッダー</h3>

<p>{{HTTPHeader("Accept")}} ヘッダーは、エージェントが処理することを望むメディアリソースの MIME タイプを羅列します。これはカンマ区切りで MIME タイプで並べており、それぞれの MIME タイプは、別の MIME タイプとの相対的な優先度を示すパラメータであるクオリティファクターと結びつけられています。</p>

<p>{{HTTPHeader("Accept")}} ヘッダーはブラウザーまたは他のユーザーエージェントによって定義され、それは HTML ページ・画像・動画・スクリプトなど取得するものによって変わる場合があります。アドレスバーで指定したドキュメントを取得するときと {{HTMLElement("img")}}、{{HTMLElement("video")}}、{{HTMLElement("audio")}} 要素でリンクしたものを取得するときで、このヘッダーは異なります。ブラウザーはこのヘッダーで、最適と思われる値を自由に使用できます。<a href="/ja/docs/Web/HTTP/Content_negotiation/List_of_default_Accept_values">一般的なブラウザーの既定値</a> の包括的な一覧があります。</p>

<h3 id="The_Accept-CH_header_experimental_inline" name="The_Accept-CH_header_experimental_inline"><code>Accept-CH</code> ヘッダー {{experimental_inline}}</h3>

<div class="note">
<p>これは <em>Client Hints</em> と呼ばれる<strong>実験的</strong>な技術の一部であり、現在は Chrome 46 以降だけが実装しています。</p>
</div>

<p>実験的な {{HTTPHeader("Accept-CH")}} は、サーバーが適切なリソースを選択するために使用できる設定データを羅列します。有効な値は以下のとおりです:</p>

<table class="standard-table">
 <thead>
  <tr>
   <th scope="col">値</th>
   <th scope="col">意味</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><code>DPR</code></td>
   <td>クライアントのデバイスピクセル比を示します。</td>
  </tr>
  <tr>
   <td><code>Viewport-Width</code></td>
   <td>レイアウトビューポートの幅を CSS ピクセルで示します。</td>
  </tr>
  <tr>
   <td><code>Width</code></td>
   <td>リソースの幅を物理ピクセルで示します (言い換えると、画像の本来の幅です)。</td>
  </tr>
 </tbody>
</table>

<h3 id="The_Accept-Charset_header" name="The_Accept-Charset_header"><code>Accept-Charset</code> ヘッダー</h3>

<p>{{HTTPHeader("Accept-Charset")}} ヘッダーは、ユーザーエージェントが理解する文字エンコーディングが何かを、サーバーに示します。伝統的に、例えば西ヨーロッパロケールでは <code>ISO-8859-1,utf-8;q=0.7,*;q=0.7</code> のように、ブラウザーのために各ロケールへさまざまな値を設定していました。</p>

<p>UTF-8 が良好にサポートされるようになり、好ましい文字エンコード方法になっています。また、<a href="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy">設定に基づくエントロピーを減らしてプライバシーを保護するため</a>に、ほとんどのブラウザーが <code>Accept-Charset</code> ヘッダーを省略しています。Internet Explorer 8、Safari 5、Opera 11、Firefox 10 はこのヘッダーの使用を取りやめました。</p>

<h3 id="The_Accept-Encoding_header">The <code>Accept-Encoding</code> header</h3>

<p>The {{HTTPHeader("Accept-Encoding")}} header defines the acceptable content-encoding (supported compressions). The value is a q-factor list (e.g.: <code>br, gzip;q=0.8</code>) that indicates the priority of the encoding values. The default value <code>identity</code> is at the lowest priority (unless otherwise declared).</p>

<p>Compressing HTTP messages is one of the most important ways to improve the performance of a Web site, it shrinks the size of the data transmitted and makes better use of the available bandwidth; browsers always send this header and the server should be configured to abide to it and to use compression.</p>

<h3 id="The_Accept-Language_header">The <code>Accept-Language</code> header</h3>

<p>The {{HTTPHeader("Accept-Language")}} header is used to indicate the language preference of the user. It is a list of values with quality factors (like: <code>"de, en;q=0.7</code>"). A default value is often set according the language of the graphical interface of the user agent, but most browsers allow to set different language preferences.</p>

<p>Due to the <a href="https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy">configuration-based entropy</a> increase, a modified value can be used to fingerprint the user, it is not recommended to change it and a Web site cannot trust this value to reflect the actual wish of the user. Site designers must not be over-zealous by using language detection via this header as it can lead to a poor user experience:</p>

<ul>
 <li>They should always provide a way to overcome the server-chosen language, e.g., by providing a language menu on the site. Most user-agents provide a default value for the <code>Accept-Language</code> header, adapted to the user interface language and end users often do not modify it, either by not knowing how, or by not being able to do it, as in an Internet café for instance.</li>
 <li>Once a user has overridden the server-chosen language, a site should no longer use language detection and should stick with the explicitly-chosen language. In other words, only entry pages of a site should select the proper language using this header.</li>
</ul>

<h3 id="The_User-Agent_header">The <code>User-Agent</code> header</h3>

<div class="note">
<p>Though there are legitimate uses of this header for selecting content, <a href="/en-US/docs/Web/HTTP/Browser_detection_using_the_user_agent">it is considered bad practice</a> to rely on it to define what features are supported by the user agent.</p>
</div>

<p>The {{HTTPHeader("User-Agent")}} header identifies the browser sending the request. This string may contain a space-separated list of <em>product tokens</em> and <em style="font-style:italic">comments</em>.</p>

<p>A <em>product token</em> is a name followed by a '<code>/</code>' and a version number, like <code>Firefox/4.0.1</code>. There may be as many of them as the user-agent wants. A <em>comment</em> is a free string delimited by parentheses. Obviously parentheses cannot be used in that string. The inner format of a comment is not defined by the standard, though several browser put several tokens in it, separated by '<code>;</code>'.</p>

<h3 id="The_Vary_response_header">The <code>Vary</code> response header</h3>

<p>In opposition to the previous <code>Accept-*</code> headers which are sent by the client, the {{HTTPHeader("Vary")}} HTTP header is sent by the web server in its response. It indicates the list of headers used by the server during the server-driven content negotiation phase. The header is needed in order to inform the cache of the decision criteria so that can reproduce it, allowing the cache to be functional while preventing serving erroneous content to the user.</p>

<p>The special value of '<code>*</code>' means that the server-driven content negotiation also uses information not conveyed in a header to choose the appropriate content.</p>

<p>The <code>Vary</code> header was added in the version 1.1 of HTTP and is necessary in order to allow caches to work appropriately. A cache, in order to work with agent-driven content negotiation, needs to know which criteria was used by the server to select the transmitted content. That way, the cache can replay the algorithm and will be able to serve acceptable content directly, without more request to the server. Obviously, the wildcard '<code>*</code>' prevents caching from occurring, as the cache cannot know what element is behind it.</p>

<h2 id="Agent-driven_negotiation">Agent-driven negotiation</h2>

<p>Server-driven negotiation suffers from a few downsides: it doesn't scale well. There is one header per feature used in the negotiation. If you want to use screen size, resolution or other dimensions, a new HTTP header must be created. Sending of the headers must be done on every request. This is not too problematic with few headers, but with the eventual multiplications of them, the message size would lead to a decrease in performance. The more precise headers are sent, the more entropy is sent, allowing for more HTTP&nbsp;fingerprinting and corresponding privacy concern.</p>

<p>From the beginnings of HTTP, the protocol allowed another negotiation type: <em>agent-driven negotiation</em> or <em>reactive negotiation</em>. In this negotiation, when facing an ambiguous request, the server sends back a page containing links to the available alternative resources. The user is presented the resources and choose the one to use.</p>

<p><img alt="" src="https://mdn.mozillademos.org/files/13795/HTTPNego3.png" /></p>

<p>Unfortunately, the HTTP standard does not specify the format of the page allowing to choose between the available resource, which prevents to easily automatize the process. Besides falling back to the <em>server-driven negotiation</em>, this method is almost always used in conjunction with scripting, especially with JavaScript redirection: after having checked for the negotiation criteria, the script performs the redirection. A second problem is that one more request is needed in order to fetch the real resource, slowing the availability of the resource to the user.</p>

このリビジョンへ戻す

Revision 1134217 of コンテンツネゴシエーション

このリビジョンの内容

コンテンツネゴシエーションの原理

サーバー駆動型コンテンツネゴシエーション

Accept ヘッダー

Accept-CH ヘッダー {{experimental_inline}}

Accept-Charset ヘッダー

The Accept-Encoding header

The Accept-Language header

The User-Agent header

The Vary response header