Please note, this is a STATIC archive of website developer.mozilla.org from 03 Nov 2016, cach3.com does not collect or store any user information, there is no "phishing" involved.

Revision 1039678 of WebVTT

  • Revision slug: Web/API/Web_Video_Text_Tracks_Format
  • Revision title: WebVTT
  • Revision id: 1039678
  • Created:
  • Creator: Ac1521
  • Is current revision? No
  • Comment

Revision Content

{{HTMLVersionHeader("5")}}

Introduction to WebVTT

WebVTT stands for Web Video Text Tracks. This is a W3C standard which was introduced so that a standard for video text tracks could be introduced and can be embedded into browsers as a standard feature. This is an API which is used in junction with HTML <track> tag. The simple example of WebVTT can be taken as captions and subtitles which can be edited and arranged in a separate WebVTT file. The simple purpose of this can be a time synchronized subtitles with audio or video data which is being run on any website at any given instance of time.

WebVTT is a format for displaying timed text tracks (e.g. subtitles) with the {{HTMLElement("track")}} element. The primary purpose of WebVTT files is to add subtitles to a {{HTMLElement("video")}}.

WebVTT is a text based format. A WebVTT file must be encoded in UTF-8 format. Where you can use spaces you can also use tabs.

The mime type of WebVTT is text/vtt.

An example of WebVTT is given below:

WEBVTT

00:11.000 --> 00:13.000

<v Roger Bingham>We are in New York City

00:13.000 --> 00:16.000

<v Roger Bingham>We’re actually at the Lucern Hotel, just down the street

00:16.000 --> 00:18.000

<v Roger Bingham>from the American Museum of Natural History

00:18.000 --> 00:20.000

<v Roger Bingham>And with me is Neil deGrasse Tyson

00:20.000 --> 00:22.000

<v Roger Bingham>Astrophysicist, Director of the Hayden Planetarium

00:22.000 --> 00:24.000

<v Roger Bingham>at the AMNH.

The WebVTT code contains the cues which can be either a single line or multiple lines. As given below:

WEBVTT


00:01.000 --> 00:04.000

Never drink liquid nitrogen.


00:05.000 --> 00:09.000

— It will perforate your stomach.

— You could die.

WebVTT Body

The structure of a WebVTT file requires two things and has four optional components.

  • An optional byte order mark (BOM)
  • The string WEBVTT
  • An optional text header to the right of WEBVTT.
    • There must be at least one space after WEBVTT
    • You might use this to add a description to the file
    • You may use anything except newlines or the string "-->"
  • A blank line, which is equivalent to two consecutive newlines.
  • Zero or more cues or comments.
  • Zero or more blank lines
Example 1 - Simplest possible WEBVTT file
WEBVTT
Example 2 - Very simple WebVTT file
WEBVTT - This file has no cues.
Example 3 - Common WebVTT example
WEBVTT - This file has cues.

14
00:01:14.815 --> 00:01:18.114
- What?
- Where are we now?

15
00:01:18.171 --> 00:01:20.991
- This is big bat country.

16
00:01:21.058 --> 00:01:23.868
- [ Bats Screeching ]
- They won't get in your hair. They're after the bugs.

Inner Structure of WebVTT file

The structure of writing the caption can be understood from above code, where the first time is starting time for showing the caption under and and the second time after à is ending time for that particular caption. Then from next line with starting symbol of hyphen (-) a caption can be specified. We can alos place the comments before or after the caption section in order to use them for information just like in any other programming language to remind things as we work through large programs. Comments start with the NOTE keyword. Comments can either be dingle or multiple line. The format for specifying both types of comments is same.

WebVTT stands for Web Video Text Tracks. This is a W3C standard which was introduced so that a standard for video text tracks could be introduced and can be embedded into browsers as a standard feature. This is an API which is used in junction with HTML <track> tag. The simple example of WebVTT can be taken as captions and subtitles which can be edited and arranged in a separate WebVTT file. The simple purpose of this can be a time synchronized subtitles with audio or video data which is being run on any website at any given instance of time. 

From following examples we can observe the general structure of WebVTT files.

WEBVTT

00:11.000 --> 00:13.000

<v Roger Bingham>We are in New York City

00:13.000 --> 00:16.000

<v Roger Bingham>We’re actually at the Lucern Hotel, just down the street

00:16.000 --> 00:18.000

<v Roger Bingham>from the American Museum of Natural History

00:18.000 --> 00:20.000

<v Roger Bingham>And with me is Neil deGrasse Tyson

00:20.000 --> 00:22.000

<v Roger Bingham>Astrophysicist, Director of the Hayden Planetarium

00:22.000 --> 00:24.000

<v Roger Bingham>at the AMNH.

The WebVTT code contains the cues which can be either a single line or multiple lines. As given below:

WEBVTT


00:01.000 --> 00:04.000

Never drink liquid nitrogen.


00:05.000 --> 00:09.000

— It will perforate your stomach.

— You could die.

The structure of writing the caption can be understood from above code, where the first time is starting time for showing the caption under and and the second time after à is ending time for that particular caption. Then from next line with starting symbol of hyphen (-) a caption can be specified. We can alos place the comments before or after the caption section in order to use them for information just like in any other programming language to remind things as we work through large programs. Comments start with the NOTE keyword. Comments can either be dingle or multiple line. The format for specifying both types of comments is same.

WebVTT Comment

Comments are an optional component that can be used to add information to a WebVTT file. Comments are intended for those reading the file and are not seen by users. Comments may contain newlines but it cannot contain a blank line, which is equivalent to two consecutive newlines. A blank line signifies the end of a comment.

A comment cannot contain the string "-->", the ampersand character (&), or the less-than sign (<). Instead use the escape sequence "&amp;" for ampersand and "&lt;" for less-than. It is also recommended that you use the greater-than escape sequence "&gt;" instead of the greater-than character (>) to avoid confusion with tags.

A comment consists of three parts:

  • The string NOTE
  • A space or a newline
  • Zero or more characters other than those noted above
Example 4 - Common WebVTT example
NOTE This is a comment
Example 5 - Multi-line comment
NOTE
Another comment that is spanning
more than one line.

NOTE You can also make a comment
across more than one line this way.
Example 6 - Common comment usage
WEBVTT - Translation of that film I like

NOTE
This translation was done by Kyle so that
some friends can watch it with their parents.

1
00:02:15.000 --> 00:02:20.000
- Ta en kopp varmt te.
- Det är inte varmt.

2
00:02:20.000 --> 00:02:25.000
- Har en kopp te.
- Det smakar som te.  

NOTE This last line may not translate well.

3
00:02:25.000 --> 00:02:30.000
-Ta en kopp

Usage

We can also embed the CSS files within HTML files so that the cues which are attached to that page can be format according to the look and feel of web design.  The style can be defined in head part of HTML page where the track file can be listed using the track tag and src attribute to make a link between WebVTT file and HTML page. It must be noted that the file format for WebVTT is ‘.vtt’. This can be done as follows:

<!doctype html>

<html>

 <head>

  <title>Styling WebVTT cues</title>

  <style>

   video::cue {

     background-image: linear-gradient(to bottom, dimgray, lightgray);

     color: papayawhip;

   }

   video::cue(b) {

     color: peachpuff;

   }

  </style>

 </head>

 <body>

  <video controls autoplay src="video.webm">

   <track default src="track.vtt">

  </video>

 </body>

</html>

Above is an example of using the CSS style sheet for formatting but it is used in HTML file, another way to do it is to define the style directly in WebVTT file, which is advantageous in case when we are changing the look and feel of HTML pages but we want the style to be coherent and consistent throughout the website with same design as it used before. The example of styling within the WebVTT file using CSS is given below:

WEBVTT


STYLE

::cue {

  background-image: linear-gradient(to bottom, dimgray, lightgray);

  color: papayawhip;

}

/* Style blocks cannot use blank lines nor "dash dash greater than" */


NOTE comment blocks can be used between style blocks.


STYLE

::cue(b) {

  color: peachpuff;

}


hello

00:00:00.000 --> 00:00:10.000

Hello <b>world</b>.


NOTE style blocks cannot appear after the first cue.

We can also use the identifier in WebVTT file which can be used for defining a new style for some particular cues in file. The example where we wanted the transcription text to be red highlighted and the other part to remain normal, we can define it as follows using CSS. Where it must be noted that the CSS uses escape sequences the way they are used in HTML pages:

WEBVTT


1

00:00.000 --> 00:02.000

That’s an, an, that’s an L!


crédit de transcription

00:04.000 --> 00:05.000

Transcrit par Célestes™

::cue(#\31) { color: lime; }

::cue(#crédit\ de\ transcription) { color: red; }

Positioning of captions or subtitles is also supported and is done the way we do it in CSS as shown below:

WEBVTT


00:00:00.000 --> 00:00:04.000 position:10%,line-left align:left size:35%

Where did he go?


00:00:03.000 --> 00:00:06.500 position:90% align:right size:35%

I think he went down this lane.


00:00:04.000 --> 00:00:06.500 position:45%,line-right align:center size:35%

What are you waiting for?

The connection between the HTML, CSS and WebVTT files is simple as making a connection between the particular files the way we reference an image link using <a> tag and src attribute for image. Similar method is followed to attach the WebVTT file to any HTML page with a difference that WebVTT uses the <track> tag instead of <a>.

WebVTT Cues

A cue is a single subtitle block that has a single start time, end time, and textual payload. Example 6 consists of the header, a blank line, and then five cues separated by blank lines. A cue consists of five components:

  • An optional cue identifier followed by a newline
  • Cue timings
  • Optional cue settings with at least one space before the first and between each setting
  • One or more newlines
  • The cue payload text
Example 7 - Example of a cue
1 - Title Crawl
00:00:5.000 --> 00:00:10.000 line:0 position:20% size:60% align:start
Some time ago in a place rather distant....

Cue Identifier

The identifier is a name that identifies the cue. It can be used to reference the cue from a script. It must not contain a newline and cannot contain the string "-->". It must end with a single newline. They do not have to be unique, although it is common to number them (e.g. 1, 2, 3, ...).

Example 8 - Cue identifier from Example 7
1 - Title Crawl
Example 9 - Common usage of identifiers
WEBVTT

1
00:00:22.230 --> 00:00:24.606
This is the first subtitle.

2
00:00:30.739 --> 00:00:34.074
This is the second.

3
00:00:34.159 --> 00:00:35.743
Third

Cue Timings

A cue timing indicates when the cue is shown. It has a start and end time which are represented by timestamps. The end time must be greater than the start time, and the start time must be greater than or equal to all previous start times. Cues may have overlapping timings.

If the WebVTT file is being used for chapters ({{HTMLElement("track")}} {{htmlattrxref("kind")}} is chapters) then the file cannot have overlapping timings.

Each cue timing contains five components:

  • Timestamp for start time
  • At least one space
  • The string "-->"
  • At least one space
  • Timestamp for end time
    • Which must be greater than the start time

The timestamps must be in one of two formats:

  • mm:ss.ttt
  • hh:mm:ss.ttt

Where the components are defined as follows:

  • hh is hours
    • Must be at least two digits
    • Hours can be greater than two digits (e.g. 9999:00:00.000)
  • mm is minutes
    • Must be between 00 and 59 inclusive
  • ss is senconds
    • Must be between 00 and 59 inclusive
  • ttt is miliseconds
    • Must be between 000 and 999 inclusive
Example 10 - Basic cue timing examples
00:22.230 --> 00:24.606
00:30.739 --> 00:00:34.074
00:00:34.159 --> 00:35.743
00:00:35.827 --> 00:00:40.122
Example 11 - Overlapping cue timing examples
00:00:00.000 --> 00:00:10.000
00:00:05.000 --> 00:01:00.000
00:00:30.000 --> 00:00:50.000
Example 12 - Non-overlapping cue timing examples
00:00:00.000 --> 00:00:10.000
00:00:10.000 --> 00:01:00.581
00:01:00.581 --> 00:02:00.100
00:02:01.000 --> 00:02:01.000

Cue Settings

Cue settings are optional components used to position where the cue payload text will be displayed over the video. This includes whether the text is displayed horizontally or vertically. There can be zero or more of them, and they can be used in any order so long as each setting is used no more than once.

The cue settings are added to the right of the cue timings. There must be one or more spaces between the cue timing and the first setting and between each setting. A setting's name and value are separated by a colon. The settings are case sensitive so use lower case as shown. There are five cue settings:

  • vertical
    • Indicates that the text will be displayed vertically rather than horizontally, such as in some Asian languages.
    Table 1 - vertical values
    vertical:rl writing direction is right to left
    vertical:lr writing direction is left to right
  • line
    • Specifies where text appears vertically. If vertical is set, line specifies where text appears horizontally.
    • Value can be a line number
      • The line height is the height of the first line of the cue as it appears on the video
      • Positive numbers indicate top down
      • Negative numbers indicate bottom up
    • Or value can be a percentage
      • Must be an integer (i.e. no decimals) between 0 and 100 inclusive
      • Must be followed by a percent sign (%)
    Table 2 - line examples
      vertical omitted vertical:rl vertical:lr
    line:0 top right left
    line:-1 bottom left right
    line:0% top right left
    line:100% bottom left right
  • position
    • Specifies where the text will appear horizontally. If vertical is set, position specifies where the text will appear vertically.
    • Value is a percentage
    • Must be an integer (no decimals) between 0 and 100 inclusive
    • Must be followed by a percent sign (%)
    Table 3 - position examples
      vertical omitted vertical:rl vertical:lr
    position:0% left top top
    position:100% right bottom bottom
  • size
    • Specifies the width of the text area. If vertical is set, size specifies the height of the text area.
    • Value is a percentage
    • Must be an integer (i.e. no decimals) between 0 and 100 inclusive
    • Must be followed by a percent sign (%)
    Table 4 - size examples
      vertical omitted vertical:rl vertical:lr
    size:100% full width full height full height
    size:50% half width half height half height
  • align
    • Specifies the alignment of the text. Text is aligned within the space given by the size cue setting if it is set.
    Table 5 - align values
      vertical omitted vertical:rl vertical:lr
    align:start left top top
    align:middle centred horizontally centred vertically centred vertically
    align:end right bottom bottom
Example 13 - Cue setting examples

The first line demonstrates no settings. The second line might be used to overlay text on a sign or label. The third line might be used for a title. The last line might be used for an Asian language.

00:00:5.000 --> 00:00:10.000
00:00:5.000 --> 00:00:10.000 line:63% position:72% align:start
00:00:5.000 --> 00:00:10.000 line:0 position:20% size:60% align:start
00:00:5.000 --> 00:00:10.000 vertical:rt line:-1 align:end

Cue Payload

The payload is where the main information or content is located. In normal usage the payload contains the subtitles to be displayed. The payload text may contain newlines but it cannot contain a blank line, which is equivalent to two consecutive newlines. A blank line signifies the end of a cue.

A cue text payload cannot contain the string "-->", the ampersand character (&), or the less-than sign (<). Instead use the escape sequence "&amp;" for ampersand and "&lt;" for less-than. It is also recommended that you use the greater-than escape sequence "&gt;" instead of the greater-than character (>) to avoid confusion with tags. If you are using the WebVTT file for metadata these restrictions do not apply.

In addition to the three escape sequences mentioned above, there are fours others. They are listed in the table below.

Table 6 - Escape sequences
Name Character Escape Sequence
Ampersand & &amp;
Less-than < &lt;
Greater-than > &gt;
Left-to-right mark   &lrm;
Right-to-left mark   &rlm;
Non-breaking space   &nbsp;

Cue Payload Text Tags

There are a number of tags, such as <bold>, that can be used. However, if the WebVTT file is used in a {{HTMLElement("track")}} element where the attribute {{htmlattrxref("kind")}} is chapters then you cannot use tags.

  • Timestamp tag
    • The timestamp must be greater that the cue's start timestamp, greater than any previous timestamp in the cue payload, and less than the cue's end timestamp. The active text is the text between the timestamp and the next timestamp or to the end of the payload if there is not another timestamp in the payload. Any text before the active text in the payload is previous text . Any text beyond the active text is future text . This enables karaoke style captions.
    Example 12 - Karaoke style text
    1
    00:16.500 --> 00:18.500
    When the moon <00:17.500>hits your eye
    
    1
    00:00:18.500 --> 00:00:20.500
    Like a <00:19.000>big-a <00:19.500>pizza <00:20.000>pie
    
    1
    00:00:20.500 --> 00:00:21.500
    That's <00:00:21.000>amore
          

The following tags are the HTML tags allowed in a cue and require opening and closing tags (e.g. <b>text</b>).

  • Class tag (<c></c>)
    • Style the contained text using a CSS class.
    Example 14 - Class tag
    <c.classname>text</c>
  • Italics tag (<i></i>)
    • Italicize the contained text.
    Example 15 - Italics tag
    <i>text</i>
  • Bold tag (<b></b>)
    • Bold the contained text.
    Example 16 - Bold tag
    <b>text</b>
  • Underline tag (<u></u>)
    • Underline the contained text.
    Example 17 - Underline tag
    <u>text</u>
  • Ruby tag (<ruby></ruby>)
    • Used with ruby text tags to display ruby characters (i.e. small annotative characters above other characters).
    Example 18 - Ruby tag
    <ruby>WWW<rt>World Wide Web</rt>oui<rt>yes</rt></ruby>
  • Ruby text tag (<rt></rt>)
    • Used with ruby tags to display ruby characters (i.e. small annotative characters above other characters).
    Example 19 - Ruby text tag
    <ruby>WWW<rt>World Wide Web</rt>oui<rt>yes</rt></ruby>
  • Voice tag (<v></v>)
    • Similar to class tag, also used to style the contained text using CSS.
    Example 20 - Voice tag
    <v Bob>text</v>

Interfaces

There are two interfaces or APIs used in WebVTT which are:

VTTCue interface

It is used for providing an interface in Document Object Model API, where different attributes supported by it can be used to prepare and alter the cues in number of ways.

Constructor is the first point for starting the Cue which is defined using the default constructorVTTCue(startTime, endTime, text) where starting time, ending time and text for cue can be adjusted. After that we can set the region for that particular cue to which this cue belongs using cue.region. Vertical, horizontal, line, lineAlign, Position, positionAlign, text, size and Align can be used to alter the cue and its formation, just like we can alter the objects form, shape and visibility in HTML using CSS. But the VTTCue interface is within the WebVTT provides the vast range of adjustment variables which can be used directly to alter the Cue. Following interface can be used to expose WebVTT cues in DOM API:

enum AutoKeyword { "auto" };

enum DirectionSetting { "" /* horizontal */, "rl", "lr" };

enum LineAlignSetting { "start", "center", "end" };

enum PositionAlignSetting { "start", "center", "end", "auto" };

enum AlignSetting { "start", "center", "end", "left", "right" };

[Constructor(double startTime, double endTime, DOMString text)]

interface VTTCue : TextTrackCue {

  attribute VTTRegion? region;

  attribute DirectionSetting vertical;

  attribute boolean snapToLines;

  attribute (double or AutoKeyword) line;

  attribute LineAlignSetting lineAlign;

  attribute (double or AutoKeyword) position;

  attribute PositionAlignSetting positionAlign;

  attribute double size;

  attribute AlignSetting align;

  attribute DOMString text;

  DocumentFragment getCueAsHTML();

};

VTT Region interface

This is the second interface in WebVTT API.

The new keyword can be used for defining a new VTTRegion object which can then be used for containing the multiple cues. There are several properties of VTTRegion which are width, lines, regionAnchorX, RegionAnchorY, viewportAnchorX, viewportAnchorY and scroll that can be used to specify the look and feel of this VTT region. The interface code is given below which can be used to expose the WebVTT regions in DOM API:

enum ScrollSetting { "" /* none */, "up" };

[Constructor]

interface VTTRegion {

  attribute double width;

  attribute long lines;

  attribute double regionAnchorX;

  attribute double regionAnchorY;

  attribute double viewportAnchorX;

  attribute double viewportAnchorY;

  attribute ScrollSetting scroll;

};

Methods and properties

The methods used in WebVTT are those which are used to alter the cue or region as the attributes for both interfaces are different. We can categorize them for better understanding relating to each interface in WebVTT:

  • VTTCue

    • The methods which are available in this interface are:
      • GetCueAsHTML to get the HTML of that Cue.
      • VTT Constructor for creating new objects of Cues
      • Autokeyword
      • DirectionSetting: to set the direction of caption or text in a file
      • LineAlignment: to adjust the line alignment
      • PositionAlignSetting: to adjust the position of text
  • VTTRegion

    • The methods used for region are listed below along with description of their functionality:
      • ScrollSetting: For adjusting the scrolling setting of all nodes present in given region
      • VTT Region Constructor: for construction of new VTT Regions

Tutorial on how to write a WebVTT file

There are few steps that can be followed to write a simple webVTT file. Before start, it must be noted that you can make use of a notepad and then save the file as ‘.vtt’ file. Steps are given below:

  1. Open a notepad.
  2. The first line of WebVTT is standardized similar in the way some other languages require you to put headers as the file starts to indicate the file type. One the very first line you have to write
‘WEBVTT’.

      3. Leave the second line blank, one third line the time for first cue can is to be specified. For example, for a first que starting at time 1 second and ending at 5 second, it is written as:

‘00: 01.000 00: 05.000’
  1. On the next line you can write the caption for this cue which will run from 1 sec to 5th sec.
  2. Following the similar steps, a complete WebVTT file for specific video or audio file can be made.

CSS Pseudo-classes

CSS pseudo classes allow us to classify the type of object which we want to differentiate from other types of objects. It works in similar manner in WebVTT files as it works in HTML file.

It is one of the good features supported by WebVTT is the localization and use of class elements which can be used in same way they are used in HTML and CSS to classify the style for particular type of objects, but here these are used for styling and classifying the Cues as shown below:

WEBVTT


04:02.500 --> 04:05.000

J’ai commencé le basket à l'âge de 13, 14 ans


04:05.001 --> 04:07.800

Sur les <i.foreignphrase><lang en>playground</lang></i>, ici à Montpellier

In the above example it can be observed that we can use the identifier and pseudo class name for defining the language of caption, where <i> tag is for italics.

The type of pseudo class is determined by the selector it is using and working is similar in nature as it works in HTML. Following CSS pseudo classes can be used:

  • Lang (Lanugage): e.g. p:lang(it)
  • Link: e.g. a:link
  • Nth-last-child: e.g. p:nth-last-child(2)
  • Nth-child(n): e.g. p:nth-child(2)

Where p and a are the tags which are used in HTML for paragraph and link, respectively and they can be replaced by identifiers which are used for Cues in WebVTT file.

Specifications

Specification Status Comment
{{SpecName("WebVTT")}} {{Spec2("WebVTT")}} Initial definition

Compatibility

{{CompatibilityTable}}

Feature Chrome Firefox (Gecko) Internet Explorer Opera Safari
Basic support 18 28 10 15.0 7
Feature Android Firefox Mobile (Gecko) IChrome for Mobile Opera Mobile Safari Mobile
Basic support 4.4 {{CompatNo}} 35.0 21.0 7

 

Revision Source

<div>{{HTMLVersionHeader("5")}}</div>

<h2>Introduction to WebVTT</h2>

<p>WebVTT stands for Web Video Text Tracks. This is a W3C standard which was introduced so that a standard for video text tracks could be introduced and can be embedded into browsers as a standard feature. This is an API which is used in junction with HTML &lt;track&gt; tag. The simple example of WebVTT can be taken as captions and subtitles which can be edited and arranged in a separate WebVTT file. The simple purpose of this can be a time synchronized subtitles with audio or video data which is being run on any website at any given instance of time.</p>

<p>WebVTT is a format for displaying timed text tracks (e.g. subtitles) with the {{HTMLElement("track")}} element. The primary purpose of WebVTT files is to add subtitles to a {{HTMLElement("video")}}.</p>

<p>WebVTT is a text based format. A WebVTT file must be encoded in UTF-8 format. Where you can use spaces you can also use tabs.</p>

<p>The mime type of WebVTT is <code>text/vtt</code>.</p>

<p>An example of WebVTT is given below:</p>

<pre>
WEBVTT

00:11.000 --&gt; 00:13.000

&lt;v Roger Bingham&gt;We are in New York City

00:13.000 --&gt; 00:16.000

&lt;v Roger Bingham&gt;We’re actually at the Lucern Hotel, just down the street

00:16.000 --&gt; 00:18.000

&lt;v Roger Bingham&gt;from the American Museum of Natural History

00:18.000 --&gt; 00:20.000

&lt;v Roger Bingham&gt;And with me is Neil deGrasse Tyson

00:20.000 --&gt; 00:22.000

&lt;v Roger Bingham&gt;Astrophysicist, Director of the Hayden Planetarium

00:22.000 --&gt; 00:24.000

&lt;v Roger Bingham&gt;at the AMNH.</pre>

<p>The WebVTT code contains the cues which can be either a single line or multiple lines. As given below:</p>

<pre>
WEBVTT


00:01.000 --&gt; 00:04.000

Never drink liquid nitrogen.


00:05.000 --&gt; 00:09.000

— It will perforate your stomach.

— You could die.
</pre>

<h2 id="WebVTT_Body">WebVTT Body</h2>

<p>The structure of a WebVTT file requires two things and has four optional components.</p>

<ul>
 <li>An optional byte order mark (BOM)</li>
 <li>The string <code>WEBVTT</code></li>
 <li>An optional text header to the right of <code>WEBVTT</code>.
  <ul>
   <li>There must be at least one space after <code>WEBVTT</code></li>
   <li>You might use this to add a description to the file</li>
   <li>You may use anything except newlines or the string "<code>--&gt;"</code></li>
  </ul>
 </li>
 <li>A blank line, which is equivalent to two consecutive newlines.</li>
 <li>Zero or more cues or comments.</li>
 <li>Zero or more blank lines</li>
</ul>

<h5 id="Example_1_-_Simplest_possible_WEBVTT_file">Example 1 - Simplest possible WEBVTT file</h5>

<pre class="eval">
WEBVTT
</pre>

<h5 id="Example_2_-_Very_simple_WebVTT_file">Example 2 - Very simple WebVTT file</h5>

<pre class="eval">
WEBVTT - This file has no cues.
</pre>

<h5 id="Example_3_-_Common_WebVTT_example">Example 3 - Common WebVTT example</h5>

<pre class="eval">
WEBVTT - This file has cues.

14
00:01:14.815 --&gt; 00:01:18.114
- What?
- Where are we now?

15
00:01:18.171 --&gt; 00:01:20.991
- This is big bat country.

16
00:01:21.058 --&gt; 00:01:23.868
- [ Bats Screeching ]
- They won't get in your hair. They're after the bugs.
</pre>

<h3>Inner Structure of WebVTT file</h3>

<p>The structure of writing the caption can be understood from above code, where the first time is starting time for showing the caption under and and the second time after à is ending time for that particular caption. Then from next line with starting symbol of hyphen (-) a caption can be specified. We can alos place the comments before or after the caption section in order to use them for information just like in any other programming language to remind things as we work through large programs. Comments start with the NOTE keyword. Comments can either be dingle or multiple line. The format for specifying both types of comments is same.</p>

<p>WebVTT stands for Web Video Text Tracks. This is a W3C standard which was introduced so that a standard for video text tracks could be introduced and can be embedded into browsers as a standard feature. This is an API which is used in junction with HTML &lt;track&gt; tag. The simple example of WebVTT can be taken as captions and subtitles which can be edited and arranged in a separate WebVTT file. The simple purpose of this can be a time synchronized subtitles with audio or video data which is being run on any website at any given instance of time.&nbsp;</p>

<p>From following examples we can observe the general structure of WebVTT files.</p>

<pre>
WEBVTT

00:11.000 --&gt; 00:13.000

&lt;v Roger Bingham&gt;We are in New York City

00:13.000 --&gt; 00:16.000

&lt;v Roger Bingham&gt;We’re actually at the Lucern Hotel, just down the street

00:16.000 --&gt; 00:18.000

&lt;v Roger Bingham&gt;from the American Museum of Natural History

00:18.000 --&gt; 00:20.000

&lt;v Roger Bingham&gt;And with me is Neil deGrasse Tyson

00:20.000 --&gt; 00:22.000

&lt;v Roger Bingham&gt;Astrophysicist, Director of the Hayden Planetarium

00:22.000 --&gt; 00:24.000

&lt;v Roger Bingham&gt;at the AMNH.</pre>

<p>The WebVTT code contains the cues which can be either a single line or multiple lines. As given below:</p>

<pre>
WEBVTT


00:01.000 --&gt; 00:04.000

Never drink liquid nitrogen.


00:05.000 --&gt; 00:09.000

— It will perforate your stomach.

— You could die.</pre>

<p>The structure of writing the caption can be understood from above code, where the first time is starting time for showing the caption under and and the second time after à is ending time for that particular caption. Then from next line with starting symbol of hyphen (-) a caption can be specified. We can alos place the comments before or after the caption section in order to use them for information just like in any other programming language to remind things as we work through large programs. Comments start with the NOTE keyword. Comments can either be dingle or multiple line. The format for specifying both types of comments is same.</p>

<h2 id="WebVTT_Comment">WebVTT Comment</h2>

<p>Comments are an optional component that can be used to add information to a WebVTT file. Comments are intended for those reading the file and are not seen by users. Comments may contain newlines but it cannot contain a blank line, which is equivalent to two consecutive newlines. A blank line signifies the end of a comment.</p>

<p>A comment cannot contain the string "<code>--&gt;",</code>&nbsp;the ampersand character (&amp;), or the less-than sign (&lt;). Instead use the escape sequence "&amp;amp;" for ampersand and "&amp;lt;" for less-than. It is also recommended that you use the greater-than escape sequence "&amp;gt;" instead of the greater-than character (&gt;) to avoid confusion with tags.</p>

<p>A comment consists of three parts:</p>

<ul>
 <li>The string <code>NOTE</code></li>
 <li>A space or a newline</li>
 <li>Zero or more characters other than those noted above</li>
</ul>

<h5 id="Example_4_-_Common_WebVTT_example">Example 4 - Common WebVTT example</h5>

<pre class="eval">
NOTE This is a comment
</pre>

<h5 id="Example_5_-_Multi-line_comment">Example 5 - Multi-line comment</h5>

<pre class="eval">
NOTE
Another comment that is spanning
more than one line.

NOTE You can also make a comment
across more than one line this way.
</pre>

<h5 id="Example_6_-_Common_comment_usage">Example 6 - Common comment usage</h5>

<pre class="eval">
WEBVTT - Translation of that film I like

NOTE
This translation was done by Kyle so that
some friends can watch it with their parents.

1
00:02:15.000 --&gt; 00:02:20.000
- Ta en kopp varmt te.
- Det är inte varmt.

2
00:02:20.000 --&gt; 00:02:25.000
- Har en kopp te.
- Det smakar som te.  

NOTE This last line may not translate well.

3
00:02:25.000 --&gt; 00:02:30.000
-Ta en kopp

</pre>

<h2>Usage</h2>

<p>We can also embed the CSS files within HTML files so that the cues which are attached to that page can be format according to the look and feel of web design.&nbsp; The style can be defined in head part of HTML page where the track file can be listed using the track tag and src attribute to make a link between WebVTT file and HTML page. It must be noted that the file format for WebVTT is ‘.vtt’. This can be done as follows:</p>

<pre>
&lt;!doctype html&gt;

&lt;html&gt;

 &lt;head&gt;

  &lt;title&gt;Styling WebVTT cues&lt;/title&gt;

  &lt;style&gt;

   video::cue {

     background-image: linear-gradient(to bottom, dimgray, lightgray);

     color: papayawhip;

   }

   video::cue(b) {

     color: peachpuff;

   }

  &lt;/style&gt;

 &lt;/head&gt;

 &lt;body&gt;

  &lt;video controls autoplay src="video.webm"&gt;

   &lt;track default src="track.vtt"&gt;

  &lt;/video&gt;

 &lt;/body&gt;

&lt;/html&gt;</pre>

<p>Above is an example of using the CSS style sheet for formatting but it is used in HTML file, another way to do it is to define the style directly in WebVTT file, which is advantageous in case when we are changing the look and feel of HTML pages but we want the style to be coherent and consistent throughout the website with same design as it used before. The example of styling within the WebVTT file using CSS is given below:</p>

<pre>
WEBVTT


STYLE

::cue {

  background-image: linear-gradient(to bottom, dimgray, lightgray);

  color: papayawhip;

}

/* Style blocks cannot use blank lines nor "dash dash greater than" */


NOTE comment blocks can be used between style blocks.


STYLE

::cue(b) {

  color: peachpuff;

}


hello

00:00:00.000 --&gt; 00:00:10.000

Hello &lt;b&gt;world&lt;/b&gt;.


NOTE style blocks cannot appear after the first cue.</pre>

<p>We can also use the identifier in WebVTT file which can be used for defining a new style for some particular cues in file. The example where we wanted the transcription text to be red highlighted and the other part to remain normal, we can define it as follows using CSS. Where it must be noted that the CSS uses escape sequences the way they are used in HTML pages:</p>

<pre>
WEBVTT


1

00:00.000 --&gt; 00:02.000

That’s an, an, that’s an L!


crédit de transcription

00:04.000 --&gt; 00:05.000

Transcrit par Célestes™

::cue(#\31) { color: lime; }

::cue(#crédit\ de\ transcription) { color: red; }
</pre>

<p>Positioning of captions or subtitles is also supported and is done the way we do it in CSS as shown below:</p>

<pre>
WEBVTT


00:00:00.000 --&gt; 00:00:04.000 position:10%,line-left align:left size:35%

Where did he go?


00:00:03.000 --&gt; 00:00:06.500 position:90% align:right size:35%

I think he went down this lane.


00:00:04.000 --&gt; 00:00:06.500 position:45%,line-right align:center size:35%

What are you waiting for?</pre>

<p>The connection between the HTML, CSS and WebVTT files is simple as making a connection between the particular files the way we reference an image link using &lt;a&gt; tag and src attribute for image. Similar method is followed to attach the WebVTT file to any HTML page with a difference that WebVTT uses the &lt;track&gt; tag instead of &lt;a&gt;.</p>

<h2 id="WebVTT_Cues">WebVTT Cues</h2>

<p>A cue is a single subtitle block that has a single start time, end time, and textual payload. Example 6 consists of the header, a blank line, and then five cues separated by blank lines. A cue consists of five components:</p>

<ul>
 <li>An optional cue identifier followed by a newline</li>
 <li>Cue timings</li>
 <li>Optional cue settings with at least one space before the first and between each setting</li>
 <li>One or more newlines</li>
 <li>The cue payload text</li>
</ul>

<h5 id="Example_7_-_Example_of_a_cue">Example 7 - Example of a cue</h5>

<pre class="eval">
1 - Title Crawl
00:00:5.000 --&gt; 00:00:10.000 line:0 position:20% size:60% align:start
Some time ago in a place rather distant....</pre>

<h3 id="Cue_Identifier">Cue Identifier</h3>

<p>The identifier is a name that identifies the cue. It can be used to reference the cue from a script. It must not contain a newline and cannot contain the string "<code>--&gt;"</code>. It must end with a single newline. They do not have to be unique, although it is common to number them (e.g. 1, 2, 3, ...).</p>

<h5 id="Example_8_-_Cue_identifier_from_Example_7">Example 8 - Cue identifier from Example 7</h5>

<pre class="eval">
1 - Title Crawl</pre>

<h5 id="Example_9_-_Common_usage_of_identifiers">Example 9 - Common usage of identifiers</h5>

<pre class="eval">
WEBVTT

1
00:00:22.230 --&gt; 00:00:24.606
This is the first subtitle.

2
00:00:30.739 --&gt; 00:00:34.074
This is the second.

3
00:00:34.159 --&gt; 00:00:35.743
Third
</pre>

<h3 id="Cue_Timings">Cue Timings</h3>

<p>A cue timing indicates when the cue is shown. It has a start and end time which are represented by timestamps. The end time must be greater than the start time, and the start time must be greater than or equal to all previous start times. Cues may have overlapping timings.</p>

<p>If the WebVTT file is being used for chapters ({{HTMLElement("track")}} {{htmlattrxref("kind")}} is <code>chapters</code>) then the file cannot have overlapping timings.</p>

<p>Each cue timing contains five components:</p>

<ul>
 <li>Timestamp for start time</li>
 <li>At least one space</li>
 <li>The string "<code>--&gt;"</code></li>
 <li>At least one space</li>
 <li>Timestamp for end time
  <ul>
   <li>Which must be greater than the start time</li>
  </ul>
 </li>
</ul>

<p>The timestamps must be in one of two formats:</p>

<ul>
 <li><code>mm:ss.ttt</code></li>
 <li><code>hh:mm:ss.ttt</code></li>
</ul>

<p>Where the components are defined as follows:</p>

<ul>
 <li><code>hh</code> is hours

  <ul>
   <li>Must be at least two digits</li>
   <li>Hours can be greater than two digits (e.g. 9999:00:00.000)</li>
  </ul>
 </li>
 <li><code>mm</code> is minutes
  <ul>
   <li>Must be between 00 and 59 inclusive</li>
  </ul>
 </li>
 <li><code>ss</code> is senconds
  <ul>
   <li>Must be between 00 and 59 inclusive</li>
  </ul>
 </li>
 <li><code>ttt</code> is miliseconds
  <ul>
   <li>Must be between 000 and 999 inclusive</li>
  </ul>
 </li>
</ul>

<h5 id="Example_10_-_Basic_cue_timing_examples">Example 10 - Basic cue timing examples</h5>

<pre class="eval">
00:22.230 --&gt; 00:24.606
00:30.739 --&gt; 00:00:34.074
00:00:34.159 --&gt; 00:35.743
00:00:35.827 --&gt; 00:00:40.122</pre>

<h5 id="Example_11_-_Overlapping_cue_timing_examples">Example 11 - Overlapping cue timing examples</h5>

<pre class="eval">
00:00:00.000 --&gt; 00:00:10.000
00:00:05.000 --&gt; 00:01:00.000
00:00:30.000 --&gt; 00:00:50.000</pre>

<h5 id="Example_12_-_Non-overlapping_cue_timing_examples">Example 12 - Non-overlapping cue timing examples</h5>

<pre class="eval">
00:00:00.000 --&gt; 00:00:10.000
00:00:10.000 --&gt; 00:01:00.581
00:01:00.581 --&gt; 00:02:00.100
00:02:01.000 --&gt; 00:02:01.000</pre>

<h3 id="Cue_Settings">Cue Settings</h3>

<p>Cue settings are optional components used to position where the cue payload text will be displayed over the video. This includes whether the text is displayed horizontally or vertically. There can be zero or more of them, and they can be used in any order so long as each setting is used no more than once.</p>

<p>The cue settings are added to the right of the cue timings. There must be one or more spaces between the cue timing and the first setting and between each setting. A setting's name and value are separated by a colon. The settings are case sensitive so use lower case as shown. There are five cue settings:</p>

<ul>
 <li><strong>vertical</strong>

  <ul>
   <li>Indicates that the text will be displayed vertically rather than horizontally, such as in some Asian languages.</li>
  </ul>

  <table>
   <thead>
    <tr>
     <th colspan="2">Table 1 - vertical values</th>
    </tr>
   </thead>
   <tbody>
    <tr>
     <th><code>vertical:rl</code></th>
     <td>writing direction is right to left</td>
    </tr>
    <tr>
     <th><code>vertical:lr</code></th>
     <td>writing direction is left to right</td>
    </tr>
   </tbody>
  </table>
 </li>
 <li><strong>line</strong>
  <ul>
   <li>Specifies where text appears vertically. If vertical is set, line specifies where text appears horizontally.</li>
   <li>Value can be a line number
    <ul>
     <li>The line height is the height of the first line of the cue as it appears on the video</li>
     <li>Positive numbers indicate top down</li>
     <li>Negative numbers indicate bottom up</li>
    </ul>
   </li>
   <li>Or value can be a percentage
    <ul>
     <li>Must be an integer (i.e. no decimals) between 0 and 100 inclusive</li>
     <li>Must be followed by a percent sign (%)</li>
    </ul>
   </li>
  </ul>

  <table>
   <thead>
    <tr>
     <th colspan="4">Table 2 - line examples</th>
    </tr>
   </thead>
   <tbody>
    <tr>
     <th>&nbsp;</th>
     <th><code>vertical</code> omitted</th>
     <th><code>vertical:rl</code></th>
     <th><code>vertical:lr</code></th>
    </tr>
    <tr>
     <th><code>line:0</code></th>
     <td>top</td>
     <td>right</td>
     <td>left</td>
    </tr>
    <tr>
     <th><code>line:-1</code></th>
     <td>bottom</td>
     <td>left</td>
     <td>right</td>
    </tr>
    <tr>
     <th><code>line:0%</code></th>
     <td>top</td>
     <td>right</td>
     <td>left</td>
    </tr>
    <tr>
     <th><code>line:100%</code></th>
     <td>bottom</td>
     <td>left</td>
     <td>right</td>
    </tr>
   </tbody>
  </table>
 </li>
 <li><strong>position</strong>
  <ul>
   <li>Specifies where the text will appear horizontally. If vertical is set, position specifies where the text will appear vertically.</li>
   <li>Value is a percentage</li>
   <li>Must be an integer (no decimals) between 0 and 100 inclusive</li>
   <li>Must be followed by a percent sign (%)</li>
  </ul>

  <table>
   <thead>
    <tr>
     <th colspan="4">Table 3 - position examples</th>
    </tr>
   </thead>
   <tbody>
    <tr>
     <th>&nbsp;</th>
     <th><code>vertical</code> omitted</th>
     <th><code>vertical:rl</code></th>
     <th><code>vertical:lr</code></th>
    </tr>
    <tr>
     <th><code>position:0%</code></th>
     <td>left</td>
     <td>top</td>
     <td>top</td>
    </tr>
    <tr>
     <th><code>position:100%</code></th>
     <td>right</td>
     <td>bottom</td>
     <td>bottom</td>
    </tr>
   </tbody>
  </table>
 </li>
 <li><strong>size</strong>
  <ul>
   <li>Specifies the width of the text area. If vertical is set, size specifies the height of the text area.</li>
   <li>Value is a percentage</li>
   <li>Must be an integer (i.e. no decimals) between 0 and 100 inclusive</li>
   <li>Must be followed by a percent sign (%)</li>
  </ul>

  <table>
   <thead>
    <tr>
     <th colspan="4">Table 4 - size examples</th>
    </tr>
   </thead>
   <tbody>
    <tr>
     <th>&nbsp;</th>
     <th><code>vertical</code> omitted</th>
     <th><code>vertical:rl</code></th>
     <th><code>vertical:lr</code></th>
    </tr>
    <tr>
     <th><code>size:100%</code></th>
     <td>full width</td>
     <td>full height</td>
     <td>full height</td>
    </tr>
    <tr>
     <th><code>size:50%</code></th>
     <td>half width</td>
     <td>half height</td>
     <td>half height</td>
    </tr>
   </tbody>
  </table>
 </li>
 <li><strong>align</strong>
  <ul>
   <li>Specifies the alignment of the text. Text is aligned within the space given by the size cue setting if it is set.</li>
  </ul>

  <table>
   <thead>
    <tr>
     <th colspan="4">Table 5 - align values</th>
    </tr>
   </thead>
   <tbody>
    <tr>
     <th>&nbsp;</th>
     <th><code>vertical</code> omitted</th>
     <th><code>vertical:rl</code></th>
     <th><code>vertical:lr</code></th>
    </tr>
    <tr>
     <th><code>align:start</code></th>
     <td>left</td>
     <td>top</td>
     <td>top</td>
    </tr>
    <tr>
     <th><code>align:middle</code></th>
     <td>centred horizontally</td>
     <td>centred vertically</td>
     <td>centred vertically</td>
    </tr>
    <tr>
     <th><code>align:end</code></th>
     <td>right</td>
     <td>bottom</td>
     <td>bottom</td>
    </tr>
   </tbody>
  </table>
 </li>
</ul>

<h5 id="Example_13_-_Cue_setting_examples">Example 13 - Cue setting examples</h5>

<p>The first line demonstrates no settings. The second line might be used to overlay text on a sign or label. The third line might be used for a title. The last line might be used for an Asian language.</p>

<pre class="eval">
00:00:5.000 --&gt; 00:00:10.000
00:00:5.000 --&gt; 00:00:10.000 line:63% position:72% align:start
00:00:5.000 --&gt; 00:00:10.000 line:0 position:20% size:60% align:start
00:00:5.000 --&gt; 00:00:10.000 vertical:rt line:-1 align:end
</pre>

<h3 id="Cue_Payload">Cue Payload</h3>

<p>The payload is where the main information or content is located. In normal usage the payload contains the subtitles to be displayed.&nbsp;The payload text may contain newlines but it cannot contain a blank line, which is equivalent to two consecutive newlines. A blank line signifies the end of a cue.</p>

<p>A cue text payload cannot contain the string "<code>--&gt;"</code>, the ampersand character (&amp;), or the less-than sign (&lt;). Instead use the escape sequence "&amp;amp;" for ampersand and "&amp;lt;" for less-than. It is also recommended that you use the greater-than escape sequence "&amp;gt;" instead of the greater-than character (&gt;) to avoid confusion with tags. If you are using the WebVTT file for metadata these restrictions do not apply.</p>

<p>In addition to the three escape sequences mentioned above, there are fours others. They are listed in the table below.</p>

<table>
 <thead>
  <tr>
   <th colspan="3">Table 6 - Escape sequences</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <th>Name</th>
   <th>Character</th>
   <th>Escape Sequence</th>
  </tr>
  <tr>
   <td>Ampersand</td>
   <td>&amp;</td>
   <td><code>&amp;amp;</code></td>
  </tr>
  <tr>
   <td>Less-than</td>
   <td>&lt;</td>
   <td><code>&amp;lt;</code></td>
  </tr>
  <tr>
   <td>Greater-than</td>
   <td>&gt;</td>
   <td><code>&amp;gt;</code></td>
  </tr>
  <tr>
   <td>Left-to-right mark</td>
   <td>&nbsp;</td>
   <td><code>&amp;lrm;</code></td>
  </tr>
  <tr>
   <td>Right-to-left mark</td>
   <td>&nbsp;</td>
   <td><code>&amp;rlm;</code></td>
  </tr>
  <tr>
   <td>Non-breaking space</td>
   <td><code>&nbsp;</code></td>
   <td><code>&amp;nbsp;</code></td>
  </tr>
 </tbody>
</table>

<h3 id="Cue_Payload_Text_Tags">Cue Payload Text Tags</h3>

<p>There are a number of tags, such as <code>&lt;bold&gt;</code>, that can be used. However, if the WebVTT file is used in a {{HTMLElement("track")}} element where the attribute {{htmlattrxref("kind")}} is <code>chapters</code> then you cannot use tags.</p>

<ul>
 <li><strong>Timestamp tag</strong>

  <ul>
   <li>The timestamp must be greater that the cue's start timestamp, greater than any previous timestamp in the cue payload, and less than the cue's end timestamp. The&nbsp;<em>active text</em> is the text between the timestamp and the next timestamp or to the end of the payload if there is not another timestamp in the payload. Any text before the&nbsp;<em>active text</em> in the payload is&nbsp;<em>previous text</em> . Any text beyond the&nbsp;<em>active text</em> is&nbsp;<em>future text</em> . This enables karaoke style captions.</li>
  </ul>

  <div>
  <h5 id="Example_12_-_Karaoke_style_text">Example 12 - Karaoke style text</h5>

  <pre class="eval">
1
00:16.500 --&gt; 00:18.500
When the moon &lt;00:17.500&gt;hits your eye

1
00:00:18.500 --&gt; 00:00:20.500
Like a &lt;00:19.000&gt;big-a &lt;00:19.500&gt;pizza &lt;00:20.000&gt;pie

1
00:00:20.500 --&gt; 00:00:21.500
That's &lt;00:00:21.000&gt;amore
      </pre>
  </div>
 </li>
</ul>

<p>The following tags are the HTML tags allowed in a cue and require opening and closing tags (e.g. <code>&lt;b&gt;text&lt;/b&gt;</code>).</p>

<ul>
 <li><strong>Class tag</strong> (<code>&lt;c&gt;&lt;/c&gt;</code>)

  <ul>
   <li>Style the contained text using a CSS class.</li>
  </ul>

  <div>
  <h5 id="Example_14_-_Class_tag">Example 14 - Class tag</h5>

  <pre class="brush: html">
&lt;c.classname&gt;text&lt;/c&gt;</pre>
  </div>
 </li>
 <li><strong>Italics tag</strong> (<code>&lt;i&gt;&lt;/i&gt;</code>)
  <ul>
   <li>Italicize the contained text.</li>
  </ul>

  <div>
  <h5 id="Example_15_-_Italics_tag">Example 15 - Italics tag</h5>

  <pre class="brush: html">
&lt;i&gt;text&lt;/i&gt;</pre>
  </div>
 </li>
 <li><strong>Bold tag</strong> (<code>&lt;b&gt;&lt;/b&gt;</code>)
  <ul>
   <li>Bold the contained text.</li>
  </ul>

  <div>
  <h5 id="Example_16_-_Bold_tag">Example 16 - Bold tag</h5>

  <pre class="brush: html">
&lt;b&gt;text&lt;/b&gt;</pre>
  </div>
 </li>
 <li><strong>Underline tag</strong> (<code>&lt;u&gt;&lt;/u&gt;</code>)
  <ul>
   <li>Underline the contained text.</li>
  </ul>

  <div>
  <h5 id="Example_17_-_Underline_tag">Example 17 - Underline tag</h5>

  <pre class="brush: html">
&lt;u&gt;text&lt;/u&gt;</pre>
  </div>
 </li>
 <li><strong>Ruby tag</strong> (<code>&lt;ruby&gt;&lt;/ruby&gt;</code>)
  <ul>
   <li>Used with ruby text tags to display <a href="https://en.wikipedia.org/wiki/Ruby_character">ruby characters</a> (i.e. small annotative characters above other characters).</li>
  </ul>

  <div>
  <h5 id="Example_18_-_Ruby_tag">Example 18 - Ruby tag</h5>

  <pre class="brush: html">
&lt;ruby&gt;WWW&lt;rt&gt;World Wide Web&lt;/rt&gt;oui&lt;rt&gt;yes&lt;/rt&gt;&lt;/ruby&gt;</pre>
  </div>
 </li>
 <li><strong>Ruby text tag</strong> (<code>&lt;rt&gt;&lt;/rt&gt;</code>)
  <ul>
   <li>Used with ruby tags to display <a href="https://en.wikipedia.org/wiki/Ruby_character">ruby characters</a> (i.e. small annotative characters above other characters).</li>
  </ul>

  <div>
  <h5 id="Example_19_-_Ruby_text_tag">Example 19 - Ruby text tag</h5>

  <pre class="brush: html">
&lt;ruby&gt;WWW&lt;rt&gt;World Wide Web&lt;/rt&gt;oui&lt;rt&gt;yes&lt;/rt&gt;&lt;/ruby&gt;</pre>
  </div>
 </li>
 <li><strong>Voice tag</strong> (<code>&lt;v&gt;&lt;/v&gt;</code>)
  <ul>
   <li>Similar to class tag, also used to style the contained text using CSS.</li>
  </ul>

  <div>
  <h5 id="Example_20_-_Voice_tag">Example 20 - Voice tag</h5>

  <pre class="brush: html">
&lt;v Bob&gt;text&lt;/v&gt;</pre>
  </div>
 </li>
</ul>

<h2>Interfaces</h2>

<p>There are two interfaces or APIs used in WebVTT which are:</p>

<h3>VTTCue interface</h3>

<p>It is used for providing an interface in Document Object Model API, where different attributes supported by it can be used to prepare and alter the cues in number of ways.</p>

<p>Constructor is the first point for starting the Cue which is defined using the default constructorVTTCue(startTime, endTime, text) where starting time, ending time and text for cue can be adjusted. After that we can set the region for that particular cue to which this cue belongs using cue.region. Vertical, horizontal, line, lineAlign, Position, positionAlign, text, size and Align can be used to alter the cue and its formation, just like we can alter the objects form, shape and visibility in HTML using CSS. But the VTTCue interface is within the WebVTT provides the vast range of adjustment variables which can be used directly to alter the Cue. Following interface can be used to expose WebVTT cues in DOM API:</p>

<pre>
enum <dfn data-dfn-type="enum" data-export="" id="enumdef-autokeyword"><strong>AutoKeyword</strong></dfn> { "auto" };

enum <dfn data-dfn-type="enum" data-export="" id="enumdef-directionsetting"><strong>DirectionSetting</strong></dfn> { "" /* horizontal */, "rl", "lr" };

enum <dfn data-dfn-type="enum" data-export="" id="enumdef-linealignsetting"><strong>LineAlignSetting</strong></dfn> { "start", "center", "end" };

enum <dfn data-dfn-type="enum" data-export="" id="enumdef-positionalignsetting"><strong>PositionAlignSetting</strong></dfn> { "start", "center", "end", "auto" };

enum <dfn data-dfn-type="enum" data-export="" id="enumdef-alignsetting"><strong>AlignSetting</strong></dfn> { "start", "center", "end", "left", "right" };

[<a data-link-type="constructor" href="https://w3c.github.io/webvtt/#dom-vttcue-vttcue">Constructor</a>(double <dfn data-dfn-for="VTTCue/VTTCue(startTime, endTime, text)" data-dfn-type="argument" data-export="" id="dom-vttcue-vttcue-starttime-endtime-text-starttime"><strong>startTime</strong></dfn>, double <dfn data-dfn-for="VTTCue/VTTCue(startTime, endTime, text)" data-dfn-type="argument" data-export="" id="dom-vttcue-vttcue-starttime-endtime-text-endtime"><strong>endTime</strong></dfn>, DOMString <dfn data-dfn-for="VTTCue/VTTCue(startTime, endTime, text)" data-dfn-type="argument" data-export="" id="dom-vttcue-vttcue-starttime-endtime-text-text"><strong>text</strong></dfn>)]

interface <dfn data-dfn-type="interface" data-export="" id="vttcue"><strong>VTTCue</strong></dfn> : <a data-link-type="idl-name" href="https://html.spec.whatwg.org/multipage/embedded-content.html#texttrackcue">TextTrackCue</a> {

  attribute <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#vttregion">VTTRegion</a>? <a data-link-type="attribute" data-type="VTTRegion? " href="https://w3c.github.io/webvtt/#dom-vttcue-region">region</a>;

  attribute <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#enumdef-directionsetting">DirectionSetting</a> <a data-link-type="attribute" data-type="DirectionSetting " href="https://w3c.github.io/webvtt/#dom-vttcue-vertical">vertical</a>;

  attribute boolean <a data-link-type="attribute" data-type="boolean " href="https://w3c.github.io/webvtt/#dom-vttcue-snaptolines">snapToLines</a>;

  attribute (double or <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#enumdef-autokeyword">AutoKeyword</a>) <a data-link-type="attribute" data-type="(double or AutoKeyword) " href="https://w3c.github.io/webvtt/#dom-vttcue-line">line</a>;

  attribute <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#enumdef-linealignsetting">LineAlignSetting</a> <a data-link-type="attribute" data-type="LineAlignSetting " href="https://w3c.github.io/webvtt/#dom-vttcue-linealign">lineAlign</a>;

  attribute (double or <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#enumdef-autokeyword">AutoKeyword</a>) <a data-link-type="attribute" data-type="(double or AutoKeyword) " href="https://w3c.github.io/webvtt/#dom-vttcue-position">position</a>;

  attribute <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#enumdef-positionalignsetting">PositionAlignSetting</a> <a data-link-type="attribute" data-type="PositionAlignSetting " href="https://w3c.github.io/webvtt/#dom-vttcue-positionalign">positionAlign</a>;

  attribute double <a data-link-type="attribute" data-type="double " href="https://w3c.github.io/webvtt/#dom-vttcue-size">size</a>;

  attribute <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#enumdef-alignsetting">AlignSetting</a> <a data-link-type="attribute" data-type="AlignSetting " href="https://w3c.github.io/webvtt/#dom-vttcue-align">align</a>;

  attribute DOMString <a data-link-type="attribute" data-type="DOMString " href="https://w3c.github.io/webvtt/#dom-vttcue-text">text</a>;

  <a data-link-type="idl-name" href="https://dom.spec.whatwg.org/#documentfragment">DocumentFragment</a> <a data-link-type="method" href="https://w3c.github.io/webvtt/#dom-vttcue-getcueashtml">getCueAsHTML</a>();

};
</pre>

<h3>VTT Region interface</h3>

<p>This is the second interface in WebVTT API.</p>

<p>The new keyword can be used for defining a new VTTRegion object which can then be used for containing the multiple cues. There are several properties of VTTRegion which are width, lines, regionAnchorX, RegionAnchorY, viewportAnchorX, viewportAnchorY and scroll that can be used to specify the look and feel of this VTT region. The interface code is given below which can be used to expose the WebVTT regions in DOM API:</p>

<pre>
enum <dfn data-dfn-type="enum" data-export="" id="enumdef-scrollsetting"><strong>ScrollSetting</strong></dfn> { "" /* none */, "up" };

[<a data-link-type="constructor" href="https://w3c.github.io/webvtt/#dom-vttregion-vttregion">Constructor</a>]

interface <dfn data-dfn-type="interface" data-export="" id="vttregion"><strong>VTTRegion</strong></dfn> {

  attribute double <a data-link-type="attribute" data-type="double " href="https://w3c.github.io/webvtt/#dom-vttregion-width">width</a>;

  attribute long <a data-link-type="attribute" data-type="long " href="https://w3c.github.io/webvtt/#dom-vttregion-lines">lines</a>;

  attribute double <a data-link-type="attribute" data-type="double " href="https://w3c.github.io/webvtt/#dom-vttregion-regionanchorx">regionAnchorX</a>;

  attribute double <a data-link-type="attribute" data-type="double " href="https://w3c.github.io/webvtt/#dom-vttregion-regionanchory">regionAnchorY</a>;

  attribute double <a data-link-type="attribute" data-type="double " href="https://w3c.github.io/webvtt/#dom-vttregion-viewportanchorx">viewportAnchorX</a>;

  attribute double <a data-link-type="attribute" data-type="double " href="https://w3c.github.io/webvtt/#dom-vttregion-viewportanchory">viewportAnchorY</a>;

  attribute <a data-link-type="idl-name" href="https://w3c.github.io/webvtt/#enumdef-scrollsetting">ScrollSetting</a> <a data-link-type="attribute" data-type="ScrollSetting " href="https://w3c.github.io/webvtt/#dom-vttregion-scroll">scroll</a>;

};</pre>

<h2>Methods and properties</h2>

<p>The methods used in WebVTT are those which are used to alter the cue or region as the attributes for both interfaces are different. We can categorize them for better understanding relating to each interface in WebVTT:</p>

<ul style="list-style-type:circle;">
 <li>
  <h3>VTTCue</h3>

  <ul>
   <li>The methods which are available in this interface are:
    <ul style="list-style-type:circle;">
     <li>GetCueAsHTML to get the HTML of that Cue.</li>
     <li>VTT Constructor for creating new objects of Cues</li>
     <li>Autokeyword</li>
     <li>DirectionSetting: to set the direction of caption or text in a file</li>
     <li>LineAlignment: to adjust the line alignment</li>
     <li>PositionAlignSetting: to adjust the position of text</li>
    </ul>
   </li>
  </ul>
 </li>
 <li>
  <h3>VTTRegion</h3>

  <ul>
   <li>The methods used for region are listed below along with description of their functionality:
    <ul style="list-style-type:circle;">
     <li>ScrollSetting: For adjusting the scrolling setting of all nodes present in given region</li>
     <li>VTT Region Constructor: for construction of new VTT Regions</li>
    </ul>
   </li>
  </ul>
 </li>
</ul>

<h2>Tutorial on how to write a WebVTT file</h2>

<p>There are few steps that can be followed to write a simple webVTT file. Before start, it must be noted that you can make use of a notepad and then save the file as ‘.vtt’ file. Steps are given below:</p>

<ol>
 <li>Open a notepad.</li>
 <li>The first line of WebVTT is standardized similar in the way some other languages require you to put headers as the file starts to indicate the file type. One the very first line you have to write</li>
</ol>

<pre>
‘WEBVTT’.</pre>

<p>&nbsp; &nbsp; &nbsp; 3. Leave the second line blank, one third line the time for first cue can is to be specified. For example, for a first que starting at time 1 second and ending at 5 second, it is written as:</p>

<pre>
‘00: 01.000 00: 05.000’</pre>

<ol>
 <li value="4">On the next line you can write the caption for this cue which will run from 1 sec to 5<sup>th</sup> sec.</li>
 <li value="5">Following the similar steps, a complete WebVTT file for specific video or audio file can be made.</li>
</ol>

<h2>CSS Pseudo-classes</h2>

<p>CSS pseudo classes allow us to classify the type of object which we want to differentiate from other types of objects. It works in similar manner in WebVTT files as it works in HTML file.</p>

<p>It is one of the good features supported by WebVTT is the localization and use of class elements which can be used in same way they are used in HTML and CSS to classify the style for particular type of objects, but here these are used for styling and classifying the Cues as shown below:</p>

<pre>
WEBVTT


04:02.500 --&gt; 04:05.000

J’ai commencé le basket à l'âge de 13, 14 ans


04:05.001 --&gt; 04:07.800

Sur les &lt;i.foreignphrase&gt;&lt;lang en&gt;playground&lt;/lang&gt;&lt;/i&gt;, ici à Montpellier</pre>

<p>In the above example it can be observed that we can use the identifier and pseudo class name for defining the language of caption, where &lt;i&gt; tag is for italics.</p>

<p>The type of pseudo class is determined by the selector it is using and working is similar in nature as it works in HTML. Following CSS pseudo classes can be used:</p>

<ul>
 <li>Lang (Lanugage): e.g. p:lang(it)</li>
 <li>Link: e.g. a:link</li>
 <li>Nth-last-child: e.g. p:nth-last-child(2)</li>
 <li>Nth-child(n): e.g. p:nth-child(2)</li>
</ul>

<p>Where p and a are the tags which are used in HTML for paragraph and link, respectively and they can be replaced by identifiers which are used for Cues in WebVTT file.</p>

<h2 id="Specifications">Specifications</h2>

<table class="standard-table">
 <tbody>
  <tr>
   <th>Specification</th>
   <th>Status</th>
   <th>Comment</th>
  </tr>
  <tr>
   <td>{{SpecName("WebVTT")}}</td>
   <td>{{Spec2("WebVTT")}}</td>
   <td>Initial definition</td>
  </tr>
 </tbody>
</table>

<h2 id="Compatibility">Compatibility</h2>

<p>{{CompatibilityTable}}</p>

<div id="compat-desktop">
<table class="compat-table">
 <tbody>
  <tr>
   <th>Feature</th>
   <th>Chrome</th>
   <th>Firefox (Gecko)</th>
   <th>Internet Explorer</th>
   <th>Opera</th>
   <th>Safari</th>
  </tr>
  <tr>
   <td>Basic support</td>
   <td>18</td>
   <td>28</td>
   <td>10</td>
   <td>15.0</td>
   <td>7</td>
  </tr>
 </tbody>
</table>
</div>

<div id="compat-mobile">
<table class="compat-table">
 <tbody>
  <tr>
   <th>Feature</th>
   <th>Android</th>
   <th>Firefox Mobile (Gecko)</th>
   <th>IChrome for Mobile</th>
   <th>Opera Mobile</th>
   <th>Safari Mobile</th>
  </tr>
  <tr>
   <td>Basic support</td>
   <td>4.4</td>
   <td>{{CompatNo}}</td>
   <td>35.0</td>
   <td>21.0</td>
   <td>7</td>
  </tr>
 </tbody>
</table>
</div>

<p>&nbsp;</p>
Revert to this revision