このページは未翻訳です。MDC のコンテンツ拡充にご協力ください。
The Audio Data API extension extends the HTML5 specification of the <audio>
and <video>
media elements by exposing audio metadata and raw audio data. This enables users to visualize audio data, to process this audio data and to create new audio data.
Please note that this document describes a non-standard experimental API. This API is considered deprecated and may not be supported in future releases. The World Wide Web Consortium (W3C) has chartered the Audio Working Group to develop standardized audio API specifications. Please refer to the Audio Working Group website for further details.
Reading audio streams
The loadedmetadata event
When the metadata of the media element is available, it triggers a loadedmetadata event. This event has the following attributes:
- mozChannels: Number of channels
- mozSampleRate: Sample rate per second
- mozFrameBufferLength: Number of samples collected in all channels
This information is needed later to decode the audio data stream. The following example extracts the data from an audio element:
<!DOCTYPE html> <html> <head> <title>JavaScript Metadata Example</title> </head> <body> <audio id="audio-element" src="song.ogg" controls="true" style="width: 512px;"> </audio> <script> function loadedMetadata() { channels = audio.mozChannels; rate = audio.mozSampleRate; frameBufferLength = audio.mozFrameBufferLength; } var audio = document.getElementById('audio-element'); audio.addEventListener('loadedmetadata', loadedMetadata, false); </script> </body> </html>
The MozAudioAvailable event
As the audio is played, sample data is made available to the audio layer and the audio buffer (size defined in mozFrameBufferLength) gets filled with those samples. Once the buffer is full, the event MozAudioAvailable is triggered. This event therefore contains the raw samples of a period of time. Those samples may or may not have been played yet at the time of the event and have not been adjusted for mute or volume settings on the media element. Playing, pausing, and seeking the audio also affect the streaming of this raw audio data.
The MozAudioAvailable event has 2 attributes:
- frameBuffer: Framebuffer (i.e., an array) containing decoded audio sample data (i.e., floats)
- time: Timestamp for these samples measured from the start in seconds
The framebuffer contains an array of audio samples. It's important to note that the samples are not separated by channels; they are all delivered together. For example, for a two-channel signal: Channel1-Sample1 Channel2-Sample1 Channel1-Sample2 Channel2-Sample2 Channel1-Sample3 Channel2-Sample3.
We can extend the previous example to visualize the timestamp and the first two samples in a <div>
element:
<!DOCTYPE html> <html> <head> <title>JavaScript Visualization Example</title> </head> <body> <audio id="audio-element" src="revolve.ogg" controls="true" style="width: 512px;"> </audio> <pre id="raw">hello</pre> <script> function loadedMetadata() { channels = audio.mozChannels; rate = audio.mozSampleRate; frameBufferLength = audio.mozFrameBufferLength; } function audioAvailable(event) { var frameBuffer = event.frameBuffer; var t = event.time; var text = "Samples at: " + t + "\n"; text += frameBuffer[0] + " " + frameBuffer[1]; raw.innerHTML = text; } var raw = document.getElementById('raw'); var audio = document.getElementById('audio-element'); audio.addEventListener('MozAudioAvailable', audioAvailable, false); audio.addEventListener('loadedmetadata', loadedMetadata, false); </script> </body> </html>
Creating an audio stream
It is also possible to create and setup an <audio>
element for raw writing from script (i.e., without a src attribute). Content scripts can specify the audio stream's characteristics, then write audio samples. Users must create an audio object and then use the mozSetup()
function to specify the number of channels and the frequency (in Hz). For example:
// Create a new audio element var audioOutput = new Audio(); // Set up audio element with 2 channel, 44.1KHz audio stream. audioOutput.mozSetup(2, 44100);
Once this is done, the samples need to be created. Those samples have the same format as the ones in the mozAudioAvailable event. Then the samples are written in the audio stream with the function mozWriteAudio()
. It's important to note that not all the samples might get written in the stream. The function returns the number of samples written, which is useful for the next writing. You can see an example below:
// Write samples using a JS Array var samples = [0.242, 0.127, 0.0, -0.058, -0.242, ...]; var numberSamplesWritten = audioOutput.mozWriteAudio(samples); // Write samples using a Typed Array var samples = new Float32Array([0.242, 0.127, 0.0, -0.058, -0.242, ...]); var numberSamplesWritten = audioOutput.mozWriteAudio(samples);
In the following example, we create an audio pulse:
<!doctype html> <html> <head> <title>Generating audio in real time</title> <script type="text/javascript"> function playTone() { var output = new Audio(); output.mozSetup(1, 44100); var samples = new Float32Array(22050); for (var i = 0; i < samples.length ; i++) { samples[i] = Math.sin( i / 20 ); } output.mozWriteAudio(samples); } </script> </head> <body> <p>This demo plays a one second tone when you click the button below.</p> <button onclick="playTone();">Play</button> </body> </html>
The mozCurrentSampleOffset()
method gives the audible position of the audio stream, meaning the position of the last heard sample.
// Get current audible position of the underlying audio stream, measured in samples. var currentSampleOffset = audioOutput.mozCurrentSampleOffset();
Audio data written using the mozWriteAudio()
method needs to be written at a regular interval in equal portions, in order to keep a little ahead of the current sample offset (the sample offset that is currently being played by the hardware can be obtained with mozCurrentSampleOffset()
), where "a little" means something on the order of 500 ms of samples. For example, if working with two channels at 44100 samples per second, a writing interval of 100 ms, and a pre-buffer equal to 500 ms, one would write an array of (2 * 44100 / 10) = 8820 samples, and a total of (currentSampleOffset + 2 * 44100 / 2).
It's also possible to auto-detect the minimal duration of the pre-buffer, such that the sound is played without interruptions, and lag between writing and playback is minimal. To do this start writing the data in small portions and wait for the value returned by mozCurrentSampleOffset()
to be greater than 0.
var prebufferSize = sampleRate * 0.020; // Initial buffer is 20 ms var autoLatency = true, started = new Date().valueOf(); ... // Auto latency detection if (autoLatency) { prebufferSize = Math.floor(sampleRate * (new Date().valueOf() - started) / 1000); if (audio.mozCurrentSampleOffset()) { // Play position moved? autoLatency = false; }
Processing an audio stream
Since the MozAudioAvailable event and the mozWriteAudio()
method both use Float32Array
values, it is possible to take the output of one audio stream and pass it directly (or process first and then pass) to a second. The first audio stream needs to be muted so that only the second audio element is heard.
<audio id="a1" src="song.ogg" controls> </audio> <script> var a1 = document.getElementById('a1'), a2 = new Audio(), buffers = []; function loadedMetadata() { // Mute a1 audio. a1.volume = 0; // Setup a2 to be identical to a1, and play through there. a2.mozSetup(a1.mozChannels, a1.mozSampleRate); } function audioAvailable(event) { // Write the current framebuffer var frameBuffer = event.frameBuffer; writeAudio(frameBuffer); } a1.addEventListener('MozAudioAvailable', audioAvailable, false); a1.addEventListener('loadedmetadata', loadedMetadata, false); function writeAudio(audio) { buffers.push(audio); // If there's buffered data, write that while(buffers.length > 0) { var buffer = buffers.shift(); var written = a2.mozWriteAudio(buffer); // // If all data wasn't written, keep it in the buffers: if(written < buffer.length) { buffers.unshift(buffer.slice(written)); return; } } } </script>