Diese Übersetzung ist unvollständig. Bitte helfen Sie, diesen Artikel aus dem Englischen zu übersetzen.
Die Web Audio API stellt eine Vielzahl an Funktionen bereit, die es Entwicklern erlauben Audioquellen auszuwählen, sie mit Effekten zu versehen, Visualisierungen zu generieren, räumliche Effekte (z.B. Panning) hinzuzufügen und vieles mehr.
Das Web Audio Konzept und Benutzung
Die Web Audio API führt Audio-Operationen in einem sog. Audio Context aus und wurde mit dem Ziel designt, modulares routing zu ermöglichen. Grundlegende Audio-Operationen werden mit Audio Nodes durchgeführt, die, miteinander verbunden, einen Audio-Routing-Graphen bilden. Unterschiedliche (durch Kanal-Layouts) Audioquellen können in einem einzigen Audio Context verwendet werden. Durch diesen modularen Aufbau ist es möglich, komplexe Audioausgaben mit dynamischen Effekten zu erzeugen.
Audio Nodes werden über ihre Ein- und Ausgänge zu einfachen Ketten oder Netzen verbunden. Sie beginnen in der Regel mit einer oder mehreren Quellen. Quellen stellen Arrays mit (Audio-)Amplituden-Werten (Samples) bereit, die einen sehr kleinen Zeitabschnitt beschreiben (oft weniger als ein Zehntausendstel einer Sekunde). Die Samples sind entweder computergeneriert (siehe OscillatorNode), beschreiben Audio- oder Videodateien (siehe: AudioBufferSourceNode und MediaElementAudioSourceNode) oder Audiostreams (siehe: MediaStreamAudioSourceNode). Audiodateien sind ihrerseits nichts anderes als Sequenzen von Samples, die von elektronischen Instrumenten und Mikrophonen erzeugt werden, und zu einer gemeinsamen, komplexen (Audio-)Datei zusammengemischt wurden. Die Ausgänge der Audio Nodes können zu Eingängen anderer Audio Nodes geroutet werden, die sie mischen oder modifizieren, und wiederum an ihrern Ausgängen bereitstellen. Eine typische Modifikation wäre das Verändern der Lautstärke (siehe GainNode). Wurde das Audiomaterial (Signal) nach Wunsch bearbeitet, kann es zum Eingang einer Destination (AudioContext.destination) geroutet werden, die es dann zu den Lautsprechern oder Kopfhörern schickt. Diese Verbindung ist nur erfoderlich, wenn der Nutzer das Signal hören soll.
Ein typischer Workflow für Web Audio wäre:
- Audio Context erstellen
- Audioquellen im Kontext bereitstellen (z.B. <audio>, Oszillator, Stream)
- Effekte (Effect Audio Nodes) erstellen (z.B. Hall, Filter oder Panning)
- Ausgabegerät auswählen (z.B. PC-Lautsprecher)
- Audioquellen mit Effekten verbinden und diese mit dem Ausgabegerät
Die Verarbeitung erfolgt mit sehr präzisem Timing und wenig Latenz, was ermöglicht, Code zu entwickeln, der schnell auf Ereignisse reagiert und auch bei hohen Abtastraten gezielt einzelne Samples zu referieren. Somit ist Web Audio API auch für Einsatzzwecke, die eine hohe Genauigkeiten benötigen, wie Grooveboxen, geeignet.
Die Web Audio API erlaubt es uns auch, zu kontrollieren, wie Audio bzw. Klang verräumlicht wird. Durch die Benutzung eines source-listener model können wir das panning model kontrollieren, wodurch wir z.B. durch Entfernung verursachte Dämpfungen des Klangs oder den Doppler-Effekt durch eine sich bewegende Audioquelle (beispielsweise ein Polizeiauto) simulieren können.
Anmerkung: Sie können über die Theorie der Web Audio API sehr viel mehr lesen in dem Artikel Basic concepts behind Web Audio API (englisch)
Web Audio API Schnittstellen
Die Web Audio API bietet 28 Schnittstellen und Events, die in neun Kategorien unterteilt sind.
Generelle Audio-Graph Funktionen
Generelle Container und Definitionen die Audio-Graphen für Web Audio erzeugen.
Die AudioContext
Schnittstelle repräsentiert den Graphen, der aus miteinander verbundenen AudioNode
s besteht. Ein Audio Context kontrolliert die Erstellung von Knoten (Nodes) und das Ausführen von Wiedergabe und (De-)Kodierung. Der Audio Context ist die Grundvoraussetzung für alles weitere und muss daher als erstes erstellt werden.
AudioNode
- Die Audio Node Schnittstelle repräsentiert ein Audio(-verarbeitendes) Modul wie beispielsweise eine Audio Quelle wie ein HTML
<audio>
or<video>
Element, eine Destination (audio destination), oder ein Verarbeitungsmodul (intermediate processing module) wie z.B. ein FilterBiquadFilterNode
oder LautstärkereglerGainNode
). AudioParam
- The
AudioParam
interface represents an audio-related parameter, like one of anAudioNode
. It can be set to a specific value or a change in value, and can be scheduled to happen at a specific time and following a specific pattern. ended
(event)- The
ended
event is fired when playback has stopped because the end of the media was reached.
Defining audio sources
Interfaces that define audio sources for use in the Web Audio API.
OscillatorNode
- The
OscillatorNode
interface represents a sine wave. It is anAudioNode
audio-processing module that causes a given frequency of sine wave to be created. AudioBuffer
- The
AudioBuffer
interface represents a short audio asset residing in memory, created from an audio file using theAudioContext.decodeAudioData()
method, or created with raw data usingAudioContext.createBuffer()
. Once decoded into this form, the audio can then be put into anAudioBufferSourceNode
. AudioBufferSourceNode
- The
AudioBufferSourceNode
interface represents an audio source consisting of in-memory audio data, stored in anAudioBuffer
. It is anAudioNode
that acts as an audio source. MediaElementAudioSourceNode
- The
MediaElementAudio
SourceNode
interface represents an audio source consisting of an HTML5<audio>
or<video>
element. It is anAudioNode
that acts as an audio source. MediaStreamAudioSourceNode
- The
MediaStreamAudio
SourceNode
interface represents an audio source consisting of a WebRTCMediaStream
(such as a webcam or microphone). It is anAudioNode
that acts as an audio source.
Defining audio effects filters
Interfaces for defining effects that you want to apply to your audio sources.
BiquadFilterNode
- The
BiquadFilterNode
interface represents a simple low-order filter. It is anAudioNode
that can represent different kinds of filters, tone control devices or graphic equalizers. ABiquadFilterNode
always has exactly one input and one output. ConvolverNode
- The
Convolver
Node
interface is anAudioNode
that performs a Linear Convolution on a given AudioBuffer, often used to achieve a reverb effect. DelayNode
- The
DelayNode
interface represents a delay-line; anAudioNode
audio-processing module that causes a delay between the arrival of an input data and its propagation to the output. DynamicsCompressorNode
- The
DynamicsCompressorNode
interface provides a compression effect, which lowers the volume of the loudest parts of the signal in order to help prevent clipping and distortion that can occur when multiple sounds are played and multiplexed together at once. GainNode
- The
GainNode
interface represents a change in volume. It is anAudioNode
audio-processing module that causes a given gain to be applied to the input data before its propagation to the output. StereoPannerNode
- The
StereoPannerNode
interface represents a simple stereo panner node that can be used to pan an audio stream left or right. WaveShaperNode
- The
WaveShaperNode
interface represents a non-linear distorter. It is anAudioNode
that use a curve to apply a waveshaping distortion to the signal. Beside obvious distortion effects, it is often used to add a warm feeling to the signal. PeriodicWave
- Used to define a periodic waveform that can be used to shape the output of an
OscillatorNode
.
Defining audio destinations
Once you are done processing your audio, these interfaces define where to output it.
AudioDestinationNode
- The
AudioDestinationNode
interface represents the end destination of an audio source in a given context — usually the speakers of your device. MediaStreamAudioDestinationNode
- The
MediaStreamAudio
DestinationNode
interface represents an audio destination consisting of a WebRTCMediaStream
with a singleAudioMediaStreamTrack
, which can be used in a similar way to a MediaStream obtained fromNavigator.getUserMedia
. It is anAudioNode
that acts as an audio destination.
Data analysis and visualisation
If you want to extract time, frequency and other data from your audio, the AnalyserNode
is what you need.
AnalyserNode
- The
AnalyserNode
interface represents a node able to provide real-time frequency and time-domain analysis information, for the purposes of data analysis and visualization.
Splitting and merging audio channels
To split and merge audio channels, you'll use these interfaces.
ChannelSplitterNode
- The
ChannelSplitterNode
interface separates the different channels of an audio source out into a set of mono outputs. ChannelMergerNode
- The
ChannelMergerNode
interface reunites different mono inputs into a single output. Each input will be used to fill a channel of the output.
Audio spatialization
These interfaces allow you to add audio spatialization panning effects to your audio sources.
AudioListener
- The
AudioListener
interface represents the position and orientation of the unique person listening to the audio scene used in audio spatialization. PannerNode
- The
PannerNode
interface represents the behavior of a signal in space. It is anAudioNode
audio-processing module describing its position with right-hand Cartesian coordinates, its movement using a velocity vector and its directionality using a directionality cone.
Audio processing via JavaScript
If you want to use an external script to process your audio source, the below Node and events make it possible.
Note: As of the August 29 2014 Web Audio API spec publication, these features have been marked as deprecated, and are soon to be replaced by Audio_Workers.
ScriptProcessorNode
- The
ScriptProcessorNode
interface allows the generation, processing, or analyzing of audio using JavaScript. It is anAudioNode
audio-processing module that is linked to two buffers, one containing the current input, one containing the output. An event, implementing theAudioProcessingEvent
interface, is sent to the object each time the input buffer contains new data, and the event handler terminates when it has filled the output buffer with data. audioprocess
(event)- The
audioprocess
event is fired when an input buffer of a Web Audio APIScriptProcessorNode
is ready to be processed. AudioProcessingEvent
- The Web Audio API
AudioProcessingEvent
represents events that occur when aScriptProcessorNode
input buffer is ready to be processed.
Offline/background audio processing
It is possible to process/render an audio graph very quickly in the background — rendering it to an AudioBuffer
rather than to the device's speakers — with the following.
OfflineAudioContext
- The
OfflineAudioContext
interface is anAudioContext
interface representing an audio-processing graph built from linked togetherAudioNode
s. In contrast with a standardAudioContext
, anOfflineAudioContext
doesn't really render the audio but rather generates it, as fast as it can, in a buffer. complete
(event)- The
complete
event is fired when the rendering of anOfflineAudioContext
is terminated. OfflineAudioCompletionEvent
- The
OfflineAudioCompletionEvent
represents events that occur when the processing of anOfflineAudioContext
is terminated. Thecomplete
event implements this interface.
Audio Workers
Audio workers provide the ability for direct scripted audio processing to be done inside a web worker context, and are defined by a couple of interfaces (new as of 29th August 2014.) These are not implemented in any browsers yet. When implemented, they will replace ScriptProcessorNode
, and the other features discussed in the Audio processing via JavaScript section above.
AudioWorkerNode
- The AudioWorkerNode interface represents an
AudioNode
that interacts with a worker thread to generate, process, or analyse audio directly. AudioWorkerGlobalScope
- The
AudioWorkerGlobalScope
interface is aDedicatedWorkerGlobalScope
-derived object representing a worker context in which an audio processing script is run; it is designed to enable the generation, processing, and analysis of audio data directly using JavaScript in a worker thread. AudioProcessEvent
- This is an
Event
object that is dispatched toAudioWorkerGlobalScope
objects to perform processing.
Obsolete interfaces
The following interfaces were defined in old versions of the Web Audio API spec, but are now obsolete and have been replaced by other interfaces.
JavaScriptNode
- Used for direct audio processing via JavaScript. This interface is obsolete, and has been replaced by
ScriptProcessorNode
. WaveTableNode
- Used to define a periodic waveform. This interface is obsolete, and has been replaced by
PeriodicWave
.
Example
This example shows a wide variety of Web Audio API functions being used. You can see this code in action on the Voice-change-o-matic demo (also check out the full source code at Github) — this is an experimental voice changer toy demo; keep your speakers turned down low when you use it, at least to start!
The Web Audio API lines are highlighted; if you want to find more out about what the different methods, etc. do, have a search around the reference pages.
var audioCtx = new (window.AudioContext || window.webkitAudioContext)(); // define audio context // Webkit/blink browsers need prefix, Safari won't work without window. var voiceSelect = document.getElementById("voice"); // select box for selecting voice effect options var visualSelect = document.getElementById("visual"); // select box for selecting audio visualization options var mute = document.querySelector('.mute'); // mute button var drawVisual; // requestAnimationFrame var analyser = audioCtx.createAnalyser(); var distortion = audioCtx.createWaveShaper(); var gainNode = audioCtx.createGain(); var biquadFilter = audioCtx.createBiquadFilter(); function makeDistortionCurve(amount) { // function to make curve shape for distortion/wave shaper node to use var k = typeof amount === 'number' ? amount : 50, n_samples = 44100, curve = new Float32Array(n_samples), deg = Math.PI / 180, i = 0, x; for ( ; i < n_samples; ++i ) { x = i * 2 / n_samples - 1; curve[i] = ( 3 + k ) * x * 20 * deg / ( Math.PI + k * Math.abs(x) ); } return curve; }; navigator.getUserMedia ( // constraints - only audio needed for this app { audio: true }, // Success callback function(stream) { source = audioCtx.createMediaStreamSource(stream); source.connect(analyser); analyser.connect(distortion); distortion.connect(biquadFilter); biquadFilter.connect(gainNode); gainNode.connect(audioCtx.destination); // connecting the different audio graph nodes together visualize(stream); voiceChange(); }, // Error callback function(err) { console.log('The following gUM error occured: ' + err); } ); function visualize(stream) { WIDTH = canvas.width; HEIGHT = canvas.height; var visualSetting = visualSelect.value; console.log(visualSetting); if(visualSetting == "sinewave") { analyser.fftSize = 2048; var bufferLength = analyser.frequencyBinCount; // half the FFT value var dataArray = new Uint8Array(bufferLength); // create an array to store the data canvasCtx.clearRect(0, 0, WIDTH, HEIGHT); function draw() { drawVisual = requestAnimationFrame(draw); analyser.getByteTimeDomainData(dataArray); // get waveform data and put it into the array created above canvasCtx.fillStyle = 'rgb(200, 200, 200)'; // draw wave with canvas canvasCtx.fillRect(0, 0, WIDTH, HEIGHT); canvasCtx.lineWidth = 2; canvasCtx.strokeStyle = 'rgb(0, 0, 0)'; canvasCtx.beginPath(); var sliceWidth = WIDTH * 1.0 / bufferLength; var x = 0; for(var i = 0; i < bufferLength; i++) { var v = dataArray[i] / 128.0; var y = v * HEIGHT/2; if(i === 0) { canvasCtx.moveTo(x, y); } else { canvasCtx.lineTo(x, y); } x += sliceWidth; } canvasCtx.lineTo(canvas.width, canvas.height/2); canvasCtx.stroke(); }; draw(); } else if(visualSetting == "off") { canvasCtx.clearRect(0, 0, WIDTH, HEIGHT); canvasCtx.fillStyle = "red"; canvasCtx.fillRect(0, 0, WIDTH, HEIGHT); } } function voiceChange() { distortion.curve = new Float32Array; biquadFilter.gain.value = 0; // reset the effects each time the voiceChange function is run var voiceSetting = voiceSelect.value; console.log(voiceSetting); if(voiceSetting == "distortion") { distortion.curve = makeDistortionCurve(400); // apply distortion to sound using waveshaper node } else if(voiceSetting == "biquad") { biquadFilter.type = "lowshelf"; biquadFilter.frequency.value = 1000; biquadFilter.gain.value = 25; // apply lowshelf filter to sounds using biquad } else if(voiceSetting == "off") { console.log("Voice settings turned off"); // do nothing, as off option was chosen } } // event listeners to change visualize and voice settings visualSelect.onchange = function() { window.cancelAnimationFrame(drawVisual); visualize(stream); } voiceSelect.onchange = function() { voiceChange(); } mute.onclick = voiceMute; function voiceMute() { // toggle to mute and unmute sound if(mute.id == "") { gainNode.gain.value = 0; // gain set to 0 to mute sound mute.id = "activated"; mute.innerHTML = "Unmute"; } else { gainNode.gain.value = 1; // gain set to 1 to unmute sound mute.id = ""; mute.innerHTML = "Mute"; } }
Specifications
Specification | Status | Comment |
---|---|---|
Web Audio API | Arbeitsentwurf |
Browser compatibility
Feature | Chrome | Edge | Firefox (Gecko) | Internet Explorer | Opera | Safari (WebKit) |
---|---|---|---|---|---|---|
Basic support | 14 webkit | (Ja) | 23 | Nicht unterstützt | 15 webkit 22 (unprefixed) |
6 webkit |
Feature | Android | Chrome | Firefox Mobile (Gecko) | Firefox OS | IE Phone | Opera Mobile | Safari Mobile |
---|---|---|---|---|---|---|---|
Basic support | Nicht unterstützt | 28 webkit | 25 | 1.2 | Nicht unterstützt | Nicht unterstützt | 6 webkit |
See also
- Using the Web Audio API
- Visualizations with Web Audio API
- Voice-change-O-matic example
- Violent Theremin example
- Web audio spatialisation basics
- Mixing Positional Audio and WebGL
- Developing Game Audio with the Web Audio API
- Porting webkitAudioContext code to standards based AudioContext
- Tones: a simple library for playing specific tones/notes using the Web Audio API.
- howler.js: a JS audio library that defaults to Web Audio API and falls back to HTML5 Audio, as well as providing other useful features.