 So, ladies and gentlemen, if you can give a warm hand of applause to Jaume Sanchez. Thank you. Hello, hello everyone. First, welcome. I want to thank the organization of Future.js for getting this going, which is awesome. I'm going to be talking about Web Audio, the Web Audio API. This is an example of a 3JS interactive rendering using Web Audio. I'm going to explain how can you get all the nodes and all the functions that the Web Audio API provides to create something like this, which is several specialization components and creating sequences of sound effects that you can trigger and schedule and covering most of the bases of Web Audio API. So, I'm going to be using slides.com. It's the first time I've used it, so I have no idea if this is actually going to go all the way through. But, yeah, first slide. Okay. So, audio on the browser. For many years, we only had a reduced number of solutions to play back audio on our web pages, which were basically depending on external plugins like Flash or QuickTime, or at some point we were depending on the codecs of the system, which was a really dark time for multimedia on the web. Then it came a few years ago with the operation of HTML5. We've got the audio element, which is a pretty nice solution to embed native elements in our web pages as images or text. We also have a video and audio, but it was basically defined to be like a playback experience. We could change the volume, we could stop it, but there wasn't much that we could do. For many experiences that we've been doing during the years, projects like Sound Manager, first version and second version, allowed us to use a hybrid approach in which we had available audio behavior, HTML5 audio behavior, and Flash followbacks, so it was easy to keep up with Chrome, Firefox, IE, and create sound. But everyone was demanding something better, something more professional, something that had more power, which was the Web Audio API. For a while there, there was the Mozilla Audio Data API. It proposed a different approach. They wanted everything to be made with JavaScript, so you would have your frequency analysis or your filters. They would be JavaScript code, while the Web Audio API provides a native approach to all these basic functions when you deal with sound processing. So that's basically what the Web Audio provides for developers to manipulate and play audio assets on web pages or applications. It goes beyond that. Some people have said that Web Audio is the HTML5 audio, what Canvas is to the image element. I think we wish that were the case, because Canvas is very limited in that aspect. It's more like if we had Canvas and Canvas could do color filtering and Photoshop blending and all that Web Audio API provides way more features than Canvas. So how do you use? You basically have to instantiate an audio context. That's your object. You create that declaration. Usually you just use one context per page, usually because you just want everything to go into the mixer. Everything has to go into the user's output system, usually the speakers or the headphones. And once you've got all this instantiated, you can start creating your routing graph, your adding nodes to the system. So Web Audio API, it's based on nodes. It's one of these APIs that works with nodes. So all the basic audio operations like playing a sound, changing the volume, adding some filter are performed by nodes. The nodes get connected together in a graph which defines the behavior of that sound system. They're connected by their inputs and outputs. And that allows a lot of flexibility when you try to create basic playback systems or sound post-processing or dynamic behavior to sound. So these are some examples of routing graphs. The one on the top, it's a very basic one in which you can see that there's a source, not a specific source. It can be anything that provides sound into the system. It's connected to a by-code filter, which is a second-order filter. It does something. And it goes to the destination, which is usually the speakers. And one on the bottom, it's a way more complex system in which there's three sources, several distortions. It creates two paths. One is the dry part that's unprocessed. And the other one is the wet part. It's got a convolution applied. And then it got all together again into the destination. So basically you have to get your system clear in your head, and then that's the graph that you're going to implement. And it's really easy. Don't get intimidated by these weird graphs. So basically we've got sources. Sources is our main way of putting data into the system. Something has to make noise. Something has to provide us with an audio data. I'm going to talk about the four main ways you can put sound into Web Audio API, which are the create buffer source, create media element source, create media stream source, and create oscillator. And basically the way you use them is once you've got your context, you create your source. In this case it's a buffer source, audio buffer source. That creates a note, an audio buffer source note. I have to apologize because it's going to get like a tongue twisted at some points because it's going to be audio buffer note, note, audio note. It's going to get... But there's not that many words in the Web Audio API. So all the words are repeating most of the time. Then you assign a buffer, which is your data to the source audio buffer note, and you start it. This is actually not going to... I mean, this is going to play, but you're not going to hear anything because there's one important step missing, which is connecting the destination. So destination is what you plug your note into. So in the previous case, we had the same thing. Create the context, create the buffer source, assign the buffer, connect it to the context destination, which is the end of our Web Audio chain, which is, again, usually the headphones. And you start the sound, and then you would hear it on your headphones. But the combinations are endless. Once you've got notes, you're able to create notes, you can start connecting them. So here there's an example in which you get an oscillator, connected to a gain note, the gain note is connected to a convolver note, the convolver note is connected to a notch filter, and the notch filter finally goes to the destination. And this is dynamic. You can change it at any time. You can create dynamic effects, whatever you want. So speaking of the specific buffer source, which is probably the one you're going to be using the most, because it allows you to play back a sound file. Once you use the context create buffer source, the audio buffer source note has a few attributes and methods. The most important one is buffer, which is an IE32 bit float ranging from minus one to plus one. This is your data, your raw data, which you would usually load with an XML HTTP request, more about that later. So you assign it, and then you can change the playback rate, tell if the source is looping, how should start looping or end looping. You can start it, you can stop it, and you can know when it's ended. Okay, so this is the most common code that you see when you check for a tutorial on Web Audio. Create a context, create a request, an XML HTTP request, because it's asynchronous. You have to specify the response type you want for this request. It's going to be an array buffer, so it's going to be loaded as a binary data. And once you've got it on load, when you've got the request response, you use the context method that is called decode audio data, which will return a buffer that it's ready for the Web Audio API, for the audio buffer source note to play. So then you can connect it, assign the buffer to your source buffer note, and play it. I'm going to show you a demo. So this is basically the same thing. It's a bit different from what they tell you on the tutorials, because the first thing is that you're going to see... Sorry, I'm going to make it a bit bigger. That's probably better. So the audio context, since it's been an experimented technology, and as things go with Web development, things are prefixed with vendor prefixes, you have to check that you've got the right audio context. So for Chrome, stable right now, it's still WebKit audio context. In Canary, it's already deep prefixed, so it's audio context. Mozilla has dropped the prefix. And Safari, I think, still has got WebKit audio context. So basically, you've got this load sound method that will create the... will request a sound, and will play it like we've said before. The only thing I've changed is that when I create the source, I change the playback rate, and I assign a random value, so you can hear that it's not very exciting, but it's a start. So, yeah. Playback rate, note that it's got... it's playback rate.value. That's weird. Why is that? Because usually you just assign a value. It's because it's an audio param, and we'll talk about that later. So they usually advise you to load these sounds by using the XML HD request. There's one problem with this approach that it's very convenient. It's asynchronous, so it doesn't block your main thread, but you have to download everything. I mean, this own load, it's only triggered when everything's downloaded. So you've got 30 megabytes mp3 for your sound. You have to wait for the whole sound, the whole buffer to be downloaded so you can play it. There's no... the convenience of the HTML5 audio element in which you can auto preload or auto buffer. It's not with this. It doesn't work with this way of loading. But for several other sounds, at least it works. So probably you're good to start with this... with this way. So, oh, sorry. There's another thing. When you get this buffer out of the onload method, and then you pass it in the code that gives us a buffer, you cannot replay an audio buffer source note. Once an audio buffer source note is played, it's done. You cannot play it again. It will throw an exception. So you have to keep the buffer and recreate your source, and then play it again. So a more convenient way of managing sounds, it's probably having a function that what it does, it's the same thing, only that stores the buffer into an audio buffer, and then you can have a play sound which creates the buffer source and plays it. It's exactly the same. It's going to sound the same as the one before. No problem. But this way it's more convenient because it's not loading the same sound every time. It's just reusing the audio buffer. So, I should go step by step into the more complex notes, but I'm going to add something that's a bit more advanced because it's more convenient for the talk. So there's another note that's very useful, that's the analyzer note, which allows us to perform real-time analysis in the frequency domain and in the time domain, which is commonly called the fast Fourier transform. So when you get your data, your audio buffer, that's basically, when it's played back, it's on time domain. That's the classic audio shapes that you see. But the visualizers and the equalizers and most filters work based on frequencies. So the analyzer note allows us to access that data. And the way you do it, it's create an analyzer, set the fast Fourier transform size, which is the number of samples that you want your transform to have. This is all sound engineering and mathematics stuff. You only have to know that this should be a power of two, usually 256 samples, 512. And then you provide your own buffer in which to get float frequency data or get byte frequency data, it's going to store that information. And then you can use it. So in this case, for instance, this is getting more and more complex, but still, we create the audio context, we create the buffer, the analyzer note, we set it to 256 samples, we create this frequency data array of bytes, you in the array, we create a canvas because we want to see what's in that spectrum we want to plot it. And then the rest is the same. We have the load sound, we have the play sound function of before. The only difference is that when we create the play sound, we're connecting it to the analyzer and the analyzer is connected to the output, to the context destination. And the function update, which is called with request animation frame, basically just takes that frequency byte frequency data into frequency data array and plots it. So that's basically what we have. And then we can see the frequencies of what we're sending into the analyzer. So the funny thing of this is that we are creating several buffer source notes in parallel every time I click. All those notes are connected into the analyzer so they get, when you play several sounds, they get accumulated and then they go to the destination. You can see that the frequencies kind of accumulate. So what this is telling us, it's the amount of energy on each frequency band. So this is very useful if you want to create some kind of visualization or some kind of sound reactive application because lower frequencies, this part of the spectrum, it's usually like the bass, like the grave notes and the upper you go into the spectrum, it's higher sounds. So usually like if you wanted to code your own mp3 coder, you would just zero all these frequencies from this part forward or whatever you want. Basically you just link it to something on your page, on your game and you make it react. So this is with a buffer source, same thing, and as I said, there are several ways of putting sound into the system and this one, it's not using load sound, it's using the WebRTC get user media. WebRTC, it's the technology in the browser that allows us to, amongst many, many things, access the webcam and access the microphone or web cams or microphones because there can be many connected to our system. So this navigate or get user media prompts the browser to ask the user to allow access to the microphone and notice that I've commented the part in which the analyzer is connected to the destination so now you're not going to hear if this works. Hello, hello. You can't see the sound but you cannot hear it which is also convenient if you don't want your input feedback to go into the output. You just want to process it. So again, get user media, you have to check that it's prefixed and what it does it's once it prompts, get user media, once it prompts the user to allow access, you've got a success callback to the stream and then there's context create media stream source and it creates a source node and it behaves exactly the same as the audio buffer source node only that it doesn't have playback rate because you cannot make reality speak faster. Okay. So now we've seen that it's very useful to play sounds or to access the input of the system and we saw this audio param type. It's an object. It's a data type that's got some parts of the API. So audio param is an interesting thing and it's basically one of the most powerful features of Web Audio API. It allows you to shadow events to change values over time. So for instance you can set a value at a time that would be like your normal setter you want a frequency or you want again to be 100 and it's right now that's easy. If you wanted it to go slowly into 120 or another value, you would probably have to implement your own easing or your own linear interpolation function and deal with it. So basically what you get is when you set value at time you can say I want this value to be 100 now or in two seconds or whenever. So you can basically shadow everything from the very beginning you start. For the moment you start you can say I want all this to happen. And then there's linear RAM to value at time so it basically you can say it's whatever it is now I want in five seconds to go to zero which is kind of a fade out. It's very cool. So please if you implement mute on your web pages fade them down. Fade the sound in and out because if you do an abrupt cut it's grating so. Like linear RAM to value it's linear interpolation and exponentially it's a curve. Then there's a difference that it's the set target at time that one's a bit more complex it basically says I want you to get into that value at this rate so I don't care the time I just want it's like a fall off it's an implementation of a fall off value and set value curve you can specify certain values and cancel shadow values just removes all your shadow for your sample for your system for anything you want. So I will show this with the next nodes. Gain node basically it's a gain control node it's basically the volume although in audio gain, volume, power are different concepts that are in different measures. It's got one attribute which is gain and it's also an audio param so again you can program it you can shadow it. In this case same thing creating a context creating a buffer source okay creating a volume node assigning the value to 0.1 that's basically the scale that is applied it's a scaling factor to your wave so in this case it would be one tenth one would be no modification two would be twice and connect the source to the volume and the volume to destination and play it and you're done so you can change if you keep the reference to your volume node you can change it any way you want and it would just change the volume. The node that's delay node it adds a delay conveniently and basically same thing you created with context create delay you specify the maximum delay time that that node can process and then you assign it with delay time which is again an audio param attribute so in this case for instance it's creating a source buffer with a sound adding a delay of three seconds out of the 100 seconds that we specified and playing it so it would play three seconds after the actual start so what this can be used for like reverb effect a very easy and cheap reverb effect it's basically connecting your buffer source to a loop between the delay and the gain nodes they're looped you can see it on the last lines that the audio source is connected to the delay node the delay node is connected to the gain node and the gain node is connected to the delay node the delay node is connected to the context destination so it creates a feedback loop attenuating the value so you get this echo reverb effect so I'm going to show you this it's basically the same thing creating the context creating the analyzer creating the canvas to show it loading the sound and playing the sound it's what we've seen creates the audio source sets a random playback rate and creates the loop and just plays it and what you get is this same sample but you can hear that echo so that echo is created by a reverb effect not using anything fancy just a loop of delay and end gain so you've got the echo with a delay and it's attenuated so you lower the gain and the same sound goes on you can run into feedback loops with this you have to be careful it's the oscillator source there's you created with create oscillator and assign a type there's five types sine wave, square wave sawtooth, triangle and custom if you use custom you have to specify how's your custom cyclical wave with set wave table and again there's a frequency that it's an audio pattern so you can specify a frequency that you want your oscillator your LFO to work and you can shadow it, you can ramp it you can do lots of things with it so basically you can start creating your own synthesizer with oscillators same thing, create the other context, create the oscillator assign a frequency value, the type important thing that's one of these things that are on flux and some APIs accept a string stating the triangle or sawtooth, the type and some other APIs, I think it's Safari implementation they require the number so keep that in mind connect the oscillator to the destination and start it okay, sorry okay, filters this is a very useful type of note, it creates a second order filter and it's got all these types low pass, so only lower frequencies below the frequency attribute are led through, high pass, the opposite only frequencies over that frequency are led through, band pass only frequencies around that frequency in the Q attribute are allowed to pass low shelf and high shelf are like low pass and high pass but again is applied so you can amplify those frequencies same with picking it's like band pass but with amplification a notch filter it's the opposite of a band pass it just allows everything to pass except a specific band and all pass basically just applies some shift delay or a gain so in this case, for instance I'm going to show you so I'm creating up there an oscillator with a sawtooth type a signal value and then a filter which is also I think it's 2G7 it's a notch I don't even know right now with numbers I don't remember and what I'm doing is hooking the mouse movement to change and you can see here that I'm doing the frequency of the filter it's changing to a value related to the position of the mouse at the context current time plus 0.1 so in one tenth of a second it's going to go to that value context current time it's very convenient when you're scheduling events because it tells you exactly the moment in which you are in time it starts from zero when you create the context and it progresses while it's running so you can say okay this is my now so now plus one second do this you have to keep track of time and the same thing for the source for the oscillator frequency so you get this so vertical is frequency of the oscillator and horizontal it's the cutting frequency of the filter you can see how the frequencies go up and down so if you want to create a radio it's probably what you need to do it's a very lame demo but that's basically what started with the all the moog all the chords, all the roll and sound machines so you can create sound tables with this so convolution convolution I mean if you know what it is it's cool if you don't know it just use it and forget about it so basically yeah no that's how this thing works because the theory it's mind boggling but it's actually really simple to use a convolution effect it's basically you've got a sample in this case that it's characterizes the dynamic response of a system so for instance when you do this this echo that you hear it can be recorded and when you have a different sample sound you're going to use that sample and a convolver note and it adds all this echo and all this dynamic characteristics of the environment you sampled what is called the impulse response so you can search the internet find wave files or mp3 files which are impulse response in a cathedral inside a metallic tank create a context convolver assign that buffer load it exactly the same as an audio buffer source with xml hd request and apply it everything you put on your system before a convolver it's going to turn into that environment so demo time same thing we create the the canvas of the visualization we use the low sound two times because we're going to load our original sample and the cathedral impulse response then create a convolver and assign it to our audio buffer source and this creates this it creates all this complex texture of sound out of a single I mean if you try to do this with all set of notes of delay gain filters it would be really really long and this is very convenient just stick an audio file and you get this impulse response because you think that you would probably have to calculate for I don't know materials and the way different frequencies bounce of the materials you're modeling so this is pretty cool and I think this is the last one I'm going to cover it's the Paner which it's also very cool if you're creating games or some kind of immersive experience it allows you to create the spatialization effects by doing a very simple thing so basically when you can initiate a Paner note you can define the position the velocity, the direction and the focus or the sound cones like it can be omnidirectional or it can be very directional so you move your sound with all its effects, all its notes you move it in 3D and then it's tied to the audio listener which is context listener which also you can specify the position, the velocity and everything which is you it's the listener so once you've got this running basically it calculates everything like if the source hits far away it's going to get attenuated if it moves fast towards you it's going to calculate the Doppler effect to your relative speeds so that's awesome it's out of the box, you just use it there's a more thorough tutorial about this on HTML5 rocks and I recommend you to look at it because it would take longer than the time I have for the talk and that's it, there's some others like you can split channels and merge them again and the dynamics compressor tries to prevent from clipping and getting things out of hand so usually you just stick it in the end of your chain before the context destination you can create periodic waves wave shaper also complex things and javascript note, that's like the shader of 3D graphics it allows you to do anything you want with the input and the output so anything that you want to do with built notes, you can just create it there you want to do a pitch shift effect you take the frequency, you shift it you turn it back into time domain and let it go so there's plenty of those those are pretty much more complex and require a bit of sound programming knowledge so what you saw the demo I was playing before there's 50 cubes each cube has a sound loop for itself, it's got a proximity activated sound, it's got an activated properly like when it lights the activation sound each note has a different analyzer which is also connected to the microphone it's got a panner so you can place it around in 3D then there's an atmosphere that's playing also with its own note there's a convolver to add some reverb there's a filter for the sake of it then a dynamic compressor to try to keep everything under control and it's basically so now all this sound, all this texture it's created out of a lot of different sounds the sound system is really good but it's not doing justice to the sound with headphones it's way more intense so you can hear all this complex behavior out of sharing events and you can even you have to reload sorry so the sound, the vibration of the light it's based on its own sound but it's also you can also use your microphone so if you want to do something with some installation something that is interactive something that the users can interact with this is a pretty awesome technology so you can hear like when you get closer that's feedback, that's not me and you can hear the sounds when you go over the sound source so it creates a very complex texture very easily so I'm gonna blow this demo and you can see how it's created for its cube and it's not that difficult so awesome, now what you can do with all this technology, there's web artistic user media so you can start doing cool interactions with users on the web page by allowing them to talk into the web page and doing whatever I don't know, sky's the limit there's just the web media API which it's not only for stupid sounding songs, it's actually an interface so you can use MIDI keyboards or any MIDI device plugged into your computer and then use all the shift levers and all the events everything like sound machines everything you want you can go with procedural sound generation for games or for sites or for the sake of it creating an audio note and putting things into the system or using oscillators or you can use real-time effects like specialization or you can have your site explain a soundtrack and if the user goes underwater you can muffle the frequencies so it sounds like it's underwater these are the references I think I mean there's lots of them but better stick to the latest spec there's a few clashing versions of the spec so be careful some you have to create a game note and so you have to create a game so the syntax differs sometimes and both tutorials I mean edit tutorial on HTML5 rocks it's awesome so these two the getting started and the mixing position on audio on WebGL those are really really good and that's it thanks I know questions you can finally answer