 Thank you and welcome from my side as well. Who attended my presentation at the multimedia mini-conference two days ago? Just want to know how big the overlap is. It's a few people. OK, all of you, you can go again for the book, yes. So what we're doing today, and I say that up front, so you can all be prepared and think about something. At the end of this talk, I'm giving away one copy of my book that I spent the last nine months writing. And John knows how many headaches it produced and stuff, but it's got some pretty cool examples in it. It's basically sort of a book of recipes of stuff, what to do with HTML5 video. And it's sort of what I'm talking about today. I'm showing off some of the examples. Not everything is going to be too long otherwise. Some of the examples here. And what I want you to do, if you're interested in getting a copy of this book for free today, and I can sign it as well if you want to, what I want you to do is think about an application that you might want to write that includes HTML5 or some really tricky problem that you're thinking about or something, or participate really intensively during my presentation with questions but don't stop me too much because I've got 40 slides to get through. So yeah, at the end, I'll just on the spur decide who gets the book based on how you interact. All right, thank you very much. We're talking about, well, the latest and coolest in HTML5 media. That title is only topped by the more dorky one of definitive guide to HTML5 video. But what's really about is, what I browse up to these days and it's really, really exciting what they do in multimedia. I've been in multimedia for 15 years. I've seen stuff come and go, but this is what has excited me the most. So I hope you get as excited about it as I am. Let's get into it. One of the most important websites that I keep getting back to, I want to tell you right up front. It's called caniuse.com. It's about all the HTML5 features in the different browsers and it tells you what features are supported in which browsers, which is really important if you're authoring something on a website and you want to know what browsers are up to, which level of implementation of HTML5. And when I say HTML5, I actually mean a little bit more than just the HTML5 markup. I include into this HTML5 password everything that people include into a HTML5 platform, which is CSS3, all the JavaScript APIs, which is web workers and all sorts of web sockets and all sorts of other things that sort of form around this whole platform. Three things I want to talk about today, video manipulation, audio manipulation and media accessibility. The last two will get a bit short. My talk on Monday at the multimedia mini-con had a focus on audio manipulation. Therefore, I'm going to do it very briefly here today. Video manipulation will be the biggest focus and we'll talk a little bit about accessibility at the end. So HTML5 is a new audio and a new video tag, HTML5 media, but it's not just that really. It's a combination of technologies. And this is the table of contents of my book. And you can see I've got CSS3 here, JavaScript media and SVG, media and Canvas, media and web workers, the audio API, the two different versions that exist and media accessibility and internationalization and finally, audio and video devices. Of all those chapters, the first four are pretty solid. The technology behind it is pretty solid and pretty solidly implemented in all the browsers. The middle three, SVG, Canvas and web workers, exist in browsers but there are diverging levels of implementation. And the last two, audio API and accessibility or last three and the device API basically don't exist yet. There only exist the specifications and no implementations yet, but it is happening. And I'm pretty sure that within a year, we can talk about all these things much more concretely, which also means that this book is actually out of date by the time you're getting it because things are happening all the time, which is why I've set up a website and it's my intention to post to the website. It's html5videoguide.net to post to the website all the new things that are happening around html5audio and video. So hopefully in about a year, I will probably need to do an update of the book maybe in two years, who knows. We'll see. So focus on the video element and video manipulation at first. This is a fairly innocent video element. It's not much happening around it. It's just a video element with controls, a poster image because normally videos that start from the beginning and the first frame is normally shown as the representative image of the video is often black. So not much information. So people tend to pick an image in the middle of the video, take a screenshot and place that in as a poster image. Now poster in this context doesn't actually mean movie post, not as you know it from old type cinema. It just means it's a representative image. We've got a width on this one just to make it a bit scaled a bit down and we've got three source elements inside this video to be able to satisfy the different browsers that still haven't decided on a common baseline codec that every browser will implement. So as long as that's the case, we will at least have to use two source elements in order to satisfy the browsers. It's either a combination of MPEG-4 and WebM or MPEG-4 and Oc that you will have to use. I would suggest for anyone setting up a new website, use WebM. It will be a lot better looked after in the future. Oc is sort of in transition. It still has more tools available than WebM but that is constantly changing and hopefully we will eventually get a common baseline codec. Well, I was actually lying. That last page wasn't just a video element. You can see the rounded corners. That's not the video element. The video element with a little bit of CSS just to make it look a bit nicer and that's the CSS that I threw on here. Rounded corners, border radius is the CSS3 feature so we're quite happy to have that around now don't have to do any tricky things anymore to get rounded corners and I just threw a nice little border on it. Oh, while we're at CSS3, let's look at some other nice things we can do with CSS3 and the video element. For example, here, as I'm asked over, I get some transitions and the video comes closer to me, gets a black background. That's done. So on the hover effect here, we specify the video styling as it is normally and we have a video hover styling section as well and then we have a section that explains the transition between the two states and that transition property and duration and timing function, that's something new in CSS3. For simplicity, just use the standard properties here. Unfortunately, none of the browsers actually use these standard properties yet. They all use them with their own prefixes, dashMOS, dashO for opera, and so on and so forth, dashWebKit. So if you want to use that, make sure to put all of these copies of this in your style sheet. But it does work in all of the browsers. It's a pretty cool, nice little effect. And here I use a little bit more cool CSS3 stuff. I've actually put a bit of transformations in here. The transformations make it possible for the elements to be rotated. So here, for example, SIMO is over it. And not only does it get larger, it also rotates and gets into a state of well, me wanting to play it because it's picked up out of this pile of phones that I've sort of thrown down. So you can do cool things like this now. So this is CSS3 in use with a little bit of video. Okay, there is something else I want to show you. But it's not possible in Firefox, which is the browser that I'm currently using to demonstrate this. It can only be done in Safari. So let's swap over to Safari with CSS. Stop it. Anyone who's had some experience on the early Amigas, I think that was already possible to define something like a cube and throw videos on the side. We can now do that in a web browser. So that's what we're doing here. We've got a cube and I've put that cube with a bit of transitions and so on so it can move in 3D. And as we mouse over, it stops and I can watch the video or I can decide. I don't want to watch this video anymore. Let it turn a bit more and watch a different video. So I think this is some pretty cool stuff that's in the works, only implemented in Safari this far, not even Google Chrome. So I hope we'll see stuff like that in the future a lot more, because it's cool. All right, enough coolness. And enough CSS3. Let's move on. Everything I've done this far was done without JavaScript. Anyone who's a web developer will goggle. Wow, 3D stuff. And no JavaScript at all, just CSS3. Excellent, so much easier. But of course we also have a JavaScript API. And I'm explaining this JavaScript API with the page here that I've copied from Philippe Léguerre, who is from W3C, who's put this nicely together. I didn't think I needed to re-implement that. He's done such a nice job. What you can do is you can just call functions like play, and pause, and load to manipulate the video that you're using, the video resource. So with load you could load into the same video element in the same position, a new video resource, and that way you could dynamically create, for example, a playlist. Play and pause, let you stop and play. The thing of course. These are the functions to interact with the video. Then we've got all of these functions and all of the stuff that's happening on the video element actually throws some events. Here are the events that can be thrown. So for example, if you seek, am I seeking anywhere? Yes, this is a seeking, when I go current time plus one. Then that's seeking, so then I've just seeked, you'll see the seeked thing increase as I press this button. It's still seeking, although it's at the end. You can catch all of these events and do something as they happen. And we've got the media properties, which is what I've interacted with here. We can, for example, set the current time to a certain position or seek backwards just by setting the current time. And you'll see that the property is done here change. You can change them directly in your JavaScript, basically this way. So you can interact with video.currentSource. You can set it to a new source. You can interact with, and you can catch all of these events. Easy enough. So here is something I've done with all this stuff. I've free skinned a player. This is a player that has a lot bigger buttons, obviously more useful for the older generation, which can't see so well anymore, or anyone who likes big buttons. I liked it. Give me all right. So I just wanted to introduce you to W3C, and to do so I have some exciting information. W3C has been acquired by Twitter. There's more to it. You're gonna have to find it on YouTube because I uploaded this video also to YouTube. I don't think I've... Oh, it's probably published as part of the little videos that I've published yesterday on Monday, also on the slides from Multimedia Minicon. So if you go to my slides on the Multimedia Minicon, they're linked. Sorry. You're just gonna have to bear it. My slides are on John Server, and he's complained that I'm pulling more than 20 megabit per second right now, so don't worry about it. So it's up there. But what you can do here nicely, these buttons I've defined through images and CSS and so on, and I've put the JavaScript, I've used the JavaScript behind it to interact with the video. So I can also change the loudness. Come on, I can. Give me all right. So I just wanted to introduce you to W3C. Whatever. Okay, just with the buttons. So I can turn it off. The sound, turn it on. Have it play more quietly and louder and so on. This is also taken from somebody else's design, so. It's a very nice little player for making it a bit more accessible. So now we move on to looking at video and SVG. And the SVG chapter, I must admit, was the hardest to write of the whole book, simply because SVG implementations are just not uniform across all the browsers. But there's some cool things you can do with videos. Once videos have become actual, once SVG has actually become inline and part of HTML properly, you can do things like these little things here where you can use the SVG filters and throw them on video elements and get cool effects like these. The hat's small as everything is small. I'll turn off the sound because it's just annoying. But you can do filters like this blue filter or this canvas type filter, the blurring here with the lines or here's something like a border recognition or something like that. So there's lots of things you can do with these things. I've got all of these examples up on HTML5videoguides.net. So if you wanna look at the source, do it there. They're basically SVG filters and applied through CSS to the video elements. So it's actually a really nice and simple way to use it. Only works in Firefox 4. So now we're getting to something a bit more useful for actually manipulating video and that's the interface between video and canvas. Canvas is a lot more standardized across the browsers. Some little edges that we can avoid. But in general, we can actually do something with video in the canvas. Here, for example, we're grabbing the frame, just the video frame and throwing it into a canvas and displaying it in a canvas. And the way we're doing this is as the, so we obviously define a video element, a canvas element, as the video is playing, we catch the time update event. The time update event gives us an interrupt in frequent and in regular times while the video is playing. At the moment, it's about 100 milliseconds between each time update event. Every browser had their own distances there. At the moment, it sort of seems to standardize at around 100 milliseconds. So at that time update, we are calling this paint frame function. And all we do is we pull, we draw an image from the video element. It's as simple as that. We pull an image from the video element and we throw it into the canvas. Done. If that time update event isn't fast enough for us, so every 100 milliseconds is sort of a sensible thing to do if you want to do something in the canvas that doesn't look like a video, that actually looks a bit like individual images. You could do, for example, face recognition or something like that. At that time distance. If you want something that's a lot more real time, that looks a lot more like a video as well, you actually have to do that by catching the play event, which is thrown a lot more off, which is thrown at the beginning when you hit the play button. And then as you do that, you set a time out. And to do it as fast as possible, you set it to zero, which is probably not that sensible on a normal web page. You probably want to give it a certain, you don't want to throw it more often than your video frames, for example. But you can set it to zero and then paint the frame in that time. So you call your function over and over again. And when you do that, you actually get the video basically mirrored in the canvas by just repainting image by image, which is what this thing does. Note that when it reaches the end and I press the play button again, there's no play event thrown. That's because in theory, the video doesn't end at the end. It's still sitting there waiting for more data to come. So if I hit the play button again and it revines, it doesn't actually throw another play event. It's a standard discussion and we haven't really come to a better solution than that. But so you might need to also catch, if you're ended, if the video is ended, then you actually, and it's playing again, then you want to actually throw, go back into the routine as well. So that's what comes out. All right. Please do interrupt me with any questions you have. It is a big audience, a big room, but I will try to repeat what you're saying and then we can all share it. All right. Another thing that you can do with the video element, now that we have the images in the canvas, we can actually get the image data, the individual pixels. And we get the individual pixels by taking the video, what's painted into the canvas and then we have it through get image data in the frame and then we can loop over the image data, frame data length and then we can do cool things like what we're doing here. We're using the average color of the pixels in the video to change the ambient frame around the video. It's pretty cool. So this is just a CSS color change here. We can also do things, now that we have access to the individual pixels, we can actually manipulate the pixels and determine which pixels we want to display and which we don't. In this instance, what I'm doing is I'm just taking, putting a threshold over the pixel color and I'm just taking those pixels. Play event. I'm just taking those pixels that are above a certain threshold. So basically the white ones and I'm painting those into this other canvas and this other canvas has a picture in it already. So this way you can do blue screen type effects. You can put that in front of a different background. It's not difficult to do. It's basically the same as before. We just create an output image where we throw the big image. Then we loop over the pixels of our input image which is the frame from the video. And we calculate the RGB and then we put a threshold over our RGB values. So above 200 roughly is white or it's very bright and so we paint those pixels and we don't paint the others. Easy as that. While we're doing pretty nice little things, you can do similar things with a reflection. Here what we do is we take the image, we move it 160 pixels down which is below the video. So we basically do a copy of the image and then we turn it around, we scale it, flip it and as we flipped it, it's now a mirror image and then all we need to do is put a gradient which is this thing put a gradient over this mirrored image and this way we have a reflection. That's actually the only sane way I found to do reflections. All of the other ways you can try to do reflections in CSS but then you have two video elements and it's almost impossible to keep two video elements in sync with each other when they're trying to be a reflection. Here we only deal with one video so it's really the only possible way to do reflection sensibly. Here we're getting into something a little bit more CPU-intensive. What we're doing is face detection but the way we're doing it is essentially the same as just before we're identifying those pixels that have a certain color combination and in the literature I found this formula which says pixels of this color tend to be face color which as you will see because the blue, the ones that I've painted in this blue are the pixels that it identifies. Sometimes works and sometimes goes horribly wrong. But I've just taken that as a simple approach. It's the simplest way to identify face color that you can improve that algorithm 20-fold if you put a bit more effort behind it. So here it, here's a bit behind because I'm taking the time update event and so it's jumping a little bit. And then maybe I need to find a picture where it's actually a face. Then what we do or what I do is I find these groups of pixels that are all blue or face color. I find the largest block and I find the boundaries around that block and that's what I identify as a region where there's potentially a face. Now I've thrown all of this calculation into, oh thank you, we're running out of time, into a web worker which is basically a thread that works in parallel to my main page. That means that my main video up here can continue to play smoothless, smooth without interruption while this web worker is calculating the face region and so on because that's quite intensive but it shouldn't stop the main page from displaying the video properly or doing any other things for that matter of fact. So that's how we use web workers. So another cool thing that's possible with HTML5 is something, it's a little bit of an experiment that I want to do with you today. I've set up a Node.js server on this laptop at that address and if you go there, you should be able to watch a video in a shared fashion, oops, sorry. I've got it here. So if you're loading that, it might fail with this many people, I don't know. I found a bug before and just fixed it so hopefully it's gonna work but if people go to this page and don't do anything yet with the video, when I click the play button, it should play on everyone else's laptop as well. It's not currently doing that though because I can see it's not sending any. Ah, yeah, there you go. Play requests are being sent. It's a few people, are people seeing it? Is it playing? If I press pause then it's paused. If I seek, it fails. No, it works. How cool is that? We can share video. Could now be in Germany and we could watch this video together at the same speed. How cool is that? And then of course put an IRC channel next to it and talk about it and stuff. Somebody wants to write that application that would actually be a really, really good candidate for the book. Just as a side idea. All right, let's get back. Just some nice ideas. I'm really trying to kick your imagination here with all these examples that I'm bringing so you can come up with some ideas on something, some good new video application that you might want to write on the web. So seeing as we have only five minutes, I'm just going through the rest very quickly. I was told before I can go over time a little bit. Yeah, okay. Same thing with the audio element. Audio element in essence is fairly boring and innocent. But if we take Firefox 4 and we use the audio API, we can do very nice things as I've shown off on Monday. Come on, let's just run this for a bit. So this is not an example that I coded up. This is an example that the Mozilla developers did and it's very cool because you can get access to the raw audio samples. And you can display them as a waveform and you can filter them. This is all happening in the web browser, remember. It's very cool. Or you can display the Fourier transform and even visualize it somehow or do a beat detection. And now comes the cracker. We can combine this with some 3D stuff. I have a lot more really cool demos of this. So go and check out their work. That's the guys who've done it. It's really awesome. I just have no words. It's awesome. So let's go back to our presentation. This is linked from my presentation and eventually my presentation will be online somewhere as well. Right now, we're just gonna leave it there. I want to show you, so these are the key elements to doing what I've just shown you, the audio stuff. All you need to do is you need to get a, there's an extra event listener that Mozilla have introduced. It's called Mozilla Audio Available. So as soon as the audio samples are available, this thing will throw and then you can pick up the audio samples from the frame buffer. So it's actually quite simple and then you can do all sorts of things you want with it. You can also generate sound. And here we've got just, we're just generating a 440 hertz light, which you can obviously hear as sound and what you need to do is you need to just create a audio element in JavaScript and throw some samples at it. The samples are basically the same as the frame buffer before. So this is very cool and it's one of the contenders towards an audio API at the W3C. You're gonna have to give me a few more minutes. Yeah. And so there's an alternative implementation as well. An alternative proposal for a audio API. And this is implemented as a demo in Safari and in WebKit. And you can do similar cool things there, but the Mozilla API is defined on samples. So you pull out samples and you throw in samples and that's all it does. Whereas this thing is defined on filter graphs. So it's more like a G streamer thing where you put one filter after the other. But it depends on how you want to do your audio presentation. I actually think both APIs are useful and they should make both APIs available. But we'll see what the working group comes up with. Here, for example, this is just a simple drum machine written with this other API. And you can do all sorts of other cool things there. I'll leave that to you to play with. I've put, you're gonna have to install a special version of Safari on your Mac. I want somebody else's Mac. That's the only place where it's currently possible to run, unfortunately. That's because the key person developing this works at Google but came from Apple. So, yeah. Where is my browser gone? Here. So two minutes on media accessibility. We're currently focusing on getting captions and subtitles and audio descriptions working, basically. And we want to do that both as external text files and as in-band content. In-band means as part of the media file we get captions or we get audio descriptions on there. So, three key specifications for the external text file and for the in-band content have been made. Web VTT is the latest. It used to be called Web SRT. It's a new file format. It's a bit like SRT but it does a bit more things which is why we gave it a new name. We've also created a track element to put that into the video element and the audio element and we can, well, mostly actually the video element to put that together and in markup so we can actually reference the VTT file from our HTML page. And all of that has a JavaScript API. It's called the text track. Web VTT file looks like that. It's got a Web VTT identifier at the beginning. We're still discussing whether we should drop that file thing. Web VTT would be sufficient in our eyes. But it looks, other than that, it basically looks a bit like a Web SRT file. And then we expect something like this to be happening where we can have chapters and we can have subtitles and we can then have them play. Play while we're watching a video and read them out and without having to code anything in addition in JavaScript. So now this is a JavaScript implementation of the whole thing, so that's just a first step towards solving this problem. The next thing we want to do is obviously get the browser vendors to implement it natively into the browsers. There's also some formatting on Web VTT that is possible. So here, for example, the alignment, the directionality, the blind positioning and the positioning on the video can be controlled. It's particularly important for internationalization and also to avoid things that are shown on screen so we don't overlap interesting information on screen. And we can also do Karaoke style paint on subtitles with these timing things in the middle of the queues. We call them queues. And you can then have them appear one after the other. Just a quick view of the track element. This is what it looks like. We throw the track element into the video. We can give it a kind so we can tell it whether it's subtitles, captions, descriptions, chapters and metadata. The browser will do different things depending on what kind of track we have. And it will make a menu available with English, French, German on this one for the subtitles, just as I showed before. This is just a quick view of the text track API so you can also look at these things from JavaScript and you can manipulate it from JavaScript, activated, deactivated and so on. And as a final thing, we've also got styling, a suit element for styling the individual queues so we can throw CSS in our browser onto the web VTT queues and make them styled in the way that we want them. All right, so web VTT format improvements, JavaScript implementation, that's what's happening right now in the standard spotty and we're hoping that the vendors will implement this so we can get it out as quickly as possible. So we're getting to the end of this and we should probably start with the questions before we give away the book. So who's got something to, some questions to ask, some proposals to make, things they want to do with HTML5, go ahead. So my first question is, is your presentation just like a CSS style sheet over some elements or are you using some sort of tool or what? I'll show you my presentation. What is the magic behind it? Oops, that's my presentation. It's one index file, it's got all this stuff in it. And I stole this setup, where is it? Slides.javascript from somebody else. But it's really cool, which is why I'm using it all the time. And of course, in order to do all these demos, I need to run in the browser anyway, so it's all in the browser natively. Yeah, that's really nice. My kind of second question is I haven't, I don't really follow the working group standards, mailing lists or anything, but was there any sort of reaction when Google dropped H.264, MP4 support from Chrome? I know a lot of people were. The working group, so privately or on the IRC channel of the working group, a few people went yay. This is a good move towards an open format. But in general, people just ignored it because it doesn't make any difference to the standards work. As long as we can't, all browsers agree on a baseline coding, it doesn't make a difference. I actually am very surprised about all the outcry that happened because... Firefox never supported it. For a web development, it makes no difference whatsoever. Firefox and Opera never supported it. So you had to support two formats forever anyway, so I don't know. I should preface to say this, that I think we're about their new site who's using OGG and H.264 with the video element of production in Australia. No, no, we've run it for the last six months now. I think they're the only public website in Australia, the National Gallery of Victoria. How are you faring? Reasonably well. The hardest thing was to convince our AV people to create both the OGG Theoras and the H.264 because they had these nice GUI for H.264 and they had to convince them to... Well, now they're using Firefox and they're all fine, but the hardest thing was we had to write our own custom JavaScript playlist, play app that would support both 264 and OGG depending on which browser you're in. So we write our own because there was none out there at the time. Well, hopefully in the future you're going to have to do only WebM somewhere in the future, but we'll see. Sorry, but my question was about WebGL. Like, do you know, is there much work between the WebGL, the people working on WebGL and the W3C standard groups because WebGL's in a separate Kronos area like about making them compatible so you could put video elements into when you're rendering WebGL content like other two talking to each other? I know it's being talked about, but I don't actually know whether we've gone down that track yet. I think it's more about getting the video element stable right now, making sure that all browsers react in exactly the same way to all of the queues and so on, all of the events. So I don't think we've gone down that track yet. Next one. Your demonstration of the video sharing. Yes. And you're mentioning all the devices stuff which is not yet implemented. Made me wonder, in the future, is it going to be really easy to share your desktop using your browser with, say, a remote IT support person? Yeah, I think it's... With WebSockets and so on, I think we're actually going to have a possibility to do that. You have an answer to that? No, I have a question. I don't actually know enough about this, but I think that we're moving towards that. Did you have an answer? It's been done already. Ready, there is a GTK library which allows you to use direct thing to that. That's right, I've seen it. It's been done, but it's been done server-side somehow, I think. I don't exactly know how it works, but you should check it out, it's really cool. And it's really clunky to use. We have time for one more question. Well, come to me afterwards. HTML5 standard, are they going to allow the audio API to grab audio from mic and the live device on the system? Yes, that's what the audio device element is all about. So there's a device element, which I also describe in here. The only implementation we have of that right now is the proprietary Ericsson one, I think. And they've made a video of it, which is on YouTube. That's as much as we know about it. It's built on a WebKit branch, and it looks really awesome. Like, they interact with the camera and the microphone through the web browser and send that to somebody else, and they can do Skype-like stuff through the web browser alone, nothing else. It's very cool, once that's all packed, we're still specifying that. And it's all implemented, it'll be cool. But one big thing about the device element which I want to mention is that it's actually putting a lot of pressure right now on browser vendors to come to terms with the baseline codec because you can't do a video conference between somebody using Safari and somebody using Firefox unless you have a common codec. So this is actually a very big question right now, how are we going to solve that? So it's an interesting development right now. Sorry, you were? This was being streamed to one of the other rooms using the testing on-site. Were you streaming? You're being streamed on the camera up here. I think that's being streamed in AUK as far as I know, not in HTML5, being streamed in AUK Zura, so yeah, it's available through the HTML5 video element. Ah, that's cool. We've got a side room, that's way cool. All right, well, I'd say with the book, come to me and talk to me about stuff. I think we've run out of time and I'll give it to one of you who comes here. Thank you very much. So.