 Okay, after a brief interlude, I'll hand back to Yarn now who will give us a talk about WebM content with GStreamer. Thanks, Yarn. Hi again. Okay, so this talk is about, as Jonathan just said, it's about producing video in the WebM format using GStreamer-based tools, but more generically it's just about producing video in GStreamer, sort of a tour of the different tools. And as a second half, it's about what is WebM and why might people use it. So, quick note, I work for Oracle. As my employer, they don't actually pay me to do GStreamer stuff, but they do pay me so that supports it in a sense. So, to start with, what is WebM? WebM is the name for, it's a free video standard that comes from Google. Specifically, it's the name they gave their container format, which is based on Matroska, and I'm not quite sure why they felt the need to rebrand Matroska because it was already pretty well known as a file container, but nevertheless. It sort of serves as an umbrella term WebM before Matroska contained, Matroska container inside which is VP8 video with Vaubus Audio. And the interesting part is the video codec. Sorry? To the exclusion of all other codecs. Right. Right. So, the video codec is the most interesting part. The VP8 codec that they acquired through Onto, who were the company that originally released the VP3 standard that became Theora. So, VP3, they've gone through a few revisions. They're up to VP8. And what Google released was their reference encoder, decoder, and a sort of half-baked spec that you can't really implement unless you look at their encoder and decoder. And, in fact, it's to the point that, you know, you can't really implement unless you look at their encoder and decoder. And, in fact, it's to the point where, you know, they're a good demonstration of why you need independent implementations of things in order to verify that your spec is complete and that your spec is accurate. Because, of course, as soon as people started to try and implement their own versions of the VP8 codec, they got to the point where they couldn't go further without referring to the source code, which is an indication that the spec is incomplete. And worse, they got to the point where they found there are bugs where the reference encoder and decoder don't match the spec. But no one noticed because the encoder and decoder share the same code, so they both have the same bugs, so the videos come out not matching spec. But they're decoded not matching spec, so it's all fine. They're not as good as they could be, but, you know, Google's response was, oh, well, just update the spec. But when, really, they should have fixed their encoder and decoder because the way the spec had it was better. But, anyway, nevertheless, some FFMP guys did their own implementation of a decoder and that yielded the FFVPA decoder, which is faster than the Google reference decoder and gives us a separate independent implementation to verify encoder results against. And then the final point is that it's potentially a good choice for the web as a video format for a bunch of reasons. That include things like it has the potential to be a decent codec to give us a codec on the par with baseline H264, for example, but free. So H264, the big drawback, of course, is that it's not a capital F free standard. You can't go around using it willy-nilly without starting to have to buy into patent pools. Whereas Google's claim for VPA is that it is not patent encumbered. Whether that's entirely true or not is yet to be seen. But if that is true, then it gives us a free standard of decent quality, potentially better than the Aura as a video standard. Of the spec did actually implement it? Is the first release of FFVPA was only during the massages? No, I don't know. No, I don't know. I didn't know that. And the other reason I'd give for this potentially being a good choice for a web video is that we may not really have a choice with giants like Google and Apple slugging it out and deciding the browser format we can have our own format for Firefox but really we're going to have to support what the web starts producing as the video content when people start putting large quantities of video and YouTube is probably the biggest and YouTube by default is going to go with Google's decision and is already implementing WebM as a format. So that's the introduction to WebM. The next bit is about the GStreamer support for it. So that's about getting WebM support and there's the details for a couple of distros. On Ubuntu 10.10 you need to get the PPA and install that so you add that as a separate repository, run an upgrade and it will upgrade all your GStreamer packages and pull in the VPA libraries to give you WebM support both production and playback. On Fedora, Fedora 14 has it. For Fedora 12 and 13 it's in the updates. Gen2 of course it was in portage as soon as it was in GStreamer they just bumped their pointers and everyone went and compiled for a few days and it's in testing an unstable on Debian that it's not in stable. So if you get that and you run upgrades on your GStreamer what you get is a few new components installed in your GStreamer plug-in stack and there you'll probably already have verbosync because it's in the base plug-ins that are on every distro that runs GNOME but you will get the WebM MUX and VPA encoder and you'll get the VPA decoder but this is mostly about producing content rather than playing it. So you get the encoder and you get the specific WebM MUX container that restricts you to only using WebM and VPA and verbos as the input codecs and then there's a slightly, it's a bit better on the screen. So there's a diagram of what it looks like when you encode WebM content with GStreamer which some people who've already used GStreamer will know that it's a framework that relies on building these pipelines, these processing pipelines for media and you plug together components, you put elements into the pipeline and you connect them together to describe how the data should flow and how the data flows describes the formats that they go through. So in this case we've got some raw video coming into the VPA encoder it's being encoded in there and then handed off to the MUXer and same for the audio channel and then beyond the MUXer you're writing it out to a file so that's a file sync on the end. That's what GStreamer is doing internally and then all this bit to the left of it is just deleted because that depends on where you're getting your video from whether you're streaming it or capturing it or decoding it from somewhere else. Why is the encoder under the bad plugins? The distinction between the different tables, the different repositories for GStreamer are based on a couple of criteria. So we have the GStreamer core which is the media agnostic doesn't know about anything specific media it's just the scheduling and negotiation logic. We have the base plugins which is a set of demonstration plugins that are of the highest quality and have good documentation and demonstrate how to implement other types of media handling in GStreamer. And then the majority of the media handling actually live in the good, the bad and the ugly repositories and which one of those they live in is based on properties like whether they're patent free whether they're high quality code, whether they have a maintainer whether they have documentation. The reason the VP8 encoder is in the bad plugins is because all new plugins start there and they live there until they meet the quality and documentation criterion and then they either move to good or ugly based on whether they're patent encumbered or licensed in a way that they can be used with GPL apps okay. So VP8 Inc is in the bad plugins because no one's brought it up to the level to have it move and probably because there's not yet enough evidence as to whether it should go to good or ugly for patent considerations. So these are, then I want to talk about some of the methods you can use for producing WebM video. So some of the tools that are available. The first one Jamie demonstrated in the last talk is Transmageddon which has a nice simple UI and almost no configuration for you to do. You just open it, select a file that you want to transcode. So you pick a file, you either pick a preset which you don't want to do for WebM you just leave that as no preset and you select WebM as the output format and it automatically switches to Vaubus and VP8 and you hit transcode and then you leave it go and you end up with a file after a couple of minutes. Pretty simple. Second option is Arista which is also a transcoding utility and has almost a simple UI. You choose a file, you choose your computer as the device and you choose WebM as the format and then you say add that to the encoding queue and give it an output location and file name. Save and then it'll go and do that. So the benefit of Arista, it's also got a simple UI. You can enqueue multiple files at a time and it'll work its way through the queue. I guess it also gives you the live preview which is kind of useful for seeing where it's up to in the file. As a more general benefit over Transvageddon is Arista has presets that you can download from a database on the web so they're all built into Transvageddon but Arista, people can generate new profiles for specific devices They could produce, for example, a specific YouTube upload, optimized format that converts it to specifically 480p WebM for upload to YouTube and as soon as they put that in the database online you'll get off at that. Everyone's watching the video instead of listening to me, aren't you? PTV, the third option. You already saw PTV a bit in the last talk but we didn't specifically show you how to go WebM but it's as simple as you might hope. When you go render a project, the default container is OGG, you switch it to WebM, the video switches to VP8, the audio is all this and then you hit render. I guess you might, keen observers might also note that this is a different render dialogue than what Jamie demoed because I'm running PTV Git and they've updated this with a bit more detail. My favorite, which isn't rendered too well there, the GST launch syntax. You can just write your own GST rember pipeline and paste that into a terminal. I don't know why Jamie thinks that's complex. You just, you know, you just GST launch, you get a file, you'll tell it where the file is, you decode that, that name, run that through a queue, run that through VP8 Inc, run that into WebM Marks, give that a name, take the decoder, run that through a queue, run that through Gorba Sync and into the Muxer and put the Muxer out to a file and then, you know, away it goes. The benefit of GST launch is that you get immediate access to all of the 241 plugins and 1167 elements that are available in GST rember for doing video effects and, you know, you can control the sample rate, do equalizing on the audio, rotate the video, all those things. So if we want... Ah, because a bang only counts in a shell if it's got a character after it. A bang in a space gets passed through by the shell. So this would break. So, next one. The GST rember editing services that I also mentioned, that's the new framework that the PTV guys are working on. It's a new module in the GST rember repositories that gives access to simplifying video transitions and video effects. It has a command line that starts to look a little bit more like an FFM pig style transcode, for example, so you can just do a GST rember editing services launch and you give it an input file. You say start that file. It's zero time. So it's file name, cut in location, and then the duration of the clip. And in the zero for duration means play the entire file. So effectively this command line is take this input file, play the whole thing. I forget what the dash s is. Dash s is smart render. So that tells it to skip, don't decode, and then re-encode if you can avoid it. Do a smart. So the idea would be if you feed this an OG file, it will avoid re-encoding the VORBUS. It will just pass that through and only transcode the video portion of it. Give it an output file. Tell it you want video slash web M as the output format for the container. V for the video codec, A for the audio codec. Click that, it will go and launch that and that will run for usual transcoding duration and then give me the output file. The nice thing about the GST rember editing services that they're still kind of getting to, but they have this UI. You can use to automatically build up that same description that I gave you as a command line thing. You can add a file, you can set the duration, set the point where you're going to come into the file, add another file, and you can insert effects and transitions in between there. For example, you change what transition it is. This is a bit of a demo that they're working on, but the idea is you generate like this the set of play this file, play this much of it, do the transition, do this effect on the video, and then you can save that out as a description. What you end up with is just a simple text file that describes that thing. Because it's text you can easily edit that, change the file names, and then you can feed this file as input to the GES launch command to run a specific transcode. This kind of text would be easy to script. Are those files now like the GST or are they sequence or are they parallel? They're in sequence. It's like a playlist file. It's a playlist that also lets you specify transitions and effects. This is effectively what PTB will be doing as its underlying layer now when you do manipulations on the timeline. It will just be modifying this description that's being given to the GST or editing services layer. Then the final method, and we're going to run really ahead of time here. The final method I want to talk about is the Fluemotion as a WebM production method. Fluemotion is a streaming server that is produced by Fluendo in Spain and is designed for doing live streaming tasks. They added WebM support in the 0.8.0 release, which is available. They produce a repository for Fedora because that's what they use for their streaming platform that they run. There are no packages that I know of for Ubuntu 10.10 yet, but I assume that it will come as an automatic upgrade in the next release of Ubuntu. In the meantime, I've gone and installed it from source, which is pretty easy once you have the GSTRIM or PPA. Fluemotion is a one-package compile. It comes with this really nice UI that I'll just give a really simple demo for anyone that hasn't used Fluemotion before. It has very advanced capabilities where you can enroll a whole bunch of computers, you run a Fluemotion daemon on each of them, and you connect them to a centralized manager machine, and you can get all of those machines to do different tasks in your streaming platform, so you can have one machine that's doing the video capture, handing it off across the network to another machine in the encoding, which hands it off to 50 machines that are the HTTP front-ends, for example. So it has this nice wizard and a really simple test mode that'll just run on localhost. I want to create a live stream. I'll just use the test video and test audio, which are color bars, some text in, sine wave audio, and then I just use WebM as the format. Give it, use the default encoder bitrate and audio bitrate, put a URL around finish, and then it starts up all these different components, which you can see are all running on my localhost, but could be running on other machines across the network. Fluemotion is in Ubuntu, but I think it's 0.6.4 rather than 0.80. So there you go. Let me turn that off. It's just a simple demo stream, but, for example, I could also have created that to be capturing from my web camera and microphone, and then playing it, but that would have been just as nasty as a feedback effect. So that's it. Fluemotion is pretty easy if you want to set it up to stream WebM format. You can stream on demand from files or capture from a webcam and deliver out onto the web pretty easily. My slide's back. All the methods I was going to talk about. So then just some quick examples. These are not hugely amazing because they're up on the screen at low res, and they've been already scaled down quite significantly, but just as a quick comparison, that's a single frame from a high-def VPA encode. It's the same in H.264, and this was taken from a nice exhaustive comparison that one of the main H.264 developers did as a technical critique of WebM as a VPA as a video encoding format, and these were some of the screenshots from the tail end of that. It's really hard to see the difference there, but it definitely is, especially in the leaves, that VPA doesn't do as good a job as the H.264 encoder, and a big part of that is that their encoder has been optimised for a signal-to-noise ratio instead of using any kind of perceptual criteria for their video quality. They do well on signal-to-noise ratio, but when you actually look at it, fine detail's been lost, and they haven't done well at concentrating on the things that were important to a person looking at the image. And then just a couple of videos. Here's a 10-second example. It's a bit annoying now. It's been annoying since I'm also using Totem as my presentation tool. This is the original video. There's not a lot of loss, but there's a little bit of detail that gets lost in this particular encoding, but there's also a bit of noise around the edges where it's been downscaled to begin with. So that was an example of a megabit per second, which is quite a bit for a 352x240 video. The original was uncompressed Y4M. This one's possibly a bit of a better example. This is an original HDV footage at 25 megabits per second. It's got a lot of nice detail in the feathers and things a little bit in the water in the background. And this is about 25 megabit. Quite high res. And the equivalent video. When it's been taken down to about 630 kilobits per second, you still get almost all the... It's preserved a lot of the fine detail pretty nicely. It loses some of the detail in the water, but that's okay, because you're not really looking at the water. I think that one's a pretty good example at 630 kilobits of... I think it'd be a reasonable alternative to DIVX, for example. So, that's basically all the material that I've prepared. So I'll just take people's questions. If there are any. The flu motion? It's using twisted flu motions. Flu motion's based on twisted. It's all python in the front end and then G-Streamer for the heavy lifting of the video content. The architecture of flu motion is pretty well put together. They use it as the platform for fluendos. In Spain, fluendo run streaming services for TV stations and things. It's a professional app. So they use that. I think they peaked at 50 gigabits a second or something, as they're streaming with last year. Any other questions? Thank you very much for listening. I'll get an early mark.