 Okay, so we'll not waste any more time and get Timothy Terry Berry start with his talk on video products All right, so thank you. I'm gonna talk about the Dala video codec project, which is a joint project between the ZIFT.org Foundation and Mozilla And so if you're all at the keynote yesterday you heard Evan tell us that patents are no longer a problem for free software So great. I'm done like I can go home Yeah, so The situation has gotten a lot better I'm not gonna call Evan a liar but There are still places where these things are real problems and codecs are one of them So he mentioned that the the open invention network is a a fantastic tool for free software projects to defend themselves against patent aggressors and while it does a lot of very good things if you go read the agreement somewhere buried deep in all of the legal ease it Has this nice phrase where it points out to the extent that any of the Linux environment components identified in this edition Contained audio still video and or motion video codecs such codecs shall be deemed not to constitute a Linux environment component Which is a good example of why you should not let lawyers near English, but I Also serves illustrate that Even all of these tools that we have managed to build up don't help us in this space So why is it worth solving this problem anyways? And maybe this yeah, just this one codec thing is is we can just ignore that and it won't be a problem Right. We've got the rest of the space is fine Well, the problem is that encumbered codecs are a billion dollar toll tax on communications so Every cost that gets added to a codec, even if it's only a few cents gets repeated a million fold in all multimedia software You go look at the modern cell phone every single component of that cell phone is getting cheaper every every year With one exception and that's the cost of licensing patents and licensing codecs All of this licensing is anti competitive Right, so you hear people Talk about fair reasonable and non discriminatory licensing. I've had very well respected lawyers in this field Tell me they don't know what that means Because it doesn't actually mean anything like the the specific point of a good deal of codec patents is to Create these discriminatory regimes in commodity hardware businesses And so the idea is that you have to pay less to license the patents than your competitors because you happen to own Someone and can get these cross licensing deals and now new people can't come into your space Because the margins on these things are so razor thin that that few extra senses all it takes to make them unprofitable and you profitable They're also an excuse for adding proprietary software to your system, so how many people here have flash on their Linux system Yeah, yeah, there are a few hands and When you start relying on proprietary software for communications You lose the assurances that you get for using free software on the privacy of those communications You no longer have a system that you can look at and validate and guarantee that it isn't actually respecting your freedoms So it speaks to that second problem that ebb and has has turned his focus to as well and So you can say well, but we also have free software implementations of all these proprietary codecs, too, right and We can just use those and just not pay the patents and that won't that nobody nobody's gotten sued over that yet, right? The the problem is is that ignoring these licensing costs Creates risks that can show up at any time and so that works great until you actually build your business around this and you You know sell a few million units and then somebody shows up with their hands out and you get this tax on success right and You'll see that that companies like Skype for example famously used VP 8 and VP 7 before that To build up their their business and it wasn't until they got It wasn't until they got large enough to be able to absorb these licensing costs that they actually switched over to codecs like H.264 But if they had not had the opportunity to use VP 7 and VP 8 they had never have gotten started so Creating good codecs is a challenging problem However, we don't need many of them So ideally we only really need one And the free software Ecosystem is up to the challenge of making them if you go look at the proprietary patenting codecs All the best implementations of them are already free software You may not be able to use it without going and getting a patent license from someone else But the the ability for the open-source community in the free software community to build these things is already there The basic thing that we have to overcome are these network effects, right? So the the primary costs to deploying codecs are Not necessarily the licensing costs and all of that, but the compatibility problems with everybody else So if we want to be able to to build Use free codecs in all of our products. We have to make sure that everybody else uses them also Where royalty free codecs are established Non-free codecs have never displaced them and so some some good examples are JPEG and PNG and FLAC There have been many different competitors that have tried to displace JPEG and None of them have even tried to charge licensing costs All of them have claimed to be technically better than JPEG and they still haven't succeeded just because JPEG is so widespread This is not necessarily true for royalty bearing codecs there have been good examples of those being displaced by royalty free versions But being royalty free is not enough right, so everyone in this room hopefully cares about about Avoiding other people's IPR, but other people out there care about different things and say yeah I can afford to buy a codec in my business, right? What I want to get is the best quality per bit or the simplest integration into my application, right? And so you have to be good on all of these fronts. It's not enough to say Well, we're a little bit worse on quality per bit, but we're free right to get the kind of a network effects You need so that everybody is using these things. You really have to be better in every in every way So we've already done this for audio We made a nice codec called opus it was published in the IETF about two years ago and It can single-handedly replace ten different codecs across all sorts of applications and be better than all of them Despite the fact that when we started and we said our goal was to make a royalty-free codec and You know we wanted to cover some broad uses of applications But you know we were never expecting it to be quite to outperform everything because what we had originally targeted was We're gonna focus on low-latency communications They're gonna make a bunch of trade-offs that make that possible and that means stuff like high-latency Music streaming and that sort of thing. We're not gonna ever can be able to compete on Well, it turned out we were able to compete on that and It wasn't until we got the quality levels that we started to beat everything that people got really interested in this codec So that sort of illustrates the effect of you really do have to be better on a whole bunch of different fronts So opus is now the standard codec for WebRTC It's seeing lots of inroads in other places And so we really hope in the next few years that that this will start to displace a lot of the proprietary audio codecs Which by the way are generally even more expensive than video so Hopefully since we've succeeded in audio we can now turn our attention to video and so the latest generation is Is being fought between HEVC and VP9? Our goal is to be better than both of those without infringing HEVC's IPR So in order to do that We need to not only make a codec. That's really good We need to convince people that it is safe to use and to do that. We need a better strategy then Well, we're gonna go read a lot of patents The problems with that is people don't believe you when you said I read a lot of patents It's okay You know, we've been doing this for I don't know 13 14 years for me personally some some of us in the Zift Auto foundation for almost 20 years now and You would think after that point you would have earned some amount of credibility that yes We know how to design a codec that doesn't read on other people's patents But when you ask somebody to actually ship your codec, they stop and think really hard for a bit And This is you know with not without good reason because doing the kind of analysis is to say that you know There are thousands of patents in this space and we don't infringe on any of them is really error-prone, right? We try to stay fairly far away from the line of what's permissible and what's impermissible But the single mistake can ruin years of development And so a good example of that is actually the H.264 baseline profile Which when they started standardizing this in 2003? They said we would like to make this royalty-free and they got a whole bunch of companies to sign agreements and saying yes This is going to be royalty-free. We will put our patents in if everybody else does Only they didn't get the people who gave who submitted the initial proposal For 264 all they got were all the people who added additions on top of that and the people who has submitted the initial proposal said No, we don't like that idea. And so that's now not royalty-free And There have been various attempts to make it royalty-free you know to try to convince those other parties over the years and we are now you know 12 years later and it's still been unsuccessful and so all the work that people did when they thought they were making this nice free royalty-free codec You know has been locked up behind these patents so our strategy is to try to look for some Some elements that are common to broad classes of patents And so I'm not going to go into a great detailed explanation of how you work around patents If you were in Wellington five years ago Andrew Trigel gave a fantastic talk on how to do just that And I all recommend you go watch the stream if you have not But to To give a short summary, you know Every patent claim is going to have some long list of elements in it And we only need to avoid one of those elements to be able to say we don't do that Which is the best defense you can have to avoid a patent and so What we're going to do is try to identify some elements that show up again and again and again in many many different patents and Replace those with fundamentally different techniques and the idea is that we don't necessarily have to have read every patent And we will still try to do that But if something shows up out of the woodwork some number of years from now There's a good chance that we won't read on it simply because our design was so far out of what they were considering when they filed the patent That it does not apply This is a generally higher risk higher reward strategy Then then the sort of incremental change approach that a lot of codec development has followed for the past 20 years And so people have taken these things and you show up at a big meeting with a hundred different people proposing slightly incremental changes that each get you one in two percent improvements and each one of those was of course patented by your employer and You combine all these things together and at the end of the day you say well look our codec is fifty percent better than the previous one And you can go to this well a few times But eventually you're going to start to hit diminishing returns and it isn't getting harder and harder to find these incremental changes that improve things So we are trying to take sort of this higher risk approach of we're going to go in some fundamental different directions and That will potentially get us out of these local minima At the same time it'll help us avoid wide swaths of these this IPR that other people have have already filed and It's sort of its own reward in the sense of it creates new challenges that other people haven't solved You know if I want to implement technique a and technique a has you know some fundamental engineering challenges associated with it And I go patent all the solutions to those challenges and Somebody else comes along and says I'm going to implement technique B and technique B is not like technique a and it doesn't have the same problems associated with it You know this is sort of the The there is a whole class of inventions where people say you know to to state the problem is to state the solution, right? and a Good patent lawyer has always always trying to patent the problem rather than the solution And by using these different techniques, we are actually trying to solve different problems And so we avoid lots of IPR in that way as well Now all that said we still have to read a lot of patents Because you can never be certain about any of this stuff until you actually do your homework So I said we want to we want to do some things that are fundamentally different So we started out by identifying about four key areas that we thought there were good credible technological alternatives to Then I'm going to talk about each of these and in more or less detail depending on which one So the first one is this idea of a displaced frame difference, which is used in motion compensation If you don't know what that is we're going to talk about it in a couple slides The second one is the this idea of doing adaptive loop filtering which is used to remove blocking artifacts So if you've all seen low-quality video you've seen block edges show up at low bit rates And this this is one of the techniques that's used to avoid that so the next one is spatial prediction or inter prediction and The idea there is that if sometimes you have to code frames without referencing other frames And so you want to predict you want to predict one area of the frame from something that's nearby And finally binary arithmetic coding specifically context modeling and this is this is the way that the final bits getting coded into the bit stream and We will have a different way of doing that as well So let's talk about the displays frame difference so The idea behind motion compensation as traditionally done in most codecs is that you have some Some reference frame that you've already decoded and you want to predict your input frame And so what you do is you copy blocks from the already encoded frame and they're offset by some motion vectors You can count for things moving around the frame and when you've copied all those blocks You subtract them from the input and you're left with this residual and the residual is now much closer To zero and has lots less information in it You know there's still some errors where you weren't able to predict things very well But the idea is that this is where a good percent of your compression comes from The displays frame difference is the term of art for that residual and You know it in in of itself. It's not patentable. I mean people have been using this for 30 years So I'm sure somebody tried to patent it 30 years ago, but at this point you could say it's fairly well clear, but a lot of the other details of Well, how do I code those motion vectors that I'm using to copy the blocks from and if the If the motion vector doesn't point to an integer offset, how do I do the interpolation around that? to to copy those blocks and You know all of these details around the idea of motion compensation. There are many many patents around them But every single one of them to to a larger they look greater or lesser degree says at some point we do a subtraction and compute this displays frame difference and code the residual and we actually filed our own patent on what we're doing and The technical writer that had been assigned by the law firm to write up the details of our patent added the step of subtracting the predictor and Coding the residual to our diagrams and to our claims and we had to go explain to him No, we're not doing that and he he honestly could not understand how you could have a codec that didn't do that So I'm gonna help you try to understand So what we do instead Starts with this idea called perceptual vector quantization and This is based on work that we we did in the opus audio codec though with the idea behind it being to preserve energy and so when audio preserving energy, you know specifically refers to the kind of audio textures that you normally hear and and You know different tones and different Impulse responses and all that stuff In in video it means stuff like film grain and fine details, right? So we don't want to smooth everything out now The way this is done is that we try to separate The energy or the gain of the information we're trying to encode from the shape Which is you know this how it is distributed in frequency space in the spectrum And so what that's all really boils down to is that we have a bunch of numbers We'd like to encode and we put them in a vector and we take the magnitude of the vector the length of the vector And we code that separately from the direction the vector is pointing in which is just some point on a large n dimensional hypersphere So this has a bunch of potential advantages We can give each of those two different components a different number of bits So we have the potential to say there's some energy here But I'm not really going to tell you where it is and this is very hard to do in traditional approaches But that lets us You know preserve some of the contrast of the image that we've been fed Without low-passing everything even if we're throwing away some information in the process We also get this idea of activity masking for free And so what activity masking tries to do is it says we have these regions of high contrast We can throw away more information than we would be able to otherwise before you notice and because the relative error that was of the You know the inaccuracies that we're introducing by doing that is smaller because the numbers are starting off with our bigger right and So the thing you need to know to know how much you can throw away is Exactly this gain this energy which tells you how much contrast there is and So other codecs when they try to make these kinds of adaptive decisions of how much stuff to throw away They have to signal that separately in some other bits, and we don't have to signal anything separately. We get it for free It also winds up being just overall a better representation of these coefficients And so if you think of things like a fade to black You can see how having a single number that represents the magnitude of the information you have You know fits that kind of of change much more easily, right? I can just make that number smaller and the whole thing fades Whereas if I were coding each of those numbers independently, I'd have to change everything every frame So what does all that have to do with splice frame differences? The problem is is that if I subtract my predictor and code a residual I've lost this idea of the energy, you know the energy of my residual was completely unrelated to the energy of my original signal So all of the advantages I got disappear But we really want to do prediction right as I said this is where most of the Compression benefit comes from because they do a great job of reducing the amount of information we need to code So what are we going to do? Well, you step back and look at say what is this prediction actually doing? What it's really doing is changing the probability of Points near the predictor Right, so I'm not going to go into Shannon's entropy theorem and all that stuff with you If you haven't seen it before you can go read Wikipedia but the the gist of it is highly probable things are cheap to code and If you're computing a displaced frame difference what highly probable means is near zero After I've done the subtraction most of my stuff is pretty close to zero And so I can do a very simple model if it's if it's near zero I'll use very few bits and if it's far away from zero. I'll use lots of bits and that's what gives me my compression So we can do the same thing already with with these gains, right? They're just single numbers and so I can subtract I can compute the gain of my predictor and I can compute the gain of the the pixels I'm trying to encode and subtract those and now I have a number near zero And that's that's very easy to fit into the same framework and this is okay because we're not subtracting pixels anymore We're subtracting the lengths of these vectors, which is something very different in terms of patent law, right? These details matter but Then I have the shape to deal with and it turns out that enumerating points on a sphere near some other arbitrary point is Really a hard problem because now I you know How do I how do I represent in the codec that the points over here are more probable? We actually tried to do this with opus and we spent I don't know almost five years toying around with different approaches to it and never really worked very well and Eventually after we finished opus sadly We came up with the idea that we could just transform the space to single out points near the predictor and you're all thinking well Of course, that's exactly what you do and It winds up looking a lot simpler than it sounds so let's go through a simple example So imagine I have this 2d production of my n-dimensional hypersphere and I have this input vector, right? And I've normalized this out so that I'm we're not talking about the gains here We're just talking about the shapes, which just means the direction of the vector is the only thing that's important and so I also have this predictor and The decoder knows what the predictor is because it's going to compute it the same way that the encoder knows it So I don't have to code anything extra to to tell the decoder what that is and Now I'm going to compute what's called a householder reflection that maps the predictor on to one of these axes And if you don't know what a householder reflection is that's something else you can go look up on Wikipedia But the basic idea is it's just a reflection plane that takes an arbitrary vector and maps it to some axis And so all the numbers in the vector are now zero except for one of them Which is one and this makes it very simple to deal with and so in order to deal with this what we're going to do is compute and code an angle between the input and the prediction and That angle is now what represents how close you are to the predictor and so we can model The probabilities of that instead of trying to model the probabilities of all these individual points on the sphere And then we code the other dimensions Somehow and I'm not going to go into details of that somehow But the point is is that you've taken this n-dimensional hypersphere and you've picked out this one direction and now I've I've coded this angle to say how close to this predictor I am and I'm left with this n-minus one dimensional hypersphere Which is kind of seeing 3d product in there and so I've just reduced my problem by one dimension and so this New parameter theta this angle is actually somewhat intuitive as it says how much like the predictor are we You know we started out We had n different numbers that we know we're all unrelated that we you couldn't really point to any one of them say What does this number mean right but now we've got two numbers that are actually relatively intuitive parameters One of them tells me how much contrast I have the other one tells me how close to my predictor I am and you can start to do intuitive things with this you know With when theta equals zero that means use this predictor exactly, right? And so things near zero are highly probable I expect my predictor to be good that should be cheap to code and that's where all of this this probability modeling of points near the predictor should be cheap to code is grouped into this one parameter theta the remaining m-minus one dimensions we code using vector quantization and so if you attended my my talk in 2009 on Kelt how many people were in Hobart? Yeah, so I'll refer you to that presentation on how all that works But the basic idea is that we know the magnitude of that vector that we're coding is is the gain times the sign of theta And so we can use that extra piece of information To remove one degree of freedom and save some save some bits by using vector quantization to code it But at the end of the day what we're doing now is instead of subtracting my predictor from my input pixels Which is essentially if you think of these as two vectors doing a translation What we're really doing is scaling the vectors and then applying a reflection and Whatever else you can say about anything that I just explained. This is nothing like computing a displaced frame difference And the good part is that this works Which is you know was surprising to us, too So the graph on the left there is showing the peak signal-to-noise ratio of using Perceptual vector quantization versus scalar quantization, which is what basically you quantize each inch number individually And that's what most traditional codecs do In this case, we're not actually turning on activity masking and the reason for that is because activity masking makes PSNR worse Because PSNR is a terrible terrible metric, but everybody uses it But we also look at various different perceptual metrics one of them which is called fast SSIM and This is sort of less accepted in the video coding industry But we find it does a good job of telling you for example how much texture you preserve in your original image Which is what activity masking is supposed to do and so we turn activity masking on we get that second curve over there Which shows that we are in fact doing much better And also we look at images and videos that we've encoded and they look better, too Which is a nice thing to double-check sometimes So that's the displaced frame difference There are as I said several other differences So one is this idea of loop filters You know the loop filters are a filter that you run across the edges of two different blocks to try to remove these blocking artifacts and They are adaptive right so the idea is that the filter the strength of that filter Depends upon the amount of difference across those edges So if you have some large difference then that's probably caused by something that was actually different in the image You were trying to code while as if you have a relatively small difference That's probably just noise that you added because you threw away a bunch of information when you coded it And so you want to apply a stronger filter there to get things smoother They're also not invertible in the sense of after you've applied this filter. There's no way to undo it And So these filters have been used for a long time including all the way back to h.263 and in our own codec fiora But they were very primitive and didn't actually work very well because they were trying to be extremely cheap on CPU Because when these codecs were designed, you know 15 20 years ago CPUs were not nearly as good as they are now so as CPUs got better people Designed more complex filters and there's been an explosion of these filters which work much better than the previous ones do but also have lots more patents on it and Since those patents are all less than 20 years old. We said we would like to avoid that whole mess and So what we do instead are use these things called lap transforms The idea of lap transforms is that we have one of these filters that runs between these block edges Just like the the adaptive filters But it's non adaptive and it's invertible Which means I can run forwards. I can run it backwards and get the thing back that I started with and So what being invertible means is that I can run the inverse in the encoder as a pre-filter and now I don't need it to be adaptive Because I don't have to worry about well There was really you know some difference there that that I shouldn't destroy with this filter Because I inverted it already and so it's in the encoder side and so I'm there's nothing to destroy anymore Right. This is not losing information the way that these the loop filter loses information The techniques date back to the mid 90s, which is just about perfect for us There are some patents in this space, but they're vastly fewer of them and vastly they're all much older Than the loop filter patents and so it's a lot easier for us to go through and analyze these things and figure out whether or not this is okay and We've had people come up to us and say oh, but there are patents on those things. You can't use that And we said go go look and they come back to us later and say oh, yeah, okay. That's fine And that makes me feel a lot better But the idea of this pre-filter is That what it's really doing is taking your image and making it blocky Right, and so this is This actually winds up helping compression in the sense of these these discontinuities across these block boundaries are Essentially free to code right when you're coding things a block at a time It doesn't matter what that there's a difference there and So we are taking advantage of that that extra freedom by making things more blocking in the encoder so that on the decoder side we can undo all of that and Get a blocking free image that has taken advantage of the fact that we could code blocks in the interior So one of the problems is that once we start using lap transforms This problem becomes harder and so this tool is called and said spatial or inter prediction And the idea is that you want to predict some block from the neighbors nearby that you have already coded and So the way this normally works is you explicitly code a direction that says I'm going to copy some my neighboring pixels along this direction and Then just extend that boundary Into the current block along that direction just by copying whatever pixels are on our inner near your neighbors, right? Well, that doesn't work with lapping because we can't pop copy the pixels until we undo the lapping But we can't undo the lapping until we've copied the pixels So what we do instead is we don't copy pixels. We copy transform coefficients and We've spent a whole lot of time trying to figure out how to do this and Still copy things in these arbitrary diagonal directions and there's a lot of fun math you can do that to try to to solve that problem and Ultimately, we couldn't make any of it work very well We can make it work fine if you had infinite CPU, but few people have that So currently what we're doing is something that is much simpler. We just copy things in the horizontal and vertical direction and For the chroma planes, which is the color information. We copy that from the the black and white lumen information Which basically means we don't have to care about coding a direction for it because we're trying to say is that you know the boundaries of different colored objects is are In the same place as the boundaries in black and white objects that you know that they're representing when you split up the signal that way So all of this prediction is not as good as copying pixels around But we make up for it in other places in particular the lap transforms give us an improved coding performance You know that we give some back when we do this inter prediction in a slightly less optimal way Finally there's binary arithmetic coding So arithmetic coding is the thing that you use to actually generate a number of bits based on the probability of some Simplier in coding. So this is what gives you the things that things that are higher probability or cheaper to code is because you run them through this process So What traditional video codex do is they only code binary decisions. So it's very cheap to code one symbol Because you only have to consider like am I coding a zero or am I coding a one and you wanted using different numbers of bits to Signal zero or one depending on the probabilities, but it's very easy to make those decisions However to code an entire video frame you need to code lots of symbols and this is an inherently serializable process So you can't paralyze any of this and this actually winds up being one of the bottlenecks in Hardware implementations of modern codex is you need a clock cycle high enough to run the thing through your entropy coder through your arithmetic coder To code all of these symbols So the idea of arithmetic coding and particularly binary arithmetic coding is not in itself Patentable any more the original IBM patents from 1979 are all expired But the things of doing probability modeling so when you have binary arithmetic code, you can do this with a one-byte lookup table the the ideas of taking these non-binary values that you want to encode and Chopping them up until they're until they're individual binary decisions that is also highly patented and so that's all of the stuff That we would like to avoid and while it turns out the original definition of arithmetic coding was not restricted to coding binary decisions So we'll just remove that restriction So we code values with up to 16 different possibilities Which if you think about it means it's equivalent to coding up to four different binary decisions all simultaneously It's slightly more expensive than doing things the binary way, but it's not four times more expensive Because a lot of the computational overheads of running this arithmetic coder are Per simple you encode and not per the number of bits you're running through the number of bits that come out the other end So what that effectively does it gives us four-way parallelism Which is good when you're trying to make these kinds of hardware implementations that need a clock cycle that can now be potentially four times smaller Which means less power Now we can't do probability model in the same way because we can't model 16 different probabilities with one byte We maybe we could but it wouldn't be very informative so Instead we use things like I'm going to assume that the distribution of my 16 points are roughly a Laplacian distribution or an exponential Distribution and then I will use one value the expected value or the average value of those of those points To model the whole distribution And so that keeps the amount of state we need to store relatively small We have to do a little bit of extra computation there to compute the individual probabilities Given that shape of that distribution, but this is stuff is not too expensive The other nice thing is that we're not doing this Binarization we're not converting things into individual binary decisions. We're converting them into hex and so all the code Patents around converting things to individual converting large numbers into individual binary decisions is Basically doesn't apply because we're doing something different We also often will take a Bunch of binary decisions that would have been coded, you know one at a time with a binary arithmetic coder We're all grouping them into one symbol Right, and so this is that is where we get some of our parallelism But people who are doing binary arithmetic coding don't do that at all so There are many more differences between the between our codecs and what traditional codecs do I don't have time to go through all of them, but I do want to give you a brief summary of how we're doing so far So this project is not done. We still have at least a year of development left but currently Using one of our perceptual metrics PSNR HVSM, which is HVS stands for human visual systems So this is something that tries to be a little bit smarter than PSNR Run over 19 different sequences. The red curve there is us The other two curves are x264 and x265 And so you can see that above bit rates of about a half a bit per pixel. We're actually already winning According to this metric, right? So this particular metric we find generally tells you how good you do on coding clean edges We have another metric that I mentioned earlier called fast SSIM Which tells you how good a job you do preserving textures and because we have these features like activity masking built-in We do a fantastic job there and you can see we're actually winning at almost all rates The curious thing here is that if you look at the blue and green curves You can actually see the relative position of x264 and x265 flip between these two metrics One of the reasons for that is because the the x264 developers spent a lot of effort trying to preserve things like film noise and preserve energy, you know given the tools that their codec had and So they're actually currently doing a better job of that than the x265 people That I expect will change as as 265 becomes more mature, but it's an interesting observation So we have a website Called are we compressed yet where we will actually run all these metrics on any git commit If you want to go hack on a repository, we will add your personal git repository and you can run them on these things This is all run off of of a node.js server running on on AWS So you get results back in a few minutes We got some nice details on that. So this was all put together by one of our interns this summer and If you would like more information about how dollar works, we have lots and lots of demo pages giving technical details So some of this is a little bit outdated for example that That part two there on frequency domain it inter prediction as I said, we eventually decided all that stuff really didn't work But the certainly the ones near the end are actually relevant to the current design And since I work for a browser company I can actually get we actually have the entire codec implemented in JavaScript That runs in you know using in script in So this is decoding in JavaScript We haven't done any special optimization for this and you can see it doesn't quite get 24 frames per second on this laptop but we have a lot of work that we can do to make this thing go faster and we also have New developments in the browser coming like SIMD and JS which will make let us use SIMD operations to accelerate this kind of stuff But you know, this is something that you can go to visit that URL at the top there and Run this yourself in your own browser so That's the end of my talk. Are there any questions? Yes Yes, they'll go around with microphones that saves me from having to repeat the question, which I forget to do all the time So what I do work at Google, I don't know actually much about what we're doing with codecs except for webm and VP9 So I was curious. Obviously your codec supposed to be better than the existing one. So that's one reason why you're doing this on the Partwards free and patent-wise is VP9 webm an issue as far as you're concerned or it's just that you wanted to do something new Well, and also it has a Google offer to help you guys if you are better because after all I think Google just wants a free codec, too Right. So yes, so Google does want free codecs So VP9 for VP9 their development strategy has been very much the traditional incremental one Like they did a bunch of incremental improvements over VP8 So we at Mozilla don't have an issue with believing that that stuff is royalty-free But the problem is that the rest of the world is not so sure Given that there were lawsuits against VP8 in Germany all of which, you know, as far as I could tell we're completely bunk But that's the sort of thing that gives people pause and this is what I mean by You know, you're not you not only have to do all the work to make sure that you're royalty-free You have to convince other you explain that. Yeah. Yeah, okay So I get that part and then you also hoping to be better while right hopefully being more trustworthy for people Yeah, so yeah, so my understanding is that Google has started work on VP10 We have been talking about them on trying to collaborate in the ITF to come up with some kind of joint standard And I'm hopeful that that will that will you know come into, you know As get started as a project sometime in the next month or so So certainly there's a lot of interesting stuff that they do in VP9 and You know that they have on the table for VP10 that we would like to be able to use So I hope look forward to seeing contributions from them for that. Okay, and it's something They sound interesting in putting into Chrome Android and all that all those platforms or you haven't had time to tell So they're for that yet Sorry, what would they interested in adding your codec in Android the Chrome browser and all that stuff or well Yeah, so we'll be interested that when it's done. Well, you're not there yet Yeah, there isn't a lot of point of getting it shipped in all these places until people can actually use it You know in the sense of we change it, you know every day So something deployed in JavaScript like this makes some amount of sense because you know at least you're the one Shipping the codec and you can make sure that it will always work But you know until we do that until we have a stable version It doesn't make sense to put it natively into the browser. Okay. Thank you You mentioned something about hardware implementations So do you have already some sort of beta version hardware implementation for this? And do you have something like like power envelope estimates on how it compares with the existing ones? Right, so we have not built our own hardware implementation We have paid a lot of attention to the hardware designed to make sure that doing such a thing is feasible and you know We'll fit into a reasonable amount of complexity So You know we've we've prototyped some things We had someone show up and basically you know Implement the transform stages on an FPGA and start to do some of that stuff for us We do not have a complete codec implementation and we're probably not likely to produce one ourselves just for reasons of resourcing Someone wants to show up and make one that'd be great This work is all really interesting who pays for it So the bulk of our the core developers are funded by Mozilla So we have a team of about six people right now in one manager and We saw we also have you know a various group of volunteers who contribute things as well Like the hardware implementation night, you know the hardware implementation of the transforms I mentioned was done by a volunteer we since hired him right and You mentioned the opus audio codec. Yeah, which I haven't much experience with but it looked quite impressive Except at the very low bit rate End but I was just wondering whether the work you're doing on Dala is going to feed back into a new improved audio codec in the future, right so Honestly at this point, I think opus is done You know it took a lot of effort to go through and get a standard through the ITF And so I don't think we're gonna go back and revise that there is still Opportunity to write better opus encoders to help bring up that performance at the low end But if you actually look at the graph there, I can find my mouse You know that's at around eight kilobits per second You're talking on The internet with RTP on top of UDP on top of TCP on top of TCP Sorry RTP on top of UDP on top of IP You're talking about 40 bytes of headers on 50 packets a second which is another 16 kilobits per second so at that point you're spending twice as many bits on Network headers as you are in the actual audio So we weren't really worried about that small dip at that end of the curve Yes, but GSM is dominated by people who are very very interested in selling patented codecs And so I think suspect we will not If we get into GSM it will because be because of the network effects of everybody else using opus not because they want to use opus Just about that. I think a switch Bp is coming with a new high bands codec because I don't want to use opus so I Think that kind of answers the questions and Even switch people don't really care about low bit rights anymore, and they are not gonna use opus Probably what they're gonna have is gonna be worse, but they want to have patterns Okay, we don't have any other questions. We do have a bit more time for some Okay, then I can Yes You to get a gift from Linux Australia. Thank you