 So yeah, my name is Nathan Ege. I'm from Mozilla, and today I'll be talking to you about the Dalla video codec project, what we are affectionately calling the next, next generation video, all right? And so I started by sort of giving the motivation of why free codecs matter. And so here we're talking about, when we talk about free, we're talking about control and not cost. So the idea is that you should be able to do anything with this video codec that you want to and apply it to any application that you want to and not have to ask permission from anybody. And so if you look at the current video codecs, there are a billion dollar toll tax on communication tools. So what this means is that for every cell phone that has a audio or video codec in it, there's a small cost associated with licensing that codec. And if you look at the price of the components of that device over time, they all go down. But the cost of licensing the codec sort of stays constant. And of course, as these things are sold all over the place, this is a multiplied million fold. So there's a heavy cost there. And if you look at also the licensing terms on these codecs, they're really used as kind of a competitive weaponry, right? So the commodity hardware manufacturers will use these weapons by having patents on some portions of a codec, and they'll have a reciprocal license with other hardware vendors. And what this means is new entries into the market who don't have any of those advantages will have to pay a little bit more, that few cents per device. And that will make them uncompetitive and unprofitable. And so this is a tool that used to keep competitive markets, your competitors out of the market. And then finally, the success of the internet goes based on not having asked permission, right? So having to license a codec already is kind of burdensome. And if you have an idea of beginning from that point, maybe a non-starter. And of course, if you usually don't have to beg for forgiveness either. So I work at Mozilla, and we ship what browser you might have heard of. And we do this through a volunteer network. For many distros, we do this through this volunteer network of distributors, people who run FTP sites, who host the source code, or who host binaries. If you're a small open source project, and you're using a codec, and you're distributing through the same fashion, we all have the same problem. We can't count how many people are using our product. So for many of these licenses, there's a per user cost. And it's just even keeping track of the number of users is burdensome. And you can't possibly even begin to pay that license cost. And so what a lot of people do is they'll just ignore these costs. And for small projects, you're perfectly fine doing so with the assumption that you're too small to be sued. But what happens is once you're successful, then you become a target. And so a famous case is Skype started by using VP7 and VP8. And then when they became larger and successful, they had the revenue to then license other codecs like 264. And so there's sort of this tax on success that shows up when you use codecs without being mindful of the licenses. And then finally, the cost of the license really isn't actually the largest cost in deployment. The largest cost is in compatibility. And you think about the lifecycle of product for some hardware manufacturers. And they might spend orders of magnitude more on just ensuring compatibility across all their devices and all their deployments. And the licensing fee then is really not so much a concern. So you might have seen this webcomic. And you think, well, OK, we have all these codecs. We should really just make a new codec that covers these use cases. And then we can get rid of all these compatibility issues. And now you have one more codec that you have to be compatible with. So developing new codecs is kind of missing the point. The compatibility is kind of missing the point here. There are other really good reasons. And so these are all mostly around cost and licensing. So you can't license an encumbered codec if there's no acceptable license. So for certain video codecs, if you call them to license or codec, they'll give you option A or option B. And neither one of these things apply to you. So there's no acceptable way for you to even get a license. And an example of this is for 264, if you are deploying this to a large group of people on the internet, there is a license cap. And so you can just pay the cap and don't have to worry about counting the number of people vote for other codecs like AAC. There's no cap, so you're back to the same kind of problem. There's no license that works for your distribution model. And in some cases, building a new codec then maybe cheaper than the licensing terms. The doll development team within Mozilla is far below the cap for 264. And we're not sure what the licensing will look like for 265 exactly. So it makes sense for Mozilla who has this distribution problem to build a new or to invest in the development of this free codec that everybody then can use. And of course, this adversarial licensing is a huge risk in a competitive market. And Fran is often none of fair, reasonable, or nondiscriminatory. There was an FTC hearing in June of 2011 where the initial property advisor for a large networking company said that Fran meant she had to call and sign an NDA before she could get licensing terms. And so if she's under NDA, she can't talk to other people about the licensing terms for the same technologies. How is this possibly reasonable and nondiscriminatory? All right, so what we're trying to do with Dala is we're trying to change the competitive market here. Creating good codecs is not an easy problem. But we really don't need that many. We really just need one. And many of the best implementations are already free software. So if you look at the open source community, many of the commercial implementations of these commercial codecs are already open source software. You can use them. And if you have any kind of deployment on the scale, you have to go and license the appropriate patents maybe. But they're already out there. And other implementations work great. And then of course, network effects decide the market. So there has not been a case where a royalty-free codec has taken over in a particular niche that's then been displaced by a non-reality, by a royalty-bearing codec. And so for JPEG, JPEG is a pretty old image standard. There are lots of new image standards that have come out. Some of them have been patent encumbered and perhaps offer better performance in some very specific cases. But nobody's been able to displace JPEG. But being royalty-free is not enough. There are different people who care about different things. There are some from the video space. There are people that care about the cost per bit in terms of the number of bits per pixel, the compression of the codec. But they don't care about price. So they'll be willing to pay anything. They want, just give me the very best codec. I'll pay whatever your licensing terms are. There are people who are in the mobile space that say, well, I need the best performance per watt. There are people that have other needs. And so in order to really win this game, you have to be royalty-free and you have to be good on all fronts. And so at ZIF, we shipped other codecs. We shipped the Aura. And we shipped Vorbis. And these were not best in class for those use cases at that time. And they didn't see the great adoption, even though they were royalty-free. And in some cases, they were significantly better. And so what we did at ZIF was we then made Opus. So Opus is an example of a royalty-free audio codec that we did at the IETF standard body. And Opus is better almost across the board for every use case. And it basically made like 10 other codecs obsolete. You could just use Opus in one place, and you could handle all use cases from very low-quality voice communication all the way up to low-latency, high-quality, stereo, high-bit-depth music quality. And so what we want to do is the same sort of thing. But the strategy is essential here. So these things are necessary for us to be successful in deploying a royalty-free video codec. So we need to design alternatives to avoid the worst patent tickets. So it's not enough to be just avoiding existing known patents working around them. We have to actually have a story that's compelling that we can tell people about why we're royalty-free. So just going and saying, well, we read these patents and we navigate them isn't sufficient, because those people aren't going to also read those patents, the people you're talking to who may use your technology. They don't want to invest the time in that. And so we have to have a compelling story. We'll read and analyze patents and publish the results. And often the advice there is that you should not publish your patent analysis because it kind of gives your competitors a blueprint of how you might defend this in, say, a patent court. And the point there is that when we analyze and publish these results, we can then defend ourselves against IPR claims like we did it with Opus by simply pointing at specific parts of these defenses and saying your claims to this technique we're using do not apply for this specific reason. We're not actually giving away any patent defense as a result. We end up with a very defensible statement. And then, of course, we're going to patent a new technology that we developed. We've done this already with Dala. And the idea there is that by patenting specific technologies, we can then go to other partners in the industry and get them to listen to our claims as to why we believe we're royalty-free. Until Opus had patents, it was very difficult to have that conversation. But towards the end of the Opus development, we filed some patents and we were able then to speak with industry partners. Because we can take those patents and we can use them to grant reciprocal licenses. So the fourth bullet point here, which is sort of what we did with Opus, is we partnered with other people in the industry and granted a license on a reciprocal term, which meant that you could use our patents for deployment of Opus so long as you did not sue anybody else who was deploying Opus for those patents. And then if you did go after someone else who was deploying Opus, you would lose that defense. And any of the other partners who were deploying Opus who wanted to sue you then had the right to do that without losing their license. So this became a lot sort of like the GPL in a sense. It became this sort of technique to encourage good behavior. And then finally, we're targeting the next generation. So we're not targeting 265. We're looking after that. The Kodak development cycle is a pretty long cycle. And so we believe that to be competitive, we need to take the time to actually develop something that is significantly better than 265 because they've already kind of come to market. If we were to deploy something that was equal to that, they may have an advantage in the hardware space or in other spaces. And so we want to actually be maybe 30% to 50% better than what 265 is doing. And then finally, we have to document all of this stuff and make it so that it's abundantly clear that doll is royalty free, that doll is better in all these use cases and let people know what we're all about. All right. And so there are other parts of this strategy that are actually going to be very difficult. We have to be the best in all cases. We have to be best in compression per bit and bit per watt. So for the mobile case, we have to be good for archive use cases. We have to be good for streaming. We have to be good for real time communication. We have to be able to speak with our competitors and our critics in the other camps and get them on board with what we're doing. So there's a huge amount of developers that work on royalty-bearing codecs. And there's great mind share on that stuff. And we want to encourage those people that to contribute parts of their technology towards the codec development process, knowing that they can get the benefit of using this on a royalty-free basis once it becomes available, say, after the next generation of codecs. And then one of the strategies that we did with Opus that was great was we found a niche that was not currently covered by existing audio codecs. And we developed a strong use case around that with Opus. So for low-latency, high-quality audio, there was nothing in that space. And Opus filled that niche. And until we started showing that we were very successful there, we couldn't get other people to become interested in Opus. But once we showed some success, then everybody realized that this was going to be something they could deploy. And then finally, the biggest problem is the doll development team is 10 people. And we're not in a position to develop our own hardware. And so what we'd like to do is create technology in a way that shows that it's compelling, but also is something that other people will want to pick up and can easily convert into a hardware implementation. And so some of the things that we did in Opus that worked really well that we're going to try to do with Dalla is we're going to try to do all of our work in a public process and recognize the standards body with a strong IPR disclosure policy. So the work with Opus was done at the IETF. And in the IETF, there is a strong IPR disclosure policy where anybody who shows up to contribute, as make any comment towards that process for developing that standard, is required to disclose any patents they have or may know about that read on that standard. And in that specific disclosure, they're required to give the patent number. And there's nothing about the IETF policy that says you must not use patent encumbered ideas or technologies, but because they give us a specific patent number, we can then evaluate that patent and say, well, we do or do not agree that this IP infringes on what we're doing. And if we do believe that it does, we can work around it. We can then just opt to use some other technology or find some different way of doing things. That worked very well with Opus. And then we're going to question all of the assumptions around the conventional structure of video codec. So basically, the dollar research project is sort of a high-risk, high-reward approach. We're going to try new and radical techniques with the idea that some of them should give us performance gains that are above and beyond what you get with traditional video techniques. The way 264 and 265 and VP8 and VP9 have been developed, they're sort of incremental improvements. So you take an existing technique and you say, well, now we have more CPU budget. What can we do that's different? And maybe you'll refine that technique a little bit and by applying a little more computational power, you'll get better motion vectors or you'll get better entry coding. So we're going to check all those out and see if we can find something else. We're going to try to find applications where high flexibility is essential. So we've targeted dollar at real-time communication. This is in line with work with Opus where at the IETF, they adopted Opus for the mandatory to implement audio codec. We would like to develop a codec that fits, a video codec that fits that niche for real-time communication with the hope that that will eventually get used at the IETF with the WebRTC. And then finally, the process at MPEG uses a PSNR to select the features that they include in their codecs. And PSNR doesn't actually correlate well with what people perceive as video quality. And so we actually will look at the videos and choose our techniques according to what gives a better visual performance rather than just arbitrary metrics. Okay, and so very quickly, I'm going to give you guys an overview on how video codecs work. So there are four main parts to video codecs that pretty much all codecs have to do. There's prediction, considering what you already know about a scene when coding the current scene, transformation, so you rearrange all the data so that it's in a more compact form, quantization, lowering the resolution of the transform data, and then entropy coding. And so prediction, there are two kinds of prediction in video codecs. There's intro prediction, where you predict portions of the current frame from already decompressed portions of that frame. So lower in the frame, you can use references from above it. And then intro prediction, where you use the decoded previous frame to predict the next frame. And here you can see for this current frame, we've constructed a reference frame from the previous frames, and that's the residual. And there's significantly less information in the residual. And so that's how most video codecs get the bulk of their compression. Video transformation, so most codecs use a 2D DCT, which takes some spatial domain or some image information, image pixels, applies the transform into a more sparse domain, keeps just the highest coefficients and uses those as what it codes. This gives us also great compression is responsible for some of the blocky edges you see in codecs like JPEG. And then last two points, quantization and coding. Quantization is where we will take those transform coefficients, reduce the number of bits used to represent them, and take those bits and run them through an entropy coder that converts them into some set of numbers that has some probability distribution that's efficient. And so in DAWL, we're going to basically do different things for all of those. Instead of doing just a DCT, we apply a LAP Transform, which is technology that was around about 20 years ago in the early 90s that was abandoned because of the computational cost. And now that we've got faster computers, this is now something that is tractable. It also requires us to go through and do a bunch of other new techniques because none of the intraprediction and intraprediction work with LAP Transforms exactly the same way. And we've had to then innovate in that area. We're going to do multi-symbol arithmetic coding. Most of the existing codecs do binary arithmetic coding, context-adapted binary arithmetic coding. We use this great technique we borrowed from Opus called perceptual vector quantization. And if you were at FOMS yesterday, Jean-Marc gave a great talk describing that technique and why it will be that's going to work really well for us. And there's some interesting ideas around prediction where you no longer do the difference between frames. And in doing so, that removes our exposure to a number of patents that begin by saying take the difference of two frames and then do this additional processing. Finally, we do Krumba from Luma Prediction, which is different from other codecs. And Tim Terry will be talking about overlap block motion compensation next month at a conference. And we do this clever time frequency resolution switching. So all of these are kind of new techniques are not currently used for video coding. And we're going to use these in DAW to try to have a believable story for being royalty free. If you'd like to follow the work we're doing, I've got a link here to some of the demos we put together. These are all online. And I think the slides are online that has a link. I encourage you to take a look at them if you have any interest in those specific coding techniques. This is a chart that we made that sort of describes our progress for the previous year. And so what you're looking at here, the red line is what H.265 does on a certain video set as of, I believe, November 20th of last year. And then these lines here are the dollar code base sort of over time. And so we're making progress. You know, there's been additional work actually. John Mark landed code yesterday that gave another 5% improvement at sort of low rates. So we're moving closer to that red line of 265. And we believe that there are many more techniques we can apply that will get us a little closer. And of course, you know, the techniques I talked about before are really, you know, not the end of it. There's new and innovative work that's being done in this space. And for example, this is a very interesting technique that lets you take the center frame that is a composite of the two other images and separate them. So these are the result of doing sort of spatial or the sparsely induced prediction where you're actually able to separate two frames that were overlaid. And that this has computational cost that may be tractable in the next few years. So their ideas that are new that haven't been currently deployed that we believe we can get a lot of gains out of. So what's the road ahead look like? You know, as that slide showed, we've been making some progress with these new techniques. We've had to sort of innovate and find new ways to work around problems introduced by using LAP Transforms, but there's still a long way to go. The industry is currently looking at deploying 265, HEVC, which is 265 and VP9. So they're not really focused on the next next generation. They've got a lot of work ahead of them. But we'd like to, you know, take that opportunity to innovate and make Dall something that will be competitive. So we'd love to get help from people. And we, in particular, are looking for application domains where, you know, there's some novel use case that isn't currently covered by video codecs. And if there's something that you think would be interesting, maybe around, you know, video conferencing or any of those areas, please talk to us and let us know we'd like to accommodate that. All right, so I'm open for questions. Thank you very much. Thank you. One or two questions. Are you going to propose a data to a standards body when it's finished? Absolutely. In fact, we're gonna be working with a standards body hopefully before we, I'm sorry, so to repeat the question, the question was, will we be proposing Dall to a standards body at completion? And the answer to that is that we're actually talking with the ITF informing a working group, hopefully soon, to do development of video codecs in a public process. So even before we have, you know, this is just one of the ideas. Even before we've gotten everything finalized, we'd love to engage a standards body, maybe bring in other partners to give us contributions, maybe involve the community, and get that done long before we're complete. Right, yeah, so we run tests against, so the question was, how does the CPU usage of Dall compare to 265? We run benchmarks against 265 and currently we run significantly faster than them on their reference model. Our goal is to develop not just a reference model but also a production quality code that we can release under the ZIF Mozilla heading that could be used to actually do this in software at a performance rate from day one. We're not going to just have ideas that will be, hope that the CPU performance will improve over time. So we're designing techniques that make use of this. Our entropy coder, for example, is designed to work really well with SIMD. No, I think the techniques we're using have not been finalized. There's a lot of optimization we have not done. So the X264 has gone through and done assembly translations of some of the transformations and motion search, and we haven't done anything like that at all. So there's still lots of room for additional performance. Thank you very much, and... Okay, thank you. You did a free room. So maybe next year we need a bigger room. Hopefully. All right, so the next stove will be...