 Hi, yeah, much better. So yeah, I've been on a quest these past few years on trying to replace some C code. And this is what I want to talk to you about. And also that Max are really shitty platforms. OK, yeah, technical issues. So my name is Rofor. I'm a freelancer. I work at Clever Cloud, which is a platform service company. I'm really happy to say we have Rust in production right now that I did not write. It's like some colleagues just said, we have Bash. We don't like Bash. Let's write some Rust. And then they talk to me, OK, we're replacing some Bash and we have issues. So what I want to talk to you about today is another French project that you may have heard about, the one with the cone. So VLC Metaplayer is a very interesting project. The basic idea is whatever the format, whatever the network protocol you throw at it, it should just work. Like the basic usage is you get a video file and you drag and drop on the software and it should just play. Most of the time it works, but it comes with a catch. We got a lot of issues, a lot of vulnerability issues caused by memory buffer overflows, that kind of thing, memory leaks, crashes, double freeze, anything. So very common issues that we would like to solve. They come from, well, we write everything in C because old software, there was no good alternative at the time. Every format is passed manually, like handwritten parsers in C, manual memory management, and the formats are all very weird. The other thing in video, everybody has an opinion on how a video format should be and they're all wrong. Now seriously, like you have specifications that are very unclear, that may some choice that can appear dubious because the one who wrote the specification just wanted to reuse some old code from before and then people interpret in different wrong ways and they have multiple competing implementation of the same format and you have to support all of them because there's that one big Adobe stuff that makes those videos and it's done in a bad way and there's this other one and you have to play everything. So you have lots of different cases, memory stuff and it's all in C and it's a nightmare. So like two years ago, I set out to try to find some way to fix that. So I had some requirements. It should be easy to write the password correctly because basically it's a nightmare to maintain that kind of code. It's a nightmare to test and they should be an easier way. It should be memory safe. It should be easy to embed in C. Like one of my first suggestion was okay, let's do some Haskell and then they took a look at the runtime and said, okay, no way, no garbage collection. We don't want that inside VLC, no way. Okay, so maybe with that we see, you see where I'm coming. There's this nice little language that came that started the testing two years ago. It was really nice. There were all of those teals and aromas everywhere and the compiler was breaking everything every two weeks. It was a really, really nice language and it really kept me on my toes because I had to rewrite my code all the time. I'm not giving shit to the language team. I know it had to evolve and it got a long way from that time and it's a really, really amazing work. Just on the side, I did something fun that maybe some of you saw at the time. It was a project called RustFix, which was basically a small Python script that would, I could point it to a GitHub project. It would clone the project, try to build it. If it did not build with the last compiler version, it tried to fix the code with some regx very, very ugly and then push and send a pull request. It was one of the ugliest code I had to write but it was very funny to do because people got pull requests. Wait, wait, how did you do that so fast? So with that, we now have a language we can use for writing some code in VLC that's memory safe, but is it enough to write password correctly? So a few of you may have written passwords already manually in Rust and so that is still not very easy. So I started to work on another little project that's called NOM. Yeah, I called it NOM because there are a lot of great puns I can make. Like it's your data byte by byte. Yeah, it's very bad, sorry. So NOM is a password combinator's library. Password combinator is just a technique that's based on simple deterministic functions. Get an input, it generates an output. That's it. And you combine them in ways like you do one and then another one or you alternate between different until you get one that passes correctly. And it's written with macros, lots of macros, very, very large macros. So why? Because in 2014 I tried to do stuff that would have required the infiltrate that we're getting right now and it was a mess. So I said let's write macros and it should be a good idea. And basically it works. So the design is, as I said, very simple. We have functions that take an input and it returns an output type that's an NOM. The NOM can be incomplete to say, okay, we need more data. It can be an error with additional ideas like where the input, which part of the input showed the error or you can have something that will return the output value and the remaining part of the input. That way you can just get the down part of the high result and continue from the input. It's a very, very simple design. And it uses macro. Okay, so this one is simple enough. We have named which is just creating a function we have terminated which is a combinator that takes two parsers, applies them one after the other and takes the result from the first. So the alpha parser will recognize alphabetic characters and it must be parsing the recognized string but it must be terminated by digits. Okay. When it's generated it's a bit hairy but similar enough so you have a function with the signature like we saw before and it's just match the result of that parser on that input. You get a result, you get the remaining input, you match on that and then again and I return down with the remaining input and the output of the first. Most of NOM code is just match stuff everywhere. It's very, very simple, very dumb code. I choose to really make it easy to hack, easy to understand. Boy, in fact it's not generated like that. It's more like this but because reasons, because you need to have like a full path somewhere because it must be imported correctly. But it's really manageable. Just quickly the features I have in that it can work on strings, on byte slices, on bit arrays, a lot of the combinators that are present in Overlibrary. I took a lot of inspiration from Parsec, the Haskell parser combinator library. It can use the regex crate. There's no syntax extension. It works, it has worked on Rust table for two years now. And well, for as long as there's been a Rust table basically, it can be as fast as it is and we have some really nice stuff with error management like you can know, you can do a hex dump and show which part of the input corresponded to which combinator. It's a gimmick but it's very, very nice to see. And coming soon, NOM.do.zero. So there's a white space-limited format combinator which is implemented like in the dumbest way you could think of. You apply the WS combinator on your passing tree and it will just interspace the space combinator everywhere inside the macros. It looks very ugly but it works really, really well. And there are also performance gain and everything coming soon so it should be really, really nice to use. So we have the language. We have the passing library. It's about time we go to work. So yeah, because until I started working with Rust and NOM and got really into VLC, it was like a year and it just came to me and okay, when are you done? Not started yet. Okay, let's get to it. So first let's see how it works. VLC like most media application is just a pipeline. You get data that comes in input in the access module which is like HTTP, FTP, file access, everything. So it gets data from somewhere. It passes that to a demuxer which is basically where we're going to work. A demuxer is a parser that will extract the video and audio stream and subtitle streams and pass them to other parts of the software. So the decoders, the filters that will apply on video and then goes to the output or it's re-encoded and put into another format for transcoding to file or to the network. It's always a pipeline. All media software works like that because basically you're just pushing stuff at the end of each other and the big issue you have in there is how you synchronize everything because you have the audio and video streams that are decoded and the filter and everything at the same time and you want to make sure that the audio and the video are correctly synchronized because otherwise it's really annoying. Okay, so the way it works in VLC is that you have client applications like VLC media player or VLMC which is a movie editing application. They call into LibVLC which is a public API for all the things that work in VLC and LibVLC core which is the big stuff that manages everything like that's the threading, the synchronization, the module stuff, like all the APIs, how you access files on different platforms, everything. And where we work is there in one of the modules. So the one we saw in the previous slides, everything in there is a module. They all link to LibVLC core to get data but the LibVLC core loads them and try to do stuff with them. So if you want to integrate something in VLC we have to make a dynamic library. All right, it should be easy enough. Like we have cargo, everything, it should be all right. Let's see how it works. When you start a module, LibVLC core will just look in a folder and see a lot of libraries, try to load them, see if they have one symbol in them, call one of those functions which is VLC entry version and then the module will just say, okay, I am this module, I can do that and here are the callbacks you can use to talk to me. Okay, so now I think it will get a bit hairy to do that in Rust because I have to emulate a lot of stuff that's very C-specific. So how do we integrate in a Rust project, sorry, how do we integrate a Rust project in a C project where we're not in a self-contained project? Like if I want to rewrite a library in Rust I can more or less easily make a completely compatible C API, like you can just drop in the DLL and it should work correctly. But there we have something where you call C code and you're called by C code and it's interspersed all the way and it's really annoying to use, but maybe we'll be able to use that. So we have a plan, first we need, as all rewrite projects, we need to import some stuff from the C, like the structures and everything, the functions we will need to use, then we will need to make ourselves pass as a VLC module, then actually write a parser. So we choose an FLV parser, which is a kind of easy format to pass. I could have chosen some very annoying stuff like MP4, but I really didn't have the courage. And then we could actually start to pass stuff. So reproducing structures, maybe some of you worked with bind.gen and that kind of thing. So I really wanted to use it, but I couldn't because the C code in VLC is a kind of object-like structure. So you see the VLC command members stuff. It's a macro that's just expanding to some common attributes of a structure. And there's also the union, which was, I don't know why, not well supported. I've not checked if it works right now, but well. We cannot generate our structures in Rust automatically. Let's just write them manually. It's just brute force, take some time, write everything. And you can see that you have to convert everything. So you have a constant thing somewhere. You need to transform to mute where you need. It's linear work. It's something that really should be easy to automate. But I think we'll have, when we write stuff, we'll have some time to, before we get very, very automated stuff. But it's all right. That is something that I can do manually. Then we start importing the functions from Librely SQL. Again, this is easy enough. I think I could have automated part of it. But really, since I only did like 10 or so functions, it was quite easy to do. This is where we can start to get smarter when we're writing stuff, because we have those functions, but we don't want you to call them directly and interact with C code like, we don't want to be a C developer in Rust. So we make safer wrappers. So you know there's a big unsafe in there. It's not in the function definition. You know there's some unsafe happening somewhere. What you want is to have the guarantees everywhere else in your code. And this is how I wrote that. So you start importing stuff. And then, okay, let's make a module. So yeah, you would say again macros and in C. So this is how a VLC module is generated. Basically, it's a macro that will let you declare some stuff. It expands them, something like that. It's very annoying to write. Basically, LibVLC gives you a callback. And you call that callback again and again and again and again with all the data you got. And the most important part, the callbacks, the functions open and close that you will be called afterwards. Oh yeah, I have to write this in Rust. So this is the point I came up to JB, the video land project leader, and said, hey, can I just make C code that will call the Rust code because it would be easier that way. And he'd look at me and say, okay, no, it will be less fun that way. So yeah, it's specific kind of fun. Let's write everything manually. Yay. So a few annoying things. We have to do as stuff everywhere. We have to have string that are not terminated and binary strings. And it's a lot of fun stuff. There's unsafe everywhere because we got a callback and everything. Again, it's something with enough brute force, with enough time, you get it working. Well, I did not get it working like in five minutes. It took me a few days to see how I could get it. Like some of the strings did not get quite well when loading. That's all right. Maybe you can do better. Yay, macros. Okay, this one is very small one, very easy one I wrote last week, which is basically doing the same code. There's an interesting part there. I have to pass the function name. In the C code, the macro is just expanding the function name manually. But since concat ident does not work exactly as I would need it to, I have to pass the function name manually. Just all right. It's not that bad. Okay, so now we have something that actually loads in VLC. It took some time, like to get there. So it could be a bit hard. So now let's try, let's begin passing some stuff. So FLV, simple enough format. There's first header. We've begins with the tag FL and V. Then a version number, a byte. Another byte with flags indicating if you have audio, video and everything. And there's an offset that shows where you should start passing the rest of the data. It's interesting because that means you can put it very far and just hide data before the packets. It's something I don't know why they do that in some video format, but it's really fun sometimes. So, and this is where we see that it's annoying to be called by C code and call C code instead of having a completely self-contained project. Because we have a function that must pass as a C function, we want to write good Rust code inside, but we still have to write due to call over C functions everywhere. So it's a bit, the code has mixed feelings like you want to write safe, good code, but we don't really have good tools to do that, except by making really, really great wrappers for the C functions. But again, it's something that's manageable. Here, the first thing we do is we pick at the data because VLC call, we call every module and say, okay, try to pass and tell me if it's all right and if it's all right, you will be the module that will be used, otherwise you give the data to something else, you let another module take over. So this is an interesting part. We don't own the data, like this was a big design decision when I wrote NOM and everything. It's made to work on byte slices, on immutable byte slices because most of the time you don't own the data that will be passed. So if you want to be a good citizen in the sea world, you have to assume that you will not manage the memory. You can try to take over a lot of things but at some point you have to make compromises. But it's all right. So we call our parser in NOM. This returns done with the header and then we got the offset saying, okay, and we see until the offset and then we start passing. We give some functions that will be called by the VLC code and we store some data we want to use. Okay, it was easy enough. Then you have to continue with the data, like FLV is basically a lot of packets. You have something indicating the type of packet, the size, the timestamp because you want to synchronize them and the stream ID saying, okay, this ID was the audio stream. This ID is the video stream. When you start passing those, again, you have to pass as a C function and you do hairy stuff at the beginning to have like cleaner code afterwards. We read some data, okay. This is an interesting part with NOM. I have a function with which I can do a hex dump anywhere if I have an error or something to just see what happens. It's really, really useful when you do that kind of project. And then you match on your header and try to get audio data. And again, calling C code everywhere is just a complete mess. Here is the part where we ask VLC to create a block with the data. You know that you have that much data that's an audio block and you tell VLC, okay, you will pass that block to the rest of the code. And there we have some very, very uncool hack. There's a function that takes a VA list. VA list is a way in C to convert a variable number of arguments you got in your function to some kind of variable. And it's very specific to C compilers and it's very hard to support correctly in Rust and it's a very big, big hack that's available in the VA list, oh, sorry, in the VA list crate. But most of the time you have to use hacks like that. And again, isolate the unsafe annoying part and keep the nice code anywhere else. So, yeah, maybe I should show that it actually works because otherwise you'll say, okay, I'm just bullshitting everybody. Okay, so where is it? Where is the screen? No, okay. So, is it big enough? Yeah, it should. So first we will do a cargo build. Yeah, so you see, no, you see VA list, you see flavors, everything. So now there's a small bar script I use because I need to modify slightly the DLL to point to the right one. I don't think it's something we can do already with cargo but since there's already a tool to do that, it's all right. Okay. So now I've copied my DLL inside the VLC and I have another script to start it. And then, yay, we have some kind of video working. Which is a very old advertisement. It's really hard to find good samples in FLV because it's such an old format. So it's an old advertisement for Zella. And you can see that I really like logging stuff all the time but yeah, it's all right. Okay, so we got something working but I worked a bit in isolation. I make my DLL and just copy that into VLC and yay, it works. Now can we get that actually into the tree? Okay, there, there. So basically this is the biggest issue you will get in any project you want to rewrite anything in Rust because as I said earlier, every build system has its own opinion and it's made mostly mainly for one language and cargo is the same. Cargo wants to build Rust and wants to manage everything and the auto tools wants to build whatever, I don't know, and they manage everything. But we can still be good citizens and try to do it correctly. So first, auto conf. Oh, we have to check that cargo and Rust are there. All right, there's something that's in the configure script. Okay, is he enough? Well, I said that but I did not write it. I'm really not an auto tools guy. The funny part is this one, which took like an entire day to get it working. Basically, when you build C code, you make object files and at some point there's Liptool. Liptool knows how to build dynamic libraries for the platform you're targeting and it takes the object files and makes the libraries. But cargo knows how to make dynamic libraries and wants to tell Liptool that I know how to do it but Liptool wins it. So what we do is kind of hack. We make an object file. Yeah, you're gonna see but this emits object something. And then we give that to Liptool and it should mostly work. It's very hard to be a good citizen in someone else's build system. And it's an issue you will get over and over and over in that kind of project. I have another project where I did work on mobile and getting cargo and everything to work with cocoa pods and making libraries in all the platforms we need was really a pain. And getting Apple bit code working is just no way. But whatever. So here's how a rewrite project works. You have to spend time on the build system. And like here I could just work in isolation before but most of the time you will have to work on that from the beginning because this is where you will have all of the ergonomic issues. Because if every time you want it to work you need to copy stuff and edit manually and it will take too much time. You need to have everything automated. Second, you need to isolate the unsafe APIs because you know you're interacting with C and at some point there will be some unsafe stuff but you have to trust that the C code will do what it does and that does it correctly. Most of the time it won't but you have no choice basically in the matter. You have to be again a good citizen. The password is the easiest part really. It took a few hours to write correctly but integrating it in the rest that's where it gets a bit hairy. And very, very important. We don't own, I say everything but we don't own anything. The data is in the C code. The pointers, everything they have passed by the C code, the callbacks. We have to play by the rules and this is still a bit, it's not ergonomic yet but it's getting better. Like we can make safe and useful wrappers. Like if you take a look at the Rust OpenSL it got very easy to use. Like we've closed you using APIs these days. It's getting really nice. So from there, this was a prototype. It's not actually in VLC yet. So don't announce to everybody, yay, video land is doing some Rust. No, it's not there yet. It takes some time. Like Firefox, I know it took some time before it got actually into the tree and like lots of system issues and everything. Downloading dependencies is something that I could really use. I tested a bit cargo vendor basically for VLC. We have this big archive for Windows and macOS where you pre download libraries because you don't want people to rebuild everything all the time. And then I need to complete the bindings. All of the imports and everything is now in a separate crate that can be imported in a new project to make new VLC modules if anyone wants to test it. So just a thanks to the people that helped. Guillaume Gomez is maybe someone you know because he helped a lot on documentation and kind of thing for the Rust team. And he helped on the parser and writing a plugin and trying to make the early code I wrote better because I really didn't care about indentation and that kind of thing. And Luca Barbato really he did all of the auto tool stuff. So doing a rewrite, it's hard but right now in Rust it's doable. It's something we can start to do and like almost anywhere. You take some C code, you start replacing and see if it works. And then you try to convince the C developers that you will remove their code. Let's see how it goes. Guillaume, Guillaume if you want it. The slides are not there yet. Here's the FLV parser, the helper library to write modules and the test code I wrote. So if you have any question, if you want to try that kind of thing, shoot, otherwise thank you for listening.