 So, hello everyone, so this is what we're going to talk about, we're going to talk about mixing these three languages, C, Rust, and Go. This is the URL to the repository where you can find the working code for the stuff that I'm going to talk about, you can download it later or not. So yeah, I didn't introduce myself, my name is Andrea, I work for Red Hat in virtualization and we have a lot of C code, like a lot of C code, and so we've been thinking maybe we don't want to have that much C code, so let's play with other stuff. So the goal that we have in mind for this experiment is to take some existing C library and rewrite the logic in Rust or Go. And so basically what we have is this, right, we have the library code, all the logic is written in C, and we have some client code which is using that directly, and then we have bindings for Rust and for Go, and we have client code in those languages as well. What we want to have instead is this, where the library code is implemented in Rust, and the client code is calling, the Rust client code is calling that code directly, then we have bindings for C and for Go. Or alternatively, we want to have the core library written in Go, and we have binding for C and Rust. So the complexity does not really increase because we have the same number of bindings, but ideally we will be able to move most of our logic of the complex stuff to a different language than C. So what we're going to cover today is some generic information about bindings that apply to any language bindings, and then some stuff that is specific to the languages that we are dealing with, and I will show a lot of code snippets, and those are mostly not valid code even in some points. I'm going to be skimming over errors, dealing with errors, I'm going to do some stuff which is not even syntactically accurate, but just for the purpose of making it understandable quickly, the code in the repository actually works, trust me. And some disclaimers before we start is that there are many choices that have been made during this project, and some of them could have been made differently, so this is just one way to implement these things, you could pick a different way. I'm not that good at either Rust or Go, I would not call myself proficient in either of those languages, and some people would say the same about my C, but whatever. None of this has been used in production, so maybe this seems to work and then you use it in production and it explodes, I don't know, I cannot tell you that. So we have this library that we are going to be using as an example throughout the presentation, and it's a file name builder. So you have a base name for the file, and then you can have multiple extensions, for example you could have foo.rs.go.c, based on what language you're using, and you could decide that some file names are not acceptable, like foo.php, we don't want that. So the way that this is going to be implemented is that you instantiate the builder by providing a base name and a function that can be used to accept or reject names, and then once you have built this object, you can use it multiple times, each time passing a different file extension, and you will get some output, which is going to be either a foo file name if the input is acceptable or an error if it's not. So we will see, like an example of this in action, in this case we're using go because it's less verbose, our filter function will accept rust.rs extension, go extension extension, so we create a toy object, this is our file name builder, passing foo as the base name and the filter function, sorry, and then we pass the go extension and we see whether there's any error and print the result, and in this case go is one of the accepted extensions, so we print success, foo.go, perfect. We try again with the same code, but we use php as the extension, and this time this is not accepted by the filter function, so we get a failure instead. The interface for this library in different languages, it looks like this in C, so we have a bunch of possible errors, we have an opaque structure, type that for the callback, pretty standard stuff, in Rust it also looks kind of what you would expect, and in go also. So why? Why this specific interface? Because it's pretty silly, and that's why the name of the object is toy, because it's not supposed to be something that useful, but it is useful for us because it covers a bunch of scenarios, it covers primitive types, objects, error reporting, and user provided callbacks. All of this can be potentially problematic when dealing with language bindings, so we have a single interface that covers all of this and a couple of functions. One last detour before we get into the thick of it, how we are doing error handling in C. So we use this API, which is heavily inspired by Jalib's G error, and if you're familiar with that API, basically each error contains three pieces of information, one is the domain, the other one is the code, and the other one is the message, and the domain is basically mapping to the native error types of the language that could be go or Rust. The code maps to the specific type of the specific error that happened, and the message is just for printing out. And so from C it looks like this, we have our structure, list of domains, and we can get the domain, the code, or the message. You will notice that there is no constructor for these values because we get them from the other languages, we don't need to construct them ourselves. And so any function that can fail will take a double pointer to an error, and after the function has returned, you check whether this pointer is null. If it's null, then everything is fine. If it's not null, then that means that an error occurred and we can look at the details. And so in action, this looks like this. This is the definition, and below we check whether error is null or not. And those of you with a keen sense of, you know, something, keen eye for this kind of details will notice that this is basically the same code that you would write in Go and kind of what you would write in Rust as well. So this maps really well to do semantics used by those languages for error reporting. Okay. So let's get into the various implementations. So first we start with the Rust implementation. This is our library code implemented in Rust. And you just do the thing that you expect that you will need to do. There is nothing particular that we need to do in this case. So we're just going to assume it's left as a size of the reader. For the Go part, which is again the library code written in Go, same thing. We don't really need to do anything. You just write the interface that you expect for a Go library. Now things start getting more interesting where we start writing bindings. So this notation that I did not explain. But Rust arrow C means that I'm talking about the Rust bindings for the C library. Okay. So it's client code is written in Rust. The library code is written in C. So it's this part. All right. So the first thing to note is that you cannot just use C symbols from Rust. In order to be able to do so, you need to use this tool use bind gen that will create the bindings and allow you to do so. And in order to use the script, we need some plumbing. So we have in our cargo tunnel, we have the build dependency on bind gen. And then we have that build equals build RS, which tells cargo to use our build script. The build script is not particularly complex. We basically just have to instruct both cargo to where to look for the native library. So we give it the search path and the name of the library. And then we tell it where to write out the generated bindings, which is that bindings.rs file in some output directory. And this will automatically be executed when you run cargo build. We have this wrapper header, which just drags in the main header file for the C library so that bind gen can find the definitions. And then in our Rust code, we can just take this bind generated file and just chomp it in. Everything goes into our Rust source. And then we can decide to selectively expose just a subset of the symbols that were generated by bind gen to our Rust code. So for objects that we want, this is C object that we want to expose to Rust client code, we take a wrap and proxy approach. So that means that the Rust object contains a reference to the C object. And whenever you call a method on the Rust object, what happens is that the corresponding C method will be called on the underlying C object. So it looks like this. We have the our toy error structure, Rust structure that contains a pointer to the corresponding C structure. And we have this front function which basically just creates the wrapper. And when we drop the Rust objects, what happens is that the C function that frees the memory is executed. So there is no memory leakage. And when we want to call any of the functions, we just unsafely call the corresponding C function on the on the C pointer. And then we convert the value that we get back to a Rust value and return that one. Now this this convert function, you're going to say it again, it's basically that let's pretend that that is doing magic. It's just converting doing whatever you need to convert. It's not important. It's more for both. If you go to look at the actual code, but it's a shorthand. So this is pretty simple. It gets more interesting where we get to callbacks that are provided by the user. Because in this situation, we we receive from the user of the bindings a Rust callback. But what is going to ultimately execute this callback is the C library. And when the C library execute the callback, it will pass C types to it, which our Rust callback cannot use. So we need to have some sort of type conversion. And in order to achieve that, we use a special wrapper that will do that for us. So this is our implementation of the constructor for our object. And you know, we do all the usual conversion using our magic function. This is all fine. But when if you notice when we call the C function toy new, what happens is that we pass our this invoke callback as the filter. And the actual filter that we got from the user, we pass it as the data argument, which is kind of unexpected. And the reason is that our invoke callback is this generic function that will take the C arguments. This is C function as a C signature C types, and it will convert them to Rust type. And then it will obtain from from C data, it will obtain the actual user callback and then invoke it. So we have to do this kind of little dance. And we we kind of sort of invert the arguments in a way. And we can do this because Rust is more high level than CN. So we don't actually need the data argument. If we need to move some data or captions, some data, we can do it natively with Rust lambdas. So we use it to transport this other data. And notice that the type conversion happens twice. Each time the language boundaries crossed. So we take the the data from from Rust convert to CN and from C convert to Rust. But that's how it works. Then we talk about the go bindings for the C library. So this is that square or rectangle action. So contrary to what happens with Rust, the C symbols are automatically available to go. There is no third party tool that we need to use, but we still need to provide the necessary flags. And that happens with this specially formatted C go comments, which serve the same purpose as the bind gen directives that we've seen before. We deal with errors a bit differently here. Instead of wrapping them, as we've done with the Rust bindings, we convert them into native go value. This is a bit of compromise because it requires us to do the conversion upfront. But it is much better because otherwise we would have to call the free function for every error. And in go, this is not really suitable because of the way you usually write go code. You just have a lot of errors that you need to check. So we could not do that. So we just whenever we get a C error, we just grab the code and the message out of it and we convert them and we generate the go structure that we return to go code. Now we have additional challenges when it comes to callbacks than we had in Rust. The basic idea is still the same, but there is a problem. Whereas first previously we could use as the data argument and use that to pass the go callback around. We have the problem that we cannot really access go object from C code. This is not allowed. If you try to do it, the go runtime will notice and crash your software immediately. And by objects, I mean also functions. So we need to find a way around this limitation. And our solution is lookup tables. So what we do is basically we create on the go side, we create a big array and we whenever we need to make an object available from C code, we just put the object inside that array. And the index that it ends up in, it's unique identifier. And it's sort of like a pointer basically. It's just a number. We can pass it back and forth from C and go with no problems. And so our interface which we only use internally looks like this. We have this add function that allows us to store a new pointer to be accessible from C and we get its index back. And then with that index we can get back the original object. And once we don't need an object to be accessible from C anymore, we can remove it from the lookup table and it will be gone. So that takes care of that. It's weird, but it works. And now that we have found that way, we can do basically the same thing that as we were doing with Rust. So if the code looks basically the same, right? The only difference is that the base is just converted, but for the filter we use the code back add, which is giving us the go reference. And then in the same way, when we get to our wrapper function, the conversion between C and go, this time means extracting the original object back from this lookup table. Everything else just works the same. Okay, so now we're getting to the more interesting stuff, let's say, which is calling Rust from C. So that would be this part of the graph. So what we do is we basically want to, dealing with dynamic linking in this context would be complicated. So we just want static library. And so we tell cargo, we start just to build a static lib, and it would just give us a dot a archive that contains all our code. However, there is not enough to make it work with auto tools. For auto tools to be willing to link against this library, we need to provide a lib tool manifest that contains the extra information, which are basically just the linker flags that we need. And so we fake it. We just take a template, lib tool manifest, and we replace the two values that we care about, which is the name of the library, and the library flags that we need to use when linking. This is our template, and it's very simple. I love the fact that lib tool is very picky about the contents of the file. So the first two lines need to look pretty much exactly like this. You have to say that it's generated by lib tool, even if it's not, otherwise it will not accept it. So there's a little comment actually not, because I didn't want to like too much. But once you have that, you can take this dot la library, and lib tool will happily use it. And auto tools. In this case, objects, since Rust has pretty good native interoperability, we'd see. The Rust objects can be exposed to see just as regular pointers, but we need to make sure we box them first, which means that they need to be hip allocated. And then we can take these pointers and use them back in Rust by just casting them, which is very unsafe of course, but we know what we're doing, so that's fine. And so this is the simple version of it. We just build Rust object, we box it, we get the row pointer into it, and we return it to see. No problems. And when we want to use this pointer in Rust, we just do this unsafe cast, and we use it. So that is pretty simple. For dealing with errors, we use the same in the Jara inspired interface that we mentioned earlier, and basically this implementation is identical to what we seen when we were writing go bindings for the C library, except it's the other way around. But it's not very remarkable, we're just calling to a Rust object. And for the codebacks, also, there is really nothing very different, it's just the same thing, but done the other way around. Convert data in the other direction, nothing to it. Now calling go from C, this is probably the one that fewer people have attempted, I would say. Certainly if you go on the internet and look for information about it, that's the one that you will find the least information about. And so one thing that is tricky and weird is the way that you have to structure your project. Usually all the other chunks of the project have been structured in nasty directories and so forth, but when you're doing, when you're calling go from C, C go will only accept a flat directory structure. So you end up with a bunch of files on the same directory, which is strange. And you will end up with a lot of small files and a lot of these small files will be go files, but will actually contain C code in them as comments. Just like similar to how we instructed go to link to the C library, passing these specially formatted comments before, we can do the same thing and just show a bunch of C functions in there. And that will allow the go compiler to just take everything as a single compilation unit and manage it for us. We don't need to call out to the C compiler manually. But we need to do to compile our project into steps. The first one is to generate a C header that provides an interface to the glue code that we will build between C and go. And the second step is to build a library itself, because the library is to some parts of the library. The binding are implemented in C and they need to call to the glue code. And so they need to know what the interface looks like. This sounds complicated, but it's actually two commands in the make file. I just need to make sure that you call them in the right order. And that's it. Objects still has the same limitation that we talked about when we were discussing binding in the opposite direction. So the go objects cannot be accessed directly from C. So we use the same trick, lookup table. And this time instead of just passing an integer around, which we did before because it was internal stuff, we didn't care. Now we want the CAPI to be good. So we take this reference and we wrap it in a actual C structure so that we have some sort of type safety. So it looks like this. You just have a specific structure and then you have an assigned it inside it. Another thing to be mindful of when dealing with this stuff is that it's true that Go function can be made callable by C. However, the interface that is produced rather is not particularly good and you don't have much control over it. Like one example is that there are no cons pointers. Like everything is just pointers. Which for C libraries is not good enough. So in order to have a decent C API that a C programmer may want to use, we're going to create some small C wrappers on top of whatever is generated by Go. And since we already need these wrappers, we're going to take the chance and just put, use them to deal with our wrapped objects in a nice manner. So what it looks like is this is the glue code. This is a Go function that is exported to C. And you know we do the usual stuff. It's after a while. It just looks all the same. You convert the types. You call the actual Go function and then you collect the result and you convert it to either a value or an error depending on what happened and return it. But we also create this, this wrapper that I just mentioned. And so this is a better C API because it contains, for example the extension is a const char because we don't need to modify it. And so it's a toy because this function does, we know that function will not modify the contents of the object. And we can also deal with, you see we get the Go pointer out of the toy object. And we also, if needed, we wrap the reference to the Go error into a C error as necessary. You see that this is inside a comment block and then there's the import C stuff at the bottom. But it's inside a Go file. This is what I was mentioning earlier that you can stuff entire C functions as comments in a Go file and those will be compiled. Just fine. The errors are handled in the same way as when we were handling the C bindings for the Rust code. Just identical. We still wrap them. And the callbacks are handled pretty much the same as when we were writing Go bindings for C code. But the same tricks and the same caveats apply. So now that we have done all of this, what did we achieve? So we achieved that the client code can basically not care whether it's calling out to native Rust code or Rust binding for some C code or Rust binding for some C code calling down to some Go code. If we diff the C examples that we have and these are the bindings, the C bindings calling to the Rust and the Go implementation, they are identical. There is no difference. If we diff the Rust examples, the only difference is the name of the crate is different. So the code is the same. If we diff the Go examples, the only difference is that when we are dealing with the native implementation, we have garbage collection by our side. And when we are using bindings, we have to manually free the objects that are bound. But everything else is identical. All the logic is identical. So from the user point of view, we can basically switch the implementation from underneath the user without the user mostly noticing. Conclusions. So writing this C library, taking C library and converting it to Rust is possible. This is probably not a surprise to anyone because that was one of the main design goals of Rust was to replace to be used by Firefox. And Firefox is written in C++. So they needed to be able to call Rust code from C and C++. So this is not very surprising. What is perhaps more surprising is that you can do the same thing with Go, which is not something that, as I said, you find a lot of documentation about, not a lot of people seem to be doing that, but it is possible. It works. Another thing that we learned is that Clue code is ugly and repetitive. It's really not the prettiest code. And the version you see is the one that removes a lot of the details. This is the pretty version. It was already ugly. But the techniques that you learn while binding one language to another can be reused when you do the opposite bindings, when you bind different languages. Once you learn this, it's a tool in the toolbox and it's very usable. Like I have not personally tried, but if I wanted to write, for example, Python bindings or something like that, I assume that a lot of these techniques will still apply with some variation. And all of this code is clearly because of that, a good candidate for code generation. Now I've done everything manually, but if you wanted to do this, like for reals, you will need to generate most of this code. You cannot write it manually, it's just too error-prone. But to learn, it's good to write it manually. Alright, so I'm going to thank a few people slash entities. Red Hat, my employer for letting me work on this. My good friend Martin, who is not in the audience, but okay, who helped me really a lot with the Rust part, and so he deserves a shout-out in particular. And all the random strangers of the Internet who wrote a lot of articles about this stuff, so I could just Google whenever I got stuck. Otherwise I would be not presenting this stuff. We're really early. Questions? Question about code generation, which tool do you suggest for code generation? So the question is what tool to use for code generation, and the answer is I did not get that far. What language will you use to write it? The question is what language will I use to write it? Code generation tool? I don't know. I guess I think there are, I know that at least for Rust there is a C-bind jam, for example, if you want to create C-bindings for a Rust project, that already gets you some of the way there, but I haven't used it in practice, and I'm sure that there are other tools that generate this kind of code. I just really haven't had time to look into it. Maybe you can find something of the shelf, maybe you will have to build it yourself, but just don't write the code manually. It's not saying to do that at scale. What do you think about debugging these general mixtures? So the question is about debugging. Debugging is surprisingly easy. At least the example itself was easy, right? So when you get to more complex scenario, the difficulty of debugging will increase accordingly. But when you're dealing with C and Rust, Rust is very well integrated in GDB, so you just use GDB and you get your stack traces and everything. Go is also very well integrated. The only problem with Go comes when you have GoRoutine into the mix that requires specific set of commands to be to be used in GDB, but when I was hitting issues, I was just using GDB, get the stack trace, figure out where I was and it was not an insane experience. I have no idea what's going on. You could figure out what's going on, especially if you like program defensively and you put like a lot of panics and assert when you're fed invalid data, which I think is fair. Yes? Yeah, good lab. We do free software, you know. Get lab. Any more questions? So the question is where do you start if you want to convert an existence project to a different language? I don't have practical experience in actually doing it. So my guess would be you start from the part of the project that is more self-contained and when you have like a self-contained module that does a single thing, you can rewrite a single thing in Rust or Go, whatever and then provide an internal C API to the rest of your C code and then gradually you basically tend to expand that horizon of what parts are in Rust or Go and what parts are in C and you just move along. That would be my guess, but I haven't done it in practice. So, all right. Thank you.