 Hi everyone. Welcome to the Rusty Boat. My name is Taylor Thomas and I'll let Matt Butcher introduce himself. Yeah, I'm Matt Butcher. I lead an open source team at Microsoft. I've been working on all kinds of cloud technologies over the last a long time, including Helm, Kubernetes, Brigade, various Paz platforms, and even going back into OpenStack. You can find me all over the social media as always at at Technosofos. So there you go. And this is Taylor. Yeah. And like you said, I'm kind of a little bit less consistent with the social media. So there's all my social media handles. And that's just because, you know, common name and common last name. So it's kind of hard sometimes. So first off, I'm one of the Crescent Core Maintainers. We'll talk about that project a lot today. And I'm an emeritus Helm Core Maintainer. I've also been doing containers in Kubernetes for a long time. It makes me quite old in container years, as we like to joke. I've been just doing it for a while, not as long as Matt has, but for a while. And then I am a Rustation, what we call Rust Developers, by way of Go, which makes sense given my background. And so we're going to go ahead and kind of kick off and talk about different things we've learned from Rust and in cloud computing. So this all kind of started with an aha moment or maybe a couple of aha moments. I was doing a one-on-one with one of the people on our team, Brian. And we were kind of wandering around Boulder. This is summertime 2019. We had a couple cups of iced coffee and we were just kind of walking around talking. It was a couple of days before we all got together for an onsite. And as happens, you know, topic A led to topic B and all of a sudden we were talking about, hey, remember when assembly.js was going to be the next big thing? And we both sort of vowed to go off and look at this. Well, unbeknownst to us, other people on the team had accidentally stumbled on the WebAssembly world at the same time. And when we got together about a week later, we had this kind of brainstorming session, and we all sort of blurted out at the same time. Wouldn't it be fun to work on WebAssembly? And who's doing what with WebAssembly? And here's some ideas. And if you take a look at the WebAssembly ecosystem these days, there are really kind of two prongs of the cutting edge. There's the M script and JavaScript implementation. And then there's a whole bunch of work, much of which started out of Mozilla that's really oriented more around Rust. And so we kind of came to Rust. We'd all dabbled with it here and there. But really our first production endeavors with Rust all had to do with WebAssembly and Kubernetes. And that's kind of what got us originally started on it. But we feel really good about that selection. And here's why. Yeah. So really, these are kind of the overarching reasons why we picked Rust. And then we'll go a little bit deeper into all the things we discussed here. So first off, we have safety. If you're here at a Rust conference, you probably know that. But just to mention it, safety is a huge thing with Rust. And that was an excellent benefit that we picked up basically for free for using the language. The developer experience in Rust is quite amazing, which we'll also dive into. And like Matt mentioned, we have wasm support in it. Wasm support in Rust is probably one of the best of all the languages, at least for the site of wasm that's meant for the server side. And then there's an extensibility thing with Rust APIs that's just really elegant and beautiful, that has been very helpful as we've extended external APIs and also consumed other internal APIs, which we'll talk about here. Yeah. So oh yeah, I had something that really kind of transformed the way we did a lot of development, especially having come from go where the extensibility isn't quite at the same language level as it is in Rust. And that has been sort of a big and compelling reason why we've switched a lot of our development there. Rust is usually considered a system that's development language, but we've been using it almost exclusively on our team for cloud development and have found it to be very much a good fit. Cloud development, of course, is a type of system development, but often with a lot more emphasis on HTTP and networking and things like that, that our colleagues originally were saying, but why did you choose Rust for that? Wouldn't you just use language X or language Y? How come you didn't just stay with Go? Yeah. And so we really just wanted to talk about, that's what the first part of this talk is going to be about. We're going to kind of go over, well, how does Rust look in cloud native element? And then we're going to go into a very specific example of how we reverse engineered Kubelet and a bunch of Kubernetes things and show the different components of Rust that we used. So we're going to talk about this in good, bad, and ugly. And so let's go ahead and break into the good stuff. So one of the good things we love is traits. Traits are pretty much amazing. There's not really a better way to say that. They are flexible. They're expressive. We loved the conversion reference traits like the ones displayed here. So this is like from, try from, from string, deref, all those different traits because it allows just such flexible things compared to other interface types. We consider traits to be better than pretty much most other interface-style types because the type itself doesn't even have to implement an interface to be used as another type. Dref is really nice for picking up methods from underneath. It's just pretty awesome. This is an example from our Bindle project, which we won't talk about here, but it just allows us to pass in pretty much any type that can be parsed as a string and then a few others and convert it into our internal ID type, which is really, really powerful compared to other languages we've used. Yeah. And we really found that we were misusing traits originally, that we were thinking of traits more like go interfaces or Java interfaces. And Taylor put this one up here because I think this was representative of sort of his aha moment that traits were far more powerful if you thought about them from the Rust perspective instead of some of these others. And another feature that we've really liked that took us more than a few moments to realize how powerful it was, was enums. And here we've got an example from error handling and enums. Yeah. Now this error handling, I always have people ask, well, why don't you use just this error crate? This is actually using this error crate. It's already fat and big. So I didn't want to include any more. But using some of the other crates and just leveraging these enums allow for really, really expressive errors. And so they're not just like single values, but they can carry associated data. So each variant can have different data structures like a discriminated union. And then you can work with these using pattern matching. So this pattern matching is converting some types over for us in a real code. And so we're able to make sure that we handle every single kind and get the data out all in one beautiful statement. I mean, you can read this coming from any language that you and you'll go like, oh, I see what this is doing. Like you might go, what's this unwrapper default? Or what's this? Like some of it might be weird, but like the basic structure of what it's doing is really, really elegant and powerful. And we see this all over the place. We've just kind of like Matt said, had this aha moment, like, wow, like you can use these for some amazing cases in particular errors. Now, macros. Let's just say we absolutely love macros. Now they writing them can be a little finicky, but we're not going to worry about that right now. Just be aware that that happens. We've learned that quite a bit. But macros are great for cloud development. Because there's so many times we're consuming externally. So seeing that this is a Kubernetes adjacent event right now, you probably know what a CRD is. If not, it's a custom resource definition, a hook into the Kubernetes API. In Go, even with all these other libraries that people have generated, you still have to auto generate code and commit code and have all these different things to make it work here. Literally, I have the data I care about specified, and then I can derive custom resource and pass it basically some configuration. And I automatically, when the things built, that code expands out to all the code needed implement. I don't have to commit anything. It builds it correctly every time. You can also see this inheriting, not inheriting, that's the wrong word in Rust, but taking and deriving a JSON schema. So we're attaching a whole JSON schema to this object, basically for free. So this, as Matt and I were talking over this code sample, we realized that this block of code in Go, where you would generally write something like this, is probably at least 200 lines, if not more. And a lot of it is just committed or auto updated code. And this is 10. That is so powerful. We're deriving serialization, deserialization, and macros all do that for you for free. Yeah. And all the things that are being hidden from you here are things that we don't really, as developers, care all that much about. It's what would normally be boilerplate code. Another thing we really have enjoyed when it comes to reducing the number of lines of code that you have to write is the way you can do error handling and the way you do iterators in Rust. Again, this particular example probably would be 200, 300, 400 lines of Go code with lots of, with the cyclometric complexity of it would have been fairly deep because you'd have a lot of nested ifs and a lot of nested loops. And it's just so elegant to look at code like this and see everything sort of laid out, laid out linearly. And again, we find the readability of this to be very high. I remember Taylor, I don't think you liked this style at first. Has that changed? I, I still, it's sometimes can be a little bit hard to read, especially with these kinds of examples, but I have come around to really liking how some like just some of the magic that happens here, because this example in particular is doing a whole fan out fan in asynchronous compute task in 20 lines of code. So that is, that is really cool to see that happen. And so I know that it can be a little hard to read coming in, but like once you see how it's working, it starts becoming a lot clearer. Now dependency management. Okay, this is where we are, this is very high on our, on our very impressive list of Rust features. And we're gonna have to make a confession. We pretty much love cargo, like love it. It is flexible. It tells you exactly what's wrong when a version can't resolve. And it really, it's another thing that where we feel like it's completely changed how we look at dependency management. And this just shows some examples here, but you can patch in dependencies from other places. And these four show four different ways you can do stuff. So in the top example, we're pulling in only a very small portion of basically what's a big auto-generated open API spec. And we're just pulling in exactly what we need. We're not pulling in this massive code base and compiling it all in with the struct, with struct opt, we expose in Kressa an optional command line flag, which is what's shown below. If a user specifies it, it downloads all the command line management stuff and all these other structs are managing the ops. But if you're not, and you don't want to use a command line tool for it, you don't have to include it. The dependency doesn't get pulled down. The structs don't get compiled in, which is amazing. And then you can also do stuff like the bottom where you're pointing to a local path. So when you're local, it's going to use a local path, but when you push it up into crates, it uses the proper version. So I mean, these are just two examples, but we love cargo. And this is coming from people, we have both worked on multiple dependency management and package management systems like Helm and CNAB and Glide and things like that. And we just, we look at cargo and we're like, ah, this is just better than all of those things that we wrote. But I think the last thing we really wanted to highlight in the good one is the community. We found the rest community to be very open, very easy to work with, kind of exuberant about collaboration and improving the ecosystem in sort of a deliberate, rational and also good-hearted way. And while we don't, anything else you want to add to that before we go on? No, not really. That covers it. We do want to be honest and talk about the bad and the ugly. So we'll dive into this section, which admittedly is shorter and some of these things might surprise you. Others of them, I doubt they will. Yeah. So let's start off with docs and clarity. This is an interesting one that people go like, oh, I didn't really think about that. So a pattern that we've seen in lots of crate documentation, that's something to really be aware of as you consume things, especially if you're new to Rust and especially Rust with Cloud Native Development, is that the docs are sometimes very unclear about what is happening in the actual code. They describe the functionality well, but then you have to go digging through the code to find out whether it is a zero cost abstraction or if there's side effects. So like, is something flushing a file? Oftentimes, it doesn't tell you and I have to go down on the code and figure out what's going on. But you also get what's displayed here. So this is something from the tonic crate, which I'll talk about a little bit later. And as a new developer, when you come in and you see this code, you just go, what is going on here? And you click through, well, what's this make connection? And it's linking to another crate and another crate, which links to two more traits and another crate. And you're like, whoa, what am I supposed to implement here? Turns out all we needed was the little code snippet below, which is quite simple. But what had happened is there was an example in the examples directory of the, of the actual repo. But the first thought wasn't to go look there. Now granted, this is before they changed, they've updated these docs and it points to that new example. And it's great. But this is something we saw quite a bit of as we started things. So just remember to keep your docs clear as you write stuff and be aware that you might have to dig a little bit to understand what this is asking you to implement. Yeah, and we'll go a little bit faster through the remainder of the bad and ugly section so we can get to some of the more exciting stuff. But one of the things we did run into is that a lot of times when you're in a new ecosystem, you have to write more code yourself. There are many crates out there already, but sometimes we just don't find the things we need and we have to reimplement them either to glue different crates together or to, to provide an implementation that simply doesn't exist. Yeah. For example, here, GRPC on Windows, not GRPC sockets on Windows. So we're just going to skip over that unholy code and talk about the learning curve. The learning curve is something that people often mention, but what we want to point out is actually the learning curve, it is logarithmic. It just has a very steep initial curve and it flattens out, but there's one other little bump we ran into and that is how to design proper API. So that first example with traits was using try into. That was when I finally got it, but it took a lot of looking at other crates and understanding what was going on about what makes a good and flexible REST API. When you design them, users love them, but sometimes getting there is a bit of a hiccup. So with that, let's go into the ugly, which is just one thing. It's async. Now to be clear, I don't want, this is not bashing on async. I have worked with these people. I've talked to these people. They've started a working group. This is not that I'm just pointing out what you're going to run into in the cloud native space. And the first off is this competing and incompatible runtime. So you have multiple runtimes, Tokyo async standard, small, and it's possible to use interchangeable ones, but a lot of times once you buy into a stack, it kind of stuck there. You have a little bit of lock-in. It's not perfect and it depends on what you're doing. Sometimes there's shim layers that you can add in, but just keep that aware. Here we have our ugly code. I hope these code examples show you exactly what we're talking about, but there are so many chained calls. And that one on the left was one that we called move clone, move clone, move clone, move clone, because I mean, we know exactly where the date is going, but to make REST happy, we had to move something, then clone it, then move it again, then clone it, then move it again to make it so that it was all in the right place. And again, all this in context of async, right? Yes. All this is async, related to async. This is not outside of async. This doesn't happen, but something to be aware of. And then that also adds some cruft and bloat where you're re-implementing async methods for a type that's already async. I would be rich if I had money for every single time. I had to re-implement async read or write for a type that's async underneath the hood, but it's constantly getting better. So for this last part of our presentation, we wanted to switch and focus on sort of what happened when we decided we wanted to build this project, Crustlet. It was going to be a Kubernetes kubelet written in REST. And in kind of walking through the initial stages and then some of the things that caused us to have to really dive deeply into the underbelly of Kubernetes and how RESTs really has ended up making this a sort of, I don't know if I'd call it a pleasurable experience, but definitely one that was manageable by a little small team like ours that didn't require hundreds of engineers to be able to sort it out. When we first wrote Crustlet, we wrote a proof of concept, and the proof of concept was maybe a few hundred lines of code. That was all it took us to be able to take a very basic pod definition from Kubernetes, start up a WebAssembly runtime, and report back to Kubernetes, yeah, we're running this thing, here's the output. And so the proof of concept went well and was deceptively simple because we went, oh yeah, from here we got it. We're going to move on to a minimum viable product. We're going to get this thing to a 1.0 that people can actually use in production. And then it wasn't easy. Yeah, it really wasn't. So this is just kind of a list. I don't think it's even a complete list, but it's close of all like the worst things we ran into to try to figure out. We had to figure out the gRPC plug-in system. We had to figure out how to make state machines and controllers work in REST. There's also the idea of how does bootstrapping work underneath the hood, like what's the proper way to do it and exchange it and do it properly? There's things about how do we handle OCI image polling? What's the API that is expected of Kubelet for Kubernetes, and then the various Kubernetes subsystems, volume mounting, networking, what are we supposed to do? All of this became very, very hard. So let's dive into that a bit. First off, OCI image polls. This is just something where, like we mentioned, there's some gaps in the system sometimes. And so you just out here on the frontier have to do it yourself. So we partially re-implemented the OCI spec to be able to pull modules because we assume that even for our Wazen modules that we're using here, we assume they'd be stored in an OCI registry just like a container would. Now the API contract, this was interesting because there's no documentation here, which makes sense. It's not really like a public API many people consume, but it turns out there's two parts here. So if you're interested in the underlying internals of Kubelet, this is a great slide to go reference later, but there's these three endpoints that we see. And then there's also the duties of the Kubelet itself, which is watching for new pods, handle the entire pod lifecycle and resource management, and then updating the node status and heartbeat. So let's go into some code things that are right here that Matt already previewed. It's not necessarily documented what the duties are given the API endpoints. So a lot of this was going, okay, given this information, what are we going to do? And that's really where a lot of this code sort of came from. Yeah, and we can see actually some of the power of rest here. So right in this example, we see first off why generics, especially rust generic and trait system are so useful. We are allowing this to do a client that's a using the wonderful kube crate, which is the basically the official rest client for go for go. Wow. The first official rest client for Kubernetes, just like you have the client go. And in here, you can see it. It doesn't matter what type this is, this actually underlying type doesn't matter if it's a pod or a custom resource or whatever, as long as it implements a Kubernetes definition, this this code will work. But it also has the matching and unwrapping that we love that we talked about as well. And you'll see if inside we're going to now dive into the handle event thing where it actually is easy there as well. What we did have to do here and something to be aware of, especially with Kubernetes and probably a lot of other systems is we had to plumb the events through into our own system. So getting the events and starting essentially a reflector, if you're familiar with Kubernetes was really easy, like it was just that little snippet of code. But then we had to do the plumbing to kind of handle those events properly. But Rust still makes that easy because an event is expressed as an enum, which we mentioned before. And so this shows how we handle resyncs, shutdowns, and event dispatching all in a fairly concise function. Lastly, we have our patch status stuff. So this is how we patch, we also have to talk about Node, but for the actual managing the lifecycle of things, you have to patch the status of the pod. Now we're leveraging the awesome code crate again that allows us to do this patch status operation with pretty much an easy thing. We're just able to get out an API client and then we're able to talk to it and then unwrap that error. But one thing to know is that patching utilities for pods are non-existent and we had to write our own, which that JSON patch method is what's in our next slide, is that we have, we had to manually assemble the patch. Now this is easier for custom types, but we had to put the whole thing together. So something to be aware of is that you don't have those JSON patching methods that come kind of built in with some of the extended Kubernetes libraries and go. So just be aware of that when you're doing anything in Kubernetes or really anything in the cloud is that you might have to do some plumbing work just like here. And to finish up, we're going to leave what each of the functions do inside of here. But in case you're curious, what happens with a Node is that you update the lease and you have to update the status. There's two things that you do to keep it registered with the Kubernetes cluster. And all of this, we kind of had to go figure, okay, well, how long does it expect? Like, when is it expected to be updated? What's the documentation for this object? All those things. Another cool use of macros there too. Yes, where we wrote our own. So we have the GRPC plugin system. This, here there be monsters. I cannot stress this enough. So turns out to be able to enable OCI, sorry, not OCI, meant to be CSI support there. We needed to have some sort of plugin system available, which matches the one that Kubernetes expects. So the plugin manager, there's actually two of them. If you go look at the code underneath the hood, but one is only for device plugins and one handles other ones. So we implemented one of them that handles the CSI interface. There's no way to figure out how this works except by reading code. But what the cool part of all this was, and you'll see some of it is Rust shrunk the amount of code needed from several thousand. I mean, this is a gnarly thing with interfaces and indirection everywhere in the original Kubernetes thing. It's about 800 with tests, 300 without them. So, yeah. And just so you know, cross platform support for sockets is a nightmare, which we'll only cover briefly, but just wanted to mention. So this is big, I know, but if you, if you see here, this is the run. This is how we start things inside of the plugin manager. And it watches a configured plugin directory for new sockets to appear. And when it discovers them, it tries to register them. But we can see here some good examples of using iterators to run multiple futures at the same time again and how to consume a stream. So we're able to just take these, turn them into an async object that just returns something every time it detects it. So there's nothing super crazy here with Rust, but instead of having to do all this other channel management stuff that you saw in the go one, it's really concise. We shrunk it down to a single thing that runs it all. And then we have this handle create thing that actually does the creation when it sees the creation of a new plugin. And these are the actual steps that are performed in case you're curious. I tried to include some stuff in here that people can go back and look at. When I say, as we looked and we created these, these samples, and this is the exact process that goes on underneath the hood in case you're curious. But more interestingly, in the next slide is we have what actually goes on underneath with watching the file system. So this is where things get really exciting. So this is a demonstration of the awesomeness of conditional compilation, which I mean, isn't we've run into several times, I think, at this point, right, Matt? Yep, all over the place. And this shows off another useful create called notify. And we've also wanted to kind of show the crates that we're consuming here. And notify is the thing that tells you when something changes on the file system. It uses the proper underlying utilities to do it. But it's not async by default. So we had to do some wrapping and adapt adaptation into a stream interface that receives a result out, which is what we're doing here. But you can see that we actually had to do a hacky workaround for Mac OS. It turns out that the underlying libraries in Mac do not detect when a socket is created, only when it's modified, which I opened a bug for in case you're ever curious about that. But we had to basically write our own hacky version for Mac. But now, instead of having to include that everywhere, we have specific versions that are called depending on the operating system. So if it's a Mac OS target, it uses the hack. Otherwise, it uses the great notify create that gives us all this with the proper way and not hacky. All right. Now we're up to my very favorite slide because this replaces tens of thousands of lines of code that's auto-generated. Yes. We love the build RS. So one of the things, this is first, I mentioned this earlier about the tonic rate. So tonic is a gRPC library for Rust. If you haven't used it, like we use gRPC libraries and interfaces all the time in the cloud native space, not just in Kubernetes, but everywhere. And so this is something you'll likely run into no matter what you're writing for the cloud. And it's awesome. This is 10 lines. And it takes the protobuf definition and builds. You can tweak how you want to build. It builds everything. And you don't even have to commit the code. There's no auto-generated client. It is generating code underneath the hood. But because of the use of macros plus the build file, you're able to just have it be built at runtime and have it included, which is awesome. I mean, we just get souped. This is where we nerd out about it. In contrast, the typical Go repository that does the same thing includes dozens, if not hundreds of files that you have to remember to manually auto-generate and keep in sync with your protobuf definitions. And it's all incumbent on the developer to do the auto-generation. And then all that code gets checked in. And then every time somebody goes to your source code repo, they have to read through all of that. We just love the fact that this hides all of this stuff from the developer. And we don't have to see those ugly auto-generated files and can just use the protobuf definitions. Now, last off, this will be pretty quick because there's actually going to be a talk on all of this by one of our fellow co-maintenors of Cresslet at this same conference. He's going to talk more about it. But we had to discover how to do controllers and state machines inside of Rust. Now, Kubernetes is best represented by a state machine. But actually under the hood, it doesn't really do a very traditional state machine, if we're being honest, based on all the code I was reading for. It is still a state machine, but kind of a little bit different. And we were taking something that had a very established pattern and go and doing an entirely different language and was difficult because we had to combine this Kubernetes way with the Rust way. And our fellow maintainer Kevin will talk about that more. But when we combined all this, we created something called Crater, which is an operator create for Kubernetes. And this took a whole bunch, this took several false starts in different versions. And it took about three months of us just like building on other people's work and trying things again and again. And the ending API is actually quite flexible, which I'll show right now. We have, this one's the most simple thing where it basically does nothing, but you have to implement two methods. If you wanted to use the same kind of state machine pattern in something not Kubernetes, you can just take away the status one and you just have this next method that does work. And so that's the simplest example. Now, the next one is actually a more complex example, as you can see a lot longer, but it shows how to, how it's, what happens when a pod is running inside a presslet. And there are multiple things that happen, but you're able to pass state around and just go next. And you'll see at the very top, there's a, there's this derive transition to. So we wrote a derive macro that allows you to put on a transition automatically to a struct and make sure that it is compile time checked. If you don't have the proper things in place, it'll tell you that your state machine is invalid, which is incredibly powerful. Cause then you're not accidentally checking, like checking in something that you think works, but there's a missing edge in your graph essentially. So that's pretty much the stuff we learned from, from Kubernetes. Yeah. And this has been a very pleasurable journey for us as we've done this. And we've learned a lot about Rust, a lot about Kubernetes. Of course, we're really invested in the WebAssembly space. So if you're around for WebAssembly day, take a look at some of the stuff there as well. But we wanted to end up with a couple of, with a slide with a couple of resources. Of course, Kevin's talk on Crater is also in Rust Day today. So we encourage you to watch that one. And then here are several links that will give you some reading material. If you'd like to catch up on how we did the state machine, that the good, the bad, the ugly section we went through there, written version of that, or is that the slides? I don't remember if that's written version. That's okay. And then, then of course, the async code that was our only entry in the ugly. There's a lot of work going on there. And we're really excited about the work and looking forward to the future of Rust, because we know that it's just a matter of time before everything sort of falls into place and gets it right. But there's a place you can go and take a look at, at what everyone is working on, on the cutting edge of that. Anything else to add? No, I think that's it. Thank you, everyone, for listening to this. Hopefully it was helpful, gave you some good starting points. Feel free to reach out to me or Matt any time where we love talking about this, and we can pass on any knowledge. Thank you very much. Bye.