 All right, we're starting a little late. I managed to tweet out the wrong URL, so we'll see. It's fine. So the idea today is that we're gonna build a new Rust crate, and I have very intentionally not done any design work ahead. I have built crates that are sort of related to this and some projects of my own, but this is the first time that I'm building this is sort of a standalone crate. So hopefully the youth of yours and my brain will help figure out how we're gonna design this. Feel free to use the chat. I'll try to monitor it while I go. But what we're gonna start out with is just like a brief description of what it is we're trying to solve. So Amazon has this thing called EC2 that you may or may not have heard of. And EC2 is a way of spinning up lots of machines in the cloud that you can get to do things for you. And one of the really neat things they have is something called spot instances. So spot instances are very short lived machines that they run for you that run at a severely discounted price. Now this turns out to be really useful if you're trying to run something like a, if you wanna run a benchmark or you want to run like a really short job of some kind, then these are really neat because you get to run them for like 90% off compared price. So in general, they list some stuff here. What we're gonna look at is a little bit different. So spot instances by default can be interrupted at any point by Amazon. And that's a little bit annoying if you're trying to like run a benchmark with 10 machines. So we're gonna use these things called defined duration spot instances. So these are sort of special spot instances. So they're known as spot blocks that you can declare how long they're gonna run for and then Amazon guarantees that they are not shut down for that time. And we want essentially a crate that allows user to declare that they want a bunch of different sets of machines and then be able to run a bunch of stuff on them. Ideally just give them sort of a handle that gives them an SSH connection of some kind so they can run whatever commands they want. And then once that finishes, they can tear it all down. And ideally this crate that we write, whenever you use it, it would only run spot instances for as long as your job is running, which means that you could like spin up 10 VMs to run like video encoding or you could spin up 10 machines to do a benchmark of some program you've written and then you can tear them all down and you only charge for the time that they run. So we're gonna use a Rust library called Risotto. So Risotto is Rust bindings for the Amazon web services. One of them is EC2 but there are a bunch of others. So Risotto is built up by a bunch of other crates. So if we look at the Risotto website, you'll see that it has a bunch of helper crates. Here we go to their GitHub. You'll see that, where is this? Yeah, so the overview and then they have like one crate for every service. And this particular setup will only look at the EC2 crate mostly because we're just gonna spin up a bunch of VMs. And then ideally what we want is we want to spin up all of these virtual machines and then we're gonna establish an SSH session to each one so that whoever uses our crate is gonna be able to then issue commands to it. So we're gonna use lib ssh2, which also has Rust bindings. It lets you authenticate. So we're gonna have to find some way to authenticate to the running VMs. And then you can just run commands like you would to through whatever normal shell you have. So this is gonna be a somewhat long process in theory but I think we can figure it out. Yeah, I know, it's a great name, right? So we also have to come up with a name for our crate. We're gonna run relatively short jobs on lots of machines. So I sort of thought that we might wanna use the name burst also because it sort of has RST in it. So maybe Rusty, unclear. But all right, so we're gonna start a new crate and we're gonna call it burst for now. I don't know if that's visible here, may, I might help. Ah, that's a little much, there we go. All right, so we've made burst. So initially all we're really gonna depend on is we're gonna depend on risotto EC2. We don't really care what version but I guess we'll be nice and put the right one. So what version is risotto EC2 now, cheating. And then we're gonna start with an empty source lib. That's pretty exciting. All right, so when we start out, what we first of all want is we want to start to think about how we're gonna design the API for a crate. Like what do we want users of this crate to be able to do? Now, we sort of have two choices initially. We could either set this up as a binary that users download and then they like pass some scripts that we're gonna upload to the machine and then run them, but that sounds a little inconvenient. So instead we're gonna build this as a library. And so the idea would be that the user of the library sort of describes the machines they want to boot up. Maybe we're gonna let them like give them a closure that contains an SSH connection they can launch commands on. And then at some point they're gonna be able to say sort of launch or boot or something and that's gonna not return until they issue some kind of termination command. So there are many ways to go about this sort of starting to design the library from scratch. The way that I like to do this is to start thinking about the API just from like just start implementing the structures. So in this case, like I'm gonna have some base dropped like I don't know what we're gonna call it. Let's call it first, why not? And it's gonna have some things we don't really care yet. And then on burst, what is the user going to be able to do? Well, it's gonna be a new method of some kind, right? I'm just gonna give a self. Don't know what that's gonna do yet either. We're also gonna have to have some way of declaring a new set of machines. So the user say that they want to run like a benchmark that is a client in a server. They're gonna want to spin up some machines that are servers and some machines that are client and they are necessarily different, right? So we're gonna have to have something like like creating a set of machines. So let's do create set. We can deal with the sort of details later. Now notice here that this is gonna be a mute self because you're gonna be modifying the set of machines. And we're gonna have to have some kind of a description of a set of machines. So that's gonna be a machine set. We're gonna need a machine set. Don't know what's in it yet, but we can figure that out. This is gonna, let's have it not return anything yet. In addition, at some point, you're gonna need to be able to say that you've created all the sets you want and now you want to run something on them, right? So there's gonna be something that's like, it's called run, why not? Now at this point, this sort of looks like it's really a builder. So one of the things that's really nice that you're gonna come across whenever you want to build your own libraries in Rust is the Rust API guidelines. You can Google these, they're pretty easy to find. And they provide like a checklist of things that you want to think about when you design different patterns in Rust. Now one of these patterns is builder. So builders are for when you let the user like set up a bunch of configuration stuff first and then at some point you're gonna come, the user's gonna commit to that setup and then at commit time, you're gonna start doing a bunch of stuff. So as you see here, the preference here is to have a non-consuming builder. So in this case, this is the idea here is that you have sort of a builder for a command where you can set the program, you can add arguments, et cetera. And then at some point you have a spawn method. And so I think what we're gonna do for our library is a similar kind of thing. So instead of this being a burst, it's gonna be a burst builder. So you have a burst builder. No idea what's in it yet. Now builders, it's pretty convenient if they are default so that you can start one without having to declare anything. And in fact, for us, that's fairly straightforward because an empty burst builder would be, I want no virtual machines, right? So we're gonna have, we're gonna go into default for a burst builder. So default, if you're not aware is a Rust trait that lets you say how to construct something without any arguments. In our case, the burst builder is probably gonna have some list of description, right? So notice down here where we have this create set method is probably gonna be something like add set now that it's a builder. So we're gonna have to keep this sort of a list of these machine sets or machine set descriptions. And initially that's gonna be empty, but we're gonna have descriptors, we can figure out a better name later. That's gonna be a machine set. And of course, when we create a burst builder by default, it's gonna have no descriptors. So just an empty byte, right? So then we don't really need the new. So let's think about just for a second how we imagine user code to look. So that's another sort of good way to approach this is I sort of want a user to be able to say they're gonna have a builder, which is gonna be a burst builder default. It's gonna be new. And then they're gonna do something like add set machine set new maybe. And machine set new is gonna have to describe one set of machines. So for example, describe the machines that I want my servers to run on or describe the machines that my clients are gonna run on. So we're gonna have to take some arguments here. One of the things that we're gonna need is a description of the instance type. So Amazon has a bunch of different instance types that are very amount of CPU and memory and whatnot. So that's definitely something we're gonna have to take. So let's do tu.micro because it's a small instance. We don't really care. And then they're gonna have to start from some base image. So whenever you run something on EC2, you'd say like, what should this machine look like when I first started up? So this is usually known as an AMI ID. I think I have one somewhere that we can use. If not, like, they're a bunch of public AMIs. Let's just pick one. Public images. Amazon Linux. Amazon has their own Linux, of course, because why not? Let's do this one. So notice how they all have an AMI ID. Chances are they will want users to start from somewhere because they don't want to install the OS and stuff themselves. So they're gonna start from some kind of AMI. And then they're gonna want to customize this. So this is where it gets a little bit interesting. How are we gonna let the user customize the machine that we boot up for them? So there are probably gonna be two things. There's gonna be a setup phase where we give them access to some machine and then they're gonna modify that machine in some way. So like install a bunch of packages, maybe clone some source, like run some build scripts, who knows, right? So one of the things that's attractive here is we can actually take a closure that gives them like an SSH connection. And then inside here, you can imagine that this closure gets called for every machine we start up in the set and they can do whatever they want. So they could do like, let's say there's an exec method and they could do like, pseudo, yum, install, or whatever they want, right? And the idea would be that for every machine in that set, that closure is called with an SSH connection to that machine. And then once you've set up all the sets, you're gonna do something like B.run. And when you run it, you're gonna want to run something on those machines. So this is after you've done all the setup, you're gonna want something like the clients, you're gonna need all the IP addresses of the servers and whatnot. So in a sense, run should be something like a closure that gets a, I guess, set of machine descriptors, right? So you can imagine that you're gonna want to know all of the servers, all of the clients, all of whatever other sets. And that includes both things like you want access to meta information about them, such as what instance types do they have or what is their IP address. But you're also gonna want some kind of connection to them. So maybe what we give to this closure is for every set, we give them a descriptor and we give them a SSH connection. In fact, maybe we just give them one descriptor for every set where each descriptor contains that stuff. So this is gonna be something of type, let me make it a VEC, why not? It's gonna be a VEC of like machine set handle. We can get back to the name of these later. It's not a great name, but it's a start. Because now you can imagine, so this is gonna be like VMs. And inside here, they might do like, actually maybe these need names so you can refer back to them. So maybe this is gonna say server and this is gonna say client. And then you're given a map from string to machine set handle. So here you could now do something like VMs server. IP maybe you did. So the server IP is gonna be this and then we're gonna do on, this is gonna be zero because there may be multiple machines in a set and then like for client in VMs, client exect. Maybe you wanna run these in separate threads. You're gonna do like a thread spawn or something. I'm clear, exactly how this API is gonna look out. We can figure out later. Yeah, so the idea, well, so the idea is that a machine set can be just one machine. So for example, imagine that you want to set up one machine as your server and you wanna set up many client machines that are all the same, right? So then you want just one way of setting up a client machine and one way to set up a server machine and then you're gonna launch multiple instances, right? So, oh, I see. So maybe the way to do this is that this describes like a machine set up and then you say how many you want to set up, right? Something like this. And then it makes more sense for this to be a machine set, right? So this is describing how you set up one particular machine. You say how many you want to it and what it's gonna be named. And so here you're gonna have one of these server machines and two of these client machines. Of course, the set up here is probably gonna be different. Maybe this is gonna do some kind of like git clone thing and maybe this just runs like yum install Apache or something. All right, let's try that. Okay, so there's obviously a lot of error handling. We're gonna have to do that sometime later, but for now I just wanna get a sense of like how people might wanna use it. So as you've observed already, like we already now have an idea of what the API is gonna look like. All right, so for this run method, here the user's gonna want to run things in parallel probably on the things in a set. So wouldn't it be nice if you could do something like client dot for each. I'll give you a client handle. And like maybe every client is like run in parallel and different threads or something because you imagine that you're running a benchmark, and you have all these clients, you don't really wanna connect to one and then have it run a benchmark as a server and then connect to the next one and then have that run in something as a server, which is what we would get with just a regular for loop like this. However, if we have something like for each parallel, it's a little bit verbose, but we can look at that later. Then now you can imagine that the library takes care of doing all of these in parallel for you and then waiting for all of them to finish. So like this could then accept like a ping. This is gonna be something like ping. Of course, this is shell injection and all sorts of bad things. This is not really, this is probably not what the final interface is gonna look like, but it's like a good point for us to start. So each of these is gonna run command, right? That doesn't look too bad. And then I guess at some point, so presumably what's gonna happen is once the run closure exits, that's when it's gonna tear down all the machines again, right? So if we sort of just walk through what we imagine is gonna happen here, you create a burst builder, which is this builder pattern that we talked about earlier. So we use the builder pattern to let the user to let the user describe the machines they want, how many they want of each. And then at some point they're gonna call run. And run, this is actually gonna spin up all the machines, wait for them all to be ready. For every machine, it's gonna run the closure for that machine type. So for the server machines, it's gonna run the server closure and for the client machines, it's gonna run the client closure. And then once all of the machines are ready and all the closures are finished, then it's gonna run the closure pass to run. And then when the closure pass to run exits, then it's gonna tear down all the machines again, right? So that seems like a decent place to start. Okay, I think I like that interface. So what do we have now? We have a machine setup. And this means that run, we actually don't need this burst type maybe, at least for now. Okay, so now there's gonna be add set and it's gonna take a name, which is gonna be, so here we have another thing where Rust generally forces you to choose whether you want references to strings or own strings. In this case, the performance is not super critical. So we're gonna take, I guess, no, we're gonna take a string because it's easier for now. And then we're gonna take a number. So there's a lot of debate about what number type you should use. In general, use size should be anything that refers into memory. So we should not really use use size here. U32 is probably fine. U32 is basically equivalent to on-site integer in regular C. And then we're gonna take a, we're no longer gonna, no, we're gonna take description, which is gonna be a machine setup, right? So this describes the machine we're setting up and it's not gonna return anything. It also isn't gonna run anything. So it's not gonna access the Amazon API or anything. So we don't actually need it to return an error. Now, we are going to have to define for probably gonna move machine setup into its own file at some point. But let's, for now, implement on machine setup. We're gonna need this new method. That's not what I'm gonna do. So new is gonna give you a self, which is the machine setup. And it's gonna take, what did we decide on? So it's gonna take a, we can't actually use the word type because type is a reserve keyword. We're gonna do instance type. It's a little bit verbose, but it's fine. We can make this API a lot more ergonomic instead of taking strings anywhere, everywhere. So if you go back to the checklist here, I think there's a good explanation here for strings, if I remember correctly. Well, apparently not. How about that? Well, we can look at that later. It's not terribly important for now. So we're gonna take an instance type. We're gonna take an AMI. And then we're gonna take a closure, which is gonna be something like setup of some type F. Now, closures, here, we can do this a couple of different ways, but I think we want this to be, we want the user to be able to pass in whatever fits this particular pattern. So we're gonna take an F, which is, we're gonna be generic over the closure that we're given. And we're gonna say where F is an FN. So here we're gonna have to decide what kind of function this is. So our options are FN, FN, Mute, and FN1s. FN1s, we only get to call once, which is not okay because there could be the number of machines you set up of this type, could be more than one. So we have to choose between FNMute and FN, and just FN. So with FNMute, we would only be able to call this closure once at a time, which basically allows the contents of the closure to access some mutable variable outside. Or, but this would be a little bit unfortunate because it means that we have to do all the machine setups one by one. We can't do them in parallel. Whereas FN means you can't mutate anything, but it looks like from the way we've set up this closure, it looks like the client is really just gonna handle each machine in isolation. So I think we'll be fine with an FN. They could always have an arc of an atomic or something mutexy if they wanted to mutate some state outside of this closure. So we're just gonna take an FN, and that FN is gonna be given an SSH. So we're not actually using SSH yet. So let's just for now have a dummy SSH type. So it's gonna be like a connection, which we haven't really said what is yet, but you're gonna be given one of these. And in fact, they might be given like a mute SSH connection. But maybe they get to own it. It's all clear. All right, and it has to be able to fail. So that's the other thing. When this closure, like, imagine that it's doing like SSH exec, what if the connection fails? What if the machine fails? Like it needs to be able to fail somehow. So I think what we're gonna do for that is we're gonna have a, just gonna for now use IO error. Now, realistically, what we're probably gonna do slightly later on is to use the new failure crate. So failure, the failure crate is a really neat way of getting consistent error handling both across crates and within your own crate. So we're probably gonna start using that eventually, but just for now, let's have this be an IO results. And we don't actually care about the return value. We only care about the fact that it could be an error. And what is this going to do? So a machine setup is gonna have, well, instance type. It's gonna be a string. So remember, when you create an instance type, and, sorry, a machine setup, and when you give it to the burst builder, nothing actually happens, right? We don't actually do anything until we run the burst builder or whatever we end up calling this method. So here, we can really just have this take, the store the arguments is given and then we'll deal with it when we actually spin up the machine. It is, however, gonna have to be generic over F. The other thing that needs to be decided here is whether we want to enforce these trait bounds on this struct. So do we wanna say that the struct can only store Fs that are closures? Or do we wanna say that it can store whatever F it wants and then only use the closure restriction or the trait bounds on F whenever we have methods that use F in that way? There's like a little bit of debate on this, although there's an RFC in the training bounds. So hopefully this is gonna be a lot better when this RFC lands. So this is gonna let us do things like, yeah, so see this is exactly the issue we're running into where you can put the bounds on the type argument to the struct and then anything that impels on that struct, those bounds are implied. So this would mean that we could put this where F onto here and then not have to also put them on new. This is, I think this is available in nightly now. Apparently not. All right, well, we're gonna do it this way for now. Yeah, that's fine. All right, so it's just gonna store some F and then whenever we use it, we're gonna essentially have a where condition that asserts that that F is an Fn. And given that you can only construct this through new, so notice how these fields are not pub, like this is gonna have to be. But these are not pubs so you can't actually construct a machine setup except by calling new on it, right? And if new has this bound then you can't ever have a machine setup without that bound. We're still gonna have to put it there. All right, so this is really just gonna give you a new machine set up. So there's very little notice here that we can also, Russell doesn't require you to give both the key and the value when you construct a thing. If the key and the value variable name are the same. So this could type as this. We're allowed to just do that. In fact, my auto formatting is not working which is a little bit annoying. So I'm gonna avert your eyes. Ooh, it still doesn't. Do I do something stupid? Probably. Let's try that. There we go. Maybe I don't even need this. We don't mind using nightly. So this crate should compile on stable too. I'm not expecting to use any nightly features, but it's fine. It's a little bit faster compile but it's also probably not gonna matter. All right, so we have a machine setup. All of this seems pretty fine. And then if you add this to a set. So notice that this set now has this name and number property. So we can no longer just store a vec of these machine setups. Instead, we're gonna have to store something like a map from the name of the set to the machine setup and the number of machines of that type. We could do this a little bit nicer. Oh yeah, yeah, yeah. You call the same thing as I did. So we could have a new type that like wraps this. I don't think it's important in this particular case. This seems, this particular type seems straightforward enough. And now of course this becomes very straightforward. This is just gonna take self, the descriptors. Notice here that I could put hash map new but instead I'm just gonna put default default because that means that if we change this later to some other type, we don't have to change the code again. So here we're just gonna insert with this name. We're gonna insert to the description and I guess this could be called setup now and the number. And already here we're gonna have to make a choice because what happens if the user, so insert returns the old thing that was set for that key if something existed. So we're gonna have to decide what we wanna do in that case. We could like assert that there's nothing there with that name. Let's just like for now add a to do saying what if name is already in. The user will probably not be super confused if a name is, if they override a name but it would be nice if we could provide them some kind of feedback. Maybe we like return an error here. We can look at that later. It's not terribly important. Okay, so the sort of bane of our existence is gonna be this run method. So this run method, if you recall is going to set up all of the machines, connect to all of them over SSH, run the closure and then tear down all the machines. So all the magic is gonna happen here. We don't really wanna just be one function so we're probably gonna divide it into some helpers elsewhere but let's at least get the basic building blocks up to speed. So run is gonna take a closure, right? So remember that's this closure down here of what we're going to do once all of the machines have been set up. So here we're gonna want something like where F, this can be an FN once because we only expect to do all the, like do the main part of the benchmark or whatever job you're running once. Specifically this closure, there's no reason for us to execute it more than once, right? It's only gonna be executed once and that is once all the machines are off. So we're gonna take an FN once and we're gonna give that a hash map string and machine set. Now we don't have machine set anymore so let's just now have a machine set up. Okay, so it's gonna be given a map from the name that the user provided to a machine set. Now machine set here could just be a beck of machine. What do we think? It's not really a reason to have a wrapper there. So we're gonna just have that get a machine. In fact, maybe it even gets a, gets a this. That seems nice. So yeah, we can figure out exactly how we want this to work. But this is saying that for every set name, you're gonna get a list of machines, of mutable handles to machines. And then for every machine, like we're imagining the machine here is gonna have like, it's gonna have some kind of like SSH connection, which is probably gonna be on this in some way. And it's gonna have some other descriptions, things like maybe instance type, and maybe AMI, oh, and like IP address, and probably also like DNS name or something. We can get back exactly to what ends up in here. But just to have a general idea, you're gonna get a, for every name that you registered a machine set for, we're gonna give you some descriptors of machines that are of that type. All right, and this function also needs to be able to fail. So again, for now, we're just gonna use IO result. Later, we're gonna move to using the failure crate, which sort of makes error handling a little bit more ergonomic. Right, so we get to the most painful part. So let's just like, write out what we're gonna do here. So the first thing we're gonna do is we are going to issue spot requests. So just to give you a general flow of what Amazon ECDU does when you want to spawn up a bunch of machines, you issue for every machine type you want, you're gonna issue a spot request where you can say how long it needs to live for, what AMI it's gonna be based on, what instance type you want, other kind of configuration parameters. And then Amazon is gonna take a little while to spin up those machines and then you can check back for every spot request to see whether you've given the machines you asked for. And then at the end, when all the instances are ready, then you can query those instances for like their DNS name or their IP address or any other information you might want. So we're gonna issue all those spot requests. Then we're gonna wait for instances to come up and sort of a second step to this is gonna be once an instance is ready, run setup closure. So remember how for every set, so in this machine setup, each machine setup has a closure that lets you do things like set up the machine, right? So this is gonna be this closure here where they can install packages and like clone things and whatnot. And then here it's gonna be something like after all instances are up, and I guess we can do wait until all instances are up and setups have been run. And then we're gonna invoke the F closure with machine descriptors. And then once that F returns, that's where we're gonna do all our tear down. So at that point we're gonna terminate all instances. So there's one thing that is a little bit nice here and that is or Amazon is a little bit weird in that if you issue a spot request for like more than one machine or even for one machine, that spot request normally persists. So if I start up an instance or Amazon starts up an instance for me on behalf of a spot request, and then that instance goes down, Amazon will spawn a new one for me because my spot request is still there. So what we can actually do is once the instances are all up, then we can stop the spot request. If we did not do this, then Amazon would just like keep spawning new machines which seems a little bit unfortunate. Okay, right, so now we need to do this. And in order to do this, we're gonna have to go through Risotto. Now, one of the reasons why Risotto is a little bit annoying to work with is it's an mostly auto-generated crate. So you'll find some of these in the Rust world. Libs as H is not quite that bad. It's also a little bit auto-generated, but this one bears a lot of that feel. And if you just scroll through, you see that there's just lots and lots and lots of types, lots of enums. And there's one trait that is, and watch this, is just all of the functions that are defined in the EC2 API along with all of their documentation. So we're not actually gonna read through all of this because that would be a pain. So we're gonna, ooh, why doesn't that work? It's not gonna work. All right, let's try that. I guess not. All right, so I guess we'll do this the old fashioned way. So we're gonna have to do something like request spot instance. Okay, so we're gonna need that method. We're gonna need to describe, spot instance somewhere, describe instance request. And then we're gonna have to do describe instance. So I've cheated a little bit and looked up these in advance because otherwise this would take me forever. I hope you'll forgive me for that. All right, so step one. I'm actually talking to EC2. So here we're gonna first have to describe a spot instance request. And notice that basically all of these methods have the same kind of structure to them. They take a self where a self is a connection to EC2 and they take an input where input is some kind of request and then you get back some kind of result. And the types for the request and result are essentially just the name of the API call suffixed by request or result. So we're gonna have to do one of these describe spot instance request request but we're also gonna have to construct the self, right? So I'm gonna try, let's see, how do I do this in the most clear way? I just want another tab somewhere over here that we're gonna use in a different fashion. So Risotto does not have some very good like overarching documentation but if we go to here, usage examples. All right, so we're actually just gonna authenticate in the most straightforward way. Amazon has a lot of different ways you can authenticate to them. I'm just gonna use the thing that you set to environment variables just in your shell or wherever and those contain like a secret Amazon key like an access token. And what this does, the default credentials provider or they even have another one but it's not terribly important. They have various credential providers and one of them is just like read from the environment and then connect. So we're gonna use that one. This is another thing that we can improve later in the library to provide support for like more advanced EC2 authentication. I'm gonna not do that initially and then we can deal with it later. Okay, so we're gonna need to include Risotto and also Risotto EC2, right? And it's very convenient. They're gonna provide us with most of the things that we need. So here in run, we're going to want to factor this function out at some point because it's probably gonna get pretty large. I'm gonna just not deal with that right now. All right, so we're gonna default credentials provider. That seems great. That's probably does the right thing. All right, and a region not gonna run in a DynamoDB. And then they're saying in order to authenticate you do this, great, copy paste coding. All right, so this is gonna be the credentials that we end up using to connect EC2. It's great. And then you see that this provider is passed in this case to DynamoDB client. We are not gonna use that. We're gonna use EC2. Let's see if they have some. That's very unhelpful. There's definitely a, where's the GitHub repo? There we go. They've changed the example. That's pretty unhelpful. So there used to be an EC2 example here too. But it looks like they got rid of that. That's pretty sad. It's fine. All right, so we have one of these default credential providers. You can notice we could just make an environment provider ourselves if we really wanted to. In general, whenever you start using Rust crates, you'll find that there are usually many ways of doing things. This is one of the reasons why it's really encouraged to have examples in the root of your crate. And we're gonna end up writing some documentation for this crate too. And that's gonna include examples to show exactly how you use that crate. And of course, Rust is then gonna run those as tests for us, which is really nice. So in this case, we're gonna have a default credentials provider. Notice that this type signature is entirely unhelpful to us. But in general, let's just use this one. All right, so instead of this song and dance, we're just gonna do, provider is going to be an environment provider. In fact, we don't even need to use. Great. Good job team. We have a provider. All right. And now for Rust EC2, where's the tab that I opened? So EC2. So here, in order to authenticate EC2, we need this type, which is like hidden in the middle of this entire giant list. There are examples somewhere. I think they have like an examples directory that we could use. But instead, we're gonna go here. We're gonna notice that we can create a new EC2 client. Simple. Simple sounds promising. What does simple do? Default credentials provider and TLS client. That seems nice. Let's use that. Me so maybe we even don't even need that. Okay. Let EC2 is gonna be Rosotto EC2, EC2 client. Simple. And I guess Rosotto region. Someone do that. Actually, let's check. It's not, none of this code is gonna compile but I'm gonna make it download all the dependencies and such so that I get completion for them. US East one. And it's gonna give me an EC2 client. I have some code elsewhere. I'm gonna take a peak just to see. I'm not doing something stupid. That's true. I definitely, what to use for completion? So I use, in this case, this is RLS which now you can get by adding as a component with Rustup. And when you use that, you get auto-completion. So I'm using NeoVim as the editor with the language client plugin for NeoVim and that ties into RLS pretty perfectly. I can share the config later. That's helpful. Where is this code? Yeah. I do want it to use the default keyless client provider region US East. All right. That looks like it's probably going to work once this finishes. So I'll just let that run in the background. All right, so this is gonna give us a EC2 client. Great. And now we can start issuing requests to it. So we're gonna have to issue these spot requests. In order to do that, we're gonna use the, where was that, where did that go? Here. Describe spot instance request. Okay, so we're gonna do, eventually we're gonna do something like EC2.describe spot instance. And then we're gonna give that one of these. So we're gonna need one of these. All right. So we can come up with a better name for that later. It's gonna be one of these. And one thing that is nice about Rosado in particular, and this is usually the case whenever we use like decently thought through crates. Most of the types implement default. And so especially things like this that are configurations or descriptions. So we're probably, this is the same thing we ended up doing for a builder, right? That it implements default. And so you can just like create an empty builder. And then you can modify it before running it. So we're gonna do the same thing here. We're gonna create a default request request. And notice here, it has a bunch of different flags. We don't actually want to do dry runs. We probably should, but we're not gonna do that. We don't need, oh, sorry. We need to first of course request spot instance. We're gonna do that first. This is gonna happen here somewhere. We're gonna do, let me break equal to one of these. So notice how the pattern is exactly the same. You give it a structure that's gonna describe the request we want to issue and then you get back a result that tells us what actually happened. So in this case, we're gonna take, I'm gonna make one of these defaults. And we're gonna eventually do easy to dot quest spot instances. And we're gonna give it a R-rec. And then everyone's gonna get mad at me but I'm gonna unwrap this. Eventually again, we're gonna do some kind of error handling in here probably with a failure crate. But just for now, we're gonna unwrap this. I can tell you already that these things do fail. Like the Amazon API gives you a lot of errors. And so this is not actually gonna work long term but we'll start somewhere. All right, so what are we actually gonna put in this? So notice how this has a lot of stuff. We don't actually care about most of these which is why default is nice. We are going to set instance count, right? So this dot instance count is going to be, ooh, actually, we're gonna have to do this. We're gonna have to issue a spot request remember for every machine type, right? So for, what do we call these? It's gonna be Nick in cell, right? So for each of these, we're gonna have an instance count which is gonna be equal to the number. Notice that this is an I64, our number is a U32 so we're gonna do this, I64. Actually, this is probably gonna yell at me so we're gonna do this. These are basically the same. The only difference is that I64 from number is gonna warn you, like this will not type check if you try to use a type that can't safely be turned into an I64. So if you try to, for example, you can't turn a U64 into an I64 this way because there are some U64 values that are not possible to convert to an I64. Where it's on the other hand, if you use the ass cast then it will just turn one into the other which can error in an interesting ways. What else do we want? We want a type. So type here notice is the spot instance request type. No, one time is fine. So the other thing we're gonna need is block duration minutes. So if you remember back to the very beginning for the spot instances that Amazon has, normally Amazon is allowed to interrupt them at any time which is pretty inconvenient. So instead we're gonna use these things called spot blocks which are not interrupted ever but you have to say how long they will run for. So for, I have too many tabs now. All right, so we're gonna have to just define some kind of block duration. Now here this has to be in increments of hours with at most six. So this is something we're gonna really want the user to provide. But for now let's just say they're not gonna run longer than an hour. Usually when you build crates this way it's useful to just start out with some semi-sensible defaults and then make things configurable later. So for example here you could totally imagine that one of the things you set on the build in fact maybe we should just do this now. Set max duration self and it's going to take some length and it's gonna take hours. That's gonna be a U8, can't be a very high value anyway. We could take like U size or something here or something that implements into U8. I'm just gonna leave it for U8 for now. And that means that our burst builder is gonna have a max duration and that's gonna default to one hour, max duration, I guess technically 60. We can store this directly as an I64 because that's what the API is gonna want. This is gonna be hours as I64 times 60. All right, so now we can do that here. We can use self. This is gonna take self of course. This is gonna take max duration. Great, so now the user can configure how long they want these to be run for. Now of course this does mean that there can be human error here, right? Like they can set two short duration and then they'll inexplicably have their things failed but hopefully the errors we end up propagating are gonna inform them exactly what happened. Now the last thing, remember that the user in this machine setup also describes the instance type and the AMI and in this struct there is nothing that fits that. So there's this launch specification. The Amazon API is not well designed. So we're gonna have to give them a launch specification which is a request spot launch specification. So we're gonna have to make one of these. All right, so we're gonna launch. It's gonna be one of these. Also influence default. And then what are we gonna set here? So for the launch, notice here they're just like even more options that we get to set. We don't actually need very many of these. We need image ID, which is gonna be the AMI that we wanna use. So in this case, we can get this from the setup. So remember how we have a machine setup. So we have instance type and AMI. So we want the AMI from here. We're gonna get all sorts of ownership errors. We're gonna look at those later. And we want instance type. So instance type is gonna be how many, sort of dictates how many cores and how much memory is gonna be used. So we're gonna do setup.instance type. All right, so now we've set all the parameters that we've been given by the user. There's some other things you could set like you could try to ensure that the VMs all like start up in the same region of the world or in the same like rack locality. We're not gonna deal with any of these things now because it would just take too long and make the API more complicated. But that is something that we could do potentially in a second video or something like that. It's like let the user do somewhat more complicated things there. Okay, so we're gonna have to issue, we've now made this request, right? So we've constructed the request and then we need to issue it. We're just gonna assume that they work out for now. What language is that? This is Rust. This is, oh, that's interesting. I wonder how you got here. I'm excited. So this is, ooh, I've turned off JavaScript, of course. It's this language. So Rust is a language that does not have a runtime but that still has a lot of really nice high level features but also very low level control. It's become one of my favorite programming languages very quickly. Let's see. Where were we? Oh, yeah, here. Okay, so we are also going to have to set up security groups later. So this is like allowing SSH access to this particular machine. I'm gonna just not deal with that. We're gonna need key name and security groups as a reminder to myself. And we're gonna set that later. Nice to do. And also key name. Yeah, so if you're back on this Java and C++, Rust is more like C++ and Java. It does not have a runtime. And so it does not have garbage collection either. It's similar to C++ except it, at least I find it a much nicer language to work with. It also has what's known as the ownership model. So it lets you track who has access to what data. So you can, for example, the compiler will guarantee that you don't have two threads that both have mutable access to the same data at the same time. So it basically guarantees that there are no data races at all during compile time. This is not something C++ does. And if you try to have two threads operate on the same data and enforces that you have locks of some sort, for example. Okay, so we're gonna set these later. This is basically like how we get access to the VM. And I wanna deal with that slightly later. All right, so we're gonna issue all these spot requests. When you request a spot instance, let's look at what you get back. You get back a request spot instance result. What does it have? It has all these options are really important. All right, so this is gonna get a response and we're really only care about this part of the response and we're gonna unwrap it because why not? We don't do errors yet. Okay, so that gives us a vector of spot instance requests. And what is the spot instance request? Well, it has potentially an instance ID and it also has a state, right? So essentially what happens here is future Python stream. I'm not planning to do a Python stream. Although, so the risotto library here is based very much on the Bodo library that's available for Python. I've actually come to prefer Rust to Python as well. I basically only use Rust now. It's a really, really fun language. Okay, let's see. So we're gonna have to now notice where we are in the code. We're now at the point where we need to wait for all the instances to come up. However, in order to... I'm planning to do master on data analytics. Cool. So we're gonna have to wait for them to come up but in order to do that, we need to know the IDs of all the requests that we've put in, right? So that's what we... Essentially that's all we need to extract from this is gonna be this spot instance request ID, right? So we're gonna issue all the spot requests. So we're gonna here do something like we're gonna keep a spot rec IDs. And then here we're gonna do spot rec IDs extend rest dot... So remember how this was a VEC of these? So we're gonna do intator. We're gonna map for each spot instance request. We're gonna get out the... We want to avoid these. All right, so this is gonna take out all of the spot instance requests, request IDs and then gather them up into the vector that we have up here. And this allows us to sort of give all the request to Amazon first and then we're gonna wait for these instances to come up one after the other, right? So the way you do that is you issue this describe spot instance request RPC. So that's what we looked at over here. And this here you just give a list of all the requests you want to monitor and then what it gives you back is a description. So notice that these are the same type as what we got back from request spot instance. So we're just gonna keep looking at each of the spot requests and then wait for the spot instance status to become what's it called? This is another thing that's not very well documented. So I'm just gonna cheat a little. We're gonna wait until their state here. Is it to open? Sorry, is yes. All right, great. This is documented in the Amazon API but in really weird ways. This is why I had this particular thing like checked out in advance. It's not particularly well documented in the crate either. So when we wait for these instances to come up, what we're gonna do is in a loop, we are going to keep asking about all of the spot requests that we've issued and see what state they're currently in. So in this case, this request argument that we give to describe is pretty straightforward. It has dry run filters and the IDs. We don't have any filters and we don't wanna do a dry run. Let's do this. And we already have all the spot request IDs because we collected them up in spot rec IDs. This requires an own string for each one. So we do have to clone it, unfortunately, but that's probably gonna be fine. And then we're gonna issue the request. So that's gonna give that a rec. In fact, here's what we can do even better. So notice that this just borrows the request and so we don't actually need to continue reconstructing the request every time that would just be unnecessary. And in fact, this means we don't have to clone the IDs either. Probably doesn't matter at this level. All right, so we're gonna describe the spot instance requests. We're gonna get back a bunch of results and those results are gonna have vectors of spot instance requests. And we could choose how advanced we want the crate to be here. I'm gonna make it a little bit slower and a little bit stupid. How would you recommend learning West? So I'm gonna do the questions one at a time. I would recommend starting by reading the Rust book. The second edition of the Rust book is actually really good and good to follow. I would probably read like the first few chapters, not the entire book, and then like try to code something up yourself. And the compiler is gonna yell at you in some way. And then when it yells at you, go read the chapter corresponding to the thing that broke, rather than just like read the book from top to bottom. My .files for Vim available, yes they are. So, go here, and then I think it's just called .files. No, config, configs, nice. Yeah, so all my configs are here and my VMRC is over here somewhere. There's a bunch of stuff there for different languages. I can like help point you to something more interesting. So I'm gonna go ahead and go ahead and go ahead and go ahead and go ahead and help point you to something more particular. Generally, this is NeoVim with the language client plug-in. So that's this one. And then with Rust RLS running in the back. Where were we? Yeah, so we're gonna describe all those potential requests, and then in theory we could be like a little bit more efficient here in that we could once one instance become ready, immediately do set up for it. And then once another instance becomes ready, immediately do set up for that. Instead, what we're gonna do is just like wait until all of the instances have been spawned before setting up any of them. This is gonna introduce a little bit more delay, but it's gonna be a lot easier to code. So we're gonna do that. Now, there are a bunch of ways we can do it. This response, whoa, that's not at all what I meant to do. So this response that we get back from here, remember has just this, this is the same response that we got up here. So really what we're just gonna do here is we're gonna check if any of the requests are still open. If they are, that means no instance has been spawned yet. So we're just gonna do rest.this.unwrap, which is gonna give us this vec. And then we're gonna ask that vec whether any of the spot instance requests whether their state is open. And then I couldn't make this like a while loop. I'm gonna not do that. So if not any open, is not the most rustic way of doing this, but it like works fine. Notice however that one thing this doesn't give us is the actual IDs of the machines we start up. They're in the hardware approach. This is, you definitely learn a lot by the Rust compiler yelling at you. It's definitely the right way to go. So in EC2 there are two sort of instances IDs we're playing with here. One is for the request that we shoot to Amazon to boot up some machines on our behalf. And the other is the IDs for the machines themselves. And if we want to get like the IP addresses or the DNS names, we're gonna need those IPs. And so what we're gonna do is if none of the requests are open, so that is they've all been satisfied, then we want to get the IDs of all of the requests that have been spawned. So we're gonna have to do something like instances. And in this case, all of none of them are open anymore. In theory, they could be in an error state. We're just gonna ignore errors for now and deal with them later. So if we look back into the spot instance request, notice how there's an instance ID, which is the thing that we want. So we're gonna say that instances, so Rust lets you assign to a variable that is not mutable if it has never been assigned to. So this is declaring a variable but not assigning to it. And then the compiler is smart enough to realize that after this loop, the only way you can get to that point is by breaking inside the loop. And the only way you can, and if you break here, then instances must have been set. So we can do, less.spot instance requests.unwrap.intuitor. I guess if we want to avoid the unwrap filter map, this, so this is gonna be a vector. And we can just omit the type and the compiler will infer that from the type of instance ID in here. All right, so after this loop, we know now that all of the instances have been spawned or there was an error, let's ignore that for now. Now we're gonna stop the spot request because all the instances have been started. And remember what I said earlier that if you start a spot request and then an instance is taken down, then the spot request is just gonna spawn an instance again. So if we want this not to happen, we have to stop the spot request. This is luckily pretty easy to do in that we just make a new request, cancel, which is gonna be, where is my API list here? Cancel spot request. Cancel spot instance request. So we're gonna issue this. You start to notice a pattern with these API calls. It's gonna take one of these cancels structs. We're gonna have to make one. Let's default. And it also takes a spot like IDs, just like remember how are described spot instance requests. So this thing also takes a vector of strings. So it's the same set of strings. So we're going to do just cancel about that. And then we're gonna take that out of request because we're no longer using request up here. So we're gonna steal the one that's in there. Like so. And then we are going to tell EC2 to cancel all those requests. So EC2.cancels, spot instance requests. And again, we're gonna unwrap and not care about errors. Let's just see how badly this breaks if we try to compile it. Good, that's interesting. What version should I be using? Sort of 24.1. So EC2 is different. Apparently. All right, 24.1. A standing desk setup. No, it is not. I have one of work, but I don't have one here unfortunately. I wish sitting tall. I'm gonna let that keep running in the background. So this is gonna cancel our spot requests. So we're actually doing this a little bit out of order, but that's fine. Let's just pretend that that's what we intended to do all along, shall we? All right, so now we know that Amazon has started up all our instances. So in theory, if you were to look at the Amazon dashboard, there would not be a list of instances and now we want to connect to all of them. This is gonna be a little bit of a pain, but we can do it. All right, so really all you need to do in order to connect to a server is you need to get its DNS name. And then in theory, you can just SSH to it assuming that you have a key there. Maybe we'll do some key generation later. I really want to avoid it because it's a little bit painful, but we'll find out. Okay, so we sort of want to collect, remember how the run method that we have down here is given a machine set and remember how a machine set is gonna describe a bunch of machines and it's gonna have the instance type, the IP and the DNS, right? So the instance type we already know because it was given to us by the machine setup, but the IP and the DNS we need to figure out. And so that's what we're gonna do next. So here, we're gonna ask EC2 to describe all of the instances that we have to us. So we're gonna have like machines. In fact, so remember that this is already inside of a, ooh, so remember how we have descriptors for each name. So we have like the server machines, the client machines. I like to use version specifiers. So, okay, so the proposal here is that I should do this. This should not be necessary. In general, if you do this, cargo will do the right thing. It will pick a newer version as long as it's similar compatible with what I've given. So it should not be necessary to do this. Yeah, so this is gonna launch all of the instances, but remember that we sort of want to tie each instance to a name where the name is what the user gave to that particular set. Now there are a couple of ways we could do this. We could like start all the instances for each name separately, but I don't actually want to do that, I think. So I think what we're gonna do is be a little bit lazy. That's also really unfortunate. That's a good question. Okay, let's write the code first and then deal with it. So we are going to loop over all, or we're gonna ask EC2 to describe all of our instances. So no longer need this, no longer need this, no longer need this, no longer need this, no longer need this, or this, or this, or this, or this, or this, or this, or this. Great. So many fewer tabs. Fantastic. Or this. Great. We've seen that Risotto is deprecated. Really? That's interesting. The reclaim is that Risotto is deprecated. Oh yeah. What are they intending? The crate gen release. Although this is sort of already what we're doing. Oh, so I bet you the only change I really need to do is this. Yeah, that seems all right. Seems fine. Yeah, that's fine. We're gonna go through these errors later. Who needs their programs to compile? We just wanna write them. Yeah, this all seems pretty straightforward. Great. So back to this. So we've, EC2 has now spun up a bunch of instances for us and now we want to get their IP addresses and their DNS names. So we use this describe instances method and guess what, it takes as input a describe instance requests struct and returns a describe instance results struct. We know how to use these. So we're gonna do, let me, right. Actually, yeah, sure, why not. One of these, let's default and then it takes instance IDs and instance IDs. Oh, how lucky we already have that in instances because we collected those up here. Great. So now we're gonna do easy to dot describe instances and we're gonna give it a wreck. We're gonna unwrap the results now. Again, the problem is it'll change and this gives us a result and the result has this next token. So this is for pagination. If you were to list lots and lots and lots of instances but we only care about the reservations that we're given. So we're gonna do for reservation in, oh. Great. So that gives us a bunch of reservations or we know that there will be some instances. We could technically assert like the number of instances we requested in a spot request is like greater than zero but I think the API would error anyway. So what we're really iterating over here is we're iterating over all of these reservations and then hey, look, a reservation can have multiple instances. So for instance in reservation dot instances, so many unwraps. I agree with you, the unwraps should go away, they will. All right, so that finally is gonna give us an instance. So now for all of the requests or the instances we've spawned, we now have access to one of these instance things and let's see what we can find here. So it has things like image ID, which is like the AMI that we used, instance type, key names and whatnot, but crucially, public DNS names, public IP addresses, private DNS names and private IP addresses. So we have a couple of choices here of how we want to expose this to the users. So remember that the run closure that we give gives us description of a machine set. Yeah, failure would be good but we can't really use failure until, well, we could but until we know what's gonna fail and how. All right, so we need to figure out what we want to expose here. I'm gonna make this very opinionated and say that we're gonna expose the private IP and the public BS, oh, private, because those are in general the most useful things. Your clients when connecting to your server might want to use the private IP, whereas your server or when you SSH to machine you probably want to use the public DNS. So we're gonna do that. We are now gonna construct these machines that we're gonna put in here. So for every instance, we're gonna create a machine and initially the machine is not gonna have an SSH connection, so let's just make that an option. So it's gonna have no SSH connection, it's gonna have instance type, it's gonna be instance type, which of course is an option because everything in the AWS is. To do remove unwraps two years later, yeah, you're totally right, this totally happens. Also the really frustrating thing actually with the way this API is currently structured is that there are a bunch of things that are in options that can never be none in the regular operation of the API. Like you would get an error or you get a sum, but you never get an okay none. So this is, I think one of the sad things about using crates that are sort of auto-generated is that they can't really deal with this. Like they're decoding adjacent replies so they have to assume that everything could be none, right? All right, so we want the private IP address. So private IP is gonna be instance.this.unwrap and public DNS is gonna be instance.public.unwrap public DNS name, but I don't know. So here's, there are many things that are a little bit sad here. First of all, an instance could have been started but not yet been assigned a DNS name. I only know this because I experienced this firsthand. So what we're actually going to do is we're gonna do if this is none or if this, then we are going to, ooh, it's a good question. We're gonna loop for it. Again, all of this is gonna get a little bit better with error handling. Specifically, we're gonna do something like, if no errors happened, then, so we're gonna do machine slot clear. I'm gonna do this, then continue. Any not, right, right. So now down here, just technically we could do, there were a bunch of ways we could do this so we could avoid the unwraps by doing a match over instance. So we can do this and then say that if it is a, one of these, gee, this is probably the nicest way to do it, then is a sum of IP end, public DNS name, the sum of DNS. We don't care about the other fields. Then we're gonna do this. Otherwise, then we know that one of them is not ready. Kind of neat, although apparently it doesn't compile. It's probably, we're gonna do machine slot push machine. Then this now becomes, ooh, I guess actually we want sum instance type and now look at how pretty this can be. Wow. All right, fantastic. And now this is gonna be while any not. Exactly how we structure this is not terribly important. But so the intuition here is we're gonna keep sort of probing for all of the instances and then only when all of the instances are fully up and they all have a private IP and they all have a public DNS, only at that point do we end up breaking from this loop. So at that point, any not ready is gonna be false. This negation is sad. Already false, while not already is equal to true. So until all the machines are ready, we're just gonna keep iterating through this loop. Notice at the top, we clear all the machine IDs out. And so we're not gonna get here until all of the machines are ready. So now all the machines are ready. Now we need to connect to them and run the, all right. So now we're getting to this point where we need to run the setup closure. So remember that for every machine, the user got to just write this closure that's gonna get to execute some commands on the machine that we just booted up. So we're gonna sort of establish an SSH connection to each of the machines. And then once we do run the user's closure. So now we get into using this other library which is gonna give us an SSH connection. There are multiple bindings for SSH. They're all varying degrees of uncomfortable to use. In this particular case, ooh, how am I gonna do that? I'm gonna cheat a little bit and then I'm gonna set up a security group, the key pair. Oh, great, I already have a key pair. In the security group and create a security group, I'm gonna call it everywhere. Actually, let's not do that maybe, maybe that's a bad idea. I want my IP add rule. We're gonna allow SSH from my IP address. My SSH, great. Okay, so what I did here is basically whenever you set up a new machine to Amazon, you both have to give it an SSH key to let it should allow access to by default. And you need to set up a sort of firewall rules for that machine. And in this case, I'm allowing me to do remote access through SSH. Technically, we could make the crate do this for us and we might do that in a later part of this. But for now, we're just gonna not worry too much about this. So we're gonna do, key name is gonna be X1C, security groups is gonna be low. So later, we're gonna want to like, remember this security group is just something I set up manually and it only contains my IP address. That's not particularly helpful. Ideally, what we want is the library should like figure out what this machine's IP address is. Add a sleep between the scribes. Yeah, I mean, that's sort of true. Although remember, we're describing a lot of these. I have not generally found this to be a problem, but you're totally right. Yeah, so these being hard coded is really bad because it's unlikely that anyone else has a security group called a low and a key called X1C that actually matches the setup they want to use. And so this is something we're gonna have to deal with, but just so we can test the rest of the code, I'm gonna leave this in for now. All right, so now we're gonna SSH to all of the machines. So remember, we now have this machine set. So we're gonna pour machine in, I guess machines. Notice here that instead of this, which some people write, you can just write this, they're the same. This is because Rust implements into Itter for ref of vec as the same as calling.Itter. Clippy is a great tool that will warn you about all these things. So if you haven't tried Clippy, you should definitely try Clippy, a cargo Clippy that is not the word Clippy, that's useless. Okay, so for every machine, we're gonna have to connect to it. So let's see how that would work. SSH, how do we connect to something? Well, we're gonna need these things. Back to the top. And we're gonna need, in our cargo.toml, we're gonna need SSH2 of version 0.3.3. Great, and we're gonna extract SSH2. So down here, we're gonna establish a connection to the server, right? So we're gonna do this. Again, copy-pasting code, it's great. Notice there's another unwrap here. Now, we do actually know where we want to connect, which is nice. So in this case, we're gonna connect to the public IP of us, port 22, machine.public.ds, right? And in fact, it's gonna be, yes, we're gonna connect to that, establish a session. So this is an SSH session. The way the SSH library is set up is so that you establish a TCP connection and an SSH session object. And then you tell the SSH session object to do a handshake over the given connection. Why it's structured that way, I'm not entirely sure. It seems a little bit unfortunate, but it's probably because this is the closest to the underlying lib SSH2 bindings. All right, so the handshake just establishes an SSH connection, but then of course we need to authenticate to that machine. Now, we do know that for all the, you can try this. For all easy to, for the Amazon Linux 2 image that we're using, the default SSH username is easy to user, so we're just gonna use that. In this particular case, I'm saying use my SSH agent and try to connect as this user. I don't know if this is gonna work. We're gonna find out. Here again, what we probably want to do, I know that's not it to do here, is not require the users of our library to run an SSH agent. Instead, what we'd probably do is tell Amazon to generate a key pair for us, take the private key that Amazon gives us and store it in a file, and then we can use, where is this, session, user auth pubkey file, and then give the private key that Amazon gave us. So now we guarantee that we can always connect to the VM. I'm gonna skip that part for now as well, just because it sort of gets us to somewhere where we can actually try to run this code and see if anything happens. All right, so in theory, we're now gonna be authenticated. We can check this with, is authenticated somewhere? Yeah, authenticated. I'm gonna not even do that at this point. All right, so let's see how we run a command. So notice that this is already what we're doing. We're connecting to the Amazon machine. We're establishing a session, authenticating as the user, and then SSH has this notion of channels. So over a single SSH connection, you can open multiple channels and you can sort of think of them as independent shells or independent sessions to the server. So for example, we could run LS in one and like top in the other and tail in a third. In this case, all we're going to do is do LS. If I remember here, really what we're gonna do is run the closure that the user defined for this particular machine, but just to see that things are working. In fact, maybe what we do is, IP adder. Sure, why not? So we're just gonna show the interfaces that are up for each machine. We're gonna read the output of that command and then we're gonna print it and then we're gonna just close the session. This is a very, again, this is just for our own kind of sanity to see that the machines indeed booted up and see that we can run commands on them. Here in theory, we'll run the run command and then of course we get to the point we're gonna terminate all the instances. So down here, we're gonna do a similar kind of thing where we're gonna iterate over all the machines and we're gonna call into the EC2 API. So of course there's also a terminate instance. So we're gonna terminate all the instances that we started off which sort of completes the whole life cycle here. So in this case, surprise, it takes a terminate instance request object. How convenient. Ooh, we can kill multiple ones at once. Great. So it implements default and rec.instanceid is gonna be equal to, where did we use these last? So this can also be shared. So this we can move out here. Right, so up here where we're describing all the instances to get all the IP addresses out. Here we sort of kept all the instance IDs we wanna look at. So we now wanna extract those out of here and say this is gonna be our termination request. And from that we're gonna take from our describe request up here. Let's give it a better name. So we're gonna just reuse the IDs from the describe request that we had. So we don't need a for loop anymore. And now we just want EC2 to terminate instances and other. All right, so in theory, this should now issue the spot request which starts up all the instances, wait for all the instances to have started, SSH to all of them, see that we can run this particular command. I guess we can make a less verbose commands. Cat, et cetera. And then it's gonna terminate all the instances again. So currently it doesn't really do anything interesting. In particular, it doesn't run any of the user's closures. But in theory, we should at least get like sort of the right kind of stuff out. So we're gonna make an example. Examples, gonna give it a better name later as we always promise. All right, let's see what happens if we now try to compile this library. It's probably gonna break. Seems unlikely that I wrote everything right on the first try. But notice how now we have this code which we wrote, remember in the very, very beginning for how we want the library to look like. All of that code, we've written it in theory, this should just like run. We shouldn't need any modifications to this code because we've like decided what the API is gonna be like. Whereas all this code that we wrote in here, in theory at least should, cannot find Reckon scope, line 76. That's because this should say launch and this should say launch, line 24. Machine setup takes a path, it doesn't need. Are you German? I am not, I am Norwegian. But close call, you almost did it. 38, yeah, so, oh, that's interesting. Okay, so, this is an error that you're gonna run across a lot in Rust where notice how it's saying that we need to give a type argument for machine setup here, right? So remember when you do this builder, you're setting up a bunch of machine setups. But notice that machine setup is parameterized by F by the function that we ended up calling as in the setup function we're gonna call. And we don't really want to put an F here, right? We could do this, but if we did this, we're actually forcing all of the machine setups to have the same closure, which is not really what we want. So instead, what we're going to do is we're gonna have this, ooh, this is gonna solve some other problems too. We're gonna have this store a box of this. So what this is gonna do is it's gonna do dynamic dispatch, which is fine because we're not particularly performance sensitive here, but now this is just gonna store function pointers, essentially for functions that we're gonna call when setting up this particular machine setup, which means that it no longer needs to be generic or F, which means that this no longer needs to be generic or F. This, well, this still does, but setup is gonna be a box of setup. Does that roughly make sense? That's one more reason to use this as a written Rust. In a sense, it's a little bit like Ansible. However, at the same time, we're not really trying to replace Ansible here. So you can totally imagine that in the commands that you run here in setup are like the Ansible setup commands, like install Ansible and this is gonna be run Ansible locally to set them all up. It's more about sort of spinning up all these spot instances and then tearing them down, which I don't think Ansible does for you, but I may be wrong. It's also not another big difference here is this is intended for relatively short jobs, right? So the intention here is that you want to spin up something, run it to completion and then tear it all down rather than Ansible, which is sort of more for managing servers over longer or managing deployments of servers over longer periods of time, but thanks. It is pretty cool. All right, so new is gonna take some kind of function that we're gonna use to do setup. We're gonna box it. So the machine setup does not have to be generic. And this means that now we're allowed to store a list of them without forcing them all to have the same closure setup type. And now let's see what happens. Ooh, others. Yeah, so here, oh, it's also a little bit unfortunate, but it's fine. So the new method takes a function closure and returns a self, but it puts the closure into the thing that's returned. Yeah, I think so too. I think so too. So it returns a self, but that self contains the closure that was given, which means that we need to indicate that this F is not, so imagine that someone called like new and they give closure and that borrowed like something like this, right? Ended up being more than I wanted to type, but if they do, if some users of our library did this, which they're actually allowed to do given the current signature, if we allow them to do this, then the returned value would now contain something that's on the stack of the caller, which means that this self that we returned needs to have a lifetime that's like tied to the input variable. We could do that and enable some more interesting types of closures. I'm gonna just not do that for now and instead require that F is static. So this means F here is not allowed to borrow anything from some shorter scope. Use your rusted work. I do, so I do research at a university and so I sort of get to pick my own language for the projects that I work on and so the current project I'm working on is entirely built in Rust. So it's about 40,000 lines of code now, so it grows a lot, but it's pretty fun. All right, let's see what happens here. What's it complaining about now? No simple found. Oh, that's too bad. We know a lot to use simple anymore. Resulto EC2, EC2 client. This one, why am I not allowed to use client? That sounds like a lie. But okay, I guess we're gonna have to do this new business. Luckily, I cheated and I have that line somewhere. So I'm gonna just magically pull that out of nowhere and then I'm gonna also magically pull this out of nowhere, like this. No one saw that. Oh, we're getting some others. That sounds good. Line 81. This needs to be a string. Yes, indeed, it does. This also needs to be a string. Ooh, that's right. So another thing that's sort of interesting about the EC2 API is EC2 client only has these two methods and then all of the other things are in a trait that's implemented on that client. So we need to use the trait in order to be able to use any of its methods. So we're gonna do that here. Getting closer. Did you mean spot instance request IDs? Yes, compiler, I did. Thank you very much. Expected option. See, so some of these take options, some of them do not. It's great fun. We all love it for it. Which means we're gonna have to upload there. Oh, it's getting pretty close. Read to string. Yes, indeed. We do need that in here. That looks awfully like it's gonna compile pretty soon. That's not bad. It's not bad. Written a lot of code and well, you moved here. I don't know if you caught that, but. So here, does any open only really needs to look at the individual request, but it doesn't need to move anything out of them, right? It's just like checking whether their state is open. However, if you unwrap an option, it's gonna, well, move out of that option. So we no longer have access to the underlying object. So you have this asref method on option, which turns one of these into one of these. And so now when we unwrap it, we just get a reference to the thing that was inside as we don't actually move out of this. Private type and public interface, we're gonna deal with that later. 9108, I cannot move out of borrowed. Yeah, so this is the same thing. We're not allowed to unwrap, so we have to astrap. It's pretty exciting. Lots of errors. We're gonna deal with those later on. But now let's see if our example works. Compile times is sadly one of those things that Rust needs to get better at. Progress is happening, slowly but surely. Does the rough API that we have sort of make sense to like this, why this is roughly what we wanna do and the way we've gone about doing it, like while we're waiting for the compiler to run? Anything that you would change or do you think is missing? VPC, mm, yeah, VPC is a good point. So VPC is EC2's notion of like a virtually private network that you can set up. And one thing that would make a lot of sense is for the library to set up a VPC for all the machines that you spawn so that they're spawned in the same VPC. In fact, that's a good idea, let's put that. The other question is whether we want that to happen automatically. So I think here we're gonna want to do VPC. It's sort of related to the same kind of thing of in a realistic, like if we were to polish this crate more, we would want to have it like do more sort of setup of the context in which we launched these VMs. So that would both be launching them in the same VPC, maybe launching them in the same availability zone, setting up the keys for you, setting up the security groups for you. So the other thing that we're gonna need is like the ability for the security, for the virtual machines to all talk to each other. In fact, maybe that's a good call, I should do that now. So normally the firewall for EC2 instances like basically blocks everything and like nothing is allowed to talk to anyone. And realistically, we at least want the VMs that we spawn up together to be able to talk to one another. So we're gonna use this hello security group that we set up. I'm gonna abuse it a little to, ooh. It's a lot of CPU usage. There we go. Great. Easy to. Yeah, so by default, the VMs can't actually talk to each other. And so we're gonna modify the security group, this hello group that we set up. And we're gonna edit the inbound rules. So normally by default, all VMs can make outbound traffic, but none of them can accept any inbound traffic. So this is why we added the rule for accepting SSH connections for me. And then we're also gonna allow them to accept incoming connections from any other machine. I'm gonna do TCP, can I do like all ports? Well, TCP. That's so unhelpful. So hijacking my right click. So we're gonna say local traffic, local TCP. We're gonna also do the same for UDP, although I'm not imagining we're gonna actually send much UDP traffic over this. In fact, sort of wanna just allow everything. Sure, why not? Oh, SSH, can I do that? SSH is zero. I mean, remember, this is still running a SSH server. So even though the port is public, it does not actually mean that anyone can connect. So we're just gonna do that for simplicity for now. Again, realistically, the library would actually create a key pair, create a security group, and only allow connections from the IP of the thing that's running the library. All right, so here, right. So our example is of course gonna have to extern Rcrate, and then it's gonna use first builder stats and metrics. Yeah, stats and metrics would be really nice. In fact, ideally what you want is like, you want to run to return information about the run it just had, which is information it can pretty easily extract. You could usually even imagine that it ran some commands on the servers or on the VMs before tearing them down, right? Machine set up, I missed something. No machine set, because this we ended up making a back of machine. This is machine, and then it's gonna complain that all of these things need to be strings, and I really don't wanna force the user to do that. So now we're gonna tidy up our API a little bit, and we're gonna say that, but we're gonna take a string and it's gonna be anything that's into cal static. That's a good question. Actually, you know what, let's just do this. Take a stir. I actually still don't know what the best way to accept string arguments are. There are lots of different ways using traits. Return JSON to be processed by some other script. Yeah, exactly. So it would be really nice if you had the run or something else just gather up a bunch of statistics about the run and then extract it, because then you could imagine that you run a benchmark using this tool, and then some other tool consumes the output and plots nice charts and whatnot. You could of course, the user could do this manually in the run method. It would be nice if the library did this sort of for them. All right, machine setup new needs to do the same thing. Yeah, so what I was saying was, it's not entirely clear how you take something that's either a string or a stir. You get into some slightly unfortunate, oh no, business with traits. I don't actually know what the best way to do this is. I've seen some places use anything that can be borrowed to stir. I've seen anything that can be turned into a cow, anything that implements to string, but that might be too general. And also string implements to string, but then you end up cloning the string and you don't really wanna do that if you already have a string. So I think cow is the right thing to do, but then you need to add an extra lifetime. It becomes a little unpleasant. So we're gonna just take stores for now. Consider removing the semicolon. Oh right, we don't, this won't actually work yet. So we're just gonna return okay here, okay here, and do any of this, yeah, I need a hash map. I feel like every Rust file I have starts with a use collections hash map. First builder is private. First builder should not be private. Sheen should also not be private. Take zero parameters, false. It's gonna take an F, found signature. Oh right, that's it, good slices. All right, let's see what happens. Expect a result, yes, this is the same thing because we want the user's closure to be able to fail. What's that? I know that it might start a war, but have you tried using space max with VIM bindings? I have not, I have a friend who used it, and he after some time decided that space max was really annoying because it pulls in like a million dependencies. I've actually been super happy, so I'm not really using VIM, I'm using Neo VIM, although it is based on the same code base. And I've actually found it to be really, really nice to work with. I also basically live in the terminal, so I don't want to have new windows spawned, which Emax generally does. You can run it in the terminal, but it's not as good in the terminal. SSH connection is private, I should have just made this pub. And what else is private? What do you think of this private? Great, do we think it's gonna run? Do you think it's gonna spawn anything? It's gonna be pretty interesting. Looks like it's compiling. Oh, it crashed, great. Yeah, so here it's complaining that it doesn't have my secret key, so I'm gonna just do this, and then it's gonna, let's do this. I mean, it's definitely accurate, I do live in the terminal, it's not far from it. Missing parameter, the request much contain a launch specification. That is totally true, didn't we specify one? Oh, break.launch specification is some. So I just want to point out that normally crates are not quite as painful to use as risotto is, in that all of these sums should be unnecessary. I sort of want someone to build a nicer library on top of the EC2 bindings for us. Arguably, that's what we're doing, but for a very limited scope. And part of the reason no one has is because it's a pain, it's a lot of work. Oh, it's doing something. Max spot instance count exceeded. That sounds terrifying. Is that the terminal? Yes, this is Alacrity. Alacrity, I don't know how it's pronounced. So I've also been really up it. Max spot instance count exceeded. Oh, that's kind of interesting. I may need to, no, false. Oh, yeah, no, I've been super happy with it. It works perfectly fine. Why does Amazon.letbespawnspot instances? Am I allowed to spawn one instance? That's the real question. Let's see if my spot request even went through here. You currently have no spot requests. That's fine. Huh, is it not letting me do this? Let's see. Your limit may be lower than 20 to start. Let me, I'm not allowed to spawn one. That seems like it's false. How about now? Oh, it looks like it's running something more. I guess arguably we should have added some logging. Would not have been a bad idea. Let's see if it now gives me some spot requests. Ooh, how about that? Capacity not available. It does probably, because it, so in general, it doesn't, if you sort of better the instance types, the more likely it is that you'll get a spot instance. So I'm gonna go all out here and look at the spot instance. Easy to spot instance types. Here, we're gonna go to pricing. Here, we're gonna look at, sure, why not this one? Nope, that's not a link. That's a lie. Why does it think that's a link? All right, we're gonna do an M5 large. M5 large. We're gonna see if it then spawns one for me. Although notice that already, it is adding spot requests. So that's a cool start. Active fulfilled. Ooh, do we have an instance? We do have an instance. Let's see if I can SSH to that manually. Takes a little while to boot up anyway. So let's see if it generally set the right things. It booted as an M5 large, so that's good. It's attached to the Hello World Security Group. It has my key. All right, and it's set to be a spot instance. Okay, that seems promising. But am I allowed to SSH to it? It's like the critical next question. I mean, I think that the .05 dollars for an hour is gonna be fine. Also, Amazon does bill under bill per minute now, rather than per hour. And so you don't, ee, look at that. Ah, program got further. Pooled stream disconnected, but hey, look at that. That's a host name right there. Great, so this unwrap happened somewhere else. So where did it actually fail? So it did succeed to SSH to the machine and do the print that we wanted it to do. So this is a great start. And now, let's see here. At source lib 190. All right, let's see what happens at source lib 190. Oh, so the termination failed. Yeah, okay, so this is a, I've run into this when I've written code using risotto before. Basically, this is, if you, the Amazon API only lets you, only gives you access tokens for a very short amount of time. So if you try to issue another request, it tells you you need to reauthenticate. So this is gonna seem a little stupid, but while we get an error, look, we got rid of an unwrap. Yay. So here, we're gonna just keep doing it while there's an error. And then we're gonna do, okay, you're all gonna hate me for this, but bear with me, because the only errors that the API exposes to us, the risotto API that is are, so the errors for all this API stuff is, ooh, maybe not. Maybe this is better than I remember. Terminate instances. So the terminate instances error is generally just gonna give us like a credentials error, which just is a message, which is entirely unhelpful. Yeah, so the credentials error is just a message. So we can't actually like check what type of error it is. So instead, we're gonna do this really, really ugly hack. And you're all gonna hate me for it, but it's fine. I know, I know, I know, I know. You see how bad that is? You see how ugly this hack is, but it's fine. All right, let's do it again. And have it not, ooh. So this is one of those cases where a library does not expose errors in the best way. And then you're forced to do stuff like this on top of it. It's a little bit sad. We'll get better error management later, I promise. Cannot infer type for T. Oh yeah, fine. We're just gonna panic with, in theory, we shouldn't get there until an actual error. It's a real life situation. It's true. It's good if the hacks are educational. I mean, I have been through this enough times that I know how to hack around things. The biggest problem, as someone pointed out earlier, is then you come back to it to do two years later of like, I probably should have fixed the sins. Stringly typed errors, yeah, that's right. I mean, a panic is a fine way to error, right? It's like, all right, let's see. So our spot requests, ooh, capacity. No, that's the old one. This one we can just get rid of, cancel. Well, I guess we're running without release mode, which is probably part of the reason why it's being slow. Or like, maybe my internet is being sad about me uploading like 1080p video. Is the audio and video good? By the way, I should probably have asked before I started rather than now, but I'll excuse myself. How about that? Isn't that beautiful? Look, no crashes. We just swallowed that pooled stream error, like it was nothing. Oh, great, glad to hear it. All right, so now we're gonna have to like, actually execute this user code, right? We're gonna have to pass the SSH connection to setup and we're gonna have to pass the machine stuff to run. We're also gonna have to figure out how to divide these into the same categories the user originally gave us. Because remember, we sort of flattened everything here, right? Like, we just looped over all the descriptors and then this name just gets dropped. And in fact, the compiler complains about this. It says the name is an unused variable because we just like flattened all the requests into one. So we're gonna have to deal with that at some point. But let's go with, ooh, let's also shut this down so that I don't pay all the money. I mean, it'll be fine, but, instance, terminate. So another thing that we probably want to do is, I think you can wait too, it's fine. So one thing that we want to do is, if the program panics, we sort of still want to tear down the instances that we've set up. Now, I'm gonna deal with this slightly later but notice that if any of the code in between here panics, in particular, if any of the code here panics, we still want to execute the termination at the bottom. One way you can do this is, you sort of create something further up that then drops all of those at the bottom, right? Yeah, so here it terminated because it got to this piece of the code, right? Which terminates. But imagine that the code like here, for example, that panics, then it won't execute this code at all. Therefore, our instances won't be terminated. So the way to do this is to use, what's this called, scope guard, scope guard. This thing. So this essentially is similar to the go keyword defer. And the way it does this in Rust is it creates an object that contains a closure and that closure is called when the object is dropped which also happens when you panic. Yeah, I manually, the one where it panicked, I terminated it, that's right. Also, yeah, yeah, yeah, that's exactly what happened. And so we sort of want it to be better than that, right? So we're gonna end up using this crate later. So let's see here. So the things that we're sort of missing is we need to call the setup and we're gonna have to need to call the run closure that we're doing. We're gonna start with the setup because now we have SSH, so that's nice. So we're establishing all these connections and ideally, we sort of wanna tag the, how are we gonna do this? I think we're gonna do this because we're gonna have to set this SSH connection, right? So this is gonna be an SSH2 session. Now are we even allowed to do that? Is this session? Yeah, okay, so notice how when you do a handshake, it does not take ownership of the socket provided. So we need to ensure that the socket persists because otherwise, if the TCP socket was closed, the session would not automatically disappear. Normally you don't see this in Rustland because it would be a lifetime connection between the two but for various reasons, this is not the case with SSH2. So there are some ways we can work around this. And the way I'm gonna propose that we work around this is we're gonna add another module called SSH. And we're gonna have this include an SSH session. Now, I'm gonna show you a trick. So we're gonna have a struct called session. Oh, that's too many SS. So we're gonna use SSH2 session and we're gonna use net TCP stream. And then this is actually just gonna be great. And then we're gonna have this session thing contain both the session and the TCP stream. And so this means that as long as we only use the session, then we know that as long as that session is active, the TCP stream will also keep living. So it actually basically consumes the TCP stream as well. And so now we can do things like, and it's gonna take a, let's do, essentially have this method. So we're gonna essentially capture all the connection stuff for an, we're gonna take all the SSH connection stuff and sort of put it in a wrapper so that we can remove that code from our now fairly large source lib, right? So all this stuff that we were doing here to like authenticate and whatnot, all of that is gonna go in here. So that's gonna connect to the address that you gave. So this is gonna connect to the given address. And then it is going to establish a SSH session to it over the stream that we connect. Then it's gonna do the handshake, the authenticate we're gonna figure out later. And then it's gonna return a session that's gonna contain both the SSH session it established and the underlying stream. And so now we give this guarantee that was required by SSH two that the socket persists around the lifetime of the session, right? And then we can do things like we can implement DRF and DRF mute. So these are the traits that the dot operator, for example uses. So that if you have a session, you can call methods that are defined on SSH two session directly without having to do like dot session or some or dot SSH in this case, right? So we're gonna implement DRF for a session and that's gonna DRF to an SSH session, right? And then we're gonna do the same for DRF mute. So you can also call mutable things, mutable methods, right? So you see what I did there. So now here we can do instead of TCP, we're just gonna do, I guess we called it SESS. So we're gonna do SSH session connect to that and unwrap, right? So now we don't need to talk about any part of the TCP stuff here. And this also means that now this machine can hold the SSH session and that will encapsulate both the socket and the session that we established on top of it, right? And in fact, now this does all this exec and read to string and then wait to close things. And we could now add convenience methods to our sessions if we just wanna execute some commands quickly. For now that's probably fine because in this case, now we have an SSH session and this means that the setup method that we give here, this is really gonna get an SSH session right there. So I guess session actually should be public. So now in theory, all of this stuff we should be able to move into our example code because now this will be called with an SSH session, right? And in fact, now because these are all IO errors, watch all of these go away. How happy does that make you, huh? All of you complaining about errors. There are no errors anymore. They're all gone. So now of course the issue is that we need to call these closures. And this is where we get back into the business of the closures are defined per machine setup, right? But down here, all we have are instances, right? So we're gonna have to figure out a way to carry through which sort of machine set each machine came from. Don't actually know the nicest way for us to do this. But I suspect what we'll do, okay, we're gonna do ID to name and then we're gonna here do, we're gonna here do dot map and then we're gonna do ID to name dot insert request ID, carry that along. All right, so this is gonna now keep a map from the machine set name to every spot request that was made for that sort of setup type, right? And so this means that down here, we can now map the spot instance request IDs into a name. So we can now go further. We can now say here that, so remember how the things that are returned from this is in described spot instance. So this method, what we get back here is a bunch of these spot instance requests that eventually contain a spot instance ID, this, which is the actual machine, but it also contains a spot instance request ID, right? So now what we can do is we can do ID to name. So we're gonna look up the spot request ID in that map. So that's gonna be the name, in fact, we can remove it to this, right? Because once all the spot requests have been set up, we no longer need the mapping for the spot requests. However, we do want to track that name for the instances. So then we can do ID to name, insert, and now the mapping is from instance ID to the same name, right? So here we're saying that the mapping should no longer be from spot request ID to machine set name, but instead be from instance ID to machine set name. And so now in theory down here, here, as long as we know the instance ID of a given machine, we also know, we also know which group it belonged to, right? So now instead of machines here being a beck, it can be a hash map. And then when we insert the machine, what we're really going to do is we're gonna do, we're gonna get its name. So here there's an instance, sort of easy to instance, there we go. Yeah, so the instance ID, there's gonna be some, the instance ID is gonna tell us what machine set this particular machine is in. So we can get the name of the machine by doing of the instance ID, right? And then we're gonna do machines. Ooh, that's a good question. .in. This is gonna be fun. So we're gonna use the entry API. So we're gonna use the, we're now gonna sort of group them the other way, right? So we want every machine to be put into the hash map indexed by the name of its group, right? So in this case, the name of its group we can find by looking up the ID because we remember how we keep the mapping from ID to sort of group names. So we get that group name and then we want to look for the entry of that name, so the entry for that group. And for every group, we're gonna keep a vector which we can do with the, all right, let's see if that roughly makes sense. So here machines is the hash map. We look for the entry for that machine set or that group with that name. If an entry does not exist, then we insert an empty vector. Otherwise we just get the reference to the vector that's already there and then we push the machine that we just created, right? So now machines is gonna be a map from the group name string to a vector of these machine things, which is exactly what we wanted. So now when we iterate over machines, what that actually gives us is the name and the machines, right? So that's what that's gonna give us. And now because we have the name, we can use that name to keep track of the setup. So set up fn is gonna be a hash map new. So this is where we initially launched this bot requests. So we're gonna do, we're gonna keep track of all the setup closures for every group. So set up fn.insert, we're gonna use the name and the setup closure, which was called setup, that's on our poll, but all right. Right, so now we're tracking the closure for every named group in this setup fn's map. And so down here, we can now get the fn by doing, by doing setup fn's, which means that we can now call f with that session. And this is now gonna return an IO error here, not even there, an example. So we're gonna pass the SSH session that we established to the closure for that machine's group, right? So we're gonna have for each machines, for each machine in the machines in that group, like so. And then that returns an error, so we unwrap. I know that was a sort of very long, somewhat involved set of steps. Do you roughly follow what's happening? So we now have this, we now keep track of which group, or all the machines in each group. And that allows us to get back to the setup closure that the user gave for that group in the first place. So we're gonna try this now. In theory, it should produce exactly the same output as last time, which is a little bit less exciting. But that output is now determined by the user's closure and not by code that we hard-coded into the library. Let's see what happens. It's probably gonna not compile. 22. And like feel free to let me know if things are moving too quickly or you want something explained. Like I'm perfectly fine slowing down and then going over things. That's better than if I start losing people. String can't index into that because we have mismatch type. Expected result IO error. Oh, right. SSH needs to return an okay session when it establishes correctly. And this can now also stop unwrapping. That's great. Oh, the SSH create has its own error type. So this is where we start getting into the point where failure becomes sort of important. Because you want the sort of consistent error handling across the error types that are used by different crates. I still want to hold off a little bit on doing it because the library still does a lot of hacks internally that I think is gonna be easier to have a sort of consistent error story later on. Let's look at what the error that we get back from SSH is like. It implements error. This map error is gonna, once we move to failure, this map error you'll see will actually map very nicely into what failure does. Essentially, map error is also what the question mark operator does. But all it does is call into. In this particular case, we want to take an SSH error and turn it into an IO error, which we can do in like a little bit of a hacky way by doing this. So this is just like wrapping the SSH error in an IO error which in failure would just be sort of chaining the errors up, which is what you really want to do and what the IO error sort of tries to do. Expected mutable reference to the session. Yeah, so this, right? Because we're gonna want to use the SSH session after the user is done working with it. Oh, method map error found for option. Why does SSH session new given option? That seems weird. Oh, it's probably if you don't have lib ssh or something. Ah, we're getting closer. Expected an SSH session found an SSH2 session. Ah, no, this is gonna return a self. We have our own wrapper for session now because only this error left. What is it complaining about? Machines. This takes the name, which is a string, so why, oh. Okay, so the problem here is set up FNs is indexed by, it's key is string and we're trying to index it by an STR but that should work fine. Am I missing something obvious that someone else sees? That's a very good question. I can't do this, right? Not be necessary. It really does require, I need to generate pretty diagram from existing Postgres database. Not off the top of my head. But I'm sure there are like database visualization tools. I would look into something that uses like graph viz is probably what you'd want for this but I don't have a good recommendation for you, I'm afraid. Okay, you know what we're gonna do? We're gonna do this. It's a little bit more allocation but it means that I don't have to move value name. That seems about right. Where is name moved? Name is moved here. Gee, here's maybe the better way to do this is to not consume the self descriptors and then instead do this and this. These by reference. Oh, then I can't move that. All right, let's see what that does. What does the error actually mean? Oh, okay, so the error that we got back here. So what it's saying is that you cannot borrow a string as a refster. So a string can be borrowed into, let's see, how do I explain this easily? So if you have a string, you could turn it into a ref str because it implements borrow str. It was one of the ways in which this is true. However, for some reason it wants to borrow ref str which would mean that you take a string and turn it into this. I don't actually know why it's asking me to do that. It may be, the most likely explanation is that somewhere in here, I'm like not dereferencing something that I should and I end up with like a double pointer to a string. I don't actually want to walk back through that code and try to figure out where that was because I don't think that's a useful use of your time or of mine. But if I find out, if it like pops in the back of my head, I'll tell you immediately. Let's see, use and move values, sir. Yes, that's also true. Let's see, okay, so in theory, oh, this is the same thing where, yeah, this is the same thing where the SSH methods sort of return an SSH error. Yeah, so you often have these kinds of weird ownership and ref battles with the compiler. It gets better over time. It is certainly true that when you start out, you get a lot of them. Usually the compiler is like pretty nice about this. The one thing, and it has gotten better, the one thing I've found is often useful is to check the variables, half of the type that you expect them to be. So I don't know if you noticed, but that's one of the things I did up here for a second, which was I did this, right? And then tried recompiling the code because in my head, my sort of mental model is telling me that this key is a string. But by leaving this out, I'm sort of telling the compiler, you infer what that type is, which means that like, if I did something stupid like down here, I did this, the type of that key is not string, it is ref string, right? And so by annotating the code in that way, I'm sort of forcing the compiler to be explicit about whatever is giving me. It would be nice if you could hover over variables and RLS could tell you. I think you can do this, but I don't think I have it set up. So the RLS does have this information and there's like a command you can give it to inspect it, but I don't think I have it set up in my editor yet. It's a relatively recent feature that actually worked well. Sure, I have this, but maybe apparently not. But maybe like, it doesn't matter. There's like a thing where you can show the definition for a variable and it will tell you everything it inferred, which is like the way you would actually deal with this. I think I also have editor bindings for like checking my code, but I don't actually use them very often. This is what I mean by I live with the terminal. Like I just have two tabs in Tbux and instead of like having everything in line in my editor. Yeah, so the problem we're observing now is that in our example code, all of these methods are SSH methods. So they return to this like SSH error type rather than the IO error type, which then again means that either we need to use the failure crate or we do this whole like map error business again. The one thing that would be nice is if the SSH error type can be converted from an IO error, but that's really unfortunate. Usually error types are better about this. But yeah, hopefully this will solve when we move to failure. I think what we'll do for now is we'll do map error, we'll do the same trick essentially of turning the SSH or wrapping the SSH error in an IO error, which is a little bit silly. And again, like my hope is that we make this go away if we have time, which hopefully we do. So readjust string is already fine, but wait close and exit this. Let's see. So remember again, the goal here is for this code to produce exactly the same results as we got the last time the code compiled, but that the code that's run now for every machine that's set up is actually going to be code that's written in the user's code. So in this case, this is like the example of someone using our library, what they are the ones that are deciding to cat et cetera hostname rather than the library doing so. Let's see. Unwrap on a non-value. Oh no, the unwrap came back to bite me. Should have expected this. 123, that's unhelpful because line 123 has multiple unwraps. All right, so this either means that an ID to name mapping did not exist, that the instance does not have a spot instance request ID. Let's have a look at what the real output here should be for spot requests. So fulfilled is what we really look for. Oh, why is this even on? This should not be here. Let's make that go away. Let's make sure I don't have lots of instances just like running in the background. This is the risky part about running like an EC2, running an EC2 crate and like doing automated testing. So really what we want is this to be all ready. And then if, if all ready, then we do this and then we check that all of them are fulfilled. So this is probably a better metric because this deals with things like errors. And then I guess we sort of want to also error out but I'm gonna ignore that. So let's do, let's do some very simple print debugging here. We're gonna do ID to name at the big, we're gonna print out also the spot instance request ID. I was looking promising, at least running something. See in our little cheat sheet over here. Okay, so that request has been fulfilled and it has spawned an instance that's running. And now presumably it's just like waiting for this instance to boot. So it's trying to SSH to this machine, right? And the machine is not yet booted. One of the things that like has somehow not bit us yet is the fact that we could try to connect and then get connection refused because we try to connect right after the machine is booted up but right before the SSH server has started which is this connect will fail but we should really try again rather than just error out but this is not something we currently deal with. So there's a bunch of this kind of like cleanup that we would have to do with the crate of sort of dealing with failures, retrying on failures, knowing which errors are sort of long-term and which are more temporary. So this is an example of the determination error that we decided to ignore. This pool stream disconnected and broken pipe are both errors that you can easily just get for a short period of time but you would never like, it doesn't indicate that you can't progress. It just means you have to try again. Can I SSH to this machine manually? Sort of the next question. Yes, I can. So why is this not doing it? So we're gonna do some GDB debugging because why not? GDB dash P has one thread. It's in receive. So this is waiting in, oh, interesting. It's still like waiting for the spot request to be described. Oh, there we go. Broken pipe, great. That's too bad. All right, so something is preventing that from working. I guess we can kill this now. All right, try that one again, shall we? Gee, let's do this with release. It's part of the reason why it would be slow. It's gonna have to compile some things. In the meantime, we can look at here, sort of the last missing part of things that the user needs to be able to configure is what should happen after all of the machines have been set up. So notice, there were a bunch of things we could do better here like set up machines in parallel, which we don't currently do. Here we could use something like rayon, which would let us do this pretty neatly. But down here, this is like the final point we want to do of the closure that was given to run, which is the thing that should run after all of the machines have spawned up. That's really what we wanna do next. And in fact, this should be fairly straightforward because here, all that we really need to do is we need to call F with this thing, right? So remember this is what we promised for the API to be. So really we should be able to just do something like F of machines in theory, right? And I guess we're gonna unwrap, because why not? However, notice that the machines, the sessions we establish here, we also want to store inside the machine. So currently, remember how we set the SSH session of machine to be an option? So initially we set it out to be none. And so now once we've established this connection, then we want this machine.src. SSH, the sum of session. All right, so in theory now, we've like set up all the things that should be necessary for the run closure to also work. So we're gonna try writing a run closure. We're actually gonna change this a little bit so that this takes a machine, just because it avoids us having to split it up and recollect it. There's no reason not to give away ownership to these at this point. It could be if we later on wanted to, like someone suggested, run a bunch of commands on the server after the experiment has been run and sort of collect statistics and stuff, that we might not want to give away ownership of the SSH connection to the machine, because we want it to be preserved for after. But for now, this should be fine. And then in our example, of course, in our example, let's try to also run a client machine, which is really gonna be exactly the same. The only difference is gonna be, just so we can tell them apart, to see that it does indeed actually start two machine setups. This one is gonna cat, et cetera. This is gonna run date. So in theory, we should get one output that shows the host name and one output that shows the date. And that way we'll know that we have one machine from each set. And then finally in run, we're gonna print out the server's IP, which is gonna be server, hopefully if we did everything right. This is, of course, gonna be the machine now. The zero servers private IP should be able to print that. In addition, unstable, why is it saying unstable? Weird, oh well, I think it's still fine. So in the final run, we wanna print the server's private ID IP and the client's private IP. I guess in theory, we could use the SSH session here, but we're just gonna check that at least it runs and gets roughly the right parameters. This is getting pretty close. Oh, it's probably sad because I'm compiling things. Yeah, but my thing should make it not use all the cores. Oh well, I'll just let it run as this goes away. So another thing that we might wanna consider adding to this API is to have the session that you're given rather than force all of our users to go through the song of dance, like creating a session and then executing, then reading to string and then waiting to close. We could imagine just like providing a run and read method, right? Maybe that's something we can add while this is compiling. It's really just gonna execute this code, right? So on our SSH thing, we're gonna have a pub, then we're gonna call it, it's gonna take a string. It's gonna give you an IO result. And really, all it's gonna do is all the same things that we did already, except this exec is gonna take the actual command, not print anything. In theory, it should check the X's status of the command. We're gonna ignore that too. And now all of this can be replaced with ss.command, cat, et cetera, hostname, like so. Isn't that much nicer? We can do the same here and now this is gonna run the date. All right, compiler, finish up. I wanna run my code. Look how neat this code is turning out though, right? Providing the clients, so that's still compiling. Yeah, so notice also that what we're doing and what's gonna eventually let us transition into using the failure crate here is that we're like building up these, like sort of trying to propagate errors wherever we can and just mapping them into some error type that in this case just happens to be IO error new. In the failure, when we start using failure, then really all that's gonna happen is we're gonna change all of these things that map into an IO error to instead use the failure error type, sorry, the fail type. So we're gonna keep returning like boxed failures that we're gonna propagate back up. So here, we should accumulate many kind of failures should return this type. All failures can be converted into it. And so this is essentially what we're gonna end up doing that our library is gonna just keep exposing errors and then the errors are gonna keep track of the trace that gets them back to where they came from. How many years of experience do you have? Well, it's hard to say. I started playing around with Rust like four years ago now and then since I've been working on a research project for my PhD that is essentially all in Rust and that's now a lot of code. So yeah, I do have a bunch of experience but as you saw the string errors earlier, like I still make really silly mistakes and that's okay, like more fun that way. The other thing that's neat about Rust is it can be pretty daunting when you start out but it's not actually that hard once at least sort of to do the sort of normal things is not that hard once you get started. And once like you start to grok the borrow checker and why it's yelling at you, from here, I guess. Oh yeah, so it's complaining that the stream is not being used, which is totally true. We have it there just so it doesn't get dropped. Private IP is, oh yeah, so here, certainly, that's gonna be a little bit annoying. I guess we can make these pub for now. We might wanna add getter methods for these but at the same time, once we give out a machine it's because we've finished constructing it. So we don't actually mind if the user like starts clobbering over these because we've just like, remember the F method that they give us is one of the last things we invoke, right? So at that point, if they want to start like overwriting this HashMap and this VEC in these machines, that's all fine with us. So there's not a big reason for us to really hide the internals of the machine struct. Now let's see what happens. Not found. Damples test one. Oh, I guess we can make this bssh. Reads a little bit nicer. And let's see what happens. What do we think? Is there hope? I think there's hope. Do we have any spot requests currently running? Fulfilled 12 minutes ago. Oh no, gotta watch out for these whenever my program crashes. We should probably like do something about these panics and have it make sure that it cleans up even if something panics. Probably not a bad idea. All right, so here's our spot request. So that one, no, that's the one I just canceled. Oh, now it's running, great. So remember what we're expecting now is we're setting up two different machine sets. One that prints, et cetera, hostname and the other that runs date. So in theory, we should get the output of both of those and then we should get the output of the private IPs of both of them which we got in the run method. Is sort of the hope at least. So let's see. Few seconds ago. Okay, so this is me being stupid. So see how there are two spot requests. One is for an M5 large and the other for a T2 micro. This is because this is what I started out with in the experiment and it turns out that Amazon does not like to give out spot instances for T2 micro. So it's basically never gonna be fulfilled. It's gonna always say capacity not available. So we need our example here to also use the M5. All that money flying away. It's gonna be a full, maybe it'll even cost me a dollar. That sounds unlikely. It's probably gonna be significantly less. All right, let's see. So we should see a spot request any time now. The hope, maybe instances here. Yeah, it has started two instances for us. So that's nice. There we go. There are the spot requests. Okay, so they got fulfilled almost immediately. There's now running. The instances are also running. These are their public IP addresses. So notice that the things that should be printed out is the public DNS address of one. The current date by the other. And then this private IP and this private IP. Let's see if that happens. Yes, we're gonna find out. Let me try SSH to the other. Can I SSH to the other? So I was interested in using, oh, use Dbeaver. Oh, never heard of it. That's interesting. So what is this waiting for now? That's a good question. I wonder if it's, yeah, it probably tried to connect to the machine before the machine had booted. That would be my guess. And so it's still like not quite able to connect, but we can probably check this with, so it's actually talking to the Amazon API that's hanging for some reason. So here's another neat trick. We're gonna add to our cargo.toml. Is it gonna make it recompile everything? How can we do that? You can add a profile.release, debug equals true. And this will include debug symbols, even in release builds. Not gonna do that, because it forces me to recompile the entire crate. So instead I'm gonna cancel that. I'm gonna cancel these spot requests manually. Huh. So why does it get stuck there? That's a good question. Oh, maybe it's starting to rate limit me. That's not entirely impossible. Someone did warn me that this might happen. Let's see. So we're gonna add some print lines, because they're the best kind. So we're gonna print line requesting spot instances, requesting this many spot instances for the number and name. And then we're gonna do, it's gonna be good old print line debugging. The core term. Oh, does this ever break? Yeah, it should break. And terminate all instances. All right, so let's see how far it gets. Let's see this way, yep. The fact that the spot request stayed here probably means that it didn't get past this point. Cause remember, we wait for the instances to come up and then we stop all the spot requests. We probably is like stuck in this loop somewhere. Yeah, waiting for spot instance request to be fulfilled. Which is weird, because if we look at these, they do both say fulfilled. How about we, oh, I'm gonna have to kill these again, don't I? Really wanna fix that. So this is where we're gonna use the scope guard crate, which is really nice for this. So let's have it do print out everything gets back when it's waiting. So let's see. Well, that says fulfilled. And that says, wait, don't these both say fulfilled? Am I missing something? Oh, they both say active and their status fulfilled. Oh, what does that mean? So I guess we are looking for state active then, not fulfilled. Well, that's tricky, cause it says fulfilled here. So state and status are different, because you ever wondered, let's see what happens now. Three hours, that's another good question, whether we should split this into two or keep going. Invalid spot instance request ID. Yeah, so this is an example of an error that we can totally get from describe here. So have you ever done something with web and Rust, some Rust API maybe? I have built some things using Rustfall and Hyper. It's not generally what I've been working on. So the research prototype I'm building is for a database. It does have a webby like API, but I'd be happy to look at that sometime too. So I think what we're gonna do here is, let's save that error, and then we're gonna do, let's rest on the app. And specifically, what's the error we got? So this is an error that can happen if we ask the API too quickly after we first sent the request, it might say that that request ID does not exist, right? So it's like a race here. In theory, one thing we could do here is like we could do a thread sleep, but that seems a little bit hacky. So instead, we're gonna do the other hacky thing that's sort of similar to what we did down here. That's gonna make a lot of you sad and angry. So that's why I'm gonna do it. So we're gonna say if we're gonna format it, because again, this is because this particular crate, yeah, Rocket RS is also really nice. This is again because this particular crate, the risotto crate does not expose very well-defined error types. This is the reason why we have to do this song and dance. I wish we didn't have to. So if it says, if it contains the spot instance request ID and then we continue, otherwise we're panic. Great. It's not really better. Well, it is better in the sense that it won't actually crash. Let's try it. How about now? Oh, that's better. Hey, it unwrapped a none. How lucky. Oh, that's interesting. I've put the mapping the wrong way. See how it goes from a name of a group to the request ID? Whereas what we wanted is for every request ID, map it to a group. So ID to name is just the wrong way round. Should be sir.clone.name. How about now? I guess now we have more spot requests we need to kill. I really should just add this scope guard, shouldn't I? But it's so close. It's so close to just working. Oh, we're in setup closures. What do we think? Hostname and date, and then the private IPs. I think it's gonna work. So remember, we have to wait for the VMs to boot. That's why it takes a while before you can SSH into them. And you can see the mapping here now too, right? So you see how this instance is mapped to that group. This request is mapped to that group. Come on, servers. Who thinks this will work? Who thinks this will not? Run faster as you do. I wanna see. I wanna see. Yeah, I know, right? You're totally right. So you're suggesting that I should use a new type to distinguish between the keys and values. The reason this happened, of course, is because the keys are strings and the values are strings. If we had a new type wrapper around string, this problem would go away. So in theory, I should have a separate type for group name. And that way the map would be from string to group name or from instance or request ID to group name, which in case this problem could never have occurred in the first place. But, oh well, we found it in the end. It wasn't that bad. It's not really running these closures though, which makes me a little bit sad. Oh, connection timed out. Yeah, so this is, my guess is what happened here is the race that we talked about earlier, how if you try to connect right after the machine booted but before SSH starts, you can sometimes get this issue. So I think what we really want to do here, this is gonna also make everyone a little bit sad, but it's not that bad. So for TCP stream, you can set the read and write timeout. And for SSH, you can set this timeout here somewhere too. Session, set the timeout. Yeah, so notice how this happened in, yeah, specifically for the TCP connect up here. So we're gonna do something like, okay, here's a, we're gonna try this pattern. Is you running into many errors? No, that's true. It is good that we're running into errors. It's also weird to live stream coding because it means that I have to like think about speaking also in addition to coding. It's a good exercise in like multitasking. But yeah, I mean, in a sense, the goal here is for this to be useful. So if it's useful, that's good, even if it like takes longer to get through it, right? So what we're going to do is we're going to have, oh, can I use do catch here? I think so. Okay, so do catch for those of you not aware of it. It is really neat. It lets you take a block that returns an error. Let's see if they have examples here somewhere. Where is a do catch example? Is this one maybe? Click me. Well, it's the worst I could happen. Where is a do catch example? Come now. Essentially what it lets you do is it lets you use question mark inside a scope, inside some block, and then instead of having, so the question mark normally returns from the entire function, whereas catch is like, it will only return to the start of the do catch. So it will only return an error up here. And so this means that I can, ooh, what do I do now? That's something stupid. Yeah, so we can basically do something like, let rest is gonna be a result of something or, actually this can just be an IO result. And then we can say that that's equal to a do catch of this. And so this means that if this, if any of these things fail, actually I guess we really only want the connect. So we might not need this. But sure, we'll do it anyway. So what this is gonna do is say that this statement here errors, then this question mark, which would normally just return an error from the outer function, is then gonna return an error from this do catch block into this variable. And so now we can say like, if rest is an error, then if I is greater than or equal to three or something, this equals one, then return error E, otherwise continue, right? So this now means that we can keep retrying but for a limited number of times and we can still use the question mark syntax, which is pretty neat. Now in this particular case, we know that it's always the connect that will fail. We don't actually wanna retry like a handshake because we know that if it failed the first time, it's gonna fail again. So the connect is the only thing we really care about. So in this case, I'm actually not gonna do this. I'm gonna do it this way instead, just to make the code a little bit simpler. So we're gonna do TCP is loop this. If it's okay, then we just return, then we break with an S. If it's error with anything and I is less than or equal to five, three, then I equals equals one. Otherwise, if it's an error, then return GZLA works. So this is now, we're gonna keep trying to connect to the given address. And if we connect, then we're just gonna break out of this loop and we're gonna give the stream that we got out. If we get an error, then if we've only gotten a few errors, then we're gonna just keep trying. Like if I is less than three, then we're just gonna increment I and not actually do anything. So we'll go back to the top of loop and try to connect again. However, once we're beyond three errors, then we'll return error from the outer function. And now this does not need to be a loop anymore, but I've messed something up here somewhere. Right, so in theory, this is now gonna retry connecting three times and if it fails, then we get sad. I do think we want to also put a timeout on that connect, maybe. Unix connect. I don't remember whether it's the read timeout or the write timeout for connecting. An unspecified timeout, well then. So it sounds like I can't actually do that. Well, that's fine. All right, so we're just gonna do it this way. I'm gonna have that run and then I'll kill these. Great. Huh, value moved here in a previous iteration of the loop. Totally true. I'll let you do that. I think I'm allowed to do that. Two socket. All right, let's see. All right, we got some instances. And now of course it's, now we're in the same state as we were before and now it's all these instances have booted up or are booting up. So if we refresh this, it should be two instances here. Ooh, and two old instances. Where did those come from? That's a good question. I'm gonna go with these are the old ones. Terminate. Okay, so these are the machines that are starting up. And so in theory it's now waiting for those to boot and then trying to SSH into them. I guess we can sort of check manually that we're actually able to SSH into them. Hey, how about that? Hostname on the first. So that was one machine set. Date of the second, that's the other machine set. Private IP of both, that's the final closure. How about that? This means that our entire example now all works. And so in theory, of course, we could do things like with multiple servers or with multiple clients. And then this, we should now see date show up three times and hostname show up one time. And then of course, then we would want to print, say, four C in VMs client. And C dot private IP. And of course, this has access to, we do also here have C dot SSH or we could issue SSH commands as well. We're gonna nod through that for now. Well, let's just double check. We're gonna get rid of some of these print lines because technically we could turn these into logging statements. And so there's a great logging crate called sLog that we might even wanna use here. I'm gonna just not do that for now. I guess arguably we should make a git commit as well just to show that we like not lose all our progress. Piles. Yeah, and then terminate all these, this is great. All right, let's make a git up repo for this. So we're gonna make a new repository and we're gonna call it first. It's gonna be a Rust crate for spawning short-lived clusters of EC2 spot instances. So if you wanna see the code, I guess you can go there. We can also do some more like meta stuff about cargo. So sort of what things to put in your cargo Toml use for crates for like managing readmes. Of course, one of the other big things that were, ooh, failed to look up address information. Name or service not known. Yeah, it's gonna be on the connect to. This is when connecting to the clients. So this is DNS lookup taking too long. This is why like all this unwrapped stuff is something we're gonna have to deal with and why many of these operations sort of need to be redone because our transient errors so they're not actually gonna persist with us. So I guess now we have a choice. Let's make a, so things that we wanna do. Failure for getting with scope guard for tearing tear down even during panic. We want to deal with transient, for example, if the API is unreliable or DNS lookups, et cetera. We probably want a nicer SSH interface. So we want to the SH object that we pass to the closures to be like easier to use because currently is a little bit of a pain on cargo infrastructure. So we want Travis, obviously we want cargo and we want all the meta information and we want documentation. Now all of these are pretty important. I think realistically given that we're at like three and a half hours, I'll probably only do some of them this time. And then maybe we'll do like a second video where we go over some of the others. What would be most interesting to do for the people who are here? Oh, that's right, Rayon. Yeah, yeah, yeah, yeah, that's a good point. Rayon for parallel, that's a good call. So I'm happy to deal with any of these and I think all of them are things that we're gonna sort of essentially have to do for this crate. The question is just which ones we want to do first. Looking at the time, I think realistically we'll do one or two of these this time around. Sure, we can do failure first. Failure is a sort of nice self-contained thing. I mean, we can do, okay, so how about we do failure and cargo infrastructure then? The cargo infrastructure is very straightforward in terms of like setting up things up with Travis. If we also want code coverage, it's a little bit more involved, but we can do that too. At least one thing I can point you to is a cargo chart calling. So this is a re-implementation of K-Cov, entirely in Rust and it's very easy to set up with cargo so they give you, especially you copy paste this as your Travis YML file and then it will give you full coverage using either Coveralls or CodeCov depending on which one you want. You can also, if you wanna see an example of a repository that has this, this one for example does, except it's currently zero because Travis is being stupid. Specifically Travis has this bug where you can no longer do code profiling on Travis unless you set pseudo required and this is as a result of their meltdown and spectra mitigations. It's really painful. We also have this on factory RS, I think I added this too. Yeah, so factory RS is the other crate you can look at. Here this has a lot of infrastructure as well for running code coverage. So if you're curious, you could sort of look into that. All right, let's play with failure. Failure is failure. Failure, great. I'm gonna close some other stuff here. All right, so failure, let's add the to do and also let me kill these instances just so we don't end up, ooh. What? It's lying to me. Oh no, there's already a crate named burst? Oh, that's terrible. Well, do we have a better name? I'm happy to rename it. It's not like it's, wait, what is the current crate? Oh, there is a crate named burst. What does it do? A disassembler. Well, that's very different. That's fine. Let's call this something else then. Let's call it, let's see, what does it do? ECT spot blocks for short jobs and clusters. I mean, we could call it clusterfuck, but that would be maybe a bit on the nose. Let's do, what's something that there are a bunch of in the cloud that run for a short amount of time? Oh, tsunami or tide related is good. What's the name of that storm, El Nino? I think that's what it's called. Yeah, but it's not really a storm in and of itself. I mean, to be honest, I'm fine with calling it tsunami. I think tsunami is a good name. Great, let's do it. Tsunami, so you spell it, right? Tsunami, great, set. And I guess we will move burst to be called tsunami. I'm gonna cd to that, and then we're gonna, ooh, I did not like that at all. Normally I'd organize my get-to-mits better than this, but oh yeah, naming things is great. I'm very happy with many of my names, but that's a topic for another day. All right, Tastic. All right, so we're doing failure, huh? Seems appropriate. Okay, so the failure crate, where do we have the failure crate? Here. So we're gonna need cargo tunnel. We're gonna need failure as version 0.1.1. Extreme crate failure. Then we're gonna, okay. So, contrary to what the old error chain crate did, failure does not actually require you to declare your own error type. The idea is essentially that, because errors should be relatively rare, errors are generally just abstracted through this failure error type. So you very rarely have to do like, set up a big enum where you're gonna add stuff to. Now, there was a blog post recently where he suggested some changes. But in general, we're gonna have one enum that's gonna talk about the kind of errors that we can get, and then we're gonna take everything else that comes from other places and just have them be error, right? So in our case, let's, this is getting pretty long, but it's actually not that bad. Okay, but we're gonna have a, let's not have an error mod for now. Enum. We're gonna have an enum called error. I guess this is no longer called burst, is it? So, and burst. So this is gonna be tsunami builder. That sounds pretty, pretty severe, but okay. Why not? I guess the example is gonna be nami builder. Let's see. So, we're gonna have a enum tsunami error. Tsunami does make everything sound a lot more extreme. We're gonna macro use failure. So, there aren't actually that many errors that we want to, that we might provide, because in theory, we could just propagate up the risotto errors. That's a good question. All right, let's hold off on this for now. Because essentially, the problem here is that the underlying crate we're using, the risotto crate, earthquake is also, that's a great, great name for the builder. But see, the problem is getting too carried away with the names to the point where you don't know what anything does anymore. As a users of your library, you have to like read all the documentation in order to understand they can no longer guess based on the names of the types. But earthquake is a great name for that builder. All right, so what we're gonna do is all the things that currently return errors, or I guess none of them return errors, because we just don't wrap everywhere, but run in particular, sort of our main method. And if you look at what, if you look at here, the error type of failure functions, which accumulate many kinds of errors, should return this type. Great, we're gonna do that, gonna return error. And now the other interesting in this observation is that basically everything, like failure implements into error for most kinds of things. And so this means that we can use the question mark operator for most of these. So default TLS client, you can probably do that. Now, it doesn't really encourage you to do this because if you just put question marks, it means that you don't give an explanation for why this error happened. So your caller is just gonna immediately be given this particular error and not a causal chain for why that error happened in the first place. Like why were we trying to create one? And so the way you do this is, well, there are a couple of different ways. You can use context. Yeah, so if you see here, if you have a lower level error, such as IO error, you can provide additional context about what that error means, the context of your function. Right? Six, eight type. Yeah, that's fine. So from memory, I have to, it's like make life a little bit easier for ourselves. Is it better to use expect if possible? So expect is basically unwrap. The only difference with expect is that if the unwrap fails, you're given a more helpful message than was none. So we don't actually want to use expect either because it's still panics, right? The whole goal is that we want this to not panic. I wonder if they've given an example of this. Yes, you see, if you return a result that returns a failure error, then you can just use question marks everywhere. But at the same time, that's not really what we want to do because we want to provide this context, right? So this is why I want, I think what we have to do is map error from. So this maps it into a failure error. And then we can say dot map error, which takes the E. Right, yeah. And then with that E, we can now do E.context and then we can do a question mark. So notice that the code does become a little bit more verbose, but keep in mind that we could do this, right? That is totally legal, but it means that the error is not particularly helpful to our caller because they're just gonna get like, couldn't create default TLS client with no reason for why. Whereas this way, so context takes anything that implements display. So we can like give it a static string that should be fine. And now we can tell it, do they want this to be active voice? It's not terribly important, but fail to create TLS session or EC2 API client. And now let's just do a cargo check. So cargo check is a much faster way of compiling that just gives you the errors and does not actually compile your program. And of course we can keep doing this down. So where else do we have unwraps? This to do, we're also really gonna have to do it because otherwise it's not gonna run on anyone's machine but my own. Create, yeah, good bunch. Good catch. Yeah, because currently this will only work on my setup. So that's, we should have that to do. Construct VPC for cluster set of new key. This one's actually pretty important because otherwise only I can run this great. But you know, it's fine for now. I would have been faster just to release compile actually. The check in general is a lot faster. It's just in this particular case it had to recompile all the dependencies and that's not super happy. Let's see unwrap. Yeah, so here I guess we should think about how can this fail? So the, actually maybe the AWS API. Let's see what it says. EC2 API reference, fantastic. List of actions by function. Aphabetical list of API actions, sounds great. Request, thanks Amazon, super helpful. Yeah, so sorry, the video gets laggy whenever I compile a lot because it takes up all my cores. I thought I'd fix this, but apparently not. In theory it shouldn't be using all my cores but it looks like it is. So yeah, it's just whenever I do a big compile. Let's see. Oh, right. Run, we can't actually, we don't actually want it to return a failure error. We want it to result nothing failure error because it could also not crash. All right, so let's see what happens if I do this. Now it's gonna say, right. So this function, of course, if everything goes according to plan, it returns an okay. Great, so see how this code compiled? So notice that the mapping of errors we have to do error is a little bit painful, but it's not like, this is basically exactly the extra context we need to give it. Now I think, just from memory, I don't know if this is actually true, but the macros that provides yeah is this thing, format error. Yeah, let's just, if you wanna construct an error from scratch, it doesn't actually help me chain them. So I guess, ooh, that's a good question. Yeah, so context implements fail, right, yep. Okay, I think we're just gonna keep it this way. Right, so we're looking for unwraps. So that's why we were looking to error codes. So these are all the ways in which Amazon things can fail. Makes me a little bit sad. Cause realistically, most of these we're expecting not to see, right. Cause we're using a library that should format the API request the right way. There are some that can fail like off failure, which because of the way the API structure, we're not actually given an error here when we create the client. We would be given an error on every single API request. So one way you could imagine this happening is like, if you have a user that's authorized to do some actions, but not others, that's why you can't just do it on connect. I don't think there are any of these that, specifically I don't think there are any we wanna catch here. So the only ones would be like, the things we're looking for are transient errors, right. If you would find a transient error, we don't want the entire run function to exit. But it's probably fine here. So this one we're gonna do, I guess we should get used to just keeping this around. This really should have a, right. So could not, I guess we could do format. Could be even nicer to the user and say for this group. Where else do we have unwrap? Okay, so this unwrap is different. So here, this is the response for, which is this request spot instance. Quests, spot instance, instances. Yeah, so this method, the thing that's returned, notice how it has an option VEC of spot instance requests. If we look at the Amazon API for that method, notice how the response says one or more spot instance requests. So not only will it always be some, it will always be a vector of length one or greater. So in fact, here, this unwrap is fine, but as was pointed out elsewhere, like it's better for us than you'd expect and say request spot instances should always return or something. Sect phrasing is not super important. Oh, I missed a, and then we're essentially just gonna walk looking for unwraps. So here, we sort of already handle errors. So this one is a little bit special. So remember, this is where we want to retry in the particular case of it being the race message. So this error is, if we try to describe a spot instance request that we just issued but that Amazon hasn't registered yet, then we don't actually want to return an error straight away. So here, what we're gonna do is we're gonna do this, continue and else we're gonna return error from E and then map it with a complex. And in fact, no, we don't even have to do that. We can just do this and then we can just do, right? To see what this is doing. So if we have to do this so that we can do string matching on the result because the error type of result are not well-formed. And then we do some matching. And if it's not one of those, then we reuse the original error we were given and propagate that up with some context of fail to, fail to, sure, why not? Here, we have a slightly different issue which is that this loop could loop forever because what if these enter a failure mode, right? So remember how the state, if we look at describe, spot instance requests. Yeah, so the response we get back is responses and request object and the state is one of these, right? So currently, we're just waiting for them all to become active, but it could be that they never enter the active state. And if that's the case, then of course it's not good for us. So what we're gonna do here is we're gonna say that if any of them failed, I think it's the way to do this. Yeah, well, okay, so first of all, this unwrap is safe, right? Because we already checked for the error case above. So this, we can just leave that way, that's fine. This unwrap is safe because describe, if we look back at the describe here, this also always returns one or more spot instances requests. So describe always returns at least one spot instance. This is okay because this, ooh, required no. What does that mean? Oh, this isn't the thing we sent to the server. So I think this means that we can expect this to always be some coming back from the server. Like Amazon will always tell us the state of something we asked it to describe. But if we send a spot instance request to them, it doesn't have to contain a state, obviously. So I think this is just gonna be a spot request that did not have state specified, which like should never happen. So I guess really what we want here is, we wanna know if they're all active. If they're not all active, first of all, I'm gonna take the suggestion of someone else from further up in the thread. We're gonna sleep a little while just so we don't like totally spam the API. But also we're gonna see whether any of them failed. So I guess we're gonna do, we have to do this again. That's a little bit sad. But here, it's actually okay for us to use unwrap because if that unwrap would have failed, it would have failed with the expect further up, right? So these unwraps are fine. And then we wanna see whether any of them are failed. And I wanna see whether state is equal to failed or state is equal to canceled. Because if any of them are failed or canceled, then of course there's no point in us continuing any bad. And so if any of them are bad, there's no reason for us to continue any of these so we can break it, right? So I don't know if you remember, this is a long time ago, but this instance is VEC that is actually empty and unassigned to. And that's okay because when we break from this loop, remember how we know that at this break, which was the only break in the loop, instances would be defined. And so therefore it's okay for us to have this not be mutable. This means that down here, we are required to set instances before breaking, which has its own set of problems because I think what we're gonna run into here, what we're gonna run into here is that imagine that we issued two spot requests, one for the servers and one for the clients, right? And the server starts up fine and the client fails to start. If that's the case, then we still wanna terminate the server when we give up, right? So I think this will actually be wrong. I think really what we're looking for is any pending. So if any of them are open, which is funny, because this is going back to what we originally had. So if any of them are open, specifically what we want here is if the state, only if the state is active, do we record this instance as something we need to deal with? Right? If it's not active, then there is no instance to terminate. So notice the filter map here, right? So now what this is gonna do is it's gonna keep looping through this loop of describing all of the spot requests until all of them are either in a case of active or in the case of having been failed or canceled. All the ones that are active will have collected the instance IDs for and all the ones that are not active, we will not have done so with. And so this means that the one thing we do have to track is whether any of them failed. Cause if they did, there's no reason for us to continue. Like there's no reason for us to do any of the actual machine setup, right? We should just go straight to terminating. So what we're gonna do here is, I'm gonna have a variable. It's gonna be, let me, all active. All active is true. And if any of them are not active, then all active is false, right? And then what we sort of wanna do is put all of step four and step five, the things where we actually execute the closures, put like all of that inside an if, but that's gonna make our code go far to the right and that's sad. So instead, I wonder, should we just do spot guard while we're at it? No, I'm gonna not do that. In fact, we can still do four. It's just a little bit silly. So we're just gonna say, if all active we do these, there are cleaner ways to do this, but they require a bit more code, so I'm just not gonna do them. So now notice that only if all of the machines came up correctly, only then do we actually do this like SSH business. Otherwise, we just like terminate them once they're all ready. It does mean that we sort of, we wait for them to come up, but this process is fairly inefficient. It's already fairly fast, so it doesn't really matter. So now we're gonna keep hunting for unwraps. I'm just gonna bring with us these two lines. Okay, so this unwrap we know is okay, because it's the same unwrap as above there. This unwrap we know is okay, because it's the same unwrap as above. So unwraps, okay, because they are the same as expects above. So here, this unwrap spot request must have spot request ID. Seems pretty reasonable to assume. This is a good point. We guaranteed that it has an instance ID. I'm not sure. That's a very good question. So this is one of the things that, why I'm happy that rust, like discourages unwrap, because once you start walking through these, you're like, I don't actually know why this unwrap is okay. So let's divide these up. So this one is okay, because every spot request is ID is Regis is made for some machine set, right? Machine set, right? So this is in the ID to name mapping, we know that we mapped every request that we made to a name. So we know that we're gonna get one out when we look up any given request ID. But this one is really tricky. This is trying to find the instance ID of this particular machine. So the question is, why is that unwrap okay? So this is, we're making a request to the AWS API. We told it to boot up a machine, or we gave it a spot request. It came back to us and say that spot request is now active. Are we guaranteed in the API response we get that that thing that says that it's active includes an instance ID? I think we are, but I'm not sure. It's a very good question. I think what we're gonna do we're gonna consider anything where it's active and the instance ID is not yet set as pending as well, which is gonna give us this behavior for free, right? So any where it's open or where it's active, let's say it's a map this to sir.state.as I expect. Now this is the state, now this is. So if the state is open, then of course it's pending or if the state is active or the state is active and the instance ID is none, then it's also pending, right? So now we're gonna keep issuing this until either, either, or for all of the spot request, either they are fulfilled and all of them have instance IDs or they've failed in some way. And now we can say here that now we know that if it is active, then this must be some, okay? Because active implies instance ID.isSum because not any pending. That make sense? It's a little bit elaborate, but this is what you end up doing if you try to like fix all the unwraps in your program. Here, this unwrap is okay. So this is just we're reusing the spot instance requests from the previous requests that we made so that we don't have to clone all of them. And that should be fine because we're the only people who can modify that request. So we set this to sum above. So that just can't fail. Cancel spot instance request though can fail. And this one is a little bit trickier. I guess we can probably just do this. But this is one of those things where if there's a transient failure here, it makes us really sad, right? Because if that call fails, we won't cancel the request that we made. It should be unfortunate. But all right, we're just gonna keep that as a cancel for now. Fail to cancel. It's also like if we fail to do this, maybe we want to keep running the experiment. But I'm gonna ignore that for now. I'm also gonna add it to do here where this is where we create the scope guard. So if we wanted to have a thing that if we ever panicked, we would terminate all the instances. At this point, all the instances have been started. And so it's okay for us to create the scope guard. That would essentially be all the code that's in six would become the code that goes inside the scope guard and is executed when the scope guard goes out of scope. Where else do we have unwraps? Describe instances. Is the compiler smart enough to lie jumps in some of these unwrap cases? If you can work out that it won't ever panic, then I assume in many cases the compiler can do so too. I'm not sure. I know that there's some discussion about introducing a new macro for this. You could also imagine that you did something like, essentially an unwrap is if some, then return the thing, otherwise unreachable. The problem is that unreachable is actually a bunch of code. Unreachable does a print line and then terminates your program or does stack unwinding. And so there's actually a bunch of code there. I don't know whether the compiler is smart enough to realize that that particular path can never be taken. In, for example, in the cases we have of these sums, yeah, I don't know, it doesn't sound unreasonable because I don't think this is particularly hidden information. Some of them are a little subtle. Like the things that depend on like the exact semantics of the AWS API. If it doesn't, that sounds like something that ideally should be fixed. Let's see, fail to cancel spot instances. This might seem redundant, by the way, giving this context, but it's because the error that you get back from the EC2 API, so the describe instances error, is not guaranteed to say anything about it being related to canceling spot instances. Now this unwrap here, so this is in describe instances, I think that's returned from this as an array of reservation. It's a very good question. So here's another one, right? If we describe instances, what we get back is a option VEC of reservations. Can that option ever be none? I'm gonna go unwrap or VEC new. So this is a little bit tricky, this is basically saying that if I got no reservations, then just like skip this one. Cause it's saying like allocate a new empty vector and then iterate over it. So this means that we don't have to deal with this unwrap case. And then I can actually do the same thing here. Like if it is none, we think that's weird or if the vector is empty, we think that's weird, but it's not really an error per se. The one thing we might want to do though, is if, let's see. So I wanna keep track of, pretty silly. Let's do it this way maybe. No, it's not worth it. So we sort of wanna assert here that number of machines in each group, number of instances in each set is the same as requested, right? Because it could very well be that if the API here is being stupid and like we request three instances and we get back one, then there's nothing that really checks that in our code currently, right? Cause we just iterate over the things we're giving back and like put them into a vector somewhere. But we sort of want this to do here cause otherwise like if the user requested 12 clients for a benchmark and we give them two, that seems bad. All right, here. Oh, this is gonna be all sorts of painful, but actually not though. So here, this is we try, ooh, we're gonna do this though. Failed to SSH, this machine. See how helpful we can be? Do this, machine.public.ds. That's kind of nice, huh? All right, so now we're gonna try to SSH to all the machines. And if we fail to connect to one, we're gonna print an error saying we failed to connect with. Now here, this is a little bit tricky because if something fails here, we really want to run this code down here. But again, that's gonna be fixed by the scope guard. But it also means that we're not gonna run the closure for any of the other machines, which is maybe fine. So I guess returning an error here is okay. And then this is the user's closure. So remember how we tell the user to give us something that can return an error? What we really want to say is that the user can give us anything that returns an error, right? And in fact, the same for this and the same for, actually, no, it's not even error. That's a good question. If I take a closure, does it want me to use fail? Cause that's what I think it does. Yeah, ideally we should use like impole fail here, but because we're trying to be all work on stable, we're gonna use error. So this is saying that the closure the user gives us is allowed to return any kind of error as defined by the failure error type. We could here also make it slightly easier to write, I guess this should really be result nothing error. We could be nicer here and do something like allow the user to have anything that implements into error. Because that way you return like an error. There's just a string. Ooh, in fact, but then it would have to be generic. I want it to be generic. Yeah, we're just gonna leave it this way for now. This could be made more ergonomic. Like this could be impole error or impole into error. Which is like, give me any closure that returns something whose error type implements into error. Cause that's really what we want to say, right? Like anything that, your error can be anything that we can turn into an error, which includes things like strings. But we're gonna keep it simple for now. So this means that what this gives us back is a result error, which of course means that if it is an error, then we don't even need to map the error, but we do want to add a context to it. So we're gonna map, we're gonna map error and then question mark. And the context here is gonna be a setup procedure for this machine, or for this type of machine failed. And then we're gonna give the name of that group. And then of course we get to the master closure the one that determines the whole, like what to run at the end of the benchmark. And that thing is now gonna be a closure that returns a result also error, if there are any more. And here we're gonna do main procedure failed. And I guess we could say tsunami failed. But and termination can fail. Yeah, that's true. I'll be really sad. Yeah, so here we're gonna do the same thing as we did further up where we're gonna retry if the stream was disconnected. Otherwise, we're gonna return, yeah, we're gonna return an error of error from E.context. This don't need format there. You're totally right. Yeah, cool. Let's see. So here this is just gonna be fail to terminate instances. Tsunami instances. And this unwrap is okay because set to some further up. So this is the same argument that we're reusing the instance IDs from the description request. And that description request that we made up here, we set to some and it's never modified below. Yeah, thanks. I saw that shit. All right, there are now that unwrap is okay. That unwrap is okay. That unwrap is okay. This is not really an unwrap. This is not really an unwrap. This is the same unwrap as above. And so there are no more unwraps. Great. And then I guess we had a few in here that we're gonna get rid of. So use failure error, result error. Here we're gonna return error from e.context. And the context here is gonna be fail to connect to SSH port. And remember here we're gonna get this really nice chain of errors, right? Where the TCP stream connect is gonna return as an IO error. We're gonna wrap that in a message that says fail to connect to SSH port, which again is gonna be wrapped in an error to say, fail to SSH to this particular machine, right? So we get this back trace of errors, which is one of the really nice things that failure gives you. I still don't know why session returns this, but I guess we can do a, and then we say to not available. I have no idea why new would fail there. Oh, I guess we also don't need, can I find value source, 149. Oh, sir. Oh, that's giving me a lot of 192. There's no name. Fail to cancel spot instances, which means no format. If you're arriving now, you're a little bit late to the game, but at least the whole video is here. I'll also upload the video on like YouTube or something later. People who wanna watch the whole thing. My guess is we'll end up doing a second part to this. Cause like realistically, I'm not gonna finish the rest of this today. We'll probably do a part two where we go through like, essentially the things in to do, which is gonna include things like, rayon for parallelization, a bit more cargo infrastructure, those kinds of things. 205, reservation, forgot a question mark. 271, forgot a question mark. SSH 19, forgot a question mark. Oh, this map error no longer has to be there. Now that's gonna turn into our other magical map error location. This thing over here, this little monster. So this is not gonna say fail to establish SSH, fail to perform SSH. So this is gonna be fail to authenticate SSH session. Yeah, so this to do we're gonna have to deal with too. Although that will come under this point actually. I guess we should remove that one for now, given that we're dealing with it. Command. Yeah, command is gonna return this in an error. I guess here we could be really nice and do a four command. If we want it to be nice status, it's like maybe we do, I guess. The fail to execute command. This is gonna be never, and I guess to do check channel exit status. How about that compiler? Take that. Ooh, no more IO. Three. Oh, this is where the bail thing comes in. Is that channel session supposed to be after? No, okay, so the way SSH works is you first establish a session, and that session has multiple channels, and on each channel you can execute one command. Or you can spawn a shell in which case you can execute multiple commands. But so we create a new channel for the command we're about to execute, and then we execute a command on that channel. Then we read the response of that command, which I should also have here. And then we wait for the command to finish. Right, yeah, so we were up here. So the question is why does error from, I think it's a compile error. Compile error. Where? Oh, that thing, yay, I pasted too many lines. You're totally right. Okay, so it's not liking this error from. Interesting. Can I just like make up a cause? Apparently not. Can I make up a context? Great. We're just gonna do context new. Let's see, 132. That's not what line 132 is. Oh, right. It doesn't realize that the question mark forces the method to return. We'll do this, which is the same thing. Oh, my side of the error, let me do that. Great. And then 120, never read. All active is never read. That's interesting. Oh, that's true because all active is only assigned later. Yeah, okay, so that's fine. We can just not assign this of all you. I like the structure of response. Yeah, I agree with you actually. I think, I do think the structure, so think of this, for example, where we sort of started everything just in one file and it mostly is still in one file, but it just like has a nice flow to it, I think. Right, so this all ran. So now let's see if we can run our example and how that still works. Oh, it's compiling all the extra dependencies for the failure grade. For some errors, I think the AWS API is gonna give us play and give errors just from observing it so far. I don't think we need to invent our own. Watch it like crash on the first line or something silly. Our example is now gonna, so the one thing that's a little bit awkward about this is that now nothing is ever gonna crash. We actually need to do an unwrap here. That's the only point at which errors will be exposed because nothing else is gonna panic, right? They'll just keep property getting errors. This is why in normal Rust programs, if you only use the question mark operator, it's actually really annoying to figure out where that error started from because they just keep like re-throwing the error but you don't keep track of a back trace whereas adding this context sort of forces the context of forces that step of the back trace to be kept, which is really nice. So this unwrap will in theory actually tell us where the error occurred. Well, let's see what happens. Slowly, but surely. This ended up being longer than I anticipated but at the same time, it's fun. Come on, compiler. You think we can implement scope guard while it's compiling? We'll try, I think we can do it. Let's see. Okay, so scope guard is basically you make a scope guard. You give it a closure and that closure will be called when that scope guard goes out of scope even if there's a panic, which is really nice. So we're gonna, over here, we're gonna do scope guard new. Probably a good idea to not do anything while compiling. Ah, it's fine. It's just compiling dependencies. Compiling this crate does not actually really a problem. Oh, I see what you mean. Yeah, that's fair. I really wanted it to, what's the thing for making it not use cores? Let's see if I can, here. Let's see if this helps. Dash C03. Ooh. How's this? Is that, is the stream now still laggy? Yeah, probably. So in theory, this should like only run cargo on two cores. So maybe that's better. Okay, it's a little bit weird. I'm not entirely sure why it's better because it still seems to be trying to use all my cores. But, yeah, that seems to be better. Why is it spawning up multiple instances of cargo? That seems wrong. All right, great. We'll try with that command and see if that makes everything happier. Twitch is very confused about the quality of my stream. All right, I'll try to resist the programming. Is the audio fine though? Cause then I can sort of talk through what the plan is to do. That's fine. I just really want to implement scope guard, you see, but I'll resist, that's fine. The audio has always been fine. Okay, that's good. I also wonder whether, so I'm also recording the stream. So I wonder whether the recording will also be laggy. I guess we'll find out. Why is it recompiling risotto? That does not make a lot of sense. So the idea with scope guard at least is to use the scope guard defer, which, yeah, I think this helped in scoping the video stream. So the defer keyword for scope guard is essentially very similar to what you get for the defer keyword in go. So you can do this. You can declare a scope. I think there's example here somewhere, which might be more helpful to watch. Yeah, so you declare a scope and then inside that scope, you declare something that's gonna happen at the end. And in our case, really what we want to happen is this entire thing, right? Like if everything ever ends, then we want to terminate all of these instances. We want to stop the spot request immediately, right? So here we want to terminate all instances whenever we exit, even on panic, right? Oh, I guess that's running now. That's, well, that's fine. It'll do something. It's fine. You must CTL cargo check. Yeah, this is gonna get sad, isn't it? So I think we're gonna, basically what happens is we need a separate instance of the instance IDs for termination to use now. So it used to reuse the ones from the describe. So we're just gonna do, well, that's promising. That looks like it's SSHing. Here we're gonna do this even. So I think I've read about unwrapped too. How about that? So this is with one server, three clients. And notice we get one server that prints its hostname, three clients to print the date. And then we get the private IP of the server and the private IP of each of the clients. No errors though. So we didn't actually get to show any of the work that we did on errors, but hey, it's fine. Let's see. So this is probably gonna be awkward for a couple of reasons. I don't actually use the scope card. Oh, that's right. I can't, mm. Yeah, I'm just gonna not do the scope card stuff now because it's gonna distract us. Okay, so we're gonna do this and we're gonna do cargo, we're gonna do that. All right, but this is a pretty good result. Let's run it again and see if it's still, whether we might get it to give us an error. So that'd be nice. But notice how painless it was to add all the failure stuff. It's really just walking your crate and looking for unwraps and then turning them into this like, I mean, normally you can just like add a question mark. In reality, what you usually want to do is you want to add some context to it, but it's not technically required. But I think this puts us in a pretty good spot. Like now we have context for all our errors. So in theory, even if something goes wrong, we'll get useful error output, right? It is true that we still want to do this like deferred scope for execution because currently if something goes bad, I have to manually go into this list and kill the servers. But in practice, like we have pretty decent error handling now, I think. So let's just see what happens if it runs one more time. Let's see. Oh, that's the other thing we wanted to add for it to do logging. So just to remind all of us, this is the client program that we're trying to write. So it spins up a server that's gonna run from one AMI on an M5 large. So we're gonna run one of them and we're gonna just print the host name. And then we're gonna run another machine set that's gonna be sort of named the clients. We're gonna run three of those. We're gonna run on the same AMI with the same machine setup and they're gonna print the date. Very, very fancy. And then that's just gonna be the setup for those different classes of machines. Once all of the classes of machines have been set up, then we're gonna run this closure. It's gonna be given access to all of the machines from all of the types and specifically they have access to a .ssh on a machine and .ssh is an SSH connection that's already been established. Hey, it worked again. There are no errors anymore. It's pretty frustrating. And you'll be able to SSH in and like do things on all of these machines. And then once this run closure returns, then we're all done. Any ideas when you will stream next time? I'm not entirely sure. My guess is the next stream will also be a decent amount shorter just because now we have most of the infrastructure in place. So most of these are sort of more about how do you polish a crate? Which may be more interesting actually for those of you who have more experience with writing crates already. But my guess would be in like a week's time probably. I'll post on Twitter again when I've determined when the time will be. But hopefully in like not too long. I think that's like all I wanted to do for this time round. Let's just check that like everything's cleaned up. Thanks to you all for like joining me and having fun. I hope this was useful in some way. And then I'll post somewhere and like an ounce when I decide to do another stream. Also, ooh, that's right. I will push this all of the error. All right, I will also post a... Oh, sorry, yeah. My Twitter handle is johnhoo, like so. I will also post the recording of this on YouTube for those of you who like missed the beginning. I'll probably archive the Twitch stream as well, which I think should also save the video. But that way we'll sort of have both. And also if you have any feedback about today, like I would really like to hear it whether you post it here or whether you post it like to me on Twitter or like however you wanna get it to me, that's fine. This is like the first time I'm doing a live code stream. So if you feel like something could be better or it could be different or something you would like to see or something you feel like I did too much, feel free to like let me know. Yeah, cool. Thanks for hanging out with me and writing code. It was fun. So long, farewell.