 Hey folks, this time it's a, we're gonna try something a little bit different. It'll be in sort of Rust implementation stream, but this time we're not implementing an algorithm or some kind of useful crate or anything. Instead, we're gonna take on a set of basically programming challenges. So this particular set was one I was linked to not too long ago, and it's a set of distributed systems challenges that use a platform called Maelstrom that is basically a sort of distributed systems testing framework or exercise framework that can basically orchestrate message passing between nodes in a distributed system and emulate things like delayed messages or reordered messages or drop messages, nodes coming and going, that kind of stuff. And it's written by, or it's written alongside the author of Jepsen. And if you haven't looked at Jepsen, it's a really cool effort. It's basically they're doing correctness research for, wow, my browser plugins are messing with the site, but they're doing correctness research for distributed systems. So they have this framework for exercising a lot of the interesting corner cases of distributed systems, and they've found like a bunch of bugs in real systems, like real distributed systems like Redis-Raft, Postgres. I think they did something with, yeah, at CD. So there's like, they're studying real systems and finding real distributed systems bugs. Now this is, I assume I haven't actually looked through all the exercises yet, but this is going to be a sort of let's build up an increasingly more sophisticated distributed system and run it through MailStorm and see whether what we implemented is actually correct or what kind of additional mechanisms we might need to introduce in order to make it correct. Now I'll link the website in the chat so that you can take a look. Now this one is, the exercises that they give here or in the documentation for them is going to be in go because I assume that's what Fly.io uses or something. And I took a look at the MailStorm repo and in the MailStorm repo, they have this, why does it not, no, I want this, there we go. In the MailStorm repo, they have demo implementations of like the node code. And when I say node, I mean node in a distributed system is not like Node.js. They have demo implementations of that node stuff in Ruby, Go, JavaScript, Java and Python. There isn't one in Rust as far as I can tell. So we're gonna have to write a little bit of the sort of connecting code to get all this to work. Now looking at it, let's see, I think it's set in an echo challenge, yeah. So MailStorm basically requires that each node is just a binary and all the nodes are running the same binary. And they receive JSON messages from standard in and send JSON messages to standard out. And these messages are basically just like the stuff, these JSON objects that you send and receive sort of correspond to network messages. Like if we look at the protocol spec, which we're gonna have to implement here. Yeah, so the messages are of the form source, destination and body. Where source is the sort of identifying string of the node that sent the message. And the other one is an identifying string for the node that is the target of the message. And I assume these are gonna be like, you could think of these like IP addresses, but because we're using a thing that essentially emulates the network here, they're gonna be node names. Like N1, N2, N3, et cetera. The message bodies have the following reserved keys. Message IDs should be unique. Each message has additional keys depending on what kind of message it is. I see, so these are fields that can be set on any message or sort of reserved keywords. And then you can set other stuff in the message too. They can have any body structure you like. So we're gonna basically have to implement this protocol, but it seems like a pretty straightforward protocol here that we're going to have to build on top of type error. Interesting. Okay, so we'll have to think about exactly how we wanna model this and how accurate we wanna do it. I kinda want to just sort of get started with the real distributed systems part of things rather than spend too much time on implementing the protocol. But we're gonna have to do that just to get sort of set up. In fact, how about we just do the echo example here? So this is just like to see that it works. Like you get an echo message in from this orchestration system maelstrom and your job is to send a message with the same body back to the client with a message type of echo okay. So I guess we're just gonna start writing some code. I already grabbed maelstrom, maelstrom, maelstrom. So I have that over here. So once we wanna test this, we can actually do that. Great, so let's do cargo new binary and what are we gonna call it? What's a fun name for a distributed system that runs on top of maelstrom? What is the center of a maelstrom called? Vortex is the proper term. Yeah, but I want something that has to do with rust ideally. But let's look at vortex here and see if there's a turbulence is pretty cool. Vorticity, ooh, that's a cool word. Mm, mm. The Naruto whirlpools, that's fantastic. Alternatively, I'm thinking like ship, right? Like something that might navigate a whirlpool. Hmm, Eddie is also cool. Moloch, it's a name of a huge ship in a story about the 13.5 lives of a blue bear. Rustengun is fantastic. Yes, let's do rustengun. Huh, yeah, yeah, rustengun. Nice, great, I love this already. My girlfriend would be very proud. Okay, so rustengun is a, I'm gonna call it a battle spell even though that's not actually really what it is from the anime Naruto where you create like a swirl of wind essentially in your palm of your hand. And so it's kind of like a whirlpool, but also it's rustengun because we're doing it a rustle. Okay, so what are we gonna need? Well, we're gonna need, we're gonna need Sirdae and we're gonna need Sirdie Jason. And we're gonna need those because the protocol here is, the protocol here is in Jason. We're gonna define something like a struct message. These are gonna be passed everywhere so I'm tempted to make it short, but I'm not going to. And I'm probably gonna make this generic but we'll just do that down the line. So there's gonna be source, there's gonna be destination which they call dest, but I want this to be actually called dist so that they're the same length. And I guess I will Sirdae derive, nope. Nope. Derive, serialize, deserialize. And I also here want probably debug and clone which means I'm gonna use serialize and deserialize. And I want body. And body is going to be one of these things. And so what is body? Body here is a, I actually think maybe this is a, let's look at this in a second. So there's type which we can't have. So, because that's a reserved keyword. So we'll rename that to type and we'll have it be, there are a couple of ways to get around this. Either you call it type or TY or you call it kind. I'm gonna go with TY because why not? It is mandatory and it's a string identifying the type of message that this is. It would be nice if that was an enum but let's make it keep it simple. Then there is the ID which is called message ID in the protocol. A unique integer identifier. So this is gonna be an option use size. And there's in reply to which is option use size. And then they say it can also be arbitrary other key value types. So we have two options here. We could either do something like, you know, rest and say that is a hash map of string to like CERTI JSON value. So that's one option. The other option is to make it a string and the other option is to make this generic over B and then use like CERTI flatten rest B. I don't think we wanna do this because it's tempting but it requires that you know the type of the message ahead of time. Like at the time of deserialization which we're not generally going to know because the type is gonna, that type is gonna depend on the type up here. Now there's a, we could make this an enum, actually and then say that it is internally tagged by the type here. That's also kind of tempting. Although that means that we're gonna have to explicitly list out all of the enum types but that's a little, actually I kind of like that. The only thing that's weird about it, maybe we do that actually, so maybe we do this and then we say, I don't know if this works. So here's what I want to do. I don't know whether CERTI support this but I wanna do enum, I guess payload and the payload is going to be, sorry about the bright mode, there's no alternative for CERTI RS. CERTI tag equals type and then now in theory at least, we should be able to just explicitly enum out these and say the message types here are, so echo is gonna be one of them and this is gonna be one of those CERDI rename. There's like a CERDI, there we go, rename all equals lowercase and I think it's probably not even lowercase. I guess we'll see this soon. It's gonna be echo underscore okay so they're using snake case so we're gonna turn all enums into snake cases for us and then now we should be able to have these be types. So if we go back here, for echo, the field is gonna be a field called echo which is a string. Now the thing I don't know whether it's supported is this business which is, I wanna flatten in here, payload, payload, payload, I guess we'll see. All right so the flatten bit here is like the body has the type which is the tag but it also has these fields which are shared across all the possible enum variants and I don't know whether flatten works in this regard. We'll find out. All right so let's see now, let's just write the logic here for actually doing the deserialization of inputs. Like we're basically constructing the state machine driver here, right? So we're going to say stdin is standard IO, stdin.lock and we're gonna do stdout. The idea here is that we really want the inner state machine to just deal with messages in and out and not have to deal with things like IO. So we'll say here now that, actually I kinda wanna make some of this a library. We might do that down the line. So here we'll do, we want standard in to be a stream DC realizer. So in serde, there is this thing called a deserializer. Nope, that's not what I want. I want the docs.rs serdejson deserializer. Deserializer. So you can construct a deserializer and what's neat about it is that it can be turned into an iterator if you know that there are gonna be multiple things that you're going to DC realize. So what we can do here is we do inputs is serdejson from reader standard in dot. I guess we can do question mark here. I don't love it, but it's fine. Result, we'll probably do anyhow result here. Feels fine to bring in anyhow for this. So we do from reader question mark and then we do dot into iter just to make the compiler happy. And what is it complaining about here? It wants R and T. R is we're gonna let it infer and T is gonna be message. Why? Oh, I actually do need to do deserializer from reader into iter. And now we don't need this and we don't need this but into iter is now the thing that gives message. Okay, so now we should be able to do like while let, okay, input is inputs or alternatively we can just do, I guess, for input in inputs. And then the way that this deserialization works is that if you get an error during deserialization, it'll, it's still an iterator overall but the items that it yields are results. And what would happen if there was a deserialization error is it would propagate up an error through the item that the iterator yields. So that's why we sort of do this unwrap over here. I also wanna use anyhow context to give some context to this thing, which is gonna be maelstrom input could not be deserialized. Input from standard in could not be deserialized. And then this is where we get to sort of the state machine. So we'll do something like, you know, struct echo service or I guess we could call it echo node. Why not? And initially echo node is gonna be nothing and on echo node we're gonna have, this is a pretty common way to model state machines. There are crates that let you write state machines too but I'm gonna assume that we don't need that at least not quite yet. Then we're gonna do something like handle. It's gonna get a mutable reference to the state of the node in the distributed system and it's gonna get the input, which is gonna be a message. And the expectation is that this returns, actually no. And then it's also gonna get a mutable reference to I think there's a stream serializer too. The idea here being that as the node is executing it might want to send messages as well, right? It seems pretty reasonable. That might be things like responses but it can also be it triggers messages to other nodes. And so the sort of state step function here, let's call it step, needs a way to send messages. Now, arguably it actually also needs a way to wait for a message before it replies to the current message. So there's like a tricky interleaving of things that can happen here. It's not clear we actually wanna model it quite in the full state machine here. Well, we'll see how it turns out. So here what we'll want is a mute and it returns a any how results. So that it has the ability to just fully error. The mail stream protocol new line separates the JSON objects you don't need to use the fancy service thing for streams. That's true, but this gives a nice interface anyway. I don't think this is all that much. Additional complexity really, great. Yeah, so I don't know whether we're gonna want to express this through a step function but for now this seems probably fine. At the moment, actually we can just implement the logic right here because the logic they want is pretty straightforward, right? They want the reply is gonna be a message and the message is gonna have a source which is the input.destination and it's gonna have a destination that is the input source. It's gonna have a body that's going to be a body and the ID of the response we're gonna have to generate. So that's the one bit that's gonna be in here for now is ID. So we're gonna do self.id and then we'll do self.id plus equals one down here. The ID is gonna be sum. It's in reply to the ID of the input here, ID. And the payload is going to be payload echo. It's not gonna be echo. The payload is gonna be an echo and what was the response they wanted? Okay. Yeah, so here you see they've built a library that does the reply mechanism for you. We haven't done that yet and so we manually swap the source and destination here. So this is gonna be an echo, okay. And the echo is gonna be echo which we haven't extracted yet. So this is one of those. We're gonna do a match on input. Actually we're gonna do a match on input.body.payload. This is going to be not okay. Unexpected. Actually we're gonna do nothing if we receive an echo okay. Then we just do nothing. And if we get an echo that's when we wanna send this echo reply and this is gonna be in reply two. I thought the ID was required. Am I misremembering? In the protocol, only the type is required. Okay, so we generate one of these only if it had one of these. And so this reply we're then gonna do output dot, oh what's the way you use the serializer again? Right. I thought there was a output dot serialize. Can I use serialize any here? Reply. Fail to serialize response to echo. Oh right, it's reply dot serialize and you give in the serializer. That's right. Great, so this is then gonna do, down here I guess we're going to do a, let's construct this output channel as well. Serdy, JSON, serializer, new, standard out. And we're gonna do state is echo node. So we have to construct this echo node. It starts it, or it's the state for the echo node. It starts at zero and then we're gonna do state, step, give it the input and then this is gonna be context, node step function failed. And we need to pass in the output serializer. There's obviously a lot more framework we could do here but this is the basic motion of the system, right? You get messages in and those messages are gonna cause other messages to be sent and you need to deal with those as a result. So I think now we have a binary that in theory does what they ask for. I'm sure there's gonna be stuff that breaks here. Great. So if I now run cargo R, great. It's running because it's just waiting for messages and so if I now do this, except this and the binary we're gonna use is target debug, we're staying gone. See what happens? Well, it's doing something. Oh, it crashed. Unknown variant in it, right. So it says this in somewhere, initialization. At the start of a test, Milstrom issues a single in it message to each node like so the node ID field indicates the ID of the node which is receiving this message. The node IDs fields list all nodes in the cluster including the recipient. In response to the unit message, each node must respond with a message of type in it, okay. Great. So in other words, there are two more here in it and in it, okay. This is one of those where it would be nice to split this enum between like replies, like responses. No, messages, requests and responses so that we don't have to list them together but it doesn't really matter here, at least not at this stage. So node ID is gonna be a string and also node IDs, which is a VEC of string. And in response, we have to respond with a message type of in it, okay, which has no fields. Great. So now we should be able to say for in it, we don't actually care about any of the fields but what we're gonna do is respond with the required message which is here in it, okay. And in it, okay. If we receive an in it, okay message, that should never happen. We might receive an echo, okay, because remember if some node sends us an echo, then we're gonna send it an echo, okay back. And so we might receive an echo, okay back. The in it, okay, I assume goes to the maelstrom servers. So if you saw up here somewhere, yeah, so there are nodes and one and two and three, et cetera, that are our nodes and then there are the C nodes which are the maelstrom internal clients that sort of trigger these to be sent in the first place. So the in it, okay, our nodes should never receive because we never send one to one of ours. So this is received in it, okay message. Should never happen. And I guess here we can do, I want bail in here too. Great, let's try that. Yeah, I need to build it. See what it does. What do you mean by split the enum? Have two unrelated enums or is there a way to specify the two enum types is somehow related? It was the former I was thinking like split the enums but then have them sort of flattened into the same actual enum that gets used by Serde. And I think there is a way to do that but I don't know, seems probably unnecessary to do. Expected node and zero to respond to an in it message but the node did not respond. That's interesting. It might be a matter of flushing here. Oh really, can I not get at the underlying type? That's gonna be annoying. I think the problem here is that standard out is buffered. So when we write out here, we're not flushing standard out as well, which means that the message isn't actually getting out to the server. It might be enough to just print a new line because I think this is a line buffered writer. The challenge is that in order to print a new line, we need to get at the inner part of this output stream which serializer doesn't let us. All we can do here is we can unwrap it and then put it back together but that feels unfortunate to have to do that but it might just be what we have to do here. Actually, I wonder if we could do this with pretty. Like if we made this a pretty serializer, is it gonna print new lines for us? Oh, I need to do pretty. The protocol itself also requires you print a new line because it's new line separate adjacent objects. Makes sense. Wait, why is it? Oh, it's a pretty formatter. Fine, fine, fine. Where's the tick A here? This comes from Sirty Jason, sir, this. Can you explain again, when would the node receive echo okay? So if one of our nodes sends an echo request to another one of our nodes, then that node is gonna respond with an echo okay to our node. So it's reasonable for us to receive an echo okay. In this particular case, I don't think it would happen because I think the initial echo messages come from the C servers or the C nodes. Let's see if pretty saves is here. Did you mean to encode this line as JSON? Yeah. Okay, so it actually requires them to be one per line. Now there is technically a way to do this, which is a and a new line writer where anything that gets written to it, it depends a new line too. But I think the way we're gonna do that then is to not construct the serializer here, which is a little sad, but it is what it is. And then instead say this is gonna get the standard out and we're gonna do reply.serialize. No, we're gonna do 30 JSON to writer, output and reply, and then we're gonna do output.write. All write trailing, let's see if it's happy about that. Write and we need to do the same down here. And this should be standard out. Yeah, great. Nope, borrowed value output. Why does it not want to be helpful to me here? Do I really need to do a re-borrow here? That's pretty stupid. So this is just saying when I do this, what happens is the mutable reference, the ownership of the mutable reference gets transferred to two writer, which means that I'm not allowed to use it anymore here. Whereas what I really want is just the mutable reference to be sort of re-borrowed. Like I want a mutable reference to get into two writer, but I don't want that to mean that I no longer have it. So I do a re-borrow here where I dereference and create a mutable reference, which I'm allowed to because I have a mutable reference. And then at the end of this call, that is no longer mutably borrowed. And so I can reuse it again up here. I think what I'll do is actually, once we get to the next challenge, what we'll do is we'll move some of the stuff into lib and then we'll have each binary just to find the bits of the protocol that it uses. Why can't the step function return a message and have the surrounding code serialized and imprinted? It could, the reason why I haven't done that is because in this particular exercise, it's just request response, but you can imagine that when a node receives a particular kind of message, it actually sends a bunch of messages. Like for example, it might send a message to all nodes, in which case it's not sufficient for it to return one message. Furthermore, it might be that when I receive a message, I have to send messages to like three hosts and wait for them to respond before I can respond to mine. So there actually has to be a mechanism here for sending messages that is separate from returning from the function. Yeah, so that's the reason why it doesn't return a message. We can have it return a message for convenience, but that's sort of separate. Yee, so here we go. That seems promising. Everything looks good. Wow, it's not so good at printing. Well, my terminal is not that good at printing Unicode symbols, it seems. Okay, so now we have, we basically passed the first exercise, right? So we now have this working. We can look at mailstrom serve too. So there's a mechanism in mailstrom that lets you basically look at all the stuff that was sent. And so we can look at here, this execution of echo, and we can look at all the messages that were sent. We can look at the latency for echo, the rate, the throughput rate. We can look at all the messages that were exchanged in between which nodes. So this is gonna be handy for debugging and stuff later, but for this echo server, it's not all that interesting. Okay, so now we have distributed system one done. Next is Unique IDs. We need to implement a globally unique ID generation system that runs again mailstrom's unique IDs workload. Your service should be totally available, meaning that it can continue to operate even in the face of network partitions. Okay, so here now comes the next question of do we want to tidy this up a little bit before we start generalizing? And I think we do. So I think what we're gonna do here is lib.rs and we're gonna take all this stuff and move it over here. And we're gonna say that the payload here is going to be, I wish there was a nice way to, I think the way that I want to do this, I wanna see if this works. I wanna see whether I can do this. Yeah, I was worried that might not be the case. So this means that the payload needs to be fully defined by the caller, which is a little unfortunate, but it's okay. So the payload here is going to be generic. And the reason it's okay to make the payload generic here is because the payload is entirely based on what node service you're implementing. I'll commit the code. All right, fine, fine. I'll commit the code first. That's fine. Git add. Actually, I wanna git ignore store as well, which is mailstroms thing. So I'd git ignore the cargo files and source main. So the reason it's okay to make a message generic over payload here, remember I mentioned earlier that the problem with doing this is you need to know the generic type at the time of deserialization. Now, if we do that to all of, if we say the payload here is all of the message types that are used by a given service, then we do know that at deserialization time. Like the echo service knows that only echo messages should be exchanged in this particular messaging network. So this should be okay. What we can do is say, struct in it, just because we know what these fields are. So this is gonna be pub. This is gonna be pub. It's gonna be pub. These fields are going to be pub. And this and this and this. And same with the node ID and the node IDs. Now, there are some things we could do here that are kind of interesting, which is we could do something like have a like pub struct node that we implement on behalf of the user basically, which would have the ID management, for example, the message ID management that is and would have mechanisms like reply. I kind of wanna avoid doing that yet. I wanna build a second service to see what it is going to need first. The main loop here though, I think is kind of tempting to expose. I think what we'll do is, so we could have a pub trait here called something like node instead. And it is going to define the step function payload. And we're gonna say that in order to implement main, and this main is maybe not, it's like main loop. And it is going to take a state. And it's not gonna take a payload. And S needs to implement, and S implements node payload. And payload needs to implement de-serialize, de-serialize owned. So that's gonna be the main loop here. And we're not using bail anymore. We're not using write. So now we have a main loop that echo can reuse. And we can make changes here so that it doesn't use a step function, for example. But if we now go back to cargo-toml, what I want to do here is to say there is now a bin. Actually, we don't even need to do this. We could do get move or actually we can do make their source bin. And then we can get move source main to source bin echo.rs. And then we can go over here and say this is gonna use rustengun star. So now 17, impol state, no impol node for echo node, impol node payload for echo. This can mostly stay the same. And our main is going to be just main loop and pass in this. So now we've split this up a little bit. And at least in theory, this if we now run maelstrom again, and instead of rustengun, this is not gonna now gonna be echo. And hopefully it should still just work. We haven't really changed anything. Will the init and echo bit still be required to challenge two? No, init I think will be required. In it is a global setup step, but echo is not. So now we have split out, split out shareable parts. And so now we should be able to copy echo to unique IDs. And for unique IDs, we need to support a generate message, generate. And we need to return a generate okay message with an ID is gonna be... Oh, IDs may be of any type strings, booleans, integers, floats arrays. Okay, so realistically it's gonna be a string. There are a bunch of ways to do this, right? So you could run a full consensus algorithm to decide what ID to generate, to ensure that like you could run paxos basically to for everyone in the network, like for consensus in the network to, or quorum really of the network to agree on what the next ID that should be generated is. And then it could actually be an integer. The other way to do this is you generate a unique string and you make it unique enough that there just aren't collisions. One is very easy. One is very slow and hard, but the very easy one is also more costly in terms of the IDs are gonna be longer. There is also a somewhat higher risk of collisions, but that all depends on how you generate the IDs. But let's see if we can get by with the unique generation here. So this is gonna be unique node. Unique node. It's going to, for the init message, we still have to respond for the generate message and then generate okay. We actually want this here as well. Or this is another one of those where I don't think our nodes will actually receive any of these because we're not sending any of the generate messages, but it doesn't seem like a problem if we get one. Okay, so if we're told to generate, we need to respond with a generate okay, which is gonna need to have an ID, which is really like, ID is not really the right word here. I actually want this to be GUID because it makes it a little clearer in our code that we don't have to refer to the ID everywhere. And so when we generate this back, we're gonna have a GUID. And now the question is, okay, how do we generate this GUID? This is server response to generate. So that's the missing part here. And then this is going to be a unique node. Okay, so how do we generate the GUID? Well, there are a bunch of different ways. If you go look at something like the ULID crate, UUID is sort of the standard way to do it. The ULID is a little nicer because it's a lection graphically sortable, which doesn't really matter most of the time, but it has the nice property that the identifiers that you generate have the time field appear earlier in the struct. So if you sort them, they end up sorted by the roughly the time they were generated. And the ULID, I wish they had an example of this in the docs, but it looks like this, where part of it is a sort of header, if you will. Part of it is a timestamp in milliseconds that gets turned into a string. And then part of it is like randomness at the end. Now the question is, is it random enough on clear? Yeah, so here randomness bits. So you see it's a timestamp that's encoded using letters and then randomness. And so at least in theory, we could use this. It is true that this is not guaranteed to be globally unique, but it's close enough. Aren't we supposed to use the info in the init message to set the node ID? No, let me do the ULID thing first and then I'll explain. So ULID new.toString. So the question was, shouldn't we use the info in the init message to set the ID inside the struct? And the answer is no, this ID is really the message ID, not the node ID. We could arguably have named it as such, but it's the ID that we assigned to each outgoing message and the requirement from MailStrom is that the message ID is locally unique. So it has to be unique for any node that's sent from this node, for any message that's sent from this node, it does not need to be globally unique. Think of it as a sequence number. And so this is something that we just have worked locally for the node and then we increment it every time we send a message. You're guaranteed that the message IDs are unique so you could use those. That's also true. We do know that the, it's not even the message IDs, it's the combination of the node ID and the message ID is guaranteed to be globally unique. So that's the other thing we can do here. If we assume that the overall system never reuses a node identifier, then let me do this first so we can test it. And then I agree with you, we could just do this much easier. But let's try this first. I just want to see that ULID actually passes. So this is going to be, where's the, this big. And this is going to be target debug unique ID. Yeah, so here you can see all the unique IDs that are being generated. You see there's a bunch of randomness and you can see the timestamp at the beginning here keeps clocking up. And I would be surprised to learn that these weren't actually unique. The proposal that came in chat is a good one. Like what I just said was the message ID is guaranteed or we need to guarantee that it is unique per message from a given node. So within a node, you never generate the same ID twice. And separately we know our own global node ID, right? N1, N2, N3, et cetera. And that combination necessarily has to be globally unique. And so when we respond to that message, right? This generated ID could just be the combination of our node ID and our message ID. Because that combination is always guaranteed to be unique. So let's first see that this looks good. Okay, everything looks good. And then instead now of using ULID, we say this is going to be format self.node and self.id. And self.node, we haven't stored yet, but we're gonna say here node is gonna be a string. Now really the state here should be constructed from in it. If we wanted to make this a little bit more reasonable from in it. So this is gonna take an in it and it's gonna return a self. It can return an anyhow result self, that's fine. So where self is sized here is just saying that this method cannot be called with dynamic dispatch. Which is fine. We're not expecting to use dynamic dispatch here anyway. So now the state here is actually not gonna be passed in. Now the downside of this is, the downside of a constructor like this is that it gets pretty annoying to write code where you want to construct parts of the state in advance and then the rest of it when you get to from in it because from in it does now in a trait definition so or your impul of the trait definition. And so it doesn't have access to any state that might have been set up previously. There is a way around this which is to say that this is generic over some S. And you say that, so this would have to be S. And then you say here that N implements node of S and payload. And then you do like, this is the init state which is gonna be S. And when we construct it, we do let mute node is going to be node from in it. But we can't use the step function for that. It gets a little annoying. All right, fine, let's leave this the way it was. The thing I was about to say was constructing this node from init here, you can then pass in the init state. And then you would also pass in the data that you get from input. The place where this gets annoying is first of all, you can't use the step function because that function is called on a node and you haven't constructed the node yet. But you need to call the, you need to, oh, actually no, you don't. You can just get the input. We need to get at the input message. But in order to get to the input message, we need to deconstruct the message which is generic over payload. And the init is inside of that payload but payload here is generic. So we don't know how to get at the init variant. This is one of the reasons why it'd be nice to have payload be a sort of combination enum where we define some of the variants in the library and the remaining ones are defined by the caller through generics. But I guess one way we could do this is we could say extract init and say here, um, this has to return an init. So the node needs to tell us, actually we could do this on payload. Now that I think about it, we could have a separate trait which is payload. And what that has to define is extract init message self and it has to return an init. And I guess we could say option and then we panic in our code when we know that it should be. And this is also going to require it to be sized because it names self because now if we require that I guess I'll make this p so that we don't have overloading here. So p has to implement payload. And so now we should be able to say, well, we know that the very first thing that's going to happen is init inputs.next.unwrap or expect, I guess, I suppose saying no init message received. I guess we could really say init message should always be present. And then we can now say p extract init. There should be no self here extract init from that init message. So this is really the init message. So extract init and expect first message should be init. And then we can call from init with that init. And now we have a node and now for all the remaining ones we can call the state machine. No next exists. This also has to have a, this might fail deserialization. In it, message could not be deserialized. And this should be p. And this from init we allowed to fail node initialization failed. And then I guess we don't actually need to pass that message on. We can send the init response here now too. So the init okay. Ba, ba, ba, ba, ba. Right, so that's the other thing that we want from payload is init okay which should generate a payload. Should generate a self I suppose. And this again is the reason we need these two is because we don't have insight into the enum variants of the payload. And we need the init and init okay ones. And this is why I wish we could pull it out. So now what we should be able to do is init okay is p again init okay. And we should now be able to construct a reply message. Oh, we're gonna have to increment the ID too. That's pretty annoying, isn't it? This is actually gonna take a self and we're gonna pull out just the payload because the other fields we're gonna want to use here. In it, destination source, swap those. Right, fine, this is the init message. So this is just us generating the init reply on behalf of the underlying node. Now this ID is gonna be a little bit annoying because I guess we could just say these should always start from one. And this is definitely going to come back to bite us but we're gonna do it anyway. And say that the zero is reserved for this init okay response. In it, message body ID. And this is the init okay. And we're gonna require here the ps payload and also serialize. Great. And it doesn't like this because I cannot borrow inputs as mutable, that's fine. Okay, so to talk through what we just did. So we changed the main loop so that it will now also handle the input message by reading the first message from standard in, parsing it as a message payload and payload here is a P, like it's a generic. Then we use the implementation of the payload trait on P to extract the just the init information from the payload through the sort of variant inside of payload that the consumer of the library is using. And then we pass that initialization into the node initialization step which gives us back an end which is the node implementation that the user of the library is using. Then we generate the init okay and there too we need to rely on the implementer of the payload trait to tell us what that variant is. We write that out and then we do the normal operations. So now if we go back to Echo for example, this should make it so that we can implement payload for payload. This is probably gonna yell at me, right. This is gonna be Rustengun payload. And extract init here is going to be oh, we can use let else. Let payload init. And now this I guess can be Rustengun init equals input else return none. And then we do some. And we could do this even nicer actually. I suppose we could do if let, I like the let else. We could do this as an if let instead. And gen init okay is gonna be payload init okay. So it's a pretty easy trait to implement. This is what the implementation is gonna look like for basically every case. Oh, my chat did not catch up. Let's see. You can just set up the state on step when you receive the init. Yeah, so the whole reason why I got into this business is for unique IDs we want the node ID to be a part of the state but this field we only have access to once we receive the init message. And so if we wanted to set up the state before we get the init message then node now it has to be an option because we don't know whether or not it's been set yet. But in practice the way the system works is we know we get the init message first. So in practice node should never be none by the time we get to handling any message that's not init. And so that hence all of this setup is specifically to avoid that option. I don't know why my interview with Primogen is no longer on YouTube. That's weird. Does the node need to know about the init message at all? It could be handled by the message loop entirely. This is where it gets tricky because in theory it could. In theory we could deserialize out without using the message type at first or with using our own message type. And in fact, that's not a bad idea. It's a little weird, but it is doable. So if we go to lib here, currently we construct this stream deserializer and the type of the things that yields include this generic parameter. What we could do instead is first deserialize a single thing from standard in where we use our own concrete inum type instead of P that just holds init. In fact, maybe that's the way we should do it. It's not a bad idea. And then we construct this stream deserializer only after we've handled init. I like that. We can do that instead. Give me a second. I'll do that. I'm just catching up a chat. In case of restarting the node, you will generate the same ID. It means the ID is not unique anymore. And that is the downside here that if nodes can be restarted and when they restart, they have the same node ID, then yes, it's a problem. That's already a problem with the way that we generate IDs here though. So the assumption here in the system is that if a node were to restart, it gets assigned a new node ID. It doesn't reuse its existing one. Okay, let's tidy this up some more. So instead of doing this, we're not gonna construct inputs until down here. And instead, what we're going to do is we're gonna do init message is going to be Serti Jason from reader. Standard in. And then we're gonna do init message this. And this is now gonna be a message and it's gonna be an init. And now init doesn't even need to be pub. What we're instead gonna have to do is do this init payload. And now I don't actually even think we need init to be its own type. We could just have it be like this. We're not gonna need the payload trait anymore. Oh, actually, I think we do want init as its own type just so we can pass it to the constructor here. But we aren't gonna need the payload trait. And instead, what we're gonna say is we don't need it to implement serialize either. So what we're gonna say is we're gonna construct a message where the payload is init payload. And then we're gonna say, and here we can do a let else. Let init payload init. In it is equal to this else panic. And this response now can be init payload init okay. And then down at this point now, P no longer needs to include init because we've handled init previously. So this means we're deserializing using two different types. But I think that's fine here because for the first one, we know it should be init. This now should indeed be pub, but it doesn't need to be serialize. Fine, great. And now if we go back to echo, these can go away. This implementation of the trait can go away. This doesn't need any states so we can do from init of state and init which returns an anyhow result self and that's just gonna be okay. It's just gonna be one of those and it doesn't use the state and it doesn't use init. And now this can't possibly receive an init message or an init okay. So it doesn't need to think about them. And main loop now is just gonna be echo node and no initial state. That does look a lot nicer. And if we go to unique IDs, it should have the same property that we can get rid of the init bit here. The initial state that we care about is just gonna be empty. There's no context from the surrounding environment we wanna bring in. From init is going to take an init and it's going to return unique node where the ID is one. See, it almost bit us already and the node is init.node or node ID. There's no longer an init payload. There's no longer an init okay variant. Main loop is gonna be nothing followed by unique node, nothing, and no initial state. And I messed up my thing here and this is gonna be always okay and we don't use state. And we don't need bail. That is a lot nicer. Right, so now the way we got here is that we want the unique ID generator to use the node ID and the message ID combined and now it knows the node ID and notice there's no unwrapped here. The node ID is always set because it's set as part of init. We're still inverting the source destination node IDs in the response. Inverting the source destination node IDs in the response. Yeah, I mean, that's what we want to do, right? Cause it's a response. Okay, so let's see if this builds. Let's see that echo still works too. That doesn't seem very promising. Well, we broke something. Expected node n zero to respond to an init message but the node did not respond. What do you mean it did not respond? Why? Why? It didn't crash with anything as far as I can tell. I guess we can do cargo arbin echo and just send in, that's fine. Show me one of these init messages so I can copy paste it. This is going to be source, it's going to be, I don't know, C1 destination is going to be N1, body is going to be, this thing, source is going to be C1, dest is going to be N1, body is going to be this. It doesn't print the response, that is certainly true. Oh, I know why it is because this from reader waits for standard in to be finished. It wins for end of file, not for new lines. So we're actually going to want to construct a stream deserializer here into iter message init payload.next.exe expect no first no init message received. So the difference here is that when you use the stream deserializer, then it stops it, it checks whether it can deserialize at the end of new lines, rather than just at the end of the file. It is true that we could, like someone's made the, I think Gala's made the point in chat here that like, since we know that the format is new line separated, we could do better here than the sort of guessing deserializer, which is what the stream deserializer here is. Because it doesn't know until it's parsed whether this new line is the termination of the object or just something in the middle of the adjacent object. So this one's a little bit more costly. When we know that the format is actually stream based. So one way that we could do this instead is standard in is standard in dot lines, right? And then we could deserialize each line at the time instead. And maybe we should just do that. It's not a bad suggestion. So in that case, what we would do is standard in dot next, dot expect. And then we would do from slide. I forget what lines gives you. I think it's a, I think it gives you a stir. And this would then be message in it payload. Context, because here fail to read from standard in. Fail to read in it message from standard in. Consider borrowing here because this gives me a string. That's fine. So what we're doing here instead, we're splitting by lines and then we're just straight deserializing the entire string of a line, which we know is gonna be exactly one message. Then we can do the same thing here or we can continue to use the deserializer, but let's do for a line in standard in. That line is line.context. And I guess we can reuse most of the same context here. Could not be read. And then we do cert adjacent from string line. And this now is gonna be message P and we don't need this mute. That's true. So the code difference is fairly minimal here. And now it's gonna yell at me if I do this. So I'm going to make it all one line. Yeah, and now we get the response right away. All right, let's see that echo still works. That looks promising. Great, everything looks good. And if we now go to the unique IDs one, make sure we build all the binaries. See what it does. That seems promising. I mean, it is generating exactly the string we wanted it to generate. And those will be globally unique. Again, as someone pointed out though, only assuming that node IDs are not reused when nodes restart. Amazing, everything looks good. Okay. Actually, I want to move init logic into main, into lib. And then add solution to unique IDs challenge. Okay, next challenge, please. Continue on to the broadcast challenge. Okay, single node broadcast. In this challenge, you'll need to implement a broadcast system that gossips messages between all nodes in the cluster. Gossiping is a common way to propagate information across a cluster when you don't need strong consistency guarantees. This challenge is broken up in multiple sections so that you can build out your system incrementally. First, we'll start with a single node broadcast system. That may sound like an oxymoron, but this lets us get our message handlers working correctly in isolation before trying to share messages between nodes. Your node will need to handle the broadcast workload, which has three RPS message types. Broadcast, read and topology. Okay, let's start encoding this. So we're gonna say unique IDs into broadcast, that's fine. And this is going to be broadcast node and the payloads we'll get to in a second. So broadcast node, broadcast node. This can just be self. Great, so there's no longer a generate. There is now a broadcast. There is a read and there is a topology. Okay, your node will need to store the set of integer values that it sees from broadcast messages so that they can be returned later via the read message RPC. The goal library has two methods for sending messages. Send sends a fire and forget message and doesn't expect a response. As such it does not attach a message ID. RPC sends a message and accepts a response handler. The message will be decorated with a message ID so the handler can be invoked with the response messages received. Okay, so this is starting to look more like a sort of service, like we actually want an abstraction where I can send a message and attach a closure to it and that closure gets called when we get a response to that message. Which is interesting. I mean, that requires a little bit more mechanism in our library. If we actually want to support that kind of callback mechanism. Basically there are two ways to handle systems like this. One of them is you have an interface for doing like sending a request and attaching a response handler. The other is to say it's just a flat state machine. So when you send a message you might record in the state machine that you've sent that message and then you need to do something but really you're just updating the state machine to now be in a state where it expects a response and then when the response comes in it's just handled by your step function saying well I got this response what do you want to do with it? And so there's no closure being called. You don't register at the time of sending the request what to do in the response you just encode it as another step in your state machine. And these both have merits they're a little bit of a different programming model. I want to try to see if we can stick with the state machine here but we'll see if it gets too annoying. It might also be that we need to turn all of this async. We'll see a little bit how it pans out it might not be necessary. This message requests that a value be broadcast out to all nodes in the cluster the value is always an integer and it's unique for each message from maelstrom. Your node will receive a message body that looks like this. Okay so a broadcast has a message which is a U size. It should store the message value locally so it can be read later in response it should send an acknowledgement with a broadcast okay message. Okay so there's a broadcast okay thing. Read this message requests that a node return all values that it has seen. Okay so there's a read and then there is a read okay and the read doesn't actually include any data the read okay returns all the messages. The order of the returned values does not matter. Okay it could be a set I guess given that we're guaranteed that the messages are unique. Topology this message informs the node of who its neighboring nodes are. Maelstrom has multiple topologies available and you can ignore this message and make your own topology from the list of nodes in the node ID some method. All nodes can communicate with each other regardless of the topology passed in. Ooh interesting okay so topology informs us of the topology and it is a hash map from node to a vec of nodes. In response you should return a topology okay. Topology okay. All right so I mean the setup for this is pretty straightforward. In fact we don't yeah we might need the node ID we're gonna need the message ID and then we're gonna want a messages which is a vec of u-size and so initially messages is empty. When we get a broadcast with a message then we're going to send a broadcast okay. And here you know we could easily say we wanna make it easier to construct something like this. Right so one of the ways to do that would be to say we have an associated method on message which is like prepare reply which just does these bits. In particular inverts these. It sets the ID to one that's passed in if you have one and it sets in reply to as necessary. So let's go ahead and do that. That seems like a potentially useful thing here in oops. In lib over on message. Imple message for any payload. This should be payload agnostic. There is now a reply which consumes the message. I wanna say it consumes the message. It, well unclear whether that's what I want actually into reply is nicer. The ID is gonna be an option to a mutable reference to a u-size because we wanna increment the ID whenever we prep this reply and I think that's all we want. We could say that like this also takes a closure that maps the payload. But I don't think I want that here and it returns a self and what it does is exactly this. Now it doesn't actually need to construct a new self technically we could just set the fields instead but I actually feel like this looks nicer. Um, ed.sdrf, fine ID map. M-I-D, M-I-D and the response payload. The question is whether it should leave the payload in place. I guess maybe it can just take the payload. But I think what I want here is I want it to return. What I'm thinking here is I kinda wanna be able to take the old payload out and return it. Like I sort of want it to be a swap of the payload. And the way to do that would be something like, actually no I think this is actually what I want. I want to not take the payload in set the same payload for the reply and so now this is going to be, no that's not what I want. I want bin broadcast. Are you drinking? You're a very loud drinker. The cat was drinking, she was very loud. So when we get a broadcast here we should be able to say input dot into reply and we should be able to say, so this is why that won't work. It's because we're already consuming the payload here. So I think what I want is, I guess in some sense what I want is like, mem replace the payload with a payload that is going to be empty. But I kinda want what I replace it with to depend on what's in there. There's a couple of ways we could go about this, right? Like we could say reply payload. I don't want to do that either. The other way to do this is actually, now here's what we do. We do let reply is input dot into reply, self dot and then we match on the reply payload and then we say reply dot body dot payload is equal to this. Boom. Beautiful. That's pretty nice. And then we have to do the same thing for read. Although read doesn't take any arguments and we have to do the same for topology. Check the topology and we ignore anything that is broadcast okay. Read okay. And topology okay. And topology okay is also this and we can just, they have the same handler. Oops. Okay. So here what we want is self, self dot, messages dot push message. For this we want read okay. Where messages is self dot messages dot clone. And topology, I guess we're doing nothing with at the moment except responding with a topology okay message. And we can ignore this field for now. And we're not using the node at the moment. That's fine. I suspect we're going to start using it once we need to know what our neighborhood is. Right. This is single node for now. And so there's no need to know yourself or your neighborhood but I think we're going to need it next. All right. Let's see what we get here. So I want maelstrom and I want target debug broadcast. Why is the ID optional? ID is optional because for messages where you don't expect to get a reply there's no need to put an ID on the message in the first place because there's no need for the other node to identify the message it is responding to. That said, usually in these systems it's valuable to put an ID on messages regardless of whether you expect a reply because it can help with things like ident potency. So the recipient, let's say you end up retrying a request, the recipient can tell whether a message is one that it's already seen by looking for two messages that have the same ID. So usually even for broadcast messages in a real distributed system you would often assign them IDs regardless of whether or not you expect a response. Okay. That seems to have worked. So get diff. I guess actually we could also now go back and make these a little nicer. So let reply is input dot. And to reply some mute self ID. And then we say here, apply dot body dot payload is equal to this. Does make things a lot nicer, doesn't it? And I guess we'll do the same here. Reply, reply dot body dot payload. These quills, this, right. And this needs to be replied. Beautiful. And I guess just as a sanity check if we go back and run echo, just still do fine. Why don't you use match all but list the other variants explicitly? Where? Oh, you mean here for the okays. So I don't love doing this because I want to know if there are variants that I've forgotten to list. Does that answer your question? I think that's what you're asking. So in theory I could do this and just say do nothing for those. But if I, for example, added another payload variant I want to compile error telling me I'm not handling that variant which I wouldn't get if I had underscore there. Everything looks good. You're probably right. I don't need to increment it here because inter reply does that for me. Totally correct. Doesn't matter. There's no requirement that they increment by one but it is unnecessary. That is true. Okay, let's do actually, what do I want to do here? Add message into reply helper. Whoop. So this is solve. Okay, fine. Single node broadcast. Okay. Bring multi-node. Your node should propagate values it sees from broadcast messages to the other nodes in the cluster. It can use the topology passed to your node in the apology message or you can build your own topology. The simplest approach is to simply send a node's entire dataset on every message. However, this is not practical in a real world system. Instead, try to send data more efficiently as if you're building a real broadcast system. Values should propagate to all of the nodes within a few seconds. Okay, so the idea here is that every node in the system should know about every broadcast message. And so what we're gonna do is we're gonna gossip them around. If you're not familiar with gossip protocols, the basic premise of what we're setting up today is let's say we have three nodes. And let's say that an operation comes in here saying broadcast 34. What we want is for 34 to be known to this node and for 34 to be known to this node. And the question is how do we get there? Like what messages do we have to exchange in order for 34 to make their way over there? And one answer, of course, is that this node is gonna, it knows about all the other nodes in the system so it sends two messages. One to, let's name these N1, N2, and N3. It could send this message to every node in the system. The challenge with this is that it doesn't really scale well. Like imagine you have lots and lots and lots of nodes. You really don't want a system in which every node sends a message to every other node any time it gets a broadcast. So instead, this is notion of a broadcast, of gossip. The idea with gossip is that rather than every node sending a message to every other node, what you do is, let's see if I can make this. Ooh, that's so much. So instead, the idea is that you take, I don't want that. You take your node and you have every node have a topology that tells it about its neighborhood. And the neighborhood, you can define however you want. It could just be pick two random nodes. That's a valid neighborhood. The neighborhood could be the nodes that are closest to you in terms of network, right? Like the ones you have direct network links to, for example, there are all sorts of valid ways to define a topology. But let's say that we, you know, this one, ooh, that was a, that's very big. Let's make that smaller. Let's say that the topology of this node is this and this and this. Now note that topologies do not have to be symmetrical. So it doesn't have to be the case that just because N, let's say this is N6, it doesn't have to be the case that just because N6 is in N1's topology, that N1 is in N6's topology. It doesn't have to be symmetrical. And so instead, it could be that N6's neighborhood is actually, you know, includes N2, for example. What happens though is when you do gossip is that you send the message to everyone that you gossip to, but you don't send it to anyone else. They are gonna gossip to their topology. So N6 might send it to this one. So it might send it to a node that's already received it, but it might also send it outside of the previous set of the topology. And then this node is gonna send it to here and here maybe. This node is gonna send it to here and here. This node is gonna send it to here and here. And as a result, that 34 that came in over here is actually gonna end up propagating throughout the system, right? Because it's gonna, this 34 is gonna first go here. Then it's gonna go here. Then it's gonna go from there to here and to here. And from here to here and from here to here to here. So this node is gonna hear about it last generally. If we assume all these hops are roughly the same latency, but they will all eventually hear about it. And of course, this is true for any set of topologies as long as you always have at least one link from one node to another one through transit of closures. And it should also be the case that no matter which message you send the broadcast to, the broadcast will eventually make it to every node. And so that's the basic essence of the gossip protocols. You regularly talk to all of the nodes in your topology to learn about messages that, or in this case, messages to learn about data that they have that you do not. So you can think of this more as a sync. So what I drew here was essentially a sort of limited broadcast. Like when I hear about a message, I immediately tell my neighborhood. You don't have to implement broadcast, or you don't have to implement gossip that way. Instead, what you could say is you regularly gossip with the rest of your network. So for example, N6 and N2 every now and again, they're just gonna talk together. It doesn't have to be the case that they do it immediately when someone has a new value, that might lead to a lot of messages. Instead, you say every 500 milliseconds, nodes are gonna talk to their topology and do a sync. And a sync could be something like, hey, I have these values, which values do you have? And so you do a two-way sync rather than a one-way sync. And you do it in a sort of batched fashion rather than do it in terms of single messages you send. And so when they sort of give this specification here, you'll note that they say values should propagate to all other nodes within a few seconds. And the reason why they say that I presume is because you might choose to do scheduled gossip. But once you do scheduled gossip, you run into this weird problem where once you have many hops, the time it takes for the 34 here to reach the node all the way on the right can take a really long time, right? So N1, let's say that it receives the broadcast message and then its next gossip window with N6 isn't for 500 milliseconds because it just did it. Okay, so it waits 500 milliseconds and then it gossips with N6. And N6 just gossips with its network. So N6 is gonna wait another 500 milliseconds before it gossips again. And then you have the same for N2. And that means that the time it's gonna take for the value to get from N1 all the way to the node on the right, this one over here, right? It's gonna be 500 plus 500 plus 500 plus like the link distances of each one. So the latency of doing the sync itself. So suddenly now you're adding up to a bunch of seconds before the nodes at the edges of the graph actually have all the information necessary. You don't usually run into loops here because again, this isn't actually forwarding. What you do with gossip is you exchange information with your neighbors about data that one has but the other does not. So it's not as though like what's really gonna happen is N6 and N2 every now and again are going to talk together and compare notes. But it's not a blind forwarding. Blind forwarding is where you run into trouble with loops, right? Where I get the message so I send it to you, I'm in your network so you send it to me and then I send it to you and then you send it to me. That's where you need like TTLs or something like that. But in a gossip protocol, that doesn't really happen. What happens is when I get a message, I contact you and or at some point later, I contact you and I say, I have these messages and it includes the one that I just got and you say, oh, I don't have that one. Please send it to me. And then our sync is done. So there's no, I don't take any action as a result of learning this new value except I'm gonna gossip in the future some point two and compare notes with my neighbors. So there's not actually a forwarding in the way that you might think. And now there are a bunch of questions here like when you do the sync, how do you do it in a minimal fashion, right? So as they point out here, the simplest approach is to simply send the notes entire data set on every message. However, this is not practical in a real world system. And in fact, even in a non real world system, this gets problematic pretty quickly. So let's imagine that we have, let's imagine that we have just two nodes. And there's all sorts of network that they have on either side. So they get new messages over time. And now let's say that this node over here has the messages 24, 36, and 48. This one has 12, 13, and 24. Now let's say that they have to do a sync. One of them decides that it's about time to do a gossip. And this node, let's call them A and B. So A contacts B and says, hey, I wanna do a sync. So it's sending a message. What does it include in that message? Well, it can include all of them. What am I doing? It can include all of the messages, right? So it could say 24, 36, and 48. And then B goes, okay, that's great. Let me tell you about the ones I have. I have 12, 13, or let me tell you about all of the ones that I have that are not the ones that you have. So it knows to eliminate 24 because it sees that A already knows 24. So let's say that this is already an optimization, right? If we didn't have this optimization, it would say 12, 13, 24, 36, 48. Those are all of the messages that B have. But it can do at least this sort of obvious optimization of I'm not gonna tell you back the things that you have told me. So it sends 12 and 13. Okay, that's pretty good. Now imagine that, again, let's assume that our time here is 500 milliseconds. So 500 milliseconds pass and then they decide to do another sync. Or at this point, let's say B initializes a sync. So it doesn't have to be 500 milliseconds. It's just another sync happens. And this time B sends a message. What does B send? Well, there's no message for it to reply to. So it doesn't on paper know what A has. So the only message it can send is 12, 13, 24, 36, 48. Those are all of the messages that B knows about. And then when A now responds, its response here is gonna be, well, I don't know of any values that you don't have. Because it sees all of these messages. So it is now in the same position as B was previously of being able to eliminate anything that it was already sent. But they just talk together. So there's no need for B to send any of these because it knows that A already has them. Because it knows that A sent these here and it knows that it sent A these here. So there shouldn't be any need. And so as a result, you run into this weird situation where B could remember what it has synced with A in the past and just not send any of those either. So now B needs to remember not just which messages does it have, but also which messages does it know that A has? And now we get into the sort of really wonky world of distributed systems. So you might say, well, B here knows that all of these messages here, that all of these numbers are known to A because A told it that it knows 24, 36 and 48 and B previously told A that it has 12 and 13. So all of these can be removed. B doesn't have to send anything. That's not true though because in a distributed system, what if this whole message got lost? So A sent 24, 36 and 48 to B and B responds with 12 and 13 but A never gets that message. A doesn't know. It could in theory detect that, oh, I never got replied to this message and then sort of tell B again. Alternatively, it could just like go, maybe it didn't have anything to tell me it's not gonna send a message. That's also fine. We don't even need acknowledgments here. The challenge is B can't assume that A knows 12 and 13 until it hears 12 and 13 from A, right? And so therefore the only safe thing for B to send to A here is actually 24, 36 and 48. The only thing to eliminate is these. It still has to send 12 and 13. The question then of course becomes, well, how will B ever realize that A now knows 12 and 13, right? And the answer to that is, well, A, when it does the next sync to B, it's going to say naively 12, 13, 24, 36, 48. The question is, what does it know that B knows? Well, it knows that, it knows that B knows 12 and 13. So it doesn't need to send 12 and 13. But it doesn't know that B got this message. It only knows that B got this message if it gets B's reply. So whether it can eliminate these depends on whether it saw this. So there's an implication here between these two. It is the two generals problem, right? Like this is the problem of consensus, is it is really hard to know whether someone else knows something if arbitrary messages can be dropped. And to be clear, there is no solution to this problem. There is no finite set of messages you can send that ensures that these two are in consensus. If you allow for arbitrarily dropped messages, you cannot solve this problem. Like there's a mathematical proof saying you cannot solve this with a finite number of messages. The moment we know the messages are received, you're fine. So you know when you're done or if there are no drops, you know when you're done. The challenge is you don't know that there are no drops. So the question is, well, what do we do? And the answer is really, this is all an optimization. Right, what we're doing here is saying we want it to be the case that if messages aren't dropped then we're able to eliminate messages from the sync. That's all. And so it's okay for this to be imperfect. It's okay for us to send some extra values if some messages happen to be dropped. As long as the recipient has a way to detect that it already knows something. And in this case, that's fine because the messages all have unique IDs. All the messages in the system, like all the 12, 13, 24, 36 are guaranteed to be globally unique. And so as a result, we can just keep a set. And so when we hear things from a neighbor, we just add it to the set. And if it's already there, it doesn't matter, it's a set. Okay, so how are we gonna do this? Well, what we're gonna want here is messages is going to be a hash set instead. And we're also gonna need to keep some state about known, let's see, known. This is gonna be a hash map from a node, a node identifier to the things that we know that they know, right? So this is gonna be the set of, I know that N1 knows these values. And the hope is that this makes things, this makes the gossip protocol be more efficient. And the real question is gonna be when can we add something to known? Okay, and one of the things that I suspect we're gonna have to keep here is something like a message communicated. And I'll talk about what this means in a second. You'll see why we need this a little bit later. Okay, so messages here is gonna be a hash set new. Known is gonna be a hash map new. And in fact, in init here, there's the assumption that we know all of the node IDs. That's not always true in distributed systems, right? It could be that new nodes are gonna be added and removed. And currently there's no support mechanism for that here. But we can actually do a little bit better here by saying we're gonna do init node IDs into iter map into NID and hash set. And sort of pre-allocate the hash sets here. Collect. And then message communicated is gonna be a hash map new for now. Okay. So now let's get to this step function. Broadcast is gonna be easy enough. We're just gonna insert the message. Read is gonna be fine. We're just gonna do self.messages. But actually I think a set gets printed as a vector. Let's just work under the assumption that it does. I think it gets encoded as a sequence for JSON. So let's keep it a hash set. And if it ends up with the wrong encoding, that's fine, we'll fix that later. Okay, so the topology is gonna tell us about what nodes we wanna communicate with. We know the total set of nodes, right? It's known to us by virtue of the init field here. That tells us all the nodes in the network. But realistically, we want something like a neighborhood. And the neighborhood is gonna be a vector view sizes. No, a vector of strings, which are gonna be the nodes that we should gossip with. So when we get a topology, we're gonna say self.neighborhood is equal to topology.remove ourself. Unwrap or else, no topology given for node, right? So the topology here is basically a suggested topology of what our neighborhood should be from Maelstrom. If we weren't given one, we could pick randomly in this case. Like we don't actually know the network topology. So we just pick some random subset of the nodes in the network of size, let's say two or three. The challenge with doing it randomly is you don't actually know that you end up with a connected graph. You could, if you pick randomly, end up in a state where purple. Let's say that these are the nodes in the system. And let's say all of them choose their neighborhood randomly, but they all choose at least two nodes. Okay, so this one picks a neighborhood that's here. All right, so one picks this one. Two picks the same one. Three picks the same one. I need one more node in the system for the problem to be apparent. Whereas these, this is four, this is five, and this is six. And four, five, and six all pick this neighborhood. Well, now there's no way for the gossip to disseminate broadcasts from the left to the right or right to the left partition. This is what a network partition ends up being. In this case, it's really more of a node partition because the network, there are network links here. We're just not using them. And this is where you get into, like there are solutions to this problem such as the number of nodes that you choose is one more than half of the number of nodes. So if you required every neighborhood size to be of at least four, including yourself, so three additional nodes, then there must be an overlap here. Now, the overlap might only be in one direction because depending on how we do sync. So if sync is just I send you my stuff and not a bidirectional sync, then you still end up with partitions here. But if it's a bidirectional sync, then one of, like the nodes in four, five, six have to include one more node, which means there's no way for you to end up with a partition because there's always going to be overlap between the circles because here there's six nodes. So if every partition contains four nodes, you can't partition the network. There's always going to be overlap. So here you end up with two clusters, but if you change the rule for gossiping to be a broader topology, you don't have that problem. Now, I'm going to assume that the topology that we're given from Maelstrom is one that's guaranteed to be connected. If that's not the case, then we have to basically compute the topology ourselves in a smarter way, but let's assume that it is for now. Okay. So the neighborhood is going to tell us that the topology, read and broadcast are easy enough, but there's obviously the actual gossip part. That's where we get into trouble next. And the question becomes, well, how are we going to do that? Oh, right, a neighborhood. And this is where our current model of a step function becomes a little weird because the gossip isn't really a step function. There's no message that we receive that tells us to do a gossip. There are ways to model this, right? So you could start up a separate thread and that thread generates input events every like 500 milliseconds that say you should go gossip now. That's totally a thing that can happen. The other way would be to make this main loop be know about the outer loop over standard in and say that the outer loop is actually going to be a select over a timer and reading from standard in. So this would be a sort of, like you modify the input loop instead to say it knows how to select over multiple input sources. And for that, you don't quite need async, but you're going to want to do it in async. Now, this is where we can get into either the sort of go down the route of just write all this code in async in a style. The other way we could do it is that we could do this with essentially an actor system. So we said every, and this is basically how we've modeled it now with a state machine is to say every node is an actor and it's not going to be internally concurrent at all. It's going to handle one event at a time and we're going to generate the inputs to that event and they're all going to be handled synchronously. I don't know which one I prefer here actually. I think I want to avoid making this async for now. And if we're going to avoid making this async, that means that the outer loop needs to have a way to inject additional input messages. And how do you do that? Well, in synchronous programming, you don't have a lot of ways to do select. Like if you want to say, I want to wait for whichever comes first of another network message or a timeout, you can do a read from standard in with a timeout. It's not super pretty and it's annoying to do it through libraries like Surday but it is possible. The other way you do it is you introduce a channel and then you clone the sender side of the channel and you give one clone to each thread and that thread is going to be blocking and doing the operation that you want to select over. This is going to be a little clearer if I demonstrate. So rather than have this loop be four lines in stdn, what we're going to do is we're going to do txrx is standard sync MPSC channel. What is sync channel? Oh, it's with a bound. Do I want this to have a bound? I don't think I want this to have a bound for now. And did I get the sender order right? Yes, okay, great. And then what we're going to do is four line in rx and then we're going to do, actually that's not what we're going to do. We're going to do a thread spawn. So we're going to have one clone of the sender for standard in like so. So this is going to be for input in rx. This is just receiving from the channel. We're going to do the step function. And then in this thread is where we're going to do the standard in work. That's probably not going to work because I'm locking standard in here. That might become a little annoying, but it's going to loop over standard in. It's going to parse out the messages and then it is going to, instead of do the step, it's going to do standard in tx send input. And if that fails, so if for example, the channel has been closed, then we want to return from this thread. And we now need p to be send. That is true because it's going to be sent along with the message. Yeah, the mutex guard can't be sent. That is true. That's a good question. What I'm worried about here is if I drop this and drop the standard in lock and take it again in inside of this thread, we're going to get into a weird position of there might be buffered lines inside of lines. The question is whether that is true. Buff is self. What's self here? Standard in lock is just a lock over the inner one. The buff reader is inside the mutex. Okay, great. That makes a lot of sense. So dropping this lines and dropping the lock is not going to mean that any buffer data from standard in is going to be dropped. That's the thing I wanted to check. So this means we can now do just reconstruct this in here. Drop standard in. We're going to have to drop all the standard in. This is going to be annoying. This is going to be standard in lines and this is going to be anyhow error. And p also has to be static. That's fine. Cause it needs to live in a different thread. Oh, actually if this consumes self then I don't need to worry about that. Great. That is even better. Standard in dot lines. There we go. And once this channel closes and the node has nowhere else to go, I guess we can be nice and we can wait for this thread. Unwrap. Uh, so when you join a thread, the first unwrap or the first layer of result is if the thread panicked, thread panicked and then the second layer is whether it returned an error. So now we have this main loop. Now the reason why this matters, like currently I haven't actually fixed the problem we were talking about, which is we want the ability to generate additional input events. And I think what we'll want here is actually the ability to say, hmm, it's going to be annoying, isn't it? Is I want to give away this TX handle to the node. So over here, I actually want to construct this first. And when I initialize the node, I want to give it to the TX handle, which lets it inject its own messages. And so now inside of from init, for example, we could just choose to spawn a thread that generates messages or events every so and so often like on a time schedule. And then those would end up being surfaced in this main read loop that calls node.step. And so this is now going to be just TX. And then we go to node from init. That's now also going to be passed a sort of inject handle, which is going to be a sync NPSC sender. And the question becomes, what does it send? We could say it sent messages of payload. It's a little misleading though, because the payload messages are sourcing source destination things, which is not actually what we generate here because there's no source of destination for a generated event. We could say that there should be. You could imagine you want to generate an event for saying specifically, you should now send a message to this node, but I don't think we want to represent it as message. Instead, I think what we want to do here is something like enum. Event, and the event is either going to be a message, in which case it's a message payload, or it's going to be a body, or it's just going to be a direct payload, which is payload. This isn't going to be serialized or deserialized. And what we'll do now is say that this is going to be an event. And we're going to say that step is going to take an event payload. So it may be an injected message, or it may be a, actually let's call this injected, I guess, or it may be an actual message like that we got from standard in or from the network. And so this is now going to be an event message here. And so now we can differentiate between injected events or injected payloads and actual messages we got from the network. And now we're going to have to go fix echo. It doesn't care about the sender. Sender of payload, right, event payload, which is then, yeah. And the input here is an event payload. And we can actually hear say, let event message input equals input, else bail, or this is really a panic. Got injected event when there's no event injection. We should be able to do the same up here. So this is going to be an event. This is also going to be handed one of these, but it's not going to use it. And crucially for our broadcast, we will use it. So this TX business over here is in fact something we're going to use. We're going to say something like inject is going to be this. What's going to be awkward about this is it's actually never going to exit because we're in the state. If you go back to look at the lib, we keep looping over our inputs for as long as there are messages in the channel or might be messages in the channel, which means that we keep going until all of the transmit handles have been closed. And one of the transmit handles is held by standard in and will be dropped when standard in is closed. But the other transmit handle in the case of broadcast is held by the broadcast node. So it's held by this node right here, which we know still exists because we're holding onto it because we're going to call it in the loop. And so this loop will never terminate for broadcast. So there's arguably a sort of when standard in ends, we might actually want to send a message saying standard in ended. And we can do that down here by saying that there's an additional event here, which is end of file. And we don't actually care about that result. And so now there's at least a way for the node to learn that it should exit. Great. And inject here is going to be TX. This is going to be event. And we're going to now match on input. And if it is a message, then we do what we were doing previously. But if it is end of file, then we're going to do something different. And if it is injected, then we're also going to do something different. And we'll figure out what this is. Now currently we're using the same enum for injected events and for messages. And it might be that we want these to be different enums. It's not clear that you're always going to inject a payload. Might be a thing that we want to do. I haven't decided yet. I'm almost certainly, almost certainly we're going to need to do that. Okay, so in that case, the question becomes, how do we inject a message in the first place? What do we do? Oh, why can't I, oh right, mute topology. Well, what we're going to have to do here is when we construct from in it, we're actually going to start a new thread. And this thread is just going to, this is going to generate gossip events. And we'll do something like, it's going to be a loop. It's going to be a forever loop. And it will do, actually we can do a gossip tx is tx.clone. And in fact, we don't even need to hold the injection in the node itself, I think. Because we're not going to inject anything except for through the separate thread. And this is, we're going to have to want to find a way to make this loop terminate when the node itself gets the end-of-file event. And we can do that with an atomic bool or something, but it's not super important right now. So this is just going to loop. It's going to do sleep. It's going to sleep from Millie's 300 milliseconds. And then it is going to tx send event injected gossip. And if sending fails, then it is going to break and duration needs to be imported. So currently this has to be now a variant of the payload. And that sounds nice because it means that all of our payloads are gathered in one place. But the downside of it is A, it now needs to be serialized and deserialized as well, which seems unfortunate. There's no actual requirement because it never gets serialized or deserialized. And the other reason this is unfortunate is because down here in our match, when we match on an event message input, when we match on the payload in here, we now also need to match on payload gossip, even though that can never come in as a message. And all this makes me think that we should actually have injected payload be a separate thing. And we can give it a default here, just so that for things that don't need to inject payloads, they don't need to specify it. Forget whether you can set this in traits. Yes, you can. A lot of nice generics here. So it should not be a separate thing. So it should now be the case that here, found IP expected this. That's because then, ah, this should be N, that's fine. From init, ah, this should be injected payload. And injected payloads need to be sent and they need to be static. Actually, do they? They need to be send because they need to send a static because they're going over a channel, right? Like the injected payloads, even though they're not necessarily going over or across thread boundaries, they are going over a channel and the channel requires that the types you send over there are sent. Okay, great. So if we now go back to Echo, Echo shouldn't need to change because we have injected payload is, has a default value of the unit type, but it will be needed here. And unique IDs, similarly, should only need to change because we use the turbo fish down here. But broadcast now, we can have it infer that too, but we're gonna have a enum injected payload. Which is gonna have gossip, nothing else. So this is now gonna be an injected payload gossip. Injected payload, and this is gonna have injected payload. And so now we should be able to hear match on payload and the only injected payload is gossip. And we don't need to change anything about the matching on message further down because we know that message will never include these. Okay, so what do we do when one of these gossips trigger? Well, at least in theory, all we should need to do is for n in self.neighborhood. We really want to do something like a, actually, here's another helper we can have on message. Is this business? Where's our message helper right here? Pubfn send, preference to self, and a w, which is an impul writer, or impul write, I suppose. Like so, like so, and like so, where payload is serialized. There we go. And so now this over here, we should be able just to reply.send. To output. And this is going to be replied to broadcast. So that's only now a lot nicer. And this is replied to read. And this is replied to topology. And I guess we could also do this serialized response message and this is right trailing new line. Right, so back to gossip now. What we can do is for n in neighborhood, we should be able to do message.send to mute output gossip. And we could here do with context gossip to n. Question becomes, how do we fill out the message? And the message is easy enough here, right? The source is self.node. The destination is n. We're gonna have to clone these. Just a little sad, but it's fine. The ID here is gonna be self.id. And so that means we're gonna have to increment the ID. It'd be nice if that wasn't as error prone if we forgot this. It is not in reply to anything. And the payload is going to be payload gossip. So actually there's a good idea, a good thing that we split up this event because we are going to want gossip and gossip okay messages. And those messages are actually gonna have to hold the whole data, right? So remember, we're exchanging which messages we've seen. So this is seen and this is seen. So these are messages that I have seen. These are messages that you have seen in the response. So what we're gonna generate here is, let's do the sort of stupid version first, self.messages.clone. Which is, I'm gonna just send everything that I have and what you're gonna send me in response is everything that you have. And what's also interesting here about gossip is we don't actually need to have responses. Currently the way that we've set this up is that, you know, when we were drawing this was that when A and B gossip, they do a sync, like A sends a bunch of messages and B responds with something. But it doesn't actually need to be a reply. In fact, maybe we should just get rid of the response here. Because it's sufficient to just sort of fire and forget them, which also means they don't even need IDs. This can be none. Because it doesn't, no one needs to identify the response. If it gets dropped, it doesn't really matter. And so here we could probably prune out anything that we know that the other side has. In fact, we can write this right now. So we could do, iter copied collect iter filter. I want to do that after the copied filter N. N-nose is going to be self dot known of N. Filter only things that are not, where not N-nose. Or we could do this better. We could say known to N not known to N. To N dot contains, this is M for the message. So we're gonna take all the messages that we know about and we're gonna filter it out so that only the ones that are not known to N, only the ones that are not known to N are sent. And now the question is how are we gonna update known to N? And we can leave that for later. It gets fine for it to always be empty for now. And just to check my logic here, it's always weird with Boolean operators. So known to N, we're expecting that we're gonna send all of the messages. Let's double check that that's true. Known to N is empty. Therefore, for any given message, the filter closure here is gonna return. So empty contains is false. This is gonna turn to true. So the filter is gonna return true. And filter removes anything where it's false. And therefore, all messages will be sent. Great. So whenever it's gossip time, we're gonna send a message to everyone in our neighborhood, telling them about all the messages that we have. And over here, when we get a message gossip scene, we're just gonna do extend scene. And we're not gonna reply. So that's all we have to do. Whenever someone tells us about it, we just add it to the set that we have. So let's see if this works. And we might actually not need message communicated anymore. I was adding that because over in this space, right? When I get a response from you, then I know that you have seen these. But that means that I need to remember those for when I see this response. But if we're not doing responses, then this doesn't matter anymore. And we'll see how well that works. Okay. So if I now run, what's the message they wanna use here? This, mail strum target debug broadcasts. See if it all fails. Notice that it only prints here the gossips. And only the messages that get exchanged between the Jepson, like the mail strum clients and our clients. And not what our nodes are telling each other. Tearing down, everything looks good. I'm curious here to see what this actually looked like. So let's head over here to our local host business broadcast. Let's see the message history here. Okay, so that's only the Jepson messages. History.edn. I wanted to see if our gossip messages showed up here, but it doesn't look like they did. Which is interesting. Hmm. Broadcast messages are very fast. The reads are slower because they need to transfer data. That seems reasonable. It's a bunch of topology messages. Ah, yeah. So here we see the gossip messages. Okay, so initially we sent all the topology got sent and we responded with topology okay. Then there was a broadcast, broadcast, broadcast, broadcast, broadcast. And then you see all of our nodes started gossiping. And you see the sort of seemingly weird interactions between them because they're just sending to basically random other nodes in the network. But you see these gossip messages start to get pretty long. All right, like down here we're sending giant chunks and they just keep growing. And we should see that then if you look at the timeline. No, I don't want the timeline. I want the latency raw. That's not really what I want either. There's sort of a view here that I'm looking for which is the performance of what I really want to see is like the delay between when a message is broadcast and when it's visible to all peers, which it doesn't look like it's recording here. This is the rate of broadcast but I want like the visibility delay which doesn't seem to, it doesn't seem to surface here which is too bad because what we should see is when the gossip messages become longer and longer that means that the send between two nodes, the gossip between new nodes gets slower. And as a result, it takes longer for any given message to propagate across the network. And as a result, the propagation delay for messages is going to be longer. So let's stop this one and say implement the naive gossip. And so now let's see if we can do a little bit better and actually this should be really easy. It's just when we receive a gossip message then we know that the messages that the sender knows about include all of the ones that they sent us. If A told me about three, then I know that A knows about three and I never need to send it three again. Get mute of this gossip from unknown node. And that's in theory all it takes for this to be smarter. So if we now run this, at what point, well, isn't that the broadcast rate? I think the broadcast rate is the rate at which the broadcast operation succeeds. And broadcast is a trivial operation because it doesn't trigger any work. We trigger all the work in gossip. So broadcast won't actually get slower. At what point will copied produce copies of values being iterated on? Will it copy those that end up filtered out? It depends whether you put dot copied before or after the filter. If you put it before the filter, then every value will be copied regardless of whether they match the filter because the copy happens before the filter. If you put the copied after the filter then only the ones that aren't filtered out get copied. The reason I put it before is because filter operates on references to the items being iterated over and so just makes it more annoying to write. And in this case, the copy is basically free because the references are to numbers. So there's not a huge clone. That succeeded. Let's go ahead and look at this and just verify that the gossip messages are in fact shorter now. This broadcast. Messages. If we scroll down pretty far. So there's still kind of long. I'm just trying to gauge whether they're getting shorter. Like, you know, we're pretty far down the tree here and you see this list of messages, for example, there aren't that many. Like it's not all the messages that have been sent up to this point. Like let's go maybe to the end here. Well, these are still pretty long. It's hard to tell whether this is because there are so many messages to exchange or whether it's because the logic isn't working. What we can do here is actually give ourselves a little bit more data and say, pop, pop, pop, pop. Let notify of, this is a hash set. I just want to print to see whether it's, whether it ends up eliminating any. Cause that's really what we're after here. Right? So there should be a store. Latest. Node logs and zero. Yeah, so it is only notifying of a subset of them. Let's let it run for a little bit longer and see how much it reduces. Stable latencies in the results EDN link has the time to propagate. All right, let's go look at that. I'm gonna let this run first. The reason it's destination and not sources because this is after calling into reply, which swaps them. So if we go to the end here. Yeah, so it's definitely sub-setting them. Right? I guess the observation here is that you will only eliminate messages that that other node has told you about in the past. And they won't tell you about ones that you already have told them that you know. So this is the problem we had that we described in the over here, right? Which is if I know that you know these values, I won't tell you that I know those values. So I think actually the thing we need to do here is we need to be a little bit smarter. And we need to say, I think we wanna add to notify of include a couple of extra messages to let them know that we know them. And so we're gonna say notify of extend self.messages.inter.copied. Trying to decide how I wanna, I kind of wanna pick these randomly, but then I need actual randomness. Let's do partition already known and notify of. Whoa. Hash set. And then what I want here is extend with already known, but I only want some of them. And the way that I'm gonna pick only some of them. Yeah, exactly. If A tells B, one, two, three, B will know that A knows one, two, three and will not tell A that it knows one, two, three. And therefore A doesn't know that B knows them, which is exactly the problem I'm trying to solve. And the way we're gonna solve it is essentially with like more of this gossip idea sort of of, I'm also gonna tell you a couple of extra ones, not too many, but just enough that over time you're gonna learn more about the ones that I have. And so we're gonna bring in here rant. And rant is version 0.8 now. And what we're gonna do is iterating a hash set could be seen as semi-random. The problem is it's not actually random enough because we want it to be different every time we send messages to someone. And the hash set random, there's only changes. It is random, but it only changes between iterations if the hash map gets resized, which doesn't happen that often. So I want prelude. I don't care too much about the performance here. Dot filter. And I don't actually care about the value. And instead what I'm gonna say is random. What is the, I want this. I want an RNG. And what I want is RNG.genBool. But I forget what the returnable with probability P of being true. 0.1. There's another argument here, which is instead of this being a fixed percentage of the total number of things that are already known, it should actually be, like we always send an extra 10. So what we'll do here is gen ratio. And I want a gen ratio 10 out of already known, Dalin as U32. And actually this shouldn't be 10, this should be 10.min. Oops. Because if there are fewer than 10 that are already known, I don't want this to fail. Okay. Let's see how that does. I assume it passes. It'd be weird if including additional things in the gossip made a difference. What I really want to see here is, yeah so now we see the gossips at the end here, there's basically nothing to exchange. That's more like it. And so if I now go up and do serve, we go over here and we go back, back refresh this one. And we go to messages. Then now the gossips as we go down should still stay pretty short. There's a point in the middle where there's a lot of gossips. But if we go farther down, like around here, like see here, these gossips, even another far at the end are relatively small. Some of them are longer, but it's random, right? Beautiful. Okay. And then someone said, if you go to the results EDN, stable latencies. Ah, worst stale. Okay, stats. What I really want is like a, yeah I guess maybe it's just in text form. Stable latency 861, 771. So the worst ones are like stable latencies. The 99th percentile of stable latencies is 771 for this one. If we go to our first implementation, stable latencies was 800. So we didn't actually reduce it by that much. I'm not terribly surprised actually because these executions are pretty small anyway. It only matters once the gossip gets very long. So the 99th percentile was about 800. And then as we started to make our gossip smarter, it went down a little bit when we made gossip, we made the gossip without the randomness. It went down to 777. And when we added the randomness, it went down to 771, so statistically irrelevant. Okay, yeah, I think I'm happy with that. Interesting. Okay, I think that's actually where I wanna stop for today. So let them know that we know them. Let's make this a little bit more clear and we don't actually need the notify here. If we tell N that we know, if we know that N knows M, we don't tell N that we know M. So N will send us M for all eternity. So we include a couple of extra M's. So they gradually know all the things that we know without sending lots of extra stuff each time. And you could tune the 10 here, right? Like, it doesn't have to be 10. You could say that we're willing to always include a sort of overhead of the message of at least 10%. So instead of 10, this could be notifyof.len as U32. It could be like, for example, whoa, that's not what I wanted to do. Like this is one way to do it, right? So say, oops, actually that's not what I want. Well, that is what I want, but I just don't want it there. Let additional cap is, I don't want the message size to grow by more than 10%, right? So the message size is gonna be basically the number of things we have to tell them about. And I'm saying 10% of that is as many of, and many additional things I'm willing to tell them about. So this caps the, it doesn't hardcap it at 10, but it hardcaps it as a ratio of how much I'm already telling them about. And this is 10%, and we could of course do something like we could do this in floating point instead. I think it's fine this way. The rounding isn't really gonna matter. We cap the number of extraneous ems. We include to be at most 10% of the number of ems we have to include, to avoid excessive overhead. Now I kind of want the print line back. That's fine. And so the idea is like, if you have to gossip 100 things to them, then you're gonna send 10% of 100, so 10 extra bits. If you're telling them about five things, I guess now it's gonna end up being zero, huh? That's not great either. Maybe 10% is too low. Let's go back here and look. What? Let's just show me the end. Some of them are still pretty short. So I think this is probably okay. You don't want the overhead to be a lot more than 10% anyway. So it still means that if you send 10, you're gonna send one extra. Okay, I think I'm happy with that. Let's stop it there. Make gossip, make gossip smarter. Sweet, I think that's all I wanted to touch on today. There are more exercises, right? If you go here, there's like fall tolerant broadcast, which actually, I wonder, just to see whether we already do this debug broadcast. Are we already resilient to network partitions? So this is basically, it's gonna make it so that certain nodes can't talk to each other for some period of time. And to see whether it still works. See what it says? Everything looks good. Okay, great. So the gossip protocol we have actually works even if there are network partitions, right? If there's a permanent network partition, it won't work. But as long as there's some path, the values will eventually make it there. As long as the topologies are updated so that you end up sort of bridging that gap. Nice. So we actually do 3C as well. And then there's gonna be broadcast efficiency, which we can get to later. Are there any questions about what we've done so far before I sort of end the stream here? Like, I think you could build on this and do the later challenges too. I just, I think this is a good place to end. I think we got to something pretty interesting. And I'll push this to GitHub too so people can build on top of it. No questions. Everyone thinks everything made perfect sense. It's amazing. All right, in that case, thank you all for showing up or for watching if you're sitting there at home afterwards. I hope this was useful. We'll see, I might do a part two of this with the later challenges. Don't know yet, but this was fun so far at least. And I like thinking about the super systems problems. All right, have a good rest of your Friday or Saturday if you're in Australia and such. And I'll see you all later. Bye, folks.