 Okay, folks, welcome back. It's good to virtually see you all again. It's been a while since we did one of these implementation streams, but I'm pretty excited. I like writing code and I get to do more of them. This stream is actually inspired by a stream I did previously which was implementing the fly.io distributed systems challenges. So we didn't do all of these, but we sort of started from the beginning and worked our way through the challenges. It was a really fun way to just get into distributed systems a little bit and think about some of the problems that come up. And I think we got, did we get through one and two and maybe partially three? So it was fun. There's a bunch more challenges and I thought about doing a sort of part two where I continue those challenges. But I was sort of hesitant because I also want to leave some room for people to do it on their own without sort of being led through it by the nose. And so instead of doing those, I was like trying to figure out something else to do instead. And then shortly after I did that stream, a company reached out to me and was like, hey, we actually make coding challenges that are intended for like learning where you build real things. And they were like, well, because I've been fairly public about not really taking sponsorships, at least so far that's not really a thing I've been doing. And they were like, we don't necessarily want to sponsor you. We just think you might find these interesting and also, here's a referral code you can give people if they sign up. So I went and took a look and they actually have a catalog of challenges that are build your own X. So build your own Redis, build your own Git, build your own whatever, and they sort of give you a test suite kind of thing for each one. And in particular, I saw that they have a build your own BitTorrent. And I was like, okay, great. BitTorrent is a fun distributed system. Let's go ahead and build that. Now there are sort of two ways to go about this, right? One of them is to follow the actual BitTorrent spec. It's not too bad, actually. I've skimmed it a little bit and I feel like you could probably implement it just from the spec. But I wanted to do it through this Code Crafters thing. So Code Crafters is the company that reached out to me because it has a sort of sequence of steps of like, do these challenges and roughly disorder and eventually you'll end up with a working BitTorrent. And it starts from like the just decode the strings that are encoded in this particular format that are used in Torrent files all the way down to like, okay, actually download a whole file with like difficulties and stuff. Now I wanna prefix this with, I have never used this thing before. I don't know if it's any good. It might be that we start going through this and it's just absolute garbage. But I figured it might be useful because if this works, then this is like a teaching mechanism that I really believe in. I've told you this before on stream that one of the things that I really like is learning by doing, like building real things. And this feels like a thing that encourages that and hopefully makes it easier. So we'll go through this and if we find the structure of this website too annoying, then we'll switch to something else. But otherwise we'll go through and see how it fares. Let me also send you, so should you think, this is interesting and verdict still out on this one. You can also go join them from there. But let's see what happens if I click start to building. Welcome to build your own BitTorrent. Great. I would prefer to work in Rust. Thank you very much. Yes, please. Language proficiency. I'm gonna go with advanced. Great. Next question. Every day, sure why not. Accountability, I'll pass. No accountability. Ooh, they give me a GitHub repo. All right, let's see what happens. Get cloned. This thing. Okay. No, I don't want the whole thing. All right, let's see what's in here. Push an empty commit. All right, all right, fine. I'll do the thing they want me to do. Ooh, okay, so you do a commit and when you push, I guess they run a test suite. Nice. Okay, great, I have a thing now. Great, I successfully pushed. Okay, Bencode or B-Encode is a serialization format used in the BitTorrent protocol. It's used in torrent files and communication to trackers. Bencode supports four data types, strings, integers, arrays, and dictionaries. Focus on decoding strings. Strings are encoded as length colon contents. Hello is encoded as five hello. Okay, this is pretty standard. Length encoding, implement a decode command. It takes a Bencoded value as input and print the decoded value as JSON. Be invoked like this. All right, let's, I wanna look at the actual code here before we just blindly accept it. Okay, source main. They have a sturdy Bencode. No, that doesn't sound any fun. Oh, okay, great. So the thing they give me is really just, there's a sample torrent file. There's whatever this your BitTorrent is that just is the thing to run. Okay, that's fine. It's fine. Oh, Rust 170. Okay, let's do, I guess we could pin our Rust up to 170. Nah, I don't wanna do that. It's fine. What do they have in cargo Toml? Don't edit this. Anyhow, bytes, clap, hex, reg, x, 30, Tokyo. Okay, this seems like a fine place to start. I'm not. Okay, so they give you very little. That seems totally fine. Main, args, if command is decode, otherwise who knows what else? Uncommon the spark to pass the first stage. Right, so this takes the second argument on the command line, calls this decode, and then prints the result. Okay, great. So now we gotta implement this thing. Oh, and they already did it for me. Why? I guess it's just to ensure that you actually have running code. Encoded value, so it erases over the characters, takes the first character. So if the first character is a digit with radix 10, that's a base 10 number, then find the first colon. This is not how I would do it. I wanna do it differently. I wanna do, some, I wanna do, len, and rest. Encoded value dot split once on colon, if let okay, len equal len dot parse. I don't want all these unwraps. Then, and they parse it as an I 64, which I think is probably a lie. Probably can't be negative because it's the length of a string. In fact, it's probably a U size. And now we don't need all this string offset indexing. So we can then just do return sort of value string, rest until len. Now, this is also a little weird because this assumes that there are no UTF-8 characters in there. So this is one of those places where I kinda wanna look at the spec. I'm going beyond what the exercise really tells me to do already, which is arguably not what I should be doing, but you know, where is the here? Strings are length prefixed base 10 followed by a colon and the string. Okay, so that's entirely unhelpful. When they say length, length prefixed. So length in terms of bytes, length in search of code points. Who knows? Who knows? All right. Fine. We'll just assume that this is basically what they mean. Can I write like cargo test? Is that a thing that they let me do? Cause that's what I wanna do. File contains an unclosed delimiter. Oh, yes, it does. And then I actually want this. Early return. Oh, I guess I can just, I think they do this. Okay. Here's how the tester will execute your program. All right. So I can execute it the same way. Right. This runs cargo run behind the scenes. That's fine. So it needs to build a string length I 64. Yeah, I know. I know. Yeah. I wonder whether it's different because I selected like advanced as my difficulty level. I wonder whether that actually makes a difference or not. Okay. It printed hello, which is the correct input. Okay. That's fine. What I'm also curious about here is, yeah, this is a bite offset, which I don't love, but it's fine. We'll leave it, we'll leave it the way it is. So I guess if I go back here, what do they want me to do? I guess they want me to push. Okay. So Ben code, decode strings. Good push. Aspected blueberry a standard output. Oh, it's because this print line has to go away. Let's make it an E print line maybe. Wonder if that's okay. That's interesting. Debug output on stdr. I wish they had that in the code itself. Like this should not be a change I need to make. Ha, compilation successful. Grape. And logs from here will appear here. Test passed. Beautiful. All right. So we go back. Should this, will this now have you next stage? Okay. In this stage you'll extend decode to support Ben encoded integers. Integers are encoded as I number E. Why? I guess E for end. I don't want to use the library crate. That's no fun. Okay. So here's a good question. I guess I can do encoded value zero is equal to I. I want that branch first because this one has to split and then parse. So if the first letter is I, then it's easier to figure out what we do. Then we go and I guess actually here's what I want to do. If let some rest equals encoded value.strip prefix because that way I don't have to index afterwards. Then if let, if let some digits I suppose is a rest dot split ones on E. And I don't actually care about what comes after the E. And then if let, okay. And is digits dot parse. This time it is an I 64 because it can be negative. Then return this. Now this is obviously not ideal. I don't love this, this, I don't love this nesting here really. I think even I can just do into here. I'm pretty sure that's supported as I'm tempted here to change this one a little bit because what I want, I guess these two conditions are actually exclusive. So what I'm worried about is that we go in here, we don't find an E and therefore we might have to take this case but I don't think that's possible. I think if we go in here then this really is the only valid pattern. And so if we don't hit any of these ifs then we're gonna hit the panic. The nested ifs here are a little sad. I wish I could avoid those but I'm okay with leaving this here. I could also implement this whole parsing in num but it feels maybe excessive. So if I now run, I guess I 25 E. Okay, I minus 25 E. Okay, that seems fine. And if I don't have an E then it panics because it says unhandled encoded value. Beautiful. All right, parse ints. Yeah, so the other way we could do this is we could do and then, right? So I could do if let some n is .strip prefix .and then digits. But I don't know if this is really better. Right? Wait, no. That's not what I want. I want and then so this is gonna be, I ignore the first and then I take the rest and then I do, I mean, we can look at it and see whether we actually think this is more readable. I don't know that it's going to be but we can try. So that's gonna be this and then digits.pars.ok so that it turns into an option and then I guess, yeah. Is that better? I don't know if I think that's better. Where did I mess this up? Oh, strip prefix just gives you rest, right? It is true that now it will fall through if it doesn't go through these. I could all go to the other cases. I suppose it's nicer. I mean, we can do the same thing here, right? So we could do and then if we do actually think it's nicer. So lend rest and then we do lend pars. Okay, if let's, well, this one's more awkward because we actually also want to keep the rest part here. So this has to return lend and rest. And so here, well, we're gonna have to go let lend is this and then some lend rest. And the reason we have to do that is so that we keep the length here as well. To find out, go back and do five hello. Great. And if I do like four hello, then just skip the O. Yeah, great. I don't know if that one's nicer for the string case actually, especially because we need to have this block in the middle here. So I think that one I actually want to keep the way it was. The digital coding I think is okay. Avoid deep ifs. Get push. Amazing. In progress, no more completed view next stage. In the stage to extend decode to support been coded lists. List learn coded as L. I see. So this is, okay. So this is where we get into nested handling. So I actually think I want to change this a little bit now. I want to match on encoded value. I forget whether you can match on strings this way. So with slices, you can do this. But I don't know whether you can do the same thing for strings. Didn't complain about it. But that might be my Rust Analyzer being broken. Yeah. Pattern cannot match with input type. That's awkward. Can I do like this? Cause that's a range. I don't think I get to do this. I guess I can just do, is there a split first? I don't think there's a split first. Yeah. That's fine. All right. Well, we'll match on zero. And then we'll say, okay. If we get I, if we get L, and if we get, I guess any, this is really anything. So the string case is anything between zero and nine. Chai is very sad. I don't know if you can hear her. Are you very sad? Yeah, come here. Why are you so sad? Do you want to say hello? Are you so disgruntled? Are you a disgruntled employee? Yeah. I'm sorry, I couldn't help you. But see, this is where it makes me sad because arguably the strip prefix, like we know that the prefix is I. That's fine, I suppose. So then this will be a sturdy JSON value of number. And then for the L case, we'll do, or rather for the number case, we'll do here. If some n is this, we'll return n. We could actually do the same thing here. We could do dot in two. The reason I want to match here is just so that this is more like a table you can cross reference more easily. So what do they say with L, it's encoded as L and then a bunch of valid elements and then E. Okay, so this is where it gets trickier, right? This is where arguably you want an actual parser because we're gonna have to recursively call decode been encoded value, which I think means we're actually going to have to, we're going to want to either take a mutable reference to a string here. And so we could move the string along to be past everything we've parsed or we could have it return a value and a stir to say, okay, I parsed out this value. This is the remainder of the encoded value. Either of those are okay. Let's maybe go with this one for now. And this is the same kind of thing you see with if you use a parser combinator like NOM. What you end up doing is you pass in an input and it gives you back the thing you parsed out and then the remainder of the input. So we can do the same thing here. So this is then going to be rest, rest. And then really we want to pass that rest. So everything that comes after the E we'll do here. And now this gets into the ugly and rest. And then here we'll then do this and Len plus one and onwards like so. And once we have that, then this now becomes, oh, I guess there is a split at, isn't there? Encoded value dot split at. Yeah. And is this for split at the two slices returned, go from the start of the string slice to mid and from mid to the end of the string slice. So I think that means it has to be split at one. And then I can do I and rest and I can do L and rest. Now where this gets awkward is that split at doesn't know that it's returning a single character. So this won't work. It'll have to be this. And I don't know if you can do this. I don't know if you can have a range of strings. I guess we'll find out. All right, so at this point, that means we no longer have to strip that and we no longer have to do this. And so now for the rest here, I suppose what we'll do is we can't actually search for the E because you could have nested elements like nested lists, for example, where so you can't look for the first E because that might terminate an inner list. You can't look for the last E because you might be an earlier E that we could use. So I think what we'll do is actually just we'll just call the decoder over times. We'll do like let mute values is back new. And then we'll do loop. We'll do, actually, I suppose we'll do while rest. So we'll make this a mutable string reference while not rest dot is empty. It's not gonna be empty actually. It's gonna be while not rest is empty and rest zero not equal to E. So either actually, realistically, we could also not have the empty clause here, but we don't want it to panic if we have an endo string or rather we would want it to panic with the unhandled encoded value. So we want to here sort of terminate the loop cleanly and then what we'll do is as long as we see that there are still values, we'll do let V and remainder is decode and encoded value of rest. And then we'll set rest equals remainder and we'll do values dot push V, right? So we call the decoder. The decoder tells us both what value got out and what the remainder of the string is. Then we'd stick that there. And then at this point, we then do rest dot is there a shift? Cause it's, I wanna remove the, I guess I can do return values dot into and rest from one and onwards, right? Because the one is gonna be the E. Sweet, see how that works. So I guess if we can make up one of these, so L for start a list, I for a number 25 E. We could nest the list. Then we could have a three-link string foo. There's no need for an end there. And then we could have another integer, which is minus 43. We could end that and then we end the inner list and then we have another value like five hello. And then we end the outer list and see what that does. Oh, and this thing down here would have to be dot zero because that's what we actually want out. String indices are ranges of U size. I could just do and rest starts with E. If it's not empty and it doesn't, and it doesn't start with an E. Now, yeah, I was worried about this. See, one option is to do the split at and then to map the first character into a map. So the string of length one into a character. I suppose I can do lead and rest, or if you will tag and rest. And then I could match on the tag and I'll keep this mute for the list case. And then now we just need to map and now we just need to turn the one length string into a character. And I think we should just do let tag equals tag.cars.next.unwrap. Or this can actually be an expect. Split at, I guess split at probably panics if, yeah, the split at panics if mid is too far did not panic. So tag is length one, tag equals tag 17. All right, this no longer needs to have an end then because the rest of the outset here is not an option anymore. And it's not exhaustive pattern which is fine do nothing in the case of unknown here. Did not like that one. Unhandled encoded value, three colon foo. So that's here, that's in the nested list case. Unhandled encoded value. So this is where we gotta start to give ourselves a little bit to go on here. So E print line tag is tag, tag is L, tag is I, tag is L, tag is three. So why is it unhappy about that? So it goes into this case and that's where it breaks. E print line rest, Len is Len, rest is rest. I could arguably just use the debug macro here would probably have been easier. Len is empty. Okay, so we, oh, it's be, damn it. Okay, so actually I take this all back. We don't want to split out the tag because we need the tag is part of the length of the string. So I actually take this all back and this has to be, we don't wanna split. We want encoded value dot cars dot next. And then we can do some of this, some of this, which also means now we're gonna have to, we're gonna have to skip the value again, which is a little awkward. So rest now indeed has to be rest dot split at one, not one. And I guess here we do let mute rest is, this is actually now encoded value. This is gonna be rest dot split at one, not one. And this is gonna be encoded value, no encoded value split at one, not one. See how it likes that. No field rest, 14. Right. How does cards next differ from zero? So you're not allowed to index into a string. You can't do encoded value zero. That's not a thing that you're allowed to do. If I do this, it's a string indices are ranges of U size. So you can take a slice, but you can't take a single character out of a string that way. And it's because it's UTF encoded. So the indexing into here is not character index, it's byte index. And an arbitrary byte index into a UTF encoded string is not necessarily a valid character. Like you might be in the middle of a U16 encoded character, for example. And so that's why you can't do this. Okay. So we still get into this case. Oh, that's because here we don't wanna split at is why. Great, Lenis three, rest is this. So Lenis three. Rest is foo, so it also eats up the I for some reason. Is that because we, so for the string case, oh yeah, that's because this actually needs to be not plus one. Great. So now we get back a list that parses out all our values. Great. And now we can get rid of this E print. Pars lists, good push. So what it does, amazing. Next stage, extend decode to support been coded dictionaries. Dictionaries are encoded as D key value, key value, key value, et cetera, and followed by E. Great. So this hopefully should be a trivial extension of L. Right, so we say, great, we should also handle D. This is going to be a hash map. Or I think technically a B tree map because I believe that's what Sirty Jason uses internally. So we skip the D while we haven't accounted an E. Decode the key, decode the value. Values insert, I guess this is a dict in Sirty Jason speak, no, in Bencodes speak. And then we set the rest to be the remainder after parsing out both the key and the value. And then we turn the rest at one because we want to also skip the E. Okay, let's see how that does. So if I do, now the whole thing is going to be a D with a three bar as the key and this list as a value. Oh, right, import from B tree map is not implemented for value. So Sirty Jason value object. Interesting. So what is the map? Okay, let's just do Sirty Jason map new. Can I do this? Is it going to be mad at me or is it going to let me do it? Right, now I don't need the B tree map anymore. 39, it requires that this is a string. Okay, so this gets a little bit awkward. So I guess if, I think what we want to do here is match on K and if it's a Sirty Jason value string, then, because in Sirty, sorry, in Jason keys have to be strings, right? So the decode that we do here could be any arbitrary type. Like for example, the key might happen to be an integer or a list or something or another dictionary. And in Jason, you can't represent that. So what we want to do is something like here panic. Really this should use like proper error handling. One of the reasons I'm not doing that is because I think realistically, I'm going to swap this entire thing out for Sirty Ben code the moment we get to implement a more interesting part of the torrent. So this is sort of temporary parsing code, if you will. Dict keys must be strings, not great, so that worked. But if I tried here to use say a I 42 E, Dict keys must be string, not number 42, nice. So now we can do parse Dicts, get push. These are indeed very easy. It's funny when we first started this, like these first few steps were listed as very easy and then easy. And I agree, I think this is straightforward. Now we get to the fun part. Okay, you'll parse a torrent file and print information about the torrent. Torrent file also known as a meta info file contains, oh nice, they link to the spec that makes me happy. Can is a Ben coded dictionary with the following keys and values. Announce info, length name, pieces, lengths and pieces. Torrent files contain bytes that aren't valid UTF-8 characters. You'll run into problems, you've tried to read the contents of this file as a string, use U8 or VQ8. See, okay, this is where I wanna now switch over to not using our own decoder, but instead use the CERDI implementation that they do. Because I think the actual implementing this in a way where we can get nice structs out of it is not the thing I wanna spend time on. It's not that it's not interesting. It's just not the thing that I wanna spend time on. So if we go to cargo.toml, do I have CERDI with derive here? I do, so use CERDI deserialize. And then I can do a struct where they say the keys are announce info and these keys. Let's grab, in fact, I'm guessing there's more in here. Announce info, so torrent. So announce is a URL. Did they give me a URL library I can pull in here? Yeah, probably. CERDI URL encoded. Maybe request URL, beautiful. And we'll keep the documentation from the spec in there as well. I like to have the spec text more easily available to me, which this would do. And then we'll derive debug, clone, deserialize. And then info dictionary. So we'll have a struct info, name maps to a UTF encoded string. Great. Piece length, piece length. Oh, no, I guess I can close that one. Yeah, so I think it's just length. Piece length maps into the number of bytes each piece of the file is split into. I guess we can also here just look at the torrent file. This is D8. So announce, I guess now we can read this format, which is nice. So if I look for info, info has the keys length, right? This is length as a key with this value. Then it has name. It literally is piece space length as the name of the dict. Wow. All right. I mean maps to the number of bytes. Okay, so this is gonna be a cert a rename equals piece length. That's wild. That's a chaos move to have a dictionary whose keys have spaces in them. Pieces maps to a string whose length is a multiple of 20, wild. Okay. So this is, it's subdivided into strings of length 20, each of which is the sha one hash of the piece of the corresponding index. This I gotta see. So pieces here. Oh yeah, how about that? Pieces is a 60, a 60 length byte string that is gonna be subdivided into substrings of length 60. Okay, of length 20. This one we can also do a little bit better. We can actually split this when when deserializing. So we'll get to that in a second. There's also a key length or a key files, but not both or neither. Okay, so then we can use a certy flatten to say keys, and that's gonna be a keys. And then, so certy flatten here. Let me, I'll show you in a second what that does. So we're gonna do enum keys. And this one is gonna use certy untagged. I'll explain that one in a second too. So there's either single file, which has a field length, which is a U size, or there's a multi file. I guess instead of keys, this should arguably be files. But for the multi file case, in the single file case, length maps the length of the file in bytes. We'll stick this one up here. If the length is present, then the download represents a single file. Otherwise it represents a set of files, which go in a directory structure. The multi file case treated as only having a single file by concatenating the files. For the purposes of the other keys, the multi file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files maps to and is a list of dictionaries containing the following keys. So that suggests that there's a files, which is a dictionary, which means it's just a type for us, like types or dictionaries. File. And then I guess we're gonna have a struct file. Okay. And we have a struct file, which has length. Okay. U size has path. Oh no. Which is a list of UTF-8 encoded strings corresponding to subdirectory names. Okay. So it's a VEC of string, the last of which is the actual file name. A zero length list is an error case. Okay. That's fine. In the single file, this single file case seems very specialized. Okay. Let's, let's tidy this up a little bit so that it actually, you know, gets formatted. Great. Let's see if we can make any sense of this. So meta files are, we can do this. A meta info file. It has a URL and has an info. An info, they don't really say anything about here, but maybe the, this one says, nope. Okay. Info. Oh right. Name is a string. Disappeared in the comment. The name key maps to UTF-encoded string, which is the suggested name, suggested name to save the file or directory as. It is purely advisory. This is the number of bytes in each piece the file is split into. For the purposes of transfer, a file is split into fixed size pieces which are all the same length, except for possibly the last one, which may be truncated. Piece length is almost always a power of two. Most commonly to the power of 18, which is 26K, that's fine. Pieces, maps to, all right. So this one is just batshit insane. So this is actually, we have to make a decision here on how we wanna decode this because we could just decode it into a, a Vecu8 or we could write a custom Serdi deserializer that will turn it directly into something like a Vecu8 of 20. That's not too bad actually. So the way that we would do this in practice, yeah, I mean, why not? That seems like fun. Serdi RS. So if we do implementing deserialize, deserializing a, no, that's not what I want. I want, I think it's just this guy is all I'm gonna need. So we'll do, yeah, we'll grab these two up here and we're gonna have, I guess this is like a hash stir visitor. And the value it's going to produce is a Vecu8 of U8s each of length 20. A byte string whose length is a multiple of 20. And the only thing we support here is visit bytes and visit borrowed bytes. And notice it can't be visit string because this is not necessarily a valid UTF-8 string. And so now we're gonna do if v.len, module of 20 not equals to zero, then we can do a, we can return Serdi error. Oh, I forget what the E custom is the one I want. Or E, yeah, I guess custom. Format, just to be a little helpful, length is v.len. And so now we should be able to say let mute values is a Vecnu. And in fact, we should be able to do with capacity here, which is v.len divided by 20. And we're gonna return okay values. And then there's a v.chunks, that's what I want of size 20. Returns an iterator over chunk size elements of the slice at a time starting at the beginning of the slice. The chunks are slices and do not overlap. If chunk size does not divide the length of the slice, then the last chunk will not have length chunk size. See chunk exact for a variant of this iterator that returns chunks of always exactly chunk size elements. Okay, so chunk, chunks exact 20. Returns an iterator with chunk size elements of the slice at a time. If chunk size does not divide the length of the slice, then the last up to chunk size minus one elements will be omitted and can be retrieved from the remainder function of the iterator. Perfect, great. So this one now, I think we should be able to just do, in fact, we could just do collect, right? At least in theory, I guess it depends on what exactly this returns if I go to chunks exact. No, it's not nightly, it was stabilized in 131. The question is what it produces. Where's the implementation of iterator? Oh, it gives me a slice. That's lame. Chunks exact should be able to give me a reference to an array. Is there an array chunks? V dot. I don't think there is an array chunks. There is a, oh, there is. Yeah, but that one's nightly only. See, so to do use array chunks when stable. That makes me sad. What's the source of array chunks? Oh, that's entirely unhelpful. Okay, fine. So we have a couple of options here. I think what I wanna do is actually a, we can just do a map of slice 20 and we should be able to do slice 20.into. I guess it's like a, it's really trying to expect. Guaranteed to be length 20. And we can do the same thing here. Or actually we don't even need visit borrowed bytes here because we're not gonna do sub slices in the first place. So now that we have this visitor, what's the, it's like a here, yeah, implement DC relies. So now we can do this, implement DC relies for, I guess I technically want this to be a struct hash, or hashes, which holds a VEC of U820. Hashes visitor, and then this now is gonna be an okay of hashes, this. And now we can do hashes, DC relies bytes, hashes visitor. A little bit of customs already here, but what that lets us do is now we can say hashes here. And now we know that this will actually give, it will only DC relies things that are actually multiple of 20, and afterwards we can just index into them directly as U820 arrays. So this now should be able to say, each entry of pieces is the shot one hash of the piece of the corresponding index. And now keys, there is a key length or a key files, but not both or neither. So certainly flatten here really means treat this enum, or treat this type as though it was here. So basically ignore this keys, like this key will not be represented in the serialized format. Instead just inline this definition as though it were placed here. And the reason we wanna do this is because we want an enum and you can't really put an enum in here, at least you wouldn't want to. So that way it's like these fields are as though they were placed directly here, but they let us express it through a separate type. And then the enum keys here is sort of untagged because there's no tag that tells us whether we're in the single file case or the multi file case, right? There's no like tag equals single file. That's not a thing that's there. It's just, it's a single file if it has a length field. And if it doesn't have a length field then it should match this kind of type. And that's what Sirty Untagged does. Now Sirty Untagged comes with some performance implications like we looked at in the D-Crossing the Sirty Crate stream, but basically it has to try D serialized twice, once for the single file case and once for the multi file case. So it's a little more slower but it at least lets us express this pretty nicely. If length is present then the download represents a single file. Okay, so we can stick that over there. Okay. If length is present then the download represents a single file. In the single file case, length maps to the length of the file in bytes. Okay, so this goes here. Otherwise it represents a set of files which go in a directory structure. For the purposes of other keys, the multi file case of the other keys. Oh, for the purposes of the other keys as in these ones like piece, length, pieces, et cetera. For the purposes of the other keys in info, the multi file case is treated as having only a single file by concatenating the files in the order they appear in the files list. So this is actually, so we actually do need the order here needs to be preserved because it has the pieces come in the order that they appear in this files list. Okay, so this is a vector of files. The files list is the value files maps to. Okay, so basically they say nothing useful there. So file is the length of the file in bytes and a list of UTF-8 encoded strings corresponding to subdirectory names. Okay, names for this file, the last of which is the actual file name. In the single file case, the name key is the, wait, there's a name key as well? All right, where is our spec? Let's make this a little larger. Thank you very much. Length, path, but which name key? Oh, then this name key. Okay, so the top level name key of info. Great, so that means this really is documentation on name. In the single file case, the name key is the name of a file. In the multiple file case, it's the name of a directory. Ooh, what did I do? Great. And I guess we will derive, I actually wanna move this hashes thing somewhere else because it's in the way. Let's do 35 mod hashes and then I can grab this and this and stick them in mod hashes and then I can use hashes, hashes. And then we can derive this on all of these ones. And then now we can say, I guess, now the question is what do they want us to do with this in this thing? Info followed by sample.torrent. Okay, so time to pull out clap here as well. I can never remember the overall structure for clap, so I always just do this, which is fine. Ooh, that's not what I meant to do. So instead of like, and currently we have this if command is equal to decode, I wanna do a little bit better, which is going to be, we want a, where's the sub command example? I want the derive reference sub command. Ooh, I just want the, I guess there's, maybe the tutorial is nicer, derive tutorial. Okay, so we want the clap parser and we want clap sub command. And we wanna derive sub command for this, which has decode, which is just a value, which is a string. And we now also have info, which has a torrent file, which is a path buff. And we don't take any other arguments and then it's just what command sub command of command command. Great. And now if I go down to my main, instead of having this be just random string parsing, I can get rid of my stdnv here as well. I can do match args.command and if we get decode, then we'll do the same thing that we previously did, right? Which is we do certi.v is certi.json.value is certi.bencode. So this is where we're gonna start using then the, that certi implementation of Bencode that they gave us, value, unwrap, and print line v. If we had a torrent, we don't know what to do yet. Now just to see that this still works with our old, like decode things. Oh, right, hashes should be pub. To use certi, deserializer, and deserialize. And I did something silly here, which is this should be self clone, right? So this is hashes, which should really derive debug and clone. Quest URL, deserialize is not implemented. All right, we'll keep this a string for now, that's fine. URLs are strings. Command cannot be formatted with debug. Okay, that's fine. We'll derive debug for you. Amazing. Right, and this is the one that we intentionally broke because integers can't be keys, but we can do, you know, three bar, valid type byte array, expected a string key. Ooh, also there's a whole lot of things that it's upset about, about these structs. Oh, that's because we never parse one. Okay, so I guess we can set this code up too. So here we're gonna do mute f is file open, torrent.unwrap. We can stick in some anyhow now to, I guess, just to make this a little bit nicer. Anyhow result. And I also wanna bring in use anyhow context. And so now we should be able to do here.context. Open torrent file. And then I should now be able to do t is of type torrent is sturdy Ben code from, oh, they only give from bytes. That's sad. Fine. Read instead, read torrent file from bytes torrent, torrent parse torrent file. Okay, so that should give us a torrent so we don't really know what to do with it yet. And this whole thing can return okay. By now, it still says all of these fields are never read and unused variable. That's fine. That's just noise, but this one's interesting. So invalid type byte array expected a string key. Give me a rust back trace equals one line 87. That's interesting. Did I do something weird in this one to make it not valid lists? That should be fine. List of foo, I 43 EE. That's a valid type byte array expected a string key. This is a dictionary with a key bar and the value as a list here that ends, that's an inner list, inner list ends here, outer list ends here, dictionary ends there. Wait, so why is this invalid? Just to see that I'm not doing something obviously stupid. Okay, that works. Invalid type byte array expected a string key. So for whatever reason, the CERDI-BEN code thing they have isn't happy with this input. I think their decoder is wrong. No, but this is the decode. So someone pointed out in chat that this is because of from bytes versus from string, but this is the decode option. So this is not the torrent files. The tarred files we read as a U8 and then from bytes, but we're not executing this code path right now. We're executing this code path, which is just from stir from the value that we get in on the command line, which is this value right here. And this value is just straightforward like, no, it's the B prefix doesn't do anything. Dictionary item keys must be alphabetically sorted. Yeah, but there's only one dictionary key and that's bar. So it is alphabetically sorted. All right, let's try a simpler version of this. So let's just do D3 bar I25E. So that's the key is bar, the value is 25, and that's the end of the dictionary. What if I just do three bar? Any valid JSON, invalid type byte array expected any valid JSON value. I think I know why this is actually. Bet you if we go crates.io, crates 30-ben code. Get up. I wonder if someone's reported this as an issue. Oh no, oh no. So already flattened, isn't gonna be working, is it? A byte array. I wanna see if anyone is, this seems like something that might come handy in a second. I wonder if we look at the source here. D, tuple variant, struct variant, enamaxes. Deceeralize parse in. Yeah, so here if you squint at it, you can sort of see the remainders of the code that we wrote to parse the integers and stuff, right? You know, I bet you that yeah, it parses, this crate parses anything that has zero through nine as bytes, not as strings. Just always. So when it hits, do not delegate this to de-serialize any because we want to call visit stir instead of visit bytes on the visitor to correctly support adjacently-tight enums, right? So I think what happens here is that when you try to de-serialize into 30 JSON value, 30 JSON value doesn't tell you which type you're supposed to get back, right? Because 30 JSON value is supports both bytes and strings. So what actually happens is when the de-serializer runs, it doesn't know whether it's expected to produce a string or to produce bytes. And then it just airs on the side of producing bytes. So I bet you it actually works if you actually have a type here because when you try to de-serialize into this type, then we know that this key is supposed to be an actual string. And similarly for the value, we know it's supposed to be an actual string. And so then the de-serializer knows to de-serialize into a string rather than into bytes. But when we try to do it into 30 JSON, then 30 JSON doesn't dictate what we should do. Although it should because the keys, the keys of maps are known to be strings. Hmm, that's fascinating. I mean, I guess we can do our torrent to see what happens. Oh, that does nothing, does it? Okay, but it didn't error either. So print line, I guess E print line, see what it got out of that file. Okay, so it actually parses the torrent file correctly. So I think the problem really is trying to turn this into just an arbitrary JSON value. I think that's what's going on. Which is pretty frustrating because that's what the earlier exercises tell us to do. I guess we should just do like un-implemented, right? I guess I could do like object to require that the top level is a map string, 30 JSON value. Oh, right, so that it's okay with. So if I tell it that it's a map, what about my bigger fancy one? That it's not okay with. So I think the actual problem here is that when deserializing into 30 JSON, then it will always try to parse strings as bytes and that just doesn't work. I don't really know what to do about that, but it's also not important because this is just the decoder they're using for earlier exercises. So what if we just do like to do or I guess un-implemented 30 Ben code to 30 JSON value is worked. Right, what were we supposed to do? Well, what was even the challenge here, right? Like it's just parse the torrent file and it asks us to output this stuff. Okay, that's easy enough. Now that we have it all parsed, no, wrong thing. Okay, so we want to print line tracker URL which is just t.announce, right? And then we also want to do length which is supposed to be t.keyz. So I think they're probably focused on the single file use case, right? Also is there a dark mode for this? No, oh. The length of the file. So if let keys single file is t.keyz, then print out length else to do cause we don't know what they want us to print there yet. So if I now try to give it an info file, t. Oh, did I make this files instead? What do I make this? Oh, no, it's .info.keys. Yep, track URL and length. And that's what it expects. That's indeed what we print. Okay, parse torrent, torrent file, get push. Application didn't terminate correctly. Oh, so they still call us with the old test cases I suppose. I mean, all right. I guess what we can do is, why is it not a happy with this.rs show? Is that the last one we had? This one, you're gonna hate this. Get rid of this. And then we'll take our little comment that this is broken. So keep our manual in pull two. Yeah, I know this is terrible. I know it makes me a terrible human being. So then we do let v is decode then coded value .0 and then print v. And now I can go back to my old decode probably working. Yep, restore manual in pull for old stages of tests. Get push. So what it does, that seems promising. All tests ran successfully, fantastic. Okay, next stage. Info hash is unique identifier for a torrent file. It's used when talking to trackers or peers in the stage you'll calculate the info hash for a torrent file and print it in hexadecimal format. Extract the info dictionary from the torrent file after parsing, okay, we already did this. Ben code the contents of the info directory and calculate the sha1 hash of the Ben coded dictionary. Okay, so this suggests that we will need to be able to not just deserialize, but also serialize. Not so important for the torrent, but we might as well. This also means that we're going to have to implement serialize for hashes. Where's my sort of implement serialize? Luckily serialize is a lot easier. Implement serialize, this one. Don't be afraid of like making wrappers for sort of. It's actually not that bad, especially once you get used to it and it can give you much nicer types for parsing. So we want us to do self.zero.lens, so 20 times that because we want to serialize it as all the bytes basically flattened, right? And I wonder whether we could, wonder if the 30 flatten would work on VEC, probably not. And then I suppose what we want here is it's a little awkward because we're kind of serializing them one at a time, which would make me sad. But I suppose that's really what should happen here. So we're going to do self.zero.iter.flatten, right? That's really how this is supposed to be serialized, right? Like it's a, oh, I guess actually no, it's supposed to be a string, huh? It's supposed to be encoded as a string and not as a list. So we actually want serialize stir, but no, serialize bytes is what we want, which is awkward because serialize bytes wants a single slice. So I guess single slice, well, we'll do it the stupid way. Self.it's going to be a VEC with capacity 20 times self.zero.len. Isn't there a, I think we can actually just do self.concat. I want to see whether that is true. So I think we can do self.zero.iter.concat, yeah. And then single slice. See if it lets me do this. Yeah, nice. So concat is just, you can call it on a vector and it takes all the elements of the vector and basically iterates over and collects. So it's sort of like a iter flatten.collect type thing. So you see the example here for hello and world. For example, if you have an array of hello and world and you'd call.concat, what you get back is those things concatenated together, which is indeed what we want here. So now we implement serialize. I'm actually interested if we go back to the spec, like what does info hash, the 20 by char one hash of the Ben coded form of the info value from the meta info file. This value will almost certainly have to be escaped. Note that this is a substring of the meta info file. The info hash must be the exact, the info hash must be the hash of the encoder form as found in the dot torrent file, which is identical to be encoding the meta info file, a B decoding the meta info file, extracting the infodictionary and encoding it if and only if the B decoder fully validated the input, such as key ordering, absence leading zeros. I see. So this is really saying the info hash should be after you validated that the input is actually accurate. And such as, for example, for path, like this path should never, the number of strings here should never be zero. Key ordering and absence of leading zeros. Keys must be strings and appear in sorted order, sorted as raw strings, right? So there are a bunch of these requirements that we could verify. And I guess I wonder whether the CERDI-BENCODE implementation will do that verification for us. Cause you could imagine that it doesn't do this, right? That when it deserializes a dictionary, it's fine with the keys being in non alphabetical order. That could very well be. It'll make me sad, but it could be. So just the fact that we were able to deserialize into this does not necessarily mean we're able to deserialize out. I also worry that the serialize implementation, I wonder whether it actually walks the fields in, whether it's gonna serialize the fields in alphabetical order the way the format requires. Like if we look at CERDI-BENCODE, sir, serializer. And I wanna look at serialize map. Serialize map, serialize key, sir.intvec, values are not allowed to be empty. And then it pushes into entries, the key and the value, end self.endmap. And I'm guessing endmap sorts the keys. Okay, great. So it does take care of sorting the keys for me when I write them back out. So I guess the argument here then is that if the file that comes in, let me just manually do this, I guess, because it's being annoying. HTML. Let's go ahead and do background black. No, let's do two to two. Ooh, I guess they have a color here, huh? What, where is this background color coming from? Here, go away. And then let's do color. DDD. Dark mode. So one of the advantages, I suppose, of deserializing the torrent file and then serializing it again before hashing it is that we ensure that even if the torrent file had say dictionaries whose keys were in the wrong order, when we serialize it again, we know we'll be serializing it with the keys in alphabetical order, which means that we're more likely to get a hash that matches what everyone else has because they've also sorted their keys. All right, I think that's okay. I think this also doesn't need to be fully exact, so I'm okay with this just doing its own thing. This is also, I can't make this one dark as easily. Oh, maybe I can. Let's see, this is gonna be annoying, isn't it? Background color? Yeah, I figured. This one maybe. Background color. Black. Bulk. Aha! Okay, let's make this one 2-2-2, because this one I mostly don't care about. And let's do this one. Aha, great. So now I can do here, I can say background. 3-2-3, and I can also maybe do color. DDD. Oh no. This is gonna be real annoying, huh? Oh, because the highlighting and everything is gonna be super annoying. All right, I think this one just has to be freaking light mode. Apologize in advance, but I'm afraid. I'm afraid this one is not gonna do what we want. All right, we'll leave it light. Okay, so we have to calculate the thing, we have to Bencode, and then we have to calculate the SHA1 hash. Okay, that seems fine. So let's do then, let info hash or info encoded is 30 Bencode, two bytes of t.info, re-encode info section. And then we want info hash, which is gonna be SHA1. SHA1, oh, this is never remember the SHA1 interface. Do they also use hex literal or do they use hex? They use hex, that's fine. Back to here. So the actual way is you make a hasher. I think there's a faster way to do it. I remember there being like a shortcut for, like, hey, I just wanna, I just wanna hash a single piece of content, encoded and hasher.finalize. So this is gonna be the info hash. And then we do print line. How do they want this printed? They want info hash, okay. Info hash, and they want that to be hex encoded. So we'll do hex encode info hash. See if we can get this to run on the sample torn file. Serialize is not implemented for info. What do you mean it's not implemented for info? Ah, it's because I did not import. Certainly serialize. How about now? Amazing. So we have a hash that ends at a7f, a7f, great. Get diff, commit. That means we are also correctly serializing hashes here, which is nice. Compute info hash. Let's try git push again. See what it does. It's running a bunch of tests. All tests ran successfully, congrats. Okay, so we successfully computed the info hash. See if this is gonna be happy. It's listening for a git push. Oh, I might have to refresh this too because of the outage, which means I need to apply my little trick again. Where's HTML tag, HTML tag, element this. Thank you very much. Completed, amazing. Next day, wow, these smileys look pretty terrifying in inverted colors. Okay, piece hashes. An atorant file is split into equally sized parts called pieces. Each piece is usually 256 kilobytes or one megabyte in size. Each piece is assigned a SHA-1 hash value. On public networks, there may be malicious pierces and fake data. These hash values allow us to verify the integrity of each piece that will download. Piece length and piece hashes are specified in the info dictionary of the atorant file into the following keys. Piece length and pieces. Yeah, so we already parsed these. The bit.torn protocol specification has more information about these keys. In this stage, the tester will expect your program to print piece length and a list of piece hashes in hexadecimal format. Oh, I mean, this is gonna be easy, right? Because we already have all this stuff parsed. So we should be able to now just do, oh, I guess the, yeah, this should just be print line. Piece length is gonna be t.info.plength and then for hash in t.info.pieces. I guess dot zero. We could implement some convenience functions for four hashes here, but dot zero is fine. Print line. Oh, I see. It wants us to print line piece hashes and then it wants us to just print the hash hex encoded. Okay, so hex encode, easy. Easy. Print piece hashes. Good push. I guess we should get rid of some of these warnings how they're getting annoying. We don't need serialized map. We don't need serialized sec. What else is it complaining about? That's just output stuff. And then it's also the debug print of the torrent, which we don't really need in here, just to make the output a little less ridiculous. Let's come in less noise. Okay, so this one's happy. You next stage, discover peers. Trackers are central servers that maintain information about peers participating in the sharing and downloading of a torrent. In this stage, you'll make a get request to an HTTP tracker to discover peers to download the file from. You'll need to make a request to the tracker URL you extracted in the previous stage and include these query parameters, info hash, PRID, port, uploaded. Ooh, this is a whole lot of things. Okay, so I wanna grab these and then we're gonna do, should really start to split this up a little bit more arguably. Let's go ahead and let's make our lives a little bit easier and open a lib.rs, which is gonna have this stuff and also this stuff and I want pub. I'm gonna make all of these pub for now. They're gonna be useful to main. This is mostly just to allow us to split things into more files when I don't need those to be pub because it's not as easy to have submodules for binaries. I mean, you can, but it's kind of annoying. And then I guess we'll do up here, pub use hashes, and then we can grab this one. This may be the main one that we want. And then here, what do they call this crate? BitTorrent Starter Rust. Okay. With underscores. And I guess we'll just grab all the types from there. And now I don't think we need sturdy dserilize and serialize because we just have the torrent files right here. So if I run this again, right, it'll complain about hashes because that's now in this. Still works, that's great. And now that we have these in a lib, now we can do, so now we could even imagine splitting this Ip into torrent.rs, which is gonna have, I guess actually all of this, all of that goes into torrent.rs. So we do pubmod torrent. And then we also want a pubmod, I guess tracker. And you'll see why in a second, it's because I want this type to actually be defined in here too. Pub struct tracker requests. Debug clone. Only really need serialize. I don't think we need to deserialize these. We're not running a tracker server. And then in main now, I suppose we'll do torrent and tracker request. They're gonna be the main two we need. And so for this one now, we can say info hash, which is gonna, let's see. So the info hash of the torrent, 20 bytes long, this is not the hexadecimal representation, which is 40 bytes. Oh, this is the actual like U8 20. Okay. PRID is a unique identifier for your client. And it's a string of length 20 that you get to pick. All right. We'll make all these pub two. The port, so that's a U16. I just happened to know from experience that ports are U16. Uploaded. Okay, so that's an amount of bytes that sort of by definition is U size. Left also U size. Compact. Whether the peer list should use the compact representation, for the rest of this challenge is set this to one. So it's a U8, but really it's a boolean encoded as an integer. Okay, that's fine. All right, so now we have our track request. So this is the info hash of the torrent. And it's 20 bytes long. We'll need to be URL encoded. That's fine. The library we're gonna use for this is gonna do that for us. Unique identifier for the client. And so here, I guess we can, we can just use this value over in our main when we make this request, which we'll do somewhere down here shortly. Port. I'm just sort of setting these up for when we later write the code to actually instantiate one of these. Great. The total amount uploaded so far. So that's gonna be zero. Total amount downloaded so far is gonna be zero. Number bytes left to download. This will be the length of the file. That's fine. Whether the peerless use the compact representation. The compact representation is more commonly used in the wild. The non-compact representation is mostly supported for backwards compatibility. So we're gonna set it to one that seems reasonable. So just record that here too. Okay, so now we have a tracker request that implements serialize. And so let's just see what that request would look like or that URL would look like. So request to the tracker. It's gonna be a tracker request. Did I not save Libras? It might be mad at me. Rust analyzer seems like it doesn't really like when you change the module hierarchy under it. There we go. Fill struct fields, thank you very much. The info hash is gonna be the info hash that we computed, not the hex encoding. That's fine. The peer ID, we're gonna set to what they told us to set it to, which is gonna be string from port. They told us to set the six eight at one. So we'll do that. Upload it to zero because we haven't uploaded anything. Download it to zero because we haven't uploaded anything. Size is gonna be the entire size of the file, which we extracted up here as length. And so we can stick the length here and compact, we should set to one. Okay, so now we have one of these requests that they asked us to add. The tracker's response will be a Benco to dictionary with two keys. Okay, so we go back tracker here and then we do struct tracker response, which is gonna derive debug, clone and de-serialize. De-serialize. No, I wanted all of this. Interval is an integer indicating how often your client should make a request to the tracker. How often? What's the unit for this? Let's go to the spec. Tracker responses are Benco to dictionaries. If a tracker response is a key failure reason, okay. Otherwise it must have two keys. Interval, which match to the number of seconds the downloader should wait between requests. In seconds, that feels important. So that's gonna be a U-size. Realistically, it's much less than a U-size, right? Like you're never gonna wait for like U-size max seconds. Arguably this can probably be a U8 or something, but let's do U-size. Pierce is a string which contains a list of strings. Oh no, one of these encodings again. Each pier is represented using six bytes. The first four bytes are the pier's IP address and the last two bytes are the pier's port number. This is gonna be another one of those fun implementations of a custom data format. Let's do it. It's more fun that way. So we grab our mod hashes thing and we'll do something similar. So mod pier's. So pier's is going to be a vector of socket adder and I know this is a V4 because it says four bytes for IP address. So it's IPv4 and not IPv6. So we're gonna have a pier's visitor. That's gonna produce a pier's. And I guess we can do for the reason why we expect something. Six bytes. The first four are a pier's IP address and the last two are pier's port number. That's what we expect to get. Visit bytes. I assume it's gonna be bytes because if it's an IP address, then the bytes here are basically arbitraries. It can't be a string. It wouldn't be a valid UTF-8 string almost certainly. So that means this should be a modulo six. Shunks exact. Six, slice. Six. And so we try into U8, six. We actually don't even need to convert it that way because we know that it is six bytes here. And what we actually want to slice this into is we wanna socket at our V4. What are the fields in here? Oh, socket at our V4. I think we need to say like new. Really? It doesn't wanna give me that. Oh, right. Cause I have all these other things here. Okay. Pier's visitor serialized for pier's. It's actually gonna be a little bit annoying, this one, because we're gonna have to serialize this at bytes. Yeah. We'll do that one a little bit later. What's it? It's unhappy about something over here. Aha. How about now? Oh, I don't have, that's why. I need to import this. Now am I allowed to do this? Yeah. Okay. So the IP here is an IPv4 adder. And I wonder, ha, great. Slice six, zero. Slice six, one. Slice six, two. Slice six, three. The port is a U16. So that's gonna be U16 from, that's a good question. The port is it big Indian or little Indian? Does the high byte come first or last? And port number, respectively. Does the spec really not say whether the port should be big Indian or little Indian? Why? Why would they not specify the byte encoding of port? It's probably network byte order, I agree, which is big Indian, but like, but like why? But like four and slice six, five. And I'm not allowed to use that constructor. Fine. So the reason I use chunks exact here and the reason I do modulars, I think, it is not, it is a list of peers. Yeah, which contains a list of peers and each peer is six bytes. And so that's why we want modulo six here, not equal to six and why we chunk it this way. Yeah, but I don't want any bytes. Oh, yeah, because any is native. It's not network. And so we wouldn't want which order it uses to be dependent on the architecture of the machine we're compiling for. That wouldn't be right. All right, I mean, we'll see. Luckily this is something where, you know, at least in theory, this old test and whether it does it correctly. Okay, so this slicing is gonna be a little bit annoying because we actually need to, we can't use concat here because it's not a slice of slice of U8. So instead what we'll do is single slice is a VEC with capacity of six times self.zero.len and then four peer in self.zero. We'll do single slice.extend peer.ip.octets and then we'll just do peer.port.2be bytes and then single slice. And now main is complaining at 57 because info hash, because the info hash we get back from finalize is actually this weird generic array thing. But I think we can just do this, maybe. Okay, so the type of info hash over here is, it's okay, where's my shawan crate? The result that you get back from finalizing the hash is generic array of U8 and self output size. Now the question is, how can I turn, first of all, where does generic array comes from? That's an excellent question. Crypto common, maybe? What? The file that it claims has this type does not have this type. Someone pointed out, oh, you can just in the, in the function argument here, I could just unpack it. So b1, b2, b3, unfortunately that doesn't work because it's not an array, it's a slice. I think maybe I could do like before p1, p2, dot, dot. But the problem is this is a fallible pattern because if the slice is shorter, this pattern wouldn't match. So that will only work when we move to array chunks. And then we can also pattern match in closure args. So unfortunately that doesn't work. Oh right, this has to be now torrent keys. Yeah, so the problem is the thing you get back from finalizing a shawan hash is this generic array type. And what we want is an actual array, not this weird wrapper type generic array. And doing that is a little bit annoying because in part I can't find the generic type implementation. It makes me wonder whether like, can I write U820 here? Is it gonna let me do that? No, it's not, is it? But I don't understand where this generic array thing is because that's from Crypto common. All right, give me Crypto common then please. Does this have a generic array? That comes from the generic array crate, okay. Core Rust can't be used generically with respect to N. So that, I see, does it create the preceded const generics? Okay, is there a way for me to do it now that it's no longer true? Yeah, there's not, is there? So you're 14.5, 14.7, please tell me one of them added a way to turn it into a intergeneric array. Generic array, no, this is all sliced stuff. See, my guess is generic array version one does allow you to expand one of these. If I do this, surely this uses const generics to give you back the actual, yeah, into array. I want this one, but we're not using the latest one, are we? Generic array, generic array. Yeah, they're using generic array zero 14 and zero 14 does not have this method. Okay, I'm surprised that they haven't added a converter to this old version of the crate. Like, that's fine, it's fine. It's fine, it's fine, it's fine, it's fine. We'll just do dot try into dot expect. This is the generic array 20 is... Expected U820 found a reference. Oh, that's because this needs to go away now. Great, what do they actually want us to do? All right, this is just so that we can correctly parse peers. And so now we have a new peers command. It presumably also takes a torrent file and expects to get a list of all the peers. All right, peers. Command peers torrent. I guess just for our own sanity's sake, we could now do import torrent and give it a method to compute the info hash. pubfn info hash self returns you a U820. And so now I should be able to do info hash is t.info hash over here. And as a result, I can do the same thing now over here. Why is it t? Oh, right, because we still need to actually parse the file. That's fine. And then the request, I guess I don't need to do yet, but they're gonna ask me to do it at some point. And I guess I don't need the info hash either in this command. There's gonna be some later one where I'm gonna need this. That's fine. And so here what they want me to do is for peer in t.... Oh no, they do want me to make a request actually. Nevermind. So the request is gonna be this, the I guess tracker response is going to be request URL from, and I want to use t.announce, right? So I'm basically constructing a URL to the tracker. And so that's based on the announce URL in the torrent file. .queryPairsMute, and I want to use t.announce, right? So I'm basically constructing a URL to the tracker. PairsMute, tracker URL is gonna be this. And then I guess what I want to do here is pairs is this. I'm pretty sure there's a better way to do it. I don't think this is actually what I wanna do. I think what I wanna do is trackerurl.setquery. And I want the URL params to be using the sturdy URL encoded crate, which lets you take URL encode any type, which is what we have, two string of request. Wrap, I guess, context URL encode tracker parameters. And now I should be able to set that to URL params. And then I should be able to do a request get tracker URL. Oh, this is, let's use the sync one for now. Let response is equal to that if we have the, oh, why don't I get context anymore? Use anyhow context, but I have context. Something's wonky. Oh, it's cause, okay, that's fine. This guy needs to go over in this file. Oh, right. This guy arguably now needs to start returning a result. I'm gonna just expect here. Should be fine. Some shortcuts are legal. And right, and then this is gonna be tryinto.expect, generic array, I guess I can just steal the same one I wrote over here. Like so. Oh, I guess they haven't brought in the, fine. So we're gonna make this whole thing now be Tokyo main and async event. Because now this has to do a, it should be requests. So await, context fetch tracker. Expected URL found string. Cannot find length in the scope. Oh, I see. Okay, fine. Still need to do this. Cause they're still assuming that the file is single, or that the torrent is single file. That's easy enough. Expected URL found string. Do they have a new instead? They do not. docs.rs request. Yeah, that's fine. But how do I create a, can I just make myself a URL instead? Parse, parse. Parse tracker, now the URL. Expected stir found string. All right, reference to that. Beautiful. So I guess now we want peers. Unsupported value. Interesting. We're not to dig into that one. So this is saying when trying to URL encode the tracker parameters, this one it's not happy with for whatever reason. Huh, why? Why, why, why, why, why, why, why, why? I wish it would give me more context here. Unsupported value is not particularly helpful. It's probably the byte array, yeah. So if we do cert a URL encode, URL encoded, my bad. Two string. Actually, I want to see, give me, give me, give me, give me repository. Then I want to search for, let's see here I got unsupported value. Okay, I think that's just anything that, right, this gets called for anything that's not one of these. So it can serialize strings, static strings and stars and some. Really? No, it specifically said it's not hex encoded. So look over here in flash, note this is not the hexadecimal representation. It is a URL encoded version of the bytes, which I would hope that the URL encoded crate would do for me. But I suppose not. Which I guess means it does not encode values for me regarding bytes, support serializing bytes. Only deals with UTF-8 strings. Look, someone else is implementing a bit torrent HTTP tracker, which have already been percent encoded. This sounds like we could actually just, I mean, we could just URL encoded ourselves, I suppose. I just kind of was hoping that we didn't have to. Cause it gets real messy because now, like what even should the type of this be? Here's what I think we have to do. We have to do, what's the sturdy trick for this? Field attributes, serialize with. Serialize with equals path, copy. Serialize with URL encode. And so then we're gonna do this. And I think this has to be a module. Oh no, function, okay. So we can do this. And it has to take a serializer and it has to take a T, which is going to be one of these U820s. And the way we're gonna do that, I suppose, is serializer.serialize stir. And we're gonna serialize stir. Coded is gonna be URL. Do they even give me a handle for this? They just kind of claim that I can use sturdy URL encoded for this, but I don't think that's actually true. I think what we have to do here is, yeah, I suppose we can just manually encode it. It just feels really awkward, right? Like, okay, mute encoded is VEC with capacity. So every byte is going to end up being two bytes plus a percent. So three times T.len. And then, you know, for byte in T, encoded.push percent, encoded.push.hexencode byte. I guess string with capacity. And encoded.push.hexencode string encoded, which just feels dirty. The trait to ask for if it's not implemented for you. That's fine, okay. Like, this just feels gross. Oh boy, what? Oh right, this should be fine. That's gross, that's gross is what that is. But it works, like we get the response back, and then I guess we want the the actual response is gonna be tracker response. It's gonna be response.... Oh, forgot the question mark. Response.buddy. And then we're gonna do bincode rumbites response parse. Ooh, parcel tongue over here. parse tracker response. And that's gonna be a tracker response. And then they want for peer in response.pears.printline. Peer.ip and peer.port. Wait, did we not? Did I not market as pub or something? I did not, did I? Pub, pub, pub. And this is a peer, this is a peers. So pub use peers, peers. Great. And we're almost there because this peers has to be response.pears.zero. Missing field interval. Is interval required? I feel like the response we're getting here is wrong somehow. Failure provided invalid info hash. Yeah, I mean, of course it's the invalid fucking info hash because I guess the encoding did not actually work. All right, what is the actual URL we end up setting here? Info hash. Yeah, see it ends up URL encoding the thing that we outputted. So we URL encode the bytes and then the URL encodes are URL encoding of the bytes. But we can't give just the bytes because then, okay, here's the way we're gonna do this. Note the info hash field is not included. This is very stupid. This feels very stupid. So there's not gonna be an info hash field. Instead we're gonna do this and then we're gonna do tracker URL. Query pairs mute. Append pair. Info hash. URL encode. And this is no longer gonna be taking a serializer. It's just gonna return a string. And it still makes me very sad. And the thing it's gonna do is URL encode t.info hash. All right, I guess we already have it, so info hash. Yeah, thanks, I hate it, is right. And it's no longer generic over S either. Probably didn't valid info hash. It's still getting URL encoded. Why? Is there like an append? What do we have here? Extend encoding override. Oh boy. Am I really gonna do this? Is there really not just a, just a, okay. Request URL, show me what you got. So we have query pairs mute, which gives us back a serializer. Manipulate this URL's query string, viewed as a sequence of name value pairs in encoded syntax. Yeah, I can append unicode, but that doesn't help me. Query, these are, these all require strings. And serializer is not a public type. Okay, where does, that comes from form URL encoded. Form URL encoded, serializer. Set the character encoding to be used for names and values before percent encoding. I don't understand, why isn't there a way for me to just, I don't know, why isn't there a way for me to just set bytes in the URL? Bites. Allow non-UTFA key and value pairs. Okay, that's not super helpful. No, the encoding override is used instead. I guess the question becomes, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay. Encoding override, sum. No, because the input to this still has to be a string and it's not a valid string. That's sort of the whole point. Okay, so that's not gonna work. Yeah, I suppose the way we have to do this actually is then just tracker URL is going to be format t.announce URL params. Yeah, URL params and this. Now, the reason this is awkward is because if there are query parameters in t.announce, this won't do the right thing because you would end up with multiple question marks in the URL. It cuts off the entire query string. That's wild. Okay, so I guess just we can just not go through parse. Just don't parse it, just pass it straight to request get. Okay, so if you pass it to get, which presumably has to turn it internally into a URL, then it's okay with it. But if you pass it via request URL parse, then it doesn't work. This is just wrong. Like this is just wrong and this should not be the case. Oh boy. I... Yeah. Yep. Okay, fine. Well, it does the thing. It does the thing. This, that's just, this, okay. Great. Thank you. All right. Well, we did the thing at least in theory. 178.62 with port ends with entity. The port is wrong. The port is fucking wrong. Look at it. The port here, five, one, four, 70. The port here, five, one, four. Wait, the first, what? Oh no, they're just in a different order. Why are they in a different order? Okay, so we are parsing the ports right. So it is big Indian. Okay, I was about to say, but why are they in a different order? Are they in a different order each time? Split files and split modules and contact tracker. Contact tracker. See what happens if I push. The values are random in the instructions. Okay. Well, I guess we'll see. I would just like assume that they shouldn't need to be ordered. All right. Well, I guess we did it. Nice. Okay. Peer handshake. In this stage, you'll establish a TCP connection with a peer and complete a handshake. The handshake is a message consisting of the following parts that describe in the peer protocol. So that's also further down here, the peer protocol. Okay, let's close some of these just because otherwise they're gonna continue to make me sad. Okay. The length of the protocol string, peer connections are symmetrical message sent in both directions look the same and data flow in either direction. Peer wire protocol consists of a handshake followed by a never ending stream of length prefixed messages. The handshake starts with character 19 followed by the string bit orange protocol. The leading character is a length prefix but they're in the hope that other new protocols may do the same and thus be trivially distinguishable from each other. All later integers sent in the protocol are encoded as four bytes big Indian. So here they specify Indianness. Okay, great. After the fixed headers come eight reserved bytes which are all zero in the current implementations. If you wish to extend the protocol using these bytes please coordinate with Bram Cohen, okay. I need to make sure all extensions are done compatibly. Next comes 20 byte Shaw one hash of the B encoded form of the info value from the many, okay. This is the info hash except there's raw instead of coded, okay. Both sides don't send the same value this ever the connections. Okay. So this is fine. So the handshake, let's go back into here. We'll go to Lib. We'll do a pub mod handshake or peer. Peer is good. Like peer, peer, peer. Pub struct handshake. Now this one I actually think is going to be kind of interesting. So I think what I want to do here is actually maybe I want to do something quite unsafe here which I want wrapper C and then I'm going to say this is going to be little length which is a U8. It's bit torrent which is a U8 19. It is eight reserved bytes. Eight. Then it is a sha1 info hash. Info hash which is U8 20. And then it's a peer ID which is a U8 also 20. Ooh, what did I even do there? Okay, these are all pub. Nope, semicolon. Okay, so now in my main, what's the, okay, so there's a new handshake now and handshake is supposed to take a torrent and a peer where the peer is peer IP and port. That's fine, I'll leave a string. So if we get a handshake, we do this, we do that. We do peer is peer dot split ones on a colon. Actually, we can just do, I think socket adder v4 lets you parse peer address. Oh, there's a URL of parse with params. Interesting. That's fine, I'll leave it the way it is now. Okay, so we parse out the peer and then we just do a straight TCP connection to it, okay? So peer is, I guess, Tokyo net TCP stream. Connect to peer, connect to peer. And then we wanna do peer dots and, and this is where it all gets tricky. So handshake is gonna be a handshake, like so. Rust analyzer really doesn't like when you make new modules in the middle of writing. Now it's gonna tell me where it is, I think. Lib, oh, it's cause I don't even import it, peer handshake. There we go. The length is gonna be, it should always be 19. So I guess actually to make this a little bit nicer for ourselves, we'll do import handshake pubfm new and it's gonna only take info hash, which is a U820 and peer ID, which is a U820. I'm gonna turn a self and the length is gonna be 19. BitTorrent is gonna be the string BitTorrent protocol. And to be clear, like the reason why they do this, right, is because the reason why this length here is hard coded but is still present is so that you could have other strings of different lengths and you can still decode them like this. Now, given that we know this is always 19, we could just do it that way for right now, zero, eight. We could write a full parser for this that like does a length encoded string decode but because we know it's 19 and it's fixed length, how about we just do it nicely? So we do handshake new and what's the, right. So the peer ID, we can use this thing and the info hash we already have. The reason I make it mute here is because I actually want to use this to read the thing back out to it. You'll see in a second. So I want to write, and I guess I gotta use Tokyo IO async right next. Write all and I want to do a at handshake as U8 of length, size of handshake, right handshake. Now this is almost certainly unsafe. It should be at least, all right. A star of that, something here, a star of that. Mem size of, this is not a runtime argument. This is a compile time argument. Non-primitive cast, yeah. So this is where it'll actually be. This as const to this and then it'll be unsafe. Well, all right, let's do this a little bit nicer. So handshake bytes is going to be this and then this is going to be unsafely turning that back into handshake bytes. Casting reference to handshake as, all right, I need to do const handshake to cast it all the way. And it's upset about my socket adder before. I thought there was a parse. Is there really not? I think it's, then I think it's peer.parse. I'm pretty sure this supports like from string. Yeah. Write all dot await. So what I'm doing here is I'm constructing one of these handshake things and because the struct is rep or see, that means that I can treat it as a byte array. It's not really what it means directly, but indirectly it means I can treat it as a just series of bytes. Like I just take the bytes that are in the in-memory representation of the struct and then I take that and just cast it into a byte array of the appropriate size. And that thing I can then cast back into a reference to a byte array instead of a raw pointer or reference. And that in turn I can then pass to write all which takes a slice. And now it's upset that this needs to be mute, which is fine. And I guess, actually I guess that's a good point. I need to find a valid peer. So handshake to this thing. Oh right, okay, great. So I wrote it, but I didn't get anything back. So now here's what I wanna do. I want to do peer dot read all. Handshake bytes is gonna be a mutable reference to a U8 of this length. So I'm just turning handshake bytes into a mutable reference over the same thing. I'm passing that into write all and then I'm gonna pass it into read all as well. So I want, I guess, read X as well. Async read X, and then I wanna dot read exact into handshake bytes. Dot await, dot context, read handshake. And so now I want to do some assertions on this thing. Like I wanna assert that the, I guess I'll drop the handshake bytes thing is only valid within here. And now I wanna check that the handshake, which remember I passed in a mutable reference into this thing. So after the read, the contents of this struct should now be the bytes that we got from the other end of the other side's handshake. So I should now still have the case that the length here is 19. It should still be the case that bit torrent reads, whatever bit torrent was supposed to read this thing. And it should also be the case that reserved is equal to zero, eight, right? Casting, oh right, this has to be mute now because I want a mutable. Can't compare U819 with U819. That seems like a lie. One, two. Oh, I forgot parentheses. Assertion failed. Oh, they do set reserved bytes. The reserved bytes are supposed to be all zero. Why are the reserved bytes not all zero? That's kind of interesting. We certainly set them to all zero. I know extensions use them, but this code crafters peer presumably doesn't, but that's fine. Okay, we can just ignore them, I suppose. And then what they want us to do is print out the peer ID hex encoded. Okay, so print line. And I guess this actually means that we don't need the length in this particular exercise. Peer ID is to be hex encode handshake.peer ID. Now, the only reason why this casting here is actually okay, I should probably write a safety comment here. So handshake is a POD with reference C. A POD here means plain old data, which means that any byte pattern is valid. So you see the types in here are all valid for any byte pattern. So there are no sequence of bits that would make any of these types have an invalid or value with undefined behavior. They're all values are valid. Okay, so basically no matter what the other end sends us back, any bit string is a valid interpretation of this struct. Great, so if I now run this, I get back a peer ID. I wonder if it's happy with that, like print peer ID. Push, let's see. Yes! All right, so we correctly handshook with a peer. Next stage, download a piece. Oh, this is that hard. All the other ones have been easy, supposedly. In this stage, you'll download one piece and save it to disk. In the next stage, we'll combine these pieces into a file. To download a piece, your program will need to send peer messages to a peer. The overall flow looks like this. Read the torrent file to get the tracker URL. Perform the tracker get to get a list of peers. Establish a TCP connection with a peer, perform a handshake, okay, we've done that. Exchange multiple peer messages to download the file. Okay, peer messages consist of a message length prefix, message ID, and a payload of variable size. You have the peer messages you need to exchange once the handshake is complete. Wait for a bit field message from the peer indicating which pieces it has. The message ID for this message type is five. You can read and ignore the payload for now. The tracker we use for this challenge ensures that all peers have all pieces available. Send an interested message. Wait until you receive an Unchoke message. Break the piece into blocks of 16 kilobytes and send a request message for each block. Interesting. Okay, so this doesn't seem too bad. This doesn't seem too bad at all. So there are a couple of ways we could do this. We could either actually write out a codec. Maybe that's actually what I want to do here. Cause presumably there's like a format for each of these messages here too. I guess interested has no payload, but we can actually use Tokyo codec for this. So Tokyo codec basically lets you, this crate is deprecating token and moved into Tokyo util codec. Okay, that's fine. Yeah, so this is a way to convert between an async read and async write to a stream and a sync. So sync is a thing where you can send values in and a stream is a thing where you can take values out that are well typed. Whereas async read and async write only work on the sort of byte level. This does tend to make the protocol a little less efficient because you go via this sync thing. So you don't get a, it makes memory management a little bit less nice, but at the same time, it can be very, very convenient. You get a much nicer interface on top of the protocol. So I think what we'll do here is, so let's go to peer. So there's going to be something like Surty, we can do this with Surty, that'd be cool too. And I think we probably can. So what I really want to express here, right, is something like enum message or peer message or whatever it is, but message because it's in the module peer. And then we have, what was the example? So bitfield is one message type. And what I really would like to express is something like, I guess, message type. And then we do something like wrapper U8. And bitfield is message ID five. All right, so we can encode all of these. Like interested is two. Unchoke is one. Request is six. Peace is seven. And who knows what the other things are. Okay, so we have one, two, five, six, seven. So those are the sort of message, I guess message tag is really the appropriate one. And then I want to say enum message. And then I suppose what I really want to say, right, is that the, where's the Surty? Docs is bright. I'm apologizing a little bit too late. So bitfield, that no container attributes. Tag, enum representations. So for tags, you can say, this is like the name of the field that holds the tag for the, for which variant you have. Now, this is a little weird in this protocol because there's not a named tag, really. And I think this is going to be a little bit weird in Surty land. So I guess then what we really have is a struct message which has a tag, which is a message tag, and a payload, which is really just a vec of, you know, U8s. Yeah, I think that's, I think, I think what we'll do here is we'll do a codec and just see how that turns out. So let's write a little encoder and decoder. So we'll grab both the encoder and decoder in the same thing here. This all goes at the top. Am I bringing in the same thing multiple times? No, I'm not. Okay, so what I want is message decoder. So implement decoder for message decoder. And the thing message decoder is going to produce is a message. So it's going to be given a bunch of bytes. And we know that the length in this protocol is it's always a length prefix of four bytes and a message ID of one byte. So it should be at least five. And then I guess the length, let's read the length marker first. Let's do five marker plus tag. So then we return okay none to say there's not enough data available for us to decode a full message. And then we read the length bytes out of the slice. Does it say whether they're big Indian or little Indian this time? It did, right? Pure messages. Oh, here are the other ones listed. Let's stick those in there while we're at it. Not interested is three and half is four and cancel is eight. Yeah, I think it's said or all later integers sent in the protocol are encoded as four bytes big Indian. Okay, great. So that means over here, you could say from BE bytes. So that's the length. I'm not too concerned about this one, but I guess we know that the payloads you can get, I guess piece is probably the thing that I'm just gonna drive and choke. So pieces are, I think they said somewhere the pieces are not larger than, yeah, 16 kilobytes, usually two to the power of 14 bytes. So max here is actually gonna be 116. Or I guess 14, but it's 114, but it's plus the headers. And I also don't know that we want to use the max here. So let's do 16. Great. If the source length is, does the length include the mandatory tag? Okay. Yep. Messages of length zero are keep a lives and ignored. Okay, so there are messages of length zero, which means the messages of length zero have no tag, which means that the tag length is included in the length. Great. And here we can just keep the stuff from the example encoder, that's fine. So this is if the bytes, so we're giving a byte buffer here, right? If the byte buffer's length is shorter than how long the length field says the package should be, then we reserve some more space and then we say we don't have enough data yet. Otherwise, that means we have enough data. We have every data that's supposed to be there according to the payload length in the header. And so therefore we extract out everything followed the length. And I think actually what I want to do here is, I want to grab the tag separately. So I want tag is source five. And then I want data is five to five plus length minus one, which is four plus length, but it does not write four plus length minus one. The reason I want to do this is because the VEC that we extract here, otherwise that VEC would have a tag at the beginning and then we would need to allocate a new VEC to all the actual data bytes, which we would then need to copy all the bytes into that bit, which just seems unnecessary. So we advance it by four. We don't want to convert it to a string. Instead, what we want is at this point, an okay of a message with tag and the payload is data. So this is just the framing protocol really. Expected option. Oh right, some, because we successfully decoded the next thing. Oh, did I grab the wrong code for encoder? I think I did. Ah, example encoder, that's what I wanted. The payload data contains the tag. No, it does not contain the tag. We pull that out first. Oh, choke is equal to zero? Did I miss one of the? Oh, choke is equal to zero. How about that? Choke is equal to zero. Four. So this is then down here. This is gonna be message encoder. And really we could just call this whole thing message framer, right? Because that really is what it is. It frames for both decoding and encoding. And this can encode anything that is of type message. So item here has to be message. And really we could do even better here actually, which is to say this is gonna make our lives a little bit easier in the future. So we could have a trait here, which is a message that we implement for multiple different types. Because, let me backtrack a little bit. So the awkward part here, right, is that the tag that we pull out is separate from the payload, which is a Vecu8. So we're gonna need to have some code somewhere that like matches on tag. And depending on the tag, DC realizes the payload. In reality, which tag you get dictates what type payload should have. So it would be kind of nice if there was a way to express that. But there's not an easy way to do this with CERTI because it sort of expects fields to have names, which isn't really true here. I suppose we could make message generic over T and then if T be some kind of enum, doesn't feel quite right. Yeah, I think we're just gonna keep it this way and then we'll just, because this is also a sort of test implementation, if you will, I think it's okay for us to end up with that match. It's okay, I won't do the trait thing here either. Don't send a message if it's longer than the other end will accept. That's fine. If item.payload.len plus one because of the tag. Or did we count in the tag here when we check the length? Yeah, we do. So if this plus one is greater than the max, we're gonna refuse to send it. The length is gonna be payload.length plus as U32 plus one to big endian bytes. Four, so that's the length plus one, that's the tag plus the length of the payload. So we want to reserve space in the buffer for all the bytes we're trying to send and then we extend it with the length slice. We also, is there a push or put? Put, yeah, put U8 of item.tag and then payload.askbytes. So that's our encoder, that's our decoder. So now we have this framed thing. So we should now, we should now over in, no, over in main. Oh, what's the actual command they want us to add? So the command is download piece. All right, download piece. It's supposed to be arg short output is a path buff. Torrent is a path buff. What's the zero here? Oh, this is, and then which piece, I guess, great. So if we get a, it's a little annoying to split these into so many sub parts, but it's okay, download piece. We could also pretty easily start to refactor this so that these pieces of code are just like contained elsewhere, but I don't wanna do that yet. Torrent output and piece. So we still need to parse the torrent file. We still need to grab the info hash. We still need to do the handshake. We do the handshake. I suppose we don't actually know which peer to connect to. Should we just connect to a random peer? I guess we can connect to as many peers as we want, really, or we could connect to a random peer. Yeah, all the peers have all the data. So I think instead, what we'll do here is, we'll just connect to a single peer for now. So we'll do t.info dot, wait a second. No, we need to, right, we need to string all this stuff together. Like, it's awkward that we have to support all the previous test cases still, because otherwise I would not have, I would just not keep this old code around and have to copy paste it, but fine. So we do this, parse the torrent file, compute the info hash, send the tracker request. That gives us back the response. And then I want this to have a better name than response. This is really tracker info. And then peer is going to be trackerinfo dot peers dot zero. We're just gonna pick the first one, then we connect to the peer, we do the handshake. Now we have the peer ID. Okay. And so now the next step arrives, which is when you have one of these codec things, what you can do is, where's the example of calling it here? So now I can do this. I can create a buff. I can take the peer connection, read into the buff. Wait, no. Yeah, I don't want the equivalent to the following loop. I want the actual framed, where's framed? Here, this is what I want. So I want a, I don't want all this ugly code. Tokyo code takes care of that for me. So peer is now going to be Tokyo util. Can I add to Tokyo util? Any changes here will not reflect when code crafters test your code. Really? I don't get to use Tokyo codec here. I can add things, but it says changes here will not reflect when it tests your code. We added support last week, you can edit and save. All right, I will trust this random person in chat. So Tokyo codec, no, Tokyo util version 0.7. If they added this, it makes me very happy. Are there feature flags? Yeah, let's do full, that seems fine. All right, so Tokyo util codec framed new. And I want then the peer, and then I want handshake and message tag, message tag, message and message framer. And where's my message framer over here? There's no real need for it to have fields. Now I have a framed thing over the peer and now I should be able to just use that as a stream and a sync. As you see it implements now, framed implements stream and sync. And this means that I should be able to now go back to, let's see, where's the, so the stream trait and the sync trait, it gets from futures sync and futures core. Okay, so let's pull those in here. So futures core, 0.3 and futures sync, 0.3. And now, if I remember correctly, I should be able to just do while let, I guess I know that I'm first supposed to wait for a message. So I'll do a message is peer.next.await. Let me see if this even does anything useful. Chromotomol, Torotokio, Util. Peer, let's go bytes, bytes, buff mute. Okay, this as U8, it's easy. Payload and onLen. Right, this doesn't need to be too veck. And this should now be, right, this is where this gets awkward because any arbitrary byte is not valid as a message tag, but any message tag is valid as an arbitrary byte. So what we really need to do here is we need to match on this byte and we need to say that U is, really what I want is like a message tag try from here, but I don't think you automatically get one. So I'm actually going to do this mapping manually, which is, there's a crate that helps with these that basically lets you derive the try from here. And having this mapping in two different places, obviously not ideal because it means that you might accidentally update one, but not the other and they get really hard to debug problems as a result. All right. And then if I had anything else, I guess I will return a one of these. Oops, unknown message tag, message type tag. So we're here too, it's unhappy. It doesn't need us bytes because this is already a byte slice. My string framer, this should be message framer. Aha, we're getting there. We're getting there. This needs to be pub. This needs to be pub. This needs to be pub. This needs to be pub. This needs to be pub. Main 196. Next. All right. So now I want futures util because I want use futures util stream x and sync x. Really, I'm pretty sure there's supposed to be a sync x too, but let's do stream x for now at least. Cannot borrow peer as mutable. That's easy to do something about. Amazing. Okay, so now we get a message and we should be able to now do like an e-print line of message.tag. So let's see what message tag we get. Right, so this is gonna be, it's an option. So we're gonna unwrap because we know we're supposed to receive a message. I guess, yeah, we're gonna do expect peer always sends a first message. Always sends a bit fields. And then context as in, if we do get a message, it should be a valid one. Peer message was invalid. Message tag cannot be formatted with a default formatter. That's easy to do. Message can derive debug and clone. And message tag can even derive copy, but not display. That's fine. See what we get. Oh, right. This is because I'm calling handshake. I need to call download piece. I don't know, foo. Actually, do they give like a, oh, they just use a temp file. Okay, so if I do this, okay, they give you clap to work with, but then they use the wrong sub command shortening for that clap uses. So this, I guess I need to do clap, rename equals download underscore piece. All right, clap, derive reference, rename. I guess I will have to do rename all snake case. That's fine. Peer message was invalid. Unknown message type 224. It does indeed seem like an invalid message, doesn't it? E-print line. Let's see what length it claims this first message has. First message has length two. Interesting. That doesn't seem right at all because they say, wait for a bitfield message from the peer. Like there's not supposed to be any 224. Wait for a bitfield message from the peer indicating which piece it has. The message should be five. You can read and ignore the payload for now. Ooh, you say. Can they send you wrong messages until you get a bitfield message? I assume not. Like it'd be very weird if the peer just sent you random garbage messages, right? At least like a length of two is not entirely unreasonable. It's tag plus one byte. But the 224 is a tag seems odd. And we know that we're parsing out the peer ID correctly because that's what the previous step tests, right? So the next 20 bytes is the info hash. After the download hash comes the 20 byte peer ID. That's fine. We don't, there's supposed to validate peer IDs but I guess we don't do that, that's fine. That's for handshake. Next comes an alternating stream of length prefixes and messages. Messages of length zero are keepalives and are ignored. Keepalives, right. This is actually something we should implement which is if this needs to be four because we have to allow for there to be heartbeats. And so now if length is equal to zero this is a heartbeat message discarded. Now there's an argument here about whether heartbeat messages should be discarded or whether we should omit them as like a no message. I think I actually want to just discard them here. Like realistically you might actually want to bubble these up so that the sort of caller can keep track of which peers are actually alive but for our implementation it's going to be annoying to like we would have to make message here and I guess you could produce another option wrapping here, right? But I don't really want to do that. We can't easily add anything to message tag because it's supposed to be valid for all U8s. Like I don't want to have like message tag 255 mean heartbeat and then some protocol decides to extend it and put semantic meaning to that value. So we don't really have a way to represent up the stack a heartbeat message. So I think I just want to discard it here. The way that we have to do that is we have to do here and then what I want to do, so we need to advance the source pass the heartbeat and now this is where it gets awkward because imagine that the source buffer here actually had two elements in it. So we're sort of supposed to either return none if there's not enough data or some if there is. So there's actually a case here where you would need to recurse or you would need to sort of loop like you would need to try again in cases another one. And I guess we can just actually recurse here. This is an unlikely case to happen very often and then try again in case the buffer has more messages. So we do return self.decode source. So we advance path pass the heartbeat and then we just try to decode the next message. And then we have to do if source length is less than five then not enough data to read the tag in which case we got to return okay none. That's not going to be the cause of this but it's just, interesting. Okay. So I guess we got to print out these bytes then. So we get the length and then here I want to E print line. I suppose a hex and code of four to four dot length. Zero five E zero. Why is zero five? So we get the length and then here I want to E print line zero five should be parsed as five. Why does it say 224? I think E zero is 224. I think we're, we just have an off by one here. We actually do because I'm completely stupid. This should be four. The tag is number four. The data is five and onwards. Okay, good. Great. Fantastic. Great. So it was just, I was just being stupid. Beautiful. Great. Okay. So we got a bit field. We print it out. It is indeed a bit field. Okay. Well, we're not done yet. This is just, we got a bit field. You can read and ignore the payload for now. The track we use this challenge ensures is all pieces, all pieces available. I see. So there's a, we assume that the, so this is going to be, I suppose, bit field. We assert equals bitfield.tag is message tag bitfield. Bitfield covers all pieces. Send an interested message. Okay. Payload is empty. Okay. So now we can do, again, because we have this codec, we can do peer.send and we can send a message which has a tag of message tag interested and a payload of nothing, interested message. And I think now it's going to yell at me because send isn't found. Oh. Oh, this is another thing we should do over in peer is that message tag should implement partially kineek, just to make life a little easier for ourselves. Send doesn't work because futures util is supposed to be, yeah, sink ext. That's what I thought. But why doesn't it let me do that? Unresolved. Found an item that was configured out. Thank you, compiler. Feature flag sink. Yeah, right. Version is this. Features equals sink. Well, it didn't crash on us. So I guess that means it compiled. So this send is now going to send that message and then it will be encoded using the length encoding through the the framer. And so now wait until you receive an Unchoke message back. Okay. So now we should do Unchoke, Unchoke. And here we're sort of hard coding the protocol, right? Where we sort of assume that every time you send an interest that you get an Unchoke, really we should encode this as like a full state machine so that we could reuse it when we want to request different pieces from different peers. But this is just to see that the basic flow works, right? We should get an Unchoke. Payload should be empty. So this should be Unchoke. Unchoke.Payload is empty. Break the piece into blocks of 16 kilobytes. And send a request message for each block. Okay. So now we need to do this, this splitting thing. Break the piece. Okay. So we need to pick a piece first. T dot info dot pieces. The number of bytes in each piece, the file is split into. I wanna see what the spec says here. This is not break the piece, which the piece. I guess the piece we're going to download. All current implementations use two to the power of 14 and close connections request an amount greater than that. Okay, that's fine. Downloaders generally download pieces in random order, which is a reasonable good job of keeping them from having a strict subset or super set for the pieces of any of their peers. Okay, so I guess we're just gonna pick a random piece, which means we might as well just pick piece zero. Does it, there's nothing in the argument here that it, oh no it is, okay, we do know what piece to pick. It gives us the argument, right? Okay, so the piece hash is going to be the pieces, I guess we should have an assert at the beginning here, that piece is less than T dot pieces dot lend. Oops, so that here we can do piece and let piece size is gonna be, if piece is equal to T dot info dot pieces dot lend else. In fact, do we even need to send the size? Yeah, this will be two to the power of 14 for all blocks except the last one. The last block will contain fewer, you'll need to calculate this using the piece length. Yeah, that's what I figured. Okay, so the full length we're gonna download is the one we have up here, because we still assume single file. So if it is the, if the piece is equal to the last piece, otherwise, I guess we'll have a constant up here. So this is something like piece max is two to the power of, so, so then it's piece max or if it is the last piece, then it will be length modulo piece max. Is that right? Yeah, the remainder after dividing by piece max. Okay, and that means we should now have, let mute request is gonna be a Vic. In fact, we have a couple of options for this. I think what I actually want here is a, I wanna do the struct thing again and I'll be nice and stick it over in here. So we're gonna pull the same thing that we did with handshake. We're gonna do a sort of cast here. So a request is going to be the index which is a zero based piece index. And remember, all of these are big Indian encoded four byte integers. And so what I wanna do here is U8 of four and I actually don't think I want these to be pub. I want these all to be set. So it's index, begin, and length. And then I wanna impel on request pubfn new and I'll take an index which will then be a U32, a begin which will be a U32 and a length U32. The last param was the piece. Oh yeah, we haven't gotten to piece yet. Okay, so this will then construct a self and it can now basically enforce the fact that these will be big Indian encoded. So we can do U32 or actually we can just do index.2be bytes. We do the same for begin and we can do the same for length. And similarly, we can do index and have that return to U32 by doing self.index or U32 from be bytes. And so that way we sort of can't use this incorrectly from the outside. So now I should be able to let request is should really just grab star from here, I think is request new. Index is going to be the zero based piece index, okay. Begin is gonna be the zero based byte offset within the piece. Oh, I see because you're downloading multiple blocks of a piece. Okay, let's try to download just like the first block for now. So this is really then not the number of pieces but the number of blocks. Okay, so we're gonna have to do piece hash. Let the number of blocks is going to be, I guess the piece size is this. The number of blocks is going to be the piece size divided by piece max. That's also not quite true. That'll be an off by one, right? Because we want, if the piece size perfectly divides piece max, then I guess this is actually not piece max is block max. The pieces is the hashes. But does it, I see. So this is actually, I see, I see, I see, I see. This is actually not quite right. This is going to be T dot info dot P length. So P length is the field that dictates yeah, the number that basically the size of each piece and then the number of blocks is going to be the piece size divided by the block max. And will that work even if they don't divide evenly? I think this has to be, this is like an old trick of block max minus one divided by, so this basically ends up rounding up the, and so now I guess we want for every, for all the blocks we want to request and the piece we want to, the block we want to request is, I here, or if you will block. So we have the index is the piece index. The offset is going to be block times the block max. And the length is going to be the length of the block. And the length of the block is going to be, if block, block size is, if block is equal to m blocks minus one, then it will be a piece size modulo block max. Otherwise it will be block max. And now that's going to be the length is going to be the block size. So that's the request we're going to want to send. Send a request message reach block. Now let's start by downloading these serially, I think. And then we could do, yeah, there's a nightly function for doing this, this, the equivalent of this. I think we'll download the box serially first and then we can download them in parallel later. So we send a request. So we do peer.send message. And this is where it would actually be really cool to do zero copy. Doesn't really matter here, but like, you know, from at request as const request as const u8, I actually want, where's my handshake trick? I kind of want to, this, the cast we did here too. So let's go back to the main handshake bits. Let's make this a little less bad. Aspite's mute. And then I'll do the same for the other one. This will be, like so. This will be, like so. And then we can do the same thing here for request. Self, self, self. Bytes, bytes, bytes, bytes. And in fact, now that it just talks about self, it can be identical up here. And so now we can do handshake bytes is handshake.aspite. Bytes, mute. And get rid of the unsafety from up here. Oh, I misspelled mute. So now I should be able to do the same thing. I should be able to down here, say, let request bytes, vec from request.aspites, mute. This vec from is gonna end up doing a mem copy, which is a little sad, but we'll survive. Wait, context, send requests. We can even do this to make it. Send request for block. Block. And then when we send that, we're supposed to get a piece message. Piece is supposed to be the message we get back. Piece.tag is supposed to be a piece. Assert not piece.payload.isEmpty. The payload of the piece message we get back is supposed to be, whatever this thing is. Okay, so we'll go back here. We'll say some requests. Wait, did I not do that right? I did not do that right. So this should be a piece. And a piece is supposed to be an index, a begin, and a block. And the block here is of unknown size. So it'd be interesting to see whether we're allowed to do unsized last field. Almost certainly not. I can't construct a new one of these, but I can grab the index and the begin and the block. So I should now be able to do that piece is piece.payload. So what I wanna do here, right, is the payload is just the raw bytes and I wanna cast it into this piece type. So I want a reference to that to be, this might not work the way I want it to. Constuate as const piece. See how happy it is about that. Like if I run this, does it compile? Oh, I accidentally copied some code I didn't intend to, somewhere. I copied message tag up here. That's not on purpose. piece.payload.aspointer. So what I'm trying to do is I'm casting the payload to be a raw pointer to one of these structs and then I'm turning that raw pointer into an actual reference to that type so that I can access its fields. And the trick I'm trying to pull and we'll see where it works is you're generally allowed when you have repreceed for the last field to be unsized. So you notice here, this doesn't have a size. It's just a U8 slice, which normally is, it's not sized, it doesn't, you don't know how long this slice is and it's also not an array. Normally it's allowed for the last field to be that because you make the type sized by storing it in a reference and the reference becomes a fat pointer so it also stores its own length which then stores the length of the last field. So at least in theory, this should be okay. Can I cast thin pointer to fat pointer? How are you supposed to do this though? Let me just get rid of some of the other compiler warnings here. And yes, I'm doing some type crimes here with the overflows and such, but you know, such as life. I just want to get it to the point where only that part is the one that makes us sad. 53 pieces, Len, info.pieces.zero.len. This is all just because we have the, we have hashes, which is this new type. That's why I need all these zeros. I could implement DRef for hashes to make that problem go away, but I don't want to. And the reason I say I don't want to is because it's not like, it's a bunch of reorganization of the code that's not super interesting really. So I want to focus on the relatively interesting parts. Okay, so here the, it's now giving me just that one piece where it's complaining. So it wants me, I'm wanting it to cast a thin pointer. So a pointer to a U8 to a fat pointer, which is a piece, but I think the way to do this is, I think there is a way to do this cast the way that I want it to. Yeah, so fat pointers are for dynamic dispatch, it is the pointer to the V table. And for a slice, it is the length. And for a reference to an unsized struct whose last field is unsized, it is the length of the last field. And so that last part is the one I'm trying to take advantage of. And there is a way. Yeah, I mean, I can create a raw slice, but that's not quite what I want. I actually think that the way I have to do this is, this, so now it's a slice as this as that. It's not telling me I can't. That's not terrible, it compiles. It's complaining somewhere here, peer at 182. Slice index starts at five, but ends at four. Oh, I see it only has a tag. And so this is, let data is if source.lin is more, is more than five than this else that can you. That worked. I got a piece. Well, I got a block of a piece. Interesting. Yeah, so now I can start actually checking the integrity of the pieces by bringing all the blocks together. So I guess now I should be able to do, let all blocks evacuate, let all blocks is VEC with capacity piece.lin. And then I should be able to do all blocks.extend piece.block. And at the end now, where's my SHA-1 code? Where's my SHA-1 code? Oh, it's in tracker? No, it is in torrent. Yeah, I have everything in 10 minutes, so I need to be fast here. Let's see, so if I update this with all blocks, hash is this. And then I guess assert equals the hash with the piece hash. Right? In theory, if I haven't completely messed up, use SHA-1 digest. I'm definitely doing something wrong. 229. Oh, piece size. Why can't I? Oh, right, U820. Can't compare U8 with U8, that's a lie. Can't borrow all blocks as mutable, that seems. Well, we get the wrong hash. That's awkward. Why do we get the wrong hash? I guess we should also like, oops, assert eek that all blocks.lin is actually equal to piece size? Aha, we don't get all the pieces. Well, we don't get all the blocks. In fact, we get one, but not two. Piece size divided by, like, am I only downloading one block? No, two. Piece.block.lin. Something's not right here, because look, we're downloading first a block of that size and then a block of that size, which doesn't seem right, given that the total size is this. What if I just make that three? Then now I have too much data. I don't think this one's wrong. I mean, I could make it smaller, I suppose. No, there's something wrong in our, I think it's maybe in our modulo computation here. Oh, that's because this should be assert equal that with block size. Yeah, we're not getting back blocks that are the same size as what we request. And that could be because this one's wrong. There's seven too short, in fact. Seven is, like, is this just, like, stupid? Like, it needs, I need to tell it to also include space for the headers? No. Well, okay, here's the problem. I have another meeting in, or I have an actual meeting in five minutes, which means I actually have to stop, which is very frustrating. So there might be a part two to this one. But there, it sounds like, it seems like we're very, very close, and we have a one off here somewhere. I guess I could try one last thing, which is one to 15 and see what happens. Oh, or where's our encoder peer? Peer, where's our const? No, encode, max, where's our max? No. We have a bug here somewhere that's, like, off by seven. I'm a little unclear why, but I do really, really actually have to stop. And so I think this was pretty fun. I think this was a good way to, like, guide you through something interesting. You know, I think having the tests and having the steps was pretty useful in guiding you through the actual protocol. Yeah, I think I would recommend this. You know, I've talked about this in the past that I think it's really, it's a really good way to learn to build real things. And this certainly forces you to write, like, real code to do interesting things. And afterwards, you get to have, like, a real BitTorrent client, which I think is pretty cool. So I'll give this a sort of endorsement, if you will. So what I'll do is I'll put the link in chat for where you can try this on your own. So it's, yeah, this is really fun. I might just continue this on my own. So Code Crafters gets a plus one for me. And again, I don't actually get paid to tell people to do this, but it's like, this actually seems like a good idea. I like this way of teaching. And with that, I'm gonna stop. Thank you all for watching, so much fun. And I hope you had as much fun as I did doing this. Thank you all for coming. I'll see you all later.