 Hi folks, welcome back to another stream. This one's a follow up from the previous one where we, well, I'll show you, we implemented our own Bittorrent client by following the sort of sequence of challenges from a website. And when we were doing that, I thought it was a lot of fun but when we got to the end, it felt like the code was written as though it was following a sequence of challenges. And I started thinking about, okay, how would I restructure this code in order to, like if I were to build this for real, right? If I actually wanted to build like a real Bittorrent client, how would I restructure the client? So I'm not talking necessarily here about like implementing all of the features of the protocol. Like that could be its own kind of interesting, but even just the code that we have so far for, you know, splitting a file into these chunks, identifying how to download each chunk and, you know, which sub blocks exist within each piece, that sort of stuff. All of that code was very like top to bottom imperative. And I wanted to think a little bit harder about how we, you know, download things in parallel, how do we prioritize which things to download first? And in particular, how do we structure the application in such a way that doing that is easier, right? And that's what I wanna take some time on today. So this time it won't be as much really guided by the challenges because there's only one challenge left, which is download like a file, which is just not, like we might do that accidentally as part of this, but there are no steps guiding us to this. This is really just a sort of restructuring thing. The existing code is on GitHub. And if you watch this after the fact, I guess you will end up seeing the code as it is at the end of the stream. So I guess go back to this commit, B7624AE, and then that's where we're gonna start at the stream. My drawing tablet is not working for some reason at the moment. It blinks up, it shows us detected in Linux and everything. But for whatever reason, if I do anything on it, nothing happens. So where we need diagrams here, I'll try Excalibur and see how well that holds up to this. Okay, so let's close this one. I'll leave this one just in case we wanna actually do the challenge, but I don't feel too compelled to that. So let's drive straight into the code here. So if you remember from last time, let's do a quick little recap. We have a main here, which is the sort of entry point for the code crafter's BitTorrent challenge. So it basically asks you to make a binary and they call the binary in different ways, depending on which challenge you're in. So you see under command here, we have like this I think is like step one through, or challenge one through five. I don't remember the exact mappings, right? But each one of these correspond to a challenge. The last one that we did last time was download piece, so a torrent file is logically, it tells you about which files are in the torrent, and how many pieces is it split into. So the logical way that a BitTorrent file works, in fact, maybe this is a good drawing candidate. So if you think about a BitTorrent file, or really just a torrent, the torrent is like a little pointer thing over here. Ooh, no, I want another square over here. So the torrent file is a little thing over here, and logically what it does is it describes, like this box kind of describes this one, and what this bigger box is, is it's simultaneously a bunch of files and a bunch of pieces. So what actually happens is that this is, this might have a file hierarchy internally, like slash, ooh, I want this to be left aligned. So logically inside of it, it might have slash foo, slash, or at slash, maybe it has foo.text, maybe it has slash bar, slash baz.text, whatever, it doesn't really matter, but it might have a bunch of files, but logically it is encoded as just a single giant blob, and blob here in the sense of like a sequence of bytes. It is just a single blob of bytes. And then what actually happens is that this is subdivided, I don't think I can actually use that one. I'll draw it out instead. So I'm gonna go ahead and draw it out. So I'm gonna go ahead and draw it out. I'll draw it out instead. In reality, what happens is that this file is really just split into pieces, and I want these to be, I want my text further forward. It's fine, I'll make two. This is where I would really like to just have my own drawing thing, but it gets split into pieces that are all of the same size, except for one that is sort of the remainder. And then what this file list actually does is this file list is stored as a mapping of the bytes in the giant blob. So for example, it might say somewhere that foo.txt is stored at bytes 500 to 1089, right? baz.txt is stored at 1090 to, or depending on what we call these exclusive ranges or not, 1089 to two or 4,096. And xt.txt is stored at 4,096 to 8,000 or whatever, right? And so as a result, what you really need to do if you wanna download a particular file from a torrent is you need to look at, okay, what is the byte range of the overall blob? Make sure that you download, like if we imagine one starting to count over here, the diagram is a little weird at the moment, but like if you imagine byte one is here, then byte 500 might be somewhere around here, and let me draw here what that might look like. So let's say this is byte 500, and let's say further that over here is byte 1089, then in order to download foo.txt, you would need to download this piece and this piece, and then read the end of this piece and the start of that piece. So that's the mapping from files into a giant blob, right? And so it's a really fun puzzle problem in a sense to figure out which files to grab out. Of course, if you wanna download all of the files in a torrent, then it's much easier or reasonably much easier because you just download all of the pieces, and then you stick them all one after the other, and maybe write them to disk or something, and now you have like the entire torrent file, and then if you wanna extract any particular file, you just pull out those bytes out of the giant blob by just by indexing. It gets more interesting if you wanna say only download foo.txt, because in that case you don't wanna download these two blobs or these two pieces because they are unnecessary. And then of course this gets a little bit further complicated by the fact that each piece consists of multiple blocks. Blocks looking like this. So basically every block in here, like so, every one of these is subdivided into blocks and all the blocks are of the same size. And the only thing that's guaranteed is when you try to connect to any given peer in the bit torrent network, if they say that they have a particular piece, that means they have all of the blocks in that piece. But what you can do in order to speed up your downloads, let's say you wanted to download this piece, you could get this block from one peer and simultaneously get this block from another peer. And so in theory you might get higher performance because you're pulling now in parallel from multiple hosts. So you're not limited by the upload speed and any given other peer. So that's the sort of rough idea of why there are these two levels of division. And currently, what we built last time was this ability to download a particular piece. But of course all of this infrastructure that I've talked about of like you might need to download multiple pieces, the mapping from file names, two parts of a piece, even just downloading multiple pieces in parallel, downloading multiple blocks in parallel, which pieces do you download first? None of that we've implemented. And if we try to implement that in the way that the code is currently structured, I actually think it would be fairly annoying because when we look at the code here, if we look at like the download piece one, which is the most complete one that we have, it's really just an imperative piece of code. So it reads out the torrent file, it decodes the torrent file into like a torrent descriptor which has information about what pieces there are and what files there are, for instance. And then it talks to the tracker, which is the, there are basically two modes for torrents to operate in. One is where you download all of the information about who has which piece from other peers. And the other is one where you talk to one server and say, hey, who has this piece, who has that piece? It's that second mode that is all we support for now. And we're not gonna change that in this stream, I think. We're just gonna stick with the tracker based approach. So you contact the tracker and you get the information back from the tracker which then tells you the hashes of the different pieces and who has that piece. So that's this tracker response bit. That tells you what peers there are. Then we do a handshake with basically a randomly selected peer among the peers. And once we do the handshake, now we have a connection with a single peer and then we establish this encoded channel with them over TCP and then we send a message saying which pieces we want to download, like which pieces we're interested in. Then we tell that peer that they should start sending us data and then we start reading out the data one block at a time. And after all of that, we stick all the blocks together and then we hash them and then we write that out to the output, right? Which is all fine, like that is the sequence of steps. But it's not really how you want to describe this. The moment you want to say download multiple pieces, right? Or map files to pieces. That's what we're going to do today, right? Is figuring out how to structure this in a less insane way. And so the way I actually want to go about doing this is I'm going to add a new bit here. That's going to be download. It's going to take an output. It's going to take a torrent. It's not going to take a piece. Like so. And when I find here download piece, that's this one, yes. So download this. And here's, so I've done this on a couple of streams before. I want to write the code that I would like to be here by the end, right? What I would like this to look at is probably something like let torrent is torrent read of the torrent file, right? That's probably going to return an error of some kind that we would want to lift. Torrent read, sure, we can let that be async. That's fine too. And then I would want something like from this, I would like to be able to print out torrent.tree, or maybe this even looks like print tree, right? So I want some way to basically print out. At this point, I should know what all the files are, right? And then I should be able to do something like torrent. So I think I either want the ability to do download all which is going to download everything into output. Assuming that output is a directory. Or I want the ability to say download some where I'm going to give something like, I guess an iterator. We kind of have to decide what we would like download some to look like. I was imagining something that's like input iterator. So in this case, it could be something like back of output. Assuming here that, no, back of maybe two poles actually. Maybe this is the way it should look like. And there could also be a download single, if we really wanted that to be the case, to output. Right, so all of these modes should be supported by whatever this torrent type is. And you can imagine that internally, right? These are all going to require, in fact, download single could return bytes now that I think about it. Like there's a way here in which we could say download all sort of two file, right? But you could also imagine that we have a something like this where what it internally stores is the entire byte structure and the file list separately. And then in files, you can then do something like for file in files or something, right? So this is really an iterable. And actually maybe we just want to support download all for now. And then we could improve that with some kind of filtering later on, like only download the following ones. Maybe this is a nice interface actually. Maybe I'm happy with that. So here, you could imagine that you can pass in a set of filters over the files that you want and you could do that based on the output of print tree, for instance. And then download all to file would basically be a variant of download all. Great, so then we could, you know, stud fs or I guess Tokyo fs, write to output files.iter.first or something, right? And you could even think, so files.iter is gonna give us an iterative list. Next is gonna be the first of the files that are in there, expect always one file. And I want the bytes for that file as opposed to, for example, the name, right? So this I think, and then you could imagine that download all to file is really just gonna walk this iterator and that there are some optimizations here, like maybe you write directly to disk rather than buffering them in all in memory, but that kind of optimization, I don't think we need to worry too much about. So maybe download all to file actually internally just does this, it iterates over files and for each one, it writes the corresponding bytes to disk, in which case, download all to file is not that important because download all is the one that does all the heavy lifting. So I think that's what I would like this to look like. And so now we can actually start to construct it so that it looks that way. If you remember the code structure that we have. Yeah, so like the reason why you wouldn't actually, the reason why you might wanna optimize this further, you could imagine that there are downloads that are many, many gigabytes large and you can't actually store them all in memory in which case you might actually have to stream the individual pieces to disk. You actually kind of always have to just because if you want to be able to resume seeding them later, for example, chances are you want the pieces on disk anyway. So we could make it so that what download all will do is it will always store the pieces to disk and it might cache some of them in memory. That's also something that's totally fine for us to do here. But I'm gonna treat that as a sort of implementation detail. So inside of source, you see we have a main which is this binary. We have lib and if you look at lib, it's really just a bunch of sub modules. Peer here mostly holds the data types for interacting with a peer. So things like the handshake message and what fields are in there. But it also holds definitions for message, like the kinds of messages that you can send as well as the encoding protocol for sending and receiving messages from a given peer. Torrent mostly has the information that's actually stored in the torrent file which is primarily the URL of the tracker. So this is where we get information about what peers are connected as well as this info thing which holds information about basically which pieces there are in the file and which files there are. So whether it's a single file or whether it's a sort of multi-file packed thing. And so this is all just like information about a torrent. And we can reuse this type as the outer type of torrent here. We just need to add a read method which should be straightforward enough. And then in addition to lib, we have tracker which has all the types that define how we interact with the trackers in particular. What does the request we send them look like? What does the response look like? And that's mostly it. It has some special deserialization and serialization logic which we don't need to talk too much about here. We sort of figured that out last time. Okay, so before I go on are there questions around like this structure of this is what we want to get to? Does the implementation of each type of download function differ? Well, so the hope, if we did have like download all to file, download all, download some, download single. The hope is that behind the scenes they all invoke the same logic. And really what they do is they just differ in like avoiding to download pieces that aren't necessary for that particular download and sort of massaging the output or the result of what we download into a more convenient format for that call. So they're really mostly convenience functions. Disk sizes should be U64. I don't think there's a disk size here. Have you tried any other challenge? No, I haven't tried any of the other code crafter challenges. We're not directed by what code crafters asks now. I don't think they have multi-file torrents at all. That's right. Yeah, so at this point I'm not, the goal of this is not necessarily to meet any particular challenge. The goal here is just to structure the actual implementation of this crate the way that I think it should be done if we were to do this more sort of for real. Okay, so let's go ahead and start with the pubfn or I guess asyncfn read. So this is going to take a path to the torrent file and it's going to return I suppose a anyhow result of self. And I actually don't know. Oh, we did bring in anyhow. Great. And for most of these, at least in the beginning, the hope would be that we can just take out parts of the original code that we had here and just stick it in there, right? So this is going to be, this is async, right? Yeah, okay, great. So we can just take the code from download piece which has a lot of the sub parts here and stick them into reads. In particular here, we're going to read the torrent file. We're going to parse out the torrent out of there and then we're going to return the torrent. Great. And then, what was the next thing we wanted? We wanted print tree to be something that you can do on a torrent. So we'll do fn print tree of self. And what that's going to do is, we'll probably need a helper function here which takes a sub tree and we'll figure out what goes there in a second. So the things that we have inside of self, right, is, so we have the name, which is the name of the top level thing. So I guess here we could say if let keys single file is self.info.keys. So if it's a single file, then all we really need to print here is the name. So self.name, which is the name of that single file. And I guess we could actually match here instead. If on the other hand, this is a multi-file, then this is files list that we then need to operate on. And so if we go down to files here, for the purpose of the other keys and info, the multi-file case is treated as only having a single file by concatenating files in the order they appear in the files list, right? So this is how we get to that mapping of byte ranges, is that in the overall blob that we have the single overall blob, the files are laid out in the order dictated by this part of the torrent file. So they must be in this order. And for each file, we're told what its length is. And so what we can do up here, given that we're just gonna print the files, might not even need subtree because there's no notion of subtrees here. We're just gonna do for file in files, eprint line, file.path. And you can imagine turning this into a more structured format, right? To actually store it as a tree, basically. But given that the way it's stored in the torrent is really just as a linear sequence of paths feels reasonable to just print out that list of paths here too. Okay, so that's print tree, that's easy enough. And so now we get to download all. Okay, so obviously download all is going to be somewhat complicated. So download all is gonna take our reference to self and it's gonna return a downloaded, I suppose. Yeah, that feels fine. Okay, so what would download all look like? We're gonna need a downloaded type here too. And I think actually, I want all the logic for downloading to be somewhere else. So what we'll do is we'll use super downloaded, or download. And then I want this to actually say download all. And then I want to make a new module here called download. And then I want all the logic that has to do with downloading to live over in that module. Okay, so we need an asyncfn all, which is gonna return an anyhow result of downloaded. And this is obviously where a lot of our logic is gonna live, right? And we're gonna have a pub struct downloaded. And what do we want that to support? Well, we want to implement into iterator. I can't type into iterator for a reference to downloaded where the item type, so the thing that this produces when you iterate over it is a reference to a file. Into iter is gonna be a download iter. Yeah, like so. Into iter itself returns self into iter. So this is just to satisfy the sort of last bit of this, right? The ability to iterate over the files in there and grab each file's bytes. Downloaded, iter new of self. And then we're gonna need a pub struct downloaded iter which has a, this is a lifetime reference to the downloaded. So what downloaded is gonna hold at the end, right? It's gonna hold not pub. It's gonna hold a thing that holds all the bytes. And we might optimize this beyond having it be a vec of u8, right? You could imagine this being references to files instead. You could imagine us using the bytes type so that we can easily grab out or merge together pieces without having to do a lot of mem copies to do maybe bytes. It's from the bytes crate. And we have the files list, which is a vec of file, which we're gonna get from crate. Whoa, there's a lot of these. I wanna use crate torrent file. This is gonna hold a vec of files. And a downloaded iter is going to hold a file iter. It's gonna hold a downloaded, which is a reference to the downloaded so that we can get at the bytes. A file iter, which is going to be a vec, I guess actually it's a slice iter, a file, right? So this is an iterator over this files list. And also an offset, which is how far we are through bytes. So remember that for any given file, the location of that file in the bytes is the sum of all the lengths of the files that came before it. So we need to keep track of that. In theory, we can instead keep an iterator over bytes, but I actually think offset here is gonna be nicer. And then we'll impole downloaded iter. We'll impole new. And that's gonna take a downloaded and return a self. Downloaded is gonna be d, file iter is gonna be d.files.iter, and offset is gonna be zero. And then we're gonna implement iterator for iterator for downloaded iter. The item here is gonna be a d of file. I actually don't think it's gonna be a file. It's gonna be a downloaded file, which is the type we don't have yet. And next is going to be, let's some file is self.fileiter.next. Let else man, instead of this, actually, I can do let file is.next question mark because option implements try. So the next file we're gonna get at is this one. And the bytes for that should be self.downloaded.bytes. self.downloaded.bytes. And we want that starting at self.offset and ending at self.offset plus file.length. Those are the bytes for this file. And then we wanna return some of downloaded file of file and bytes. And then, of course, that means we're gonna need a downloaded file type, which is gonna have a pub struct file, which is a file, not a struct, and a pub bytes, which is gonna be a u8. It's gonna return a downloaded file d, like so. Same thing up here. It's gonna return downloaded file like this. Great. And we could have these be accessors instead if we really wanted to. So we could import d downloaded file. And then if we look back at our main, what do we want here? Well, we wanted bytes at the very least. So we probably want something like path, which returns a stir, which is self.file.path. Why is that a veck of string? Oh, the paths are stored as, that's fine. So that's something that will actually be different in our torrent here, is that this files list, what file here actually is, or what file the path is, it's a vector of sub directory names. So we'll actually want to dot join this by path. Main separator string. So we're gonna join it by slash basically, which is fine. And we also want bytes, which is gonna be the u8s, which is self dot bytes, like so. Oh yeah, the trick with byte offsets is nice. So we can do this instead, which has the same effect. This one. So we first slice from the beginning and then we slice to the end, just so you don't have to repeat self.offset, has the same effect. It's a nice trick. Okay, so now we have roughly what we want the returned things to look like. And you see the main thing that it needs is this file list, which we get from the torrent, and the bytes, which we get from downloading the pieces. You could imagine that this is actually a veck of piece, which is really where you get in the bytes type. So if this was a veck of piece, then now suddenly the iteration logic becomes more complicated because it becomes sort of obvious that the bytes for a given file might actually overlap multiple pieces. So then how do you bring them together? And so that's where something like the bytes create would come in. But for now, let's just stick them all in a single vecku8 and then we can improve afterwards. Okay, so it still raises the question of how do we actually do this download? Well, let's assume first that we're gonna download everything and then we can refine the code afterwards to support filtering which things do you actually download. Well, we know that there is, this is where we'll go back to our main and grab the other bit of code here from download piece. So, here. First thing all is gonna have to do is it's gonna have to grab the information from the tracker. And I don't think we actually need that connection to necessarily stay open, the connection to the tracker that is. So this is actually something that could go in tracker where we could say tracker response, right? We could implement for tracker response pubfn or asyncfn doesn't even need to be pub query. And so this code that's gonna take a torrent and it has to send a request that has to give some PRID for ourselves, which is fine. Port uploaded, downloaded, aren't relevant here. Compact one is fine. Left here is the number of bytes left to download which in this case is really the entire length of the torrent and I forget whether we actually need to compute that or whether the torrent will say use create torrent, torrent. So if I go over here and go to info, the number of bytes in each piece so that's just the size of the pieces. But I think we actually need to compute it over the keys case. I think it actually needs to be the sum. So what we'll do here is we'll do, we'll have another sort of helper function here on torrent which is length, which is then match on self.info.keys. And if it's this, then that's just the length. Otherwise it's files.idder.mapfile.length.sum. So the sum of the lengths of all the files is the total amount of the total length of the bytes in this torrent. And so that's what we're gonna tell the tracker that we need to download. Now obviously here this would change a little bit once we start seeding as well because then you might query the tracker and tell it that you already have a bunch of data. So we're not really dealing with resumes yet but I think it should be relatively easy to modify this to allow resumes later on. Okay. So right, this is where we like URL encode the stuff to the tracker. I remember this was a bit of a pain last time. We decode the tracker response and then tracker info is the thing that we give back. Great. So we now have a, this can probably be a pub crate. We don't want it to be fully public because this, I think this type isn't even public. Oh, it's public domain. Yeah, but we, I don't think we actually want this to be. We might start using this for main actually just to get information about a torrent. But for now let's keep it internal to the crate. So now that we have that, we can say that this is going to say peer info is going to be a tracker response. Or you could even imagine this is just a freestanding function rather than being tied to tracker response but I don't think it's super important. So we'll use here tracker, tracker response. So this is going to use query of, oh, all needs to take a torrent. So it needs to do a query. We'll give some context here. We'll say query tracker for peer info. All right, so if we now go back to torrent, so download all here, we're going to pass in self. So, and this we said, which is going to be pub crate instead. So the way that you access it is through the download all method on torrent. Okay, so we have the peer info. Good, let's go back to our main and see what we do next. Now that we have the peer info, now this is information about actually connecting to the peers and requesting the particular piece that we were after and all of the blocks in that piece. Okay, so this logic is very linear at the moment. And in reality, what we want here is something like, especially if we know we're going to download all of it, we sort of have two decisions to make. One of them is which piece do we download next? And the second one is which peers do we download from? Once you've decided to download a particular piece from a particular peer or a particular set of peers, then that part is easier because it's just, you enumerate all the blocks. You request all the blocks from some number of the peers and then you're good. So I think what I want to get at here is I want to do this sort of inside out, which is let's assume that we have picked a piece to download. And now we want to do the download of a particular piece. So let's encode what that might look like. Let's do something like ASICFN download piece. And so if you're told to download a particular piece, what information do you need? You're going to need, if we look up here, this is just grabbing out the peer info, which is not the bit I want to look at. I want to look at, okay, we're interested is the thing that we send. We send Unchoke, right? We send a request for a particular piece from a particular peer, right? Because we request, yeah, we send a request for that particular block of that piece. And so the only thing that we really need in order to download a given piece is sort of a list of candidate peers. And I think the peer information here, if I find that right, is a peer here. Is just a socket adder, great. So candidate peers that we could download a given piece from. And I think that's really it. Oh, no, and we need the piece length, which we probably extract here somewhere, right? The piece hash and the piece size are the two things we need. So the piece hash, which I think is just a UA 20 and the piece size. And we'll figure out what this returns. It might change the signature a little bit later, but that's all the information we should really need in order to download a particular piece, right? And then if I now say, let's extract this further, so download piece block. From. So this is gonna be even more like this is, we've picked a peer and we want block I and we know the block size. What does that look like? And I don't think actually we want these functions is more I'm trying to break this down into one of the smaller pieces. The reason I don't think we actually want this structure quite is because it is because you are probably gonna have persistent connections to a given peer. You want to basically have a state machine that owns the connection to a given peer rather than like connect to it each time you want a particular block from one. So let's, do we have a peer type here? We do, right? We have a piece, which is information about the piece, we have message tag, but we don't actually have a peer type. So I think what I want here is a pubcrate peer type. It has an adder, which is a socket adder before. And it's a state machine that we, we're gonna want to keep track of that state machine, I think. But let's do imple peer new. I'm gonna have a stream, which is gonna be in main here. When we connect to a peer, we get one of these things. The peer here is the actual connection. It's a framed TCP stream with message framer. Framed TCP stream with message framer. So we're gonna have some first class primitive of an ongoing connection to a given peer. Great, and that's this bit that we could just grab. And so this is gonna look like, oh, right, we actually have to do the handshake. This is gonna say, you know, peer, it's gonna be a, we could be even nicer here, but let's say socket adder v4 for now. Expected bang, oh, construct. So in order to construct a new peer connection, we're gonna give in the peer address, connect to it, do the handshake. And for these, we could relax. These little bits are currently, like if the handshake length is not the expect length, or if this is not bit torrent, rather than assert equal here, we could use anyhow ensure, which is a macro that ensures that basically it's like an assert, but instead of panicking, it returns an error if this is not the case. Okay, that's fine. Then we establish the connection here too, anyhow, and sure, and sure, I can't tell. And assuming that's all good, then we return self, which holds the adder, which we might not even need. And it holds the, well, I guess this is peer adder. I guess this is peer adder. And the stream here is peer. There's something else missing, which is the info hash has to be passed in here, which is the, just to make sure that the peer actually has the same block contents we expect. Okay, that's too much spamming chat. Bye. Great, so we now have a thing that can establish a connection to a given peer. And the way to think about one of these peers is really that it keeps track of the connection we have to that particular peer, and will do things like download a block if we tell it to do so. Now, the way that we wanna do this is probably, we basically need to think about whether we want to allow a given peer to be told to download multiple things, or whether a given peer should only be allowed to download one thing at a time. I think we've probably wanted to be one thing at a time for now. So there's gonna be an async fn download, and it's gonna take a mutable reference to self. It's gonna take a, I think it has to take a piece i, a block i, and a blocks, does it even need the block size? It does need the block size. And it's gonna return, hopefully, an anyhow result of back of U8. Type O, suck it out of V4? No, I think that's right. Oh, suck it out of V4, nice. So what will download look like? Well, we already actually kinda have download here, which is this, right? So again, I'm just sort of splitting up the code that we had just in a long iterative mess in main, and turning it into a more structured way of talking about a persistent connection to a peer, downloading a given block over that connection, and so on. So constructing the request here is, why does this say block max? Oh, right, because the way you actually frame the request is you say, I want to download starting at this byte offset and ending at this byte offset. And so the block i is gonna be multiplied by the size of the blocks. And the block size is needed because most blocks are size block max, but the last block is smaller. So we need to know the actual block size. So often this will be a block max. Great, then we send a message here, then we await the next message from the peer. And I guess we could do for all of these to really anyhow ensure like this. And then I suppose this can really just be piece.block and it's really a veck from this. Okay, a veck from this. So this piece is basically the payload that we get back here from the peer. That payload is a vector of U8, but it's structured like there's a bit of header information and stuff. And so we ultimately want to get out just the block that holds the real data and then turn that into a vector. Technically, we might be able to do this in a slightly more efficient way by allowing the, we could return like a thing that just ignores all the header bytes and then lets you iterate over the bytes that follow just so we don't have to do the mem copy. But I'm gonna allow the mem copy here because it's for a given block, so it's fairly small. I think this is probably okay. And it makes the interface a lot nicer. Okay, so a given peer, we can now tell to download something and they will do so. I guess the other thing we need to decide here is around this interested and un-choke and I think, oh right. Right, so this is also something we have to think about which is this bit field. So the peer tells us which pieces it has available and it might not have every piece. So I think, and then that's represented in the bit field here, bit field. So I think we actually wanna keep track of that one. That's not right. All right, where is the bit torrent specification? I don't remember the URL to it, it's this one. No, it's not zero zero, it's zero three. All right, and then we'll also do this to make people happy. So, trackers connections, peer messages, bit field. It's payload is a bit field with each index the downloader has, with each index the downloader has sent set to one and the rest set to zero. Downloaders which don't have anything yet may skip the bit field message. The first byte of the bit field corresponds to indices zero to seven from high bit to low bit, the next one eight to 15. Okay, so the bit field is really a, we're gonna have to structure this one a little bit, but the bit field that we get back from payload, bit field.payload. Like so, why is my completion not working? Whoa, it's very unhappy about a bunch of things. Let's do a cargo update here and a cargo check and see if it gets happier. Whoa, Rust update maybe too. All right, there's a new Rust version, isn't there? Let's see, come on, Rust up. So my thinking here was over in peer that we're gonna get this actual bit field that we're gonna turn that into a type that lets us actually inspect the bit field in a less inconvenient way to figure out which pieces a peer has, so which one it's a candidate for downloading from and also whether it has a given piece, the two are sort of synonymous here, right? And then we can make use of that in download to basically for each piece, figure out which peers are candidates to download from. Let's now see if we get this to do something useful. Now is it happy? It's happier download, okay. It's not happy about this one. That's fine, but at least now I'm getting error messages. So that's a start. Oh, this should be not context, but a result. And then yes, I need to import this. Yes, I need to import this. Yes, I need to import this. And bit field we don't have yet. And bitfield.payload is the one that I wanted to... Why does it not understand what type this is? Interesting. Blockmax is stored in main? It should not be in main. It should be in peer. Okay, and this one here should say self.stream and same here, because it's this peers stream. Okay, so pub struct bitfield. Now there is a bitfield crate. Don't know whether it supports this because it's a little bit of a weird mode. The payload here is a vector of U8s, right? If we look at the spec, the response or the contents of a bitfield message is a bitfield. So each bit corresponds to one piece with each index that the downloader. So I assume the downloader here is the peer has the downloader has sent. I assume it means these are the pieces that I have as in the peer that you're talking to says these are the ones that I have. The first byte of the bitfield corresponds in a zero to seven from high bit to low bit respectively. Okay, so what that means is if we implement here on bitfield, we could have a pub crate FN has piece where the piece I is a U size and we turn a bool and what this is telling us is that we want the for piece I, right? We want the ith bit. So let byte is going to be piece I divided by eight and let bit is gonna be piece I modulo eight. All right, so the byte is which of the chunks of eight bytes and the bit is what's the remainder when you do modulo eight, that's the bit within the byte and the bit here is bit counting from high, right? Because the spec said indices zero to seven from high bit to low bit. And so what we want is we want to, I guess we can just say let some byte is self.palo.get byte and it's certainly we certainly or that peer certainly doesn't have it if there's no such byte. Self, oh right. And then to get the bit, what we'll do is we'll do we'll do byte ended with one to left shifted by bit. No, I want right shifted. So if we have a byte like one, two, three, four, five, six, seven, eight. If we have a thing like this and we want to know whether the nth bit is set then if it's the nth bit from the right then it's one left shift n, right? Which is gonna be, let's say it's three then what we want is the third from the right which would be one shifted left by three so the number this would generate is this one. And if we end this with this then what we end up with is in this case, one as in it has the thing because these two are both one. I'm lying. This is shifted left by three. Now I'm confusing myself. If I do, all right, all right, all right, all right. Themes just do Solaris dark, why not? Why am I opening the playground when I have rust locally? Yeah, fine. Is it colon B? One left shifted by three. Yeah, so one left shifted by three leaves three zeros behind. So it shifts three over, right? So if four n equals to three it's gonna end with one shifted over by three in this case that's zero which means that we don't have the piece. But in this case what we want is we want right shifted by one. And so there are a couple of ways to do this. One of them is one dot rotate right bit plus one. And so what rotate right does, let's take you say rotate right, shifts the bit to the right by a specified amount wrapping the truncated bits to the beginning of the resulting integer. Right, so what this will do is it will rotate the... So it'll take one which has only the right most bits that in fact, we could make this instead be if we wanted to avoid the plus one here we could make this be one left shift seven but it's not actually that nice. So we're gonna rotate right one so one has the last bit set, right? If we rotate that by one, so the plus one bit that means this is here and then we rotated another n bits where that's the bit we're looking for. So if we're looking for bits zero we will not rotate anymore and so the one will be in the right place to do the end. If the bit is two, if the bit is one then we want the first bit from the left not the zero in which case we'll rotate one more time we'll get the right one, right? So this is gonna end up ending the right bit at least I think it will. And then we're casting that to a bull and the way this cast to a bull is gonna work is if any of the bits are one then the resulting bull is true, right? And the only way the resulting byte is one is if the one bit from our shifting here aligns with a one bit in the input which is the pieces that we have within that byte. So this should tell us whether it has a given piece. Similarly, we could also do this. Your size, right? Which is tell me all of the pieces that you have. And so for this one, what we'll do is bit is piece. We can just do four piece I in. Yeah, this is almost done in chat. Why not just seven right shift by bit? Well, you could, but I actually think this one's easier to read. Like I think this one's also right but I think this one is easier to reason about. So that's why. We have a couple of ways that we could do this. I think what I want here is actually byte and byte I in self.payload.iter.enumerate. And then, I actually don't know whether that's nicer. So the piece I of the left most bit in this byte is going to be byte I times eight. And then we're going to mask is going to be one dot rotate, right one. And again, this is probably just going to be optimized by the compiler to be equal to seven, right? And no, to 128, right? Which so the left most bit in a single byte being set. Unless I'm now, now I'm confusing myself but I think that's true, right? So if I say 128 here. So that's one, two, three, four, five, six, seven, eight. Right, so 128 is just the left most bit set which is what we want, right? That's the left most bit of a byte. So we could write 128 here but I actually think it's clearer to make it one rotate, right one as long as this is a U8 to be clear. And so then we're going to do and here to be clear, we could do this. They are equivalent. And then what piece I is plus equals one. And in fact, this could just be this. Could just start at zero in which case we don't need the byte I anymore in which case we don't need the dotator. We can just do this. And so, it has the inflatorator, don't it? This is where I really want generators, right? Like this should be a generator and it is not. That's fine, I suppose. Fine, we'll write this as a damn iterator. Dot iter dot flat map byte. And then we'll do zero to eight makes me so sad. Zero to eight dot map. I want it so bad. All right, enumerate a flat map byte, byte I. We could have a manual implementation of iterator here and it might actually be nicer, but it's fine. So this means we move in the bit I and now the piece I is gonna be byte I times eight plus bit I and the mask is gonna be one U8 rotate right by one plus bit I. And to be consistent with the code above, let's just make these be all at least consistent within the same part of the file. Rotate right bit I and then the result of this is going to be the byte ended with the mask, not bit one, bit I. Is there any way to make the magic digit eight in has pieces? Oh, this one? Yeah, this U8 bits. Fine, I'll use U8 bits. Fine, fine, fine, fine, fine. So the associated constant bits with a given integer value is tells you how many bits are in that integer value. So I mean, yes, that, sure, now it doesn't say eight. I understand. Okay, so we wanna do this. And so now this is the kind of thing where I really want to test that that actually does the right thing. Bitfield has. So we're gonna do VF is bitfield. Payload is gonna be a VEC of zero B, one zero, one zero, one zero, one zero, one zero, one zero. And then I want to assert that it has right are all of these one two three four yeah eight one two three four one two three four yeah great uh so we want to assert that it has pieces zero uh that it does not have pieces one we also want to assert that it has piece it does not have piece seven right because this would be piece seven we also want to assert that it doesn't have piece eight because this would be piece eight and we also want to assert that it does have piece 15 okay so that's this one and then we can do the same for um field iter and then I want to do uh bf dot pieces like so um and here's what I want to do with this one I want to what we in this case it's short enough that we could assert all the way through it so normally what I would do here right is I would write something like um I would just do an assertion over the iterator that it alternates or something but the it's short enough here that I kind of just want to write them out um so we should expect to get uh some piece zero one and then let me right so this is byte zero one two three four five six seven so this is seven um what oh damn it uh this is number zero this is number seven right so the numbers should be zero uh and then number eight should also be zero and then it should continue so this is nine 10 11 12 13 14 15 let's see what that does um I checked godbolt and the optimizer gets rid of the modulo because rotate right doesn't actually care since the rotation is modulo eight as well yeah I I generally just like assume that the compiler gets rid of a bunch of these and I would and therefore I would rather write it in a way that it's easier to read than optimize for um uh than optimize for what the compiler might produce like I just sort of assume that the compiler is smarter than I am uh I can't find function url in code interesting oh right forgot we had to do an ugly thing with url in code use anyhow context that's fine well let's just want it wants path that's fine you can get path uh what else we got read is so i want to pass in that actual path self dot info dot name uh use anyhow context this needs to await and else we got downloaded I want to import that uh this is an await all right we're getting there can I cargo test now okay not quite yet uh no method context found for results because we need to use anyhow context that's fine all right and I need uh these bits for this to be happy uh this needs to do this uh this needs to pull in sink and stream x okay uh this is in fact a function that we're gonna need uh over here which is from payload which is self payload so that constructor is easy we're getting pretty close uh can I multiply u32 by u size uh size that's fine uh can't compare oh right uh that's fine these can all be u size arguments and I realize that these maybe should be u64s because they're uh file file offsets and those are usually uh u64s rather than u size so that if you were to compile this on a 32 bit platform you'd actually get the right behavior um we can do that later it's more we already have u sizes a bunch of other places um and so I would do that just separately uh uh has I should just say has peace uh cannot add u32 to u8 uh that's fine what oh cannot multiply u8 by u32 okay thank you I expected u8 found u32 is that because it's very confused about the types here uh as u32 as u32 I guess actually has u size because psi here is u size uh expected bool right so this is going to be bool from expected u32 found u size byte here is a u8 and this is a u8 right so pretty sure I thought bool implemented from u8 is that not the case because if so that's kind of silly from okay I want the opposite set of implementations I guess not huh all right I guess we'll cast it as bool uh fine uh not equal to zero that's fair uh bit i is now a u size uh yes u32 that's fine as u32 cannot uh yeah I did do that didn't I see here's what I want okay uh and psi modulo u32 bits as u size and then I want that whole thing arguably as a u8 but it has to be a u32 because that's the argument that rotate right takes that's fine payload dot get actually that's fine so this one can just stay in u size land um now this one's a byte i here why is that a reference to a u8 it's a move closure u8 u8 I guess actually this is a u size because I think that's what enumerate produces oh it's because I have these backwards glad I checked uh so enumerate produces pairs of i and val where i is the current index and val is the value returned glad I checked that uh cannot add u32 to u size so byte i here is a u size uh this says u size which really means it could arguably be u size from uh as u size and bit i here is u32 that's fine okay now can I test it oh uh expected to be a closure the returns u size but returns u8 uh if byte mask not equal to zero and in fact if we want to be real ugly here we do then uh it's because this needs to be a filter map because if the if the mask is uh not equal to zero then we want to produce piece i if it is equal to zero then we want to produce none hidden type captures lifetime that don't appear in the bounds yeah because um this iterator actually consumes or continues to reference the payload until the iterator has been consumed and yes I know I held the talk recently where I told everyone that this is wrong it still works so it's fine okay mismatch types and download uh that's fine downloaded bytes to do and files to do are almost at the point where we can run those tests to see if I got it wrong how about now torrent cannot move out of self that's fine uh same thing here great uh lib don't care about main all right we're getting there we're getting there uh so let's see this bitfield how's it doing um iter complains at line 135 the first one good that's always good uh it gave zero should have given a one okay so we fucked up something pieces we iterate through the bytes left to right we iterate through the bits left to right that's because I'm stupid the things that are yielded here are the ones that have a one so the the output here is not the bit it is the index of the piece so the pieces that have ones are pieces zero two four six eight uh no six uh nine eleven thirteen and fifteen and then we should get none sweet okay so this bitfield thing is is right that makes me happy um and so now we should also be able to say here uh um anyhow ensure that self dot uh bitfield dot has piece piece i so if you now try to download a piece from peer that doesn't have that piece we'll return an error beautiful okay uh so that's the stuff we want to do with the peer and there's one more thing which is around this if we look back at our main back to where we originally were um when we connect to a peer we send it an interested and we send it unchoke so the question is should we just do that when we first connect to the peer let's go look um uh downloaders generally download pieces in random order which is a reasonable good job of keeping them from having a strict subset or superset of the pieces of any of their peers choking is done for several reasons tcp congestion control behaves very poorly when sending over many connections at once also choking lets each peer use a tit for tat ish algorithm to ensure they get a consistent download rate the choking algorithm described below is the currently deployed one it's very important that all new algorithms work well both the network consisting entirely of themselves and a network consisting mostly of this one uh unchoking the four peers which has the best download rates from and are interested peers which have a better upload rate but aren't interested get unchoked and if they become interested the worst uploader gets choked okay so this whole algorithm here about how you choose to choke um so the question then becomes how do we want to represent this um i think what i want to do right now is um i see so interested so what's the actual meaning of interested here connections contain two bits of state on either end choked or not and interested or not choking is a notification that no data will be sent until unchoking happens the reasoning and common techniques behind choking are explained later data transfer takes place whenever one side is interested and the other side is not choking okay so i think what this means is that we can always send unchoke because what we're saying with unchoke is uh and this is going to be bad for us but it's still okay for us to do it which is if we unchoke it means we are willing to send you things that's what an unchoke does um which doesn't matter because we're we don't have seeding implemented at the moment so we will send an unchoke saying that we're we're willing to send you things uh interested we should only send if we actually want data transfer one side is interested and the other side is not choking interested must be kept up to date at all times whenever a downloader doesn't have something they currently would ask a peer for if unchoked they must expect lack of interest despite being choked okay so it sounds like what we want to do here is um on a given peer connection if there's something that we want from that peer if they're willing to give it to us then we should mark ourselves as interested on that connection which means we shouldn't mark ourselves interested unless we're willing to download something from them okay I think I I think I know what I want to do here um I think I know what I want to do here but I'm gonna write the code slightly starting in the other end so we're gonna go back to download here um and what we want download to do is figure out which pieces to download next and then figure out how to get each such piece by marking all the peers that have that piece as we're interested and the moment they get an unchoke then sort of set up the the download from that peer okay it'll be clearer than code I think um so this is in the all function right so this is assuming we're going to download everything um we get the peer info and what we will want to do is um dig out all of the pieces and decide which piece to download next and I think the way we want to decide that is we're going to keep us sort of uh need pieces and I think this is going to be a binary heap uh so that we can prioritize which pieces we try to download next um and initially what we're going to do is four piece in and this is the the logic in main uh and I guess do we already have a piece no we don't great uh so I want to go to lib and I want to create a piece thing and a pub struct piece has a um it has a list of peers and what are the peers indexed by the peers are in I think it's by their peer ID yeah so the peer list is just a is the vector of addresses right so the peers this is like the peers that have this piece um and it's i and it's length I know this should be u 64 I'm ignoring that for now um and it's hash and and then I think what we'll want to do um we'll derive debug partial eek eek or partial ord uh no actually I don't want to derive those and then I want to implement partial ord for peace in fact I want to implement ord for peace and the reason I want to do this is because I want us to um so we're going to keep a heap of which pieces we're going to download next um and we want to I think we want it so that you generally pick random pieces but um that you pick the pieces with the few fewest number of peers first and this is sort of a distributed systems thinking kind of thing where um if few people have it then you should add to the list of people who have it by downloading it yourself and then sharing it sooner because the the fewer the peers the higher the risk is for that piece to essentially go missing so you should participate in the network and sort of do good in here um in which case uh what we want and and I think what is binary heap binary heap is a max heap so by default it picks the values the sort the the next thing you get out of a heap it's whichever value has the highest ordering the greatest ordering um but when you derive partial ord and ord it orders the fields in this order which is not actually what we want here I think the order we won't we want is uh self dot peers dot len compare with other dot peers dot len then I want us to order by self dot uh if there's the same number of peers then I kind of want us to give a random ordering um but that's not a thing that Russ really likes for you to do because if you order randomly every time um you get into this really weird situation where you might sort an array and it assumes that the sorting function is deterministic because it might compare the same element multiple times so if you sort randomly um a bunch of algorithms are just not going to work anymore so I almost wonder whether we want like a seed here which is going to be a random number uh let's say u64 uh which we're going to use to ensure that we randomly select pieces if they have the same number of peers and this is basically to distribute load if everyone chose to let's say order by number of peers and then by the hash if every implementation did that then everyone would choose to download the same hash next um and so you end up with this like uh contention in the network that is unnecessary um so I think what we want to do here is then uh self dot seed um self dot seed don't compare other dot seed and then you know we can compare all the other fields to the the remaining fields here basically aren't interesting anymore here because the seed is probably going to be different each time um so we could do hash length peers or piece peers it doesn't really matter uh and then we can also implement partial ord uh which we can trivially implement uh by saying sum of self dot uh compare other so anything that's ord is trivially also partial ord by the same ordering function um okay so what we now want to do is something like implement piece uh pub create f and new piece i as u size and uh I guess this is really going to take the torrent no it's going to take the it's going to take a reference to the peer info and a reference to the torrent which is the tracker response uh and it's going to return itself uh what is the difference between ord and partial ord partial ord is a partial ordering and so with partial ord you're allowed to say that some elements just cannot be compared to each other like neither is greater than the other um and ord is a total ordering so every element is um has a well defined ordering with respect to every other element so for example for numbers they implement ord because every number it can be compared to every other number and it produces definitely a greater than less than or equal um a partial ordering is something like um uh logical timestamps like if um if I do a thing and then I do thing number two then thing number two happened after thing number one so thing number two in this in some sense is greater than thing number one but if you and I do two things concurrently without talking to each other then those two events time wise uh are not ordered relative to each other like your your event is not greater than my event and the timestamp of your event is not greater than my event and vice versa logical time there's a very brief description of partial ordering versus total ordering um okay so the things we're gonna have to pull out of here in order to construct one of these pieces is the piece hash um the piece size the length here is uh t dot length oop slef um so piece i we have length is piece size uh hash is piece hash uh this doesn't need to be a reference um seed is going to be a random number um and here we could use something like uh what's fast rand is that the uh oops nope fast rand i think it's called fast rand no that's not the one rand uh this one for recent downloads and i want or maybe it is fast rand i specifically want something that doesn't have a bunch of dependencies yeah that's fine i thought there was another one too that was like uh the one used for yeah but what is the quick check one doesn't really matter i suppose that's fine i'll take fast rand uh cargo add fast rand so seed is going to be fast rand uh random u 32 uh u 64 any range um and the peers peers here is going to be um t no it's going to be peers dot peers dot zero uh dot uh iter map no uh dot iter dot map uh dot filter map in fact dot enumerate enumerate dot filter map um peer i and peer um we're going to filter by whether the peer um has peace uh peace i then some peer uh bit fields oh this actually needs to be as this is where this gets awkward is that we don't know which peers have which pieces until we start talking to them uh which means we don't actually know it's hard for us to tell in advance how many peers have a given piece because the only way that we can know that is by connecting to them and it might even change over time right peers might gain pieces as they themselves download things um so this raises an interesting question which is um peer list so i think what we need to do is we need to pick a random set of peers that we initially connect to um and so that's going to be something like peer info dot peers dot zero dot iter uh dot take three this is basically how many peers are we willing to connect to at once um let's say five um map no i think i actually want this to be a vic new and then i want four peer in peer list dot push uh peer new of peer i guess this is technically an adder um which might make our life a little easier uh dot o eight with context connect to peer here actually uh we can do even better than this which is if let okay we can even match on this because if a peer if we fail to connect to a peer we don't actually want to exit the program right um and if we get an error then we could actually just like sort of uh fail to connect to peer uh peer adder yeah and like this is realistically something where we would probably let um uh we would probably let users uh indicate the setting right so if peer list dot uh length is uh greater than equal to five to do user config uh then break import peer uh and we are also supposed to send in the info hash peer info dot info hash so that we can also do that here and i think peer new probably only needs a reference to it i guess not that's fine it can get it it's uh and this can get the actual address so yeah the problem of this is we're not actually connecting to these in parallel um to do in parallel um in practice we can i guess we can do this a little bit better because we we're already in an async context so we could do um peer info dot peers dot um where it gets awkward with doing this concurrently is you don't know when you've had enough right so if you do this concurrently i guess you might just drop the connection which would be okay that's fine so here's what we'll do then um so i think in is it in tokyo there's a um where is this thing it's in task join set no it's a joint set um yes you can stick a bunch of futures in there and they all get run the thing here though is we want to limit the concurrency here so that we don't simultaneously try to connect to all of the hosts and just pick whichever first five to respond because that that is probably going to get us banned in a bunch of places right um i don't know whether this has like a limit supported like max concurrency kind of thing i don't think it does so we might actually have to use um in futures util there's a tool for this uh that has its own set of problems but we can use uh futures unordered here which uh not futures unordered sorry uh where do we have it stream x so if you have a stream which is arguably just an iterator um you can do dot where are you uh for each concurrent is not the one i want i want buffer unordered an adapter for creating a buffered list of pending futures actually i yeah so the return stream will be a stream of each futures output no more than n futures will be buffered at any point in time and less than n may also be buffered depending on the state of each future and so this lets us do the basically what we want um and i did i already have futures util here uh i do yes okay so in that case uh and if i just pull in futures util stream stream x then now i should here be able to do um dot iter and then we're going to have to do um futures util stream iter to turn the iterator into a stream and then we are going to map uh the peer adder into a peer new so it becomes a future and then we're going to buffer unordered um five user config and then what i want here is let mute uh so that's the that's the stream and then while let some peer is peers dot next dot await i guess i could do copied here but it's fine so what this will do is we're creating a a stream over all of the addresses um for each one uh when it gets when it gets pulled into the buffer unordered so we construct a future for each one but constructing the future if we uh go here constructing the future does nothing because this just constructs a tcp stream connect uh we could do this uh we could enforce that this is actually does nothing by doing this um this is going to guarantee that this future does nothing until the first time it's pulled um and we say that at most five of these futures should be running at any given point in time we can make that less too right so that we don't connect to too many more than we need um and then but we can keep it five that's fine and then we're just going to read the the outcome of the stream which is going to be just whichever future is complete first which is whichever peers we connect to first um and we match on that peers connection if we fail to connect to one that's fine we don't really care um and we could here also say you know let peer is equal to this um and we're going to return uh peer adder and peer like so so that we can give error messages about which peer we actually failed to connect to um and then we can do e to print out the actual error that we got um if we got a peer out of there that we successfully connected to we added to the list if the list is long enough then we break um and when we break here we also drop the peers list of futures so that all of those connections are also just thrown away uh so now we have a peer list uh and now that we have an actual peer list with open connections uh then now we can use the piece thing to pass in the the list of peers so this will not actually be the tracker response this will be a list of peers and in fact if we go to peer here i think the this um has piece thing is actually something that can be a function on um on the whole peer and not just on the bitfield um subfield which is not public right so if you have a peer you can ask whether it has a given piece great um so now when we construct a piece we can have it know how many peers have that uh and that can go here expected vacuum size right so this needs to dot collect excellent so if we now go back to download um then now we can do um need pieces dot extend of actually i guess i guess really what we want to do now is do the whole uh same thing as we did in main which is to figure out how many pieces there are actually i guess we know how many pieces there are so for uh piece i in uh uh this for each one we're going to say piece is now going to be piece new of piece i reference to the t reference to peers uh this thing oh peer list i guess here we could also do uh peers is peer list so many ways to spell peer peers um and then need we could even do here uh if piece dot well what's tricky or two is that um if something has no peers then we don't have a way to download it and the only way to download it is to connect to more peers which is a little bit awkward so we're going to want something here that's sort of in the background uh maybe sort of randomly reaches out to new peers but okay um so in theory we'll do need pieces dot push of piece um and then now that we've added all of these and what i was thinking here is that we kind of want something like um no peers and if and i guess this is something we can do on uh piece here is something like pub crate uh has so we could do here if piece dot peers dot is empty then no peers piece else need pieces dot push piece so what we could do here is um you know while we can stick this whole thing in a while loop um or in fact it would have to be like this whole thing out here um we basically need something to make sure that we have at least one peer for every piece but i'm gonna i'm gonna skip over that for now because i i want to get to something more complete and then we can refactor it to to take into account that case so for now we're just gonna assert that no peers uh is empty and this is obviously a giant to do uh and then what we'll do is while let uh some next piece is need pieces dot pop um we will so at this point we now need to figure out where we download each blob from uh or each block sorry uh so it's gonna be this bit piece dot a piece and peers is messing me up real bad so i want this is another helper that i want here which is length it's gonna tell me piece dot length like so uh oh where did we stick block max it's in peer why is it in peer i think this actually goes in lib that's what i think uh so here we're gonna import create block max and then we can do the same thing in download block max comes from crape great um and the piece size yeah we can track that in here um this is let piece size is piece dot length so you know every piece consists of a bunch of blocks uh and here we have a list of all of the blocks and we also know all of the peers that we're we could possibly iterate over here right so let mute um peers is going to be actually i don't need it to be mutable i just want to do um piece dot peers dot iter dot map uh so this is the peer i right and what i want to grab out is a mutable reference to and it's not gonna like this at all um what i really want to do right is uh peers of peer i this is gonna complain about multiple boroughs um it doesn't yet because there are other errors but this is gonna complain about the fact that i'm borrowing peers multiple times because it doesn't know that the the peer eyes are are not overlapping um there are a couple of ways to deal with that the easiest one i think is actually for piece to hold a hash set of peers and you'll see why in a second um so if this holds a uh hash set of peers hash set of this what's it complaining about hash set u size is not an iterator i mean i agree with that what wait what hash set does it not implement i don't think hash set implements or does hash set not implement or why can you not order hash sets that just feels like an unnecessary restriction um can i order their iterators i can okay great that's good enough for me we're basically never gonna get to that part of the comparison anyway i guess it could be a b-tree set instead of uh of a hash set but um it's complaining about it's complaining a bunch in download that's fine because now what we can do is instead say peers dot it or mute which gives us a rep a mutable reference to each element uh enumerate uh dot filter map so this has an i and then it has a mutable reference to the peer uh and now we can do if uh piece dot peers contains i then peer dot collect um really uh then some pure i so what this is doing is the the iter mute on vectors uh knows that it's allowed to yield mutable references to all the elements separately because it knows that everything it yields is an independent element that it is allowed to give out mutable boroughs of each element concurrently uh or not concurrently but rather it's okay to give out a mutable reference to the first element and the second element both at the same time which is effectively what the iterator ends up doing and so then we can filter that iterator based on only the peer eyes that appear in the peer list and the reason i wanted this to be a hash set instead of a vector is because otherwise we need to search the vector every time for the element um whereas with a hash that we can just do contains oh it's because iteration isn't defined uh you're right iteration is is random for hash sets which now that i think about it might actually be the only randomness that we need right like that just means that we can really order here by that and we don't need the seed because iteration order of hash sets is random hash set to avoid deterministic contention so that way we no longer need the seed here which means we no longer need faster end nice i'll take it um so now we have a list of the we have a mutable reference to each of the peers that uh have this particular piece uh and then what we want to do is we want to download all of the blocks um and i want to map uh block i here and then i want to map that again i guess i can just map it once um i kind of want so we're trying to download all of the blocks and what we actually want to do is not quite this what we actually want to do is is make this is where the the un choking comes in we want all of the peers that have um that have this piece mark themselves as interested and request some subset of the block but if a peer is choked then we want another peer to take over that block instead so it's like one way in which we could do this right is we pick a random peer like uh you know rand dot rand choose or whatever uh of peers and then we just send the request to that peer and then we do all of these in parallel but i don't think that's actually quite what we want to do um instead we we kind of want these these um peer instances to cooperate right because they need to keep track of who's responsible for getting which block so in a sense it is like a uh work stealing pool of if some peer is trying to uh like basically we we um enqueue all of the blocks and then the peers just take whenever they get un choked they take the next block and they try to download it i think i'm liking that i think i'm liking this um um so what we would do is we would create a uh work queue and a work queue in this case is really just a channel um so let's do yeah so we want an async channel here and i think we want um so we want a work stealing channel which really means we want an mpmc channel which which channel do i want to use for this i like thing buff um which i think gives you mpmc thing buff ah but i think thing buff does not give you yeah it doesn't give you an async version of this because the only there's a there's an mpsc channel built on top of it but that's not quite what we want either because what i really want here is a async mpmc channel see what we get async channel well but this is a small based one which i don't want for tokyo uh flume async yeah we use this async API which i don't really want to use let's canal higher is better canal async mpmc big all right sure why not gotta start somewhere uh what i actually want is a single producer one actually no i need a multi producer so the the reason i'm hesitating here is because imagine that you have two peers um and we have a channel where we send all of the uh all of the block indices like all the block eyes we send to um a channel that's shared between these two peers imagine one of the peers is really slow and the other one's really fast so the first one so they both initially uh grab a thing from the queue and they both sort of request that and then the fast one completes takes another from the queue completes takes another one from the queue until eventually the fast one is sitting idle and the slow one is still sitting on that one block that initially took what would be really nice is if there was a way for the fast one to steal the block from the slow one or alternatively for the slow one to time out and then uh basically deregister itself and and stick the um uh and then stick the like if it times out stick the block id that it basically failed to get back onto the queue and then remove itself as a receiver that's what i think i want in which case you need an mpmc because you need to be able to have a receiver send back to the others uh it's focusing broadcast good enough for this i thought many values for many producers to many consumers each consumer will receive each value is the problem um so when you send one thing it's received by everyone which is not actually what i want right i want when you call receive only you get that one because you are responsible for downloading it so it's not quite the same okay now is pre this very early version i just want i know there's also um uh no it's none of these it's um in cross beam there's like a work stealing queue but it's not async all right all right canal see what you can do um so here's what i want i want a tx and an rx or let's call it submit and uh tasks uh is canal uh i think it's i think this can be bounded actually bounded async um because we know how many blocks there are great and then here's what i want to do um joint set here is going to be so if you remember from the tokyo docs is a joint set under task uh joint set a way to completion of some of or all of the tasks in a set the set is not ordered and the task will be returned in the order that they complete now we don't actually care about what they uh that the completion result for these instead uh here's what i'm thinking here's what i'm thinking how do i connect into the okay i create the set first uh so join set new why is this not giving me completions anymore there we go um for peer in peers um and then i want to do something like uh participate and i'm going to give it submit and tasks and i want joint set dot um spawn async move in fact i don't think the move here is going to matter so i want joint set dot spawn so i want to spawn all of the peers participating in this and then i want to for for block in this so i'm going to basically enqueue a job for every block submit dot send uh block i guess start a wait um and then this is what the receiver is going to do so it's going to be something like uh 37 so on the peer side and i guess i can generate this from here apparently not um pub crate async fn participate takes a mutable reference to self it takes a submit which is a canal async sender of u size and a tasks which is in canal async receiver of u size and inside of there it's going to do something like i get while let some um block is tasks dot receive dot await right so it's going to continuously receive tasks that it's going to download from this channel um and then it just construct a request sends that request on its own thing uh waits for the response pieces it together and then does something with the result um and what it does with the result i haven't quite decided yet that it might just be a vector of these uh these full responses yeah in fact in fact i think i want another one which is um finish and done this is going to be a finish here which takes a uh piece right so the idea here is that if you actually get this then you send the this is just used for assertions so that's fine uh then you send the piece over here and then i think what we'll do is probably this await so here uh we should have a timeout and return block to uh submit if timed out and then on the receiving end so we're going to send out all the blocks and then um we're going to have a loop here which is going to be um a tokyo select over um join set dot next dot await or join set dot next and done dot next uh receive i mean and this can actually be a tokyo mpsc tokyo sync uh channel mpsc channel and so we can go up here and make that be a tokyo sync mpsc sender uh what is the type of piece here i'm confused the the thing that we get back from next here something something's lying uh right so this is going to need piece i and end blocks that's fine uh piece size okay it also needs the piece size that's fine okay some this is not important i want to get to the good stuff uh this is self dot stream and this is self dot stream and now what's the type of piece is a message so message is what i want to send here and i suppose this can return anyhow result so if one of these fail um like if a peer completely fails in the middle of down only blocks from it we want to like eliminate it from the the set of peers um so that's fine if we get down here then okay great so this one's now happy task receive returns a result of u size okay that's fine uh we don't currently use submit because because of this to do um but the interesting part here should be what do you mean you can't infer the type oh peers here can be a doesn't really bec is fine doesn't really matter um and this now needs piece i piece size and end blocks piece i piece size and end blocks i guess piece that's fine uh uh oh i i guess we actually need to uh show which piece this is so length and uh index which is self dot piece i next okay so we're gonna make all of the peers participate by running this loop uh then we're gonna send all of the blocks in as jobs and then we're gonna observe both the the done list and the um uh the join set so the the reason we want to watch the join set is um if a participant ends early it's either slow or failed um and here it's uh you know i guess this is like message then um you know keep track of the bytes and message and where this now gets interesting is all blocks is going to be a veck of zero u8 of length piece size and then if we again go back to our main right we have this code where we figure out the begin and the end so here we want to do this all right i guess piece is fine here and this is actually a create peer piece just to be clear um and so what we should now be able to do is we find the block of bytes in here and we should then be able to do all blocks um sliced from piece dot begin and onwards copy from slice piece dot block and then we can get rid of the asserts what does this type u8 cannot be indexed by range from oh begin as u32 as u size uh next does not exist for join result what is it then it's called join next and it's either it's an option result so if this is none then that means there are no peers um if this means if this is some of okay it means uh the peer gave up because it timed out um and if it's error is the peer failed and should be removed later right so these are the cases for this one um received just gives backs an option so that's perfect and i guess really if we get none back from done uh if let some pieces piece and this else uh have received every piece or no peers left so let's see once we've told everyone to participate then we should drop our finish and we also should drop our submit after we've done this right to indicate that there's no more inputs of this kind any code following this expression is unreachable why oh right um so if we get to this uh let's break technically we're gonna need to do a little bit more than that um but at the end of this we should now end up with all of the blocks having been filled into this vector and so now we should be able to compare the sha one uh what's the stuff we need to implement to get that we need uh this and this should match the piece hash which i thought we got from right so in piece uh we have the index and we also have the hash 20 and so we should at the end make sure this matches piece dot hash great all blocks does not need to be set up here kind of bar peers as mutable sure we can peers was mutably borrowed somewhere returning this value requires that peer is barred for static why oh this is because when you iterate over the elements of a vector then in theory every element that you yield is independent right they are independent in the sense that they point to distinct elements in the vector and it's okay for you to have a mutable reference to the first element and the second element at the same time because they're not overlapping they're non aliased um and in theory this should be fine for iterators too because the iterator uh should be able to yield yield elements that are tied to the lifetime of the vector rather than the lifetime of the iterator but the iterator trait is not was stabilized long before generic associated types which means that if we now look for uh for veck the slice um mute so intermute has a mutable reference to uh to the overall slice and it's going to be interesting actually iterator no it should yield elements into the vector so why is it unhappy about this basically what i'm claiming here right is that if i have a let mute v is a vector that has a zero and a one and then i do um iterator is v dot intermute then i do zero is it dot next dot unwrap uh one is it dot next dot unwrap uh then i should be able to now you know do whatever i want here like for example i should be able to do zero is one and one is zero just to to use both of them yeah and that's okay so why is this not okay i wonder whether this requires that it's static that's why so join set requires that the future you pass in a static and participate takes a mutable reference to the peer which we get from up here and the mutable reference to peer is therefore required to be static in order to be passed to spawn for the future that gets passed to spawn to be static which then requires peers or the borrows from peers to be static which would require peers itself to be static which is not so that's what it's actually complaining about here i think we can use a local set instead so in tokyo there's a there's a joint set but there's also a local set a set of tasks which are executed on the same thread uh yeah but does it require static though that's what i want to know it's still required static um so we either need to move the peer into the future so the participate future would consume the peer and then like return the peer at the end or we would need this to just kind of work um i thought there was a way to have one of these that's not static because really what we're doing here is just a join um but i can't use the actual join macro because that requires that you enumerate all the branches i basically want like a dynamic join right which is what i was hoping that join set would give me but i think join set actually spawns the tasks which is the problem because if the moment you spawn them then they are um then they do need to be static and i thought local set allowed you to do this but for some reason it doesn't require a send but it still requires static yeah i mean there is a in futures util there is uh where are you is it under future is it under future join all which does not require static uh and we're actually okay with this using futures unordered instead it just makes me sad is all but fine so this is going to be a futures util stream futures unordered futures unordered new uh participants just doesn't stop push participants dot next okay uh things are happier now that's good so submit here dot clone tasks dot clone and finish dot clone that's easy uh this should be done yes uh sending the blocks unwrap and drop tasks as opposed to all peers already exited which actually means we could even do this up here uh expect uh bound is equal bound holds all these items right so we queue all the work we make all the peers start their their sort of work stealing a collaborative journey this is basically a cooperative multitasking right um and then we watch for things to finish if we get a piece then we're happy um if we've received every piece then i support i suppose what we could do here too is um mute bytes received is zero and then we could say uh bytes received plus equals piece dot block dot len um as if we get here we should assert equal uh bytes received with uh piece size um uh and in fact okay so we're in actually in a slightly awkward place here we have to be a little bit careful around cancellation because um let's say that all the pieces are done here now if all the pieces are done then we're fine the thing i'm concerned about is that we drop the future of a given participant while it is still active so for example let's imagine that i just had a bug here so i broke here the moment we get any piece what will happen is that means that we're going to drop participants without letting all of those futures finish if we drop if we drop participants then that means uh that we're going to drop this entire async block right this this is a future and we would drop that future mid-execution which means that we might drop it here for instance so we've sent a message we haven't received the response so imagine that we drop the future right here um and then we go through another round and now we downloaded a different piece and we try to use the same peer well it hasn't read this thing out yet so when it starts it's going to send another message to uh request something but then when it goes to read it's going to read this the message that it didn't get around to reading last time because it was canceled um so that's the thing we have to be careful about here so um um uh really what we want to do is make sure that we always wait for all of the futures of all of the participants um and how do we want to do that and we have to think about the same safety here around uh dropping the future that we get back from participants next but i think that's okay because um dropping the future from next uh does not drop the participant future it just drops the future that looks for the next participant that's ready uh like participants itself is not dropped uh because this this basically holds on to a mutable reference to participants so that part is fine so uh i think what we want to do here is if not done dot closed maybe oh uh where's our canal docks if i go here to async receiver what do i have available to me uh is terminated oh right this is actually uh tokyo channel uh mpc receiver wow there is no as close that makes me sad it's actually okay for us to break here though we could just repeat this loop underneath is i would rather not do that if i could avoid it no there's a closed method but that's not what i want i want is closed i want to check whether it's closed i don't want to actively close it um but i think actually so if we've received all of the done events i think that has to mean that all of the participants are either uh appear so one of these triggers that right and um if that means that if we get the end of this either all of the participants are gone right so all of the send handles have been dropped or um there's still some peers that are active and just waiting for more work which they'll never get um and if a future is stuck here then it's totally fine for us to cancel it because the the connection is in a sort of clean state so i actually think it's okay to break here um this must mean that uh all peers have either exited or uh or participations have either exited or are waiting for more work uh in either case um participate uh the it is okay to drop all to drop all the participant futures great so breaking here is fine in which case we're going to drop participants um if a peer gave up then that's fine we don't actually need to do anything with that in theory we could imagine that okay this peer is probably slow so we shouldn't try it again ever again um but the fact that it's like it is no longer reading from the task queue so we're kind of fine um the one thing to watch out for here is that we might be in this case uh so this is really like if bytes received is equal to p size then uh great we got all the bytes uh else all the peers quit on us uh so so we don't actually have to handle this case here because either there are still more peers in which case they're going to continue to handle the traffic or um there are no more peers in which case we'll still get into this case and we'll just get into the the else case down here um nothing to do except maybe the prioritize this peer for later so like to do um if we learn that there are no participants next uh that there are no participants left um this must mean we will we are about to get uh none from done dot receive so we'll handle it there right because uh the transmitter for done is finish finish is cloned into every participant uh and then we drop our copy here so when there are no more participants that means there are no more send handles for finish uh which means that done will return none so there's nothing to do in this clause because it's the same as this clause uh and it's just racy which one we hit first um and I guess this uh we only need to think about this branch if this is um participants is empty okay so the last case is this like the peer failed um it already isn't participating in this block in this piece anymore uh so this is more of an indicator that we shouldn't try this peer again and should remove it from the global peer list to do and again if if this causes us to have no peers left for this piece it'll be handled in this case anyway uh so so these are really more about like peers that we should now think about removing um rather than we actually need to do something that's different from this branch great so this is the only break case um and so what we'll do here is if bytes received is this otherwise uh we can actually do um so there are two things we could do here we either we can do the simple thing for now which is just um if there if we don't didn't get all the bytes because all the peers disconnected on us we just give up we just return an error um the other sort of realistically what we would need to do right is okay we did connect to all of the peers like even this is arguably uh a little overzealous because this should really be imagine all of the peers have a particular piece we still don't want to download it from uh I guess we're already limiting how many peers we're connected to anyway so I don't think we needed to take here but but what that means is if it turns out that none of the peers that we originally connected to are available to us anymore none of them are able to give us this piece then the only thing we can do is connect to more peers which sort of happens outside of this right it happens uh all the way up here so we need to grab more things into this in the first place um and we could do that I think we might actually need a sort of I think we're going to need a sort of data structure outside of this that lets us continuously populate more um more peers and the way we might have to do that is using um maybe we could do this actually with the tower service crate and load balancing I don't have to think about that one but but ultimately like the the recovering from this case is actually kind of complicated it's like we'll need to connect to more peers and make sure that those uh that those additional peers also have this piece and then download the pieces we didn't get from them which also means we don't want to re-download the parts that we have now successfully downloaded because it might be that like before we lost the last peer we downloaded all but one of the blocks so we actually want to be a little bit smart about how we do that um that recovery um yeah we'll have to we'll have to think about that so in this case I think what we'll do is bail here and say uh no peers left to get peace peace I am um and the the nice thing that we can do now now that we have this loop uh I think we can expect here uh receiver should not go away while there are active peers us and missing pieces of missing blocks this one um is that here we can actually correctly deal with choking too because down here I suppose right we can we can do um what we kind of want to do is we want to send I suppose uh interested when we're asked to participate uh and this was in previously in Maine right um so we're going to send an interested message and now uh the thing that we are we're going to need to keep track of is whether we're choked there's going to be a boolean um and so initially choked is true so we can't expect to get one of those uh so initially we're going to assume that we're choked and initially we're not going to say that we're interested in anything and then when we're asked to participate we're going to send okay we're interested now um and we actually need to be a little smart here which is something like uh loop this has to become a loop uh we want if uh self dot choked if we're choked then we need to wait for an unshoke message and I guess here we can match on um unshoke.tag then self.choke is false and then we're allowed to keep going on uh if it's anything else then this doesn't really help us like that means we're still choked so we can't really send any other messages there are other messages here that you can imagine we have to handle uh but but unshoke is really the thing that we're waiting for if we get anything else I think we can just ignore it because we know we're not in a request um and if the other side is interested in something we're not going to send them anything anyway um so I think unshoke is really the only message we can get we could look at the spec here um here uh so we can get we could get a choke we can get an unshoke but we're in the case where we're already choked so the so we shouldn't be getting another choke message we could get a choke down here um so I guess this here we should really match on piece dot tag uh and if the message tag is choke uh then uh then we should set self dot choke is true uh we should do submit dot send the block right so we should make someone else take this block instead because we're choked um and then we will uh continue with this outer loop um if on the other hand we get message tag peace uh then we can uh then we're fine we fall through and if we get anything else I don't think we should expect anything else where's the right so here let okay block uh if we get anything else we're gonna break so at the beginning of the loop we're gonna make sure that we're not choked if we are choked then we're gonna wait until we are unshoked uh and is there anything else we could get interested we don't care about um not interested we don't care about have um so have uh have means peer uh should now be eligible for more pieces um so that's something that we you know might want to handle in the future so this is like a to-do um update bit field uh and to do um update a list of peers uh or add to list of peers for relevant peace but it still doesn't let us break from this loop because we're still choked um request we're just ignoring right not allowing request for now um peace we shouldn't get because we haven't requested anything and cancel we also shouldn't get because um because we're not allowing requests but actually I think we might be able to get peace I'll show you in a second so the moment we break out of this loop it means that we're no longer choked which means that we can take a task off the queue um if there are no tasks on the queue then we can break right it means there's no more work um and then we do the whole thing to send the request um and then what we're gonna do is see what we get next uh like so if we get a chalk if we get a choke uh then we have to continue this which is gonna be uh then we're gonna continue task um if we get a peace then we can break because we're happy um and I think it's impossible for this to have gone away we still have a receiver so it's not possible for this to fail um if we get a peace then we're good um what else can we get after sending a request I don't think we can get anything else because we know we're unchoked I guess we can get the um we can still get have so this is I guess we can do message tag interested message tag not interested like these we just ignore um these what I actually think I want here is something like pub crate fn um well I'll leave that for now um so these ones can all happen down here as well uh I guess I might as well do this right so message tag unchoke um should be anyhow bail um pierce sent unchoke while unchoked shouldn't happen uh and bitfield is sort of the same like uh pierce sent bitfield uh after handshake has been completed so these are basically like violations of the state machine right from by the pier and now the interesting part here here is okay okay okay peace is is this possible so can we get a while we are choked can we get a peace and I assume that can't happen pierce sent peace while choked but what I wonder is imagine that you are currently um you're currently unchoked then you send a request and while you're sending the request the other side unchokes you and so it receives the request after it's unchoked but you uh sorry it chokes you so you send the request before you realize you're choked but the recipient receives the request um while the choke is still on its way to you so we then read here out of choke and so we decide to go back up and wait to be unchoked but the request we sent was still going to the server is the server ever going to respond to that request like if it decides to unchoke us can it now decide to send us that peace anyway yeah there's there's a there's something about the state machine that I don't quite like like I think it might need to be more not one at a time than it currently is like I think this actually needs to be like like more truly a state machine because like it's a little weird right imagine we get a peace that we asked for ages ago and then we thought we were choked so we thought it wasn't coming then if that request if that peace now comes um I guess we could send it on the finish channel but it but we already yielded that block for someone else to be responsible for so we we no longer really own sending that on finish um yeah something's not quite right here like we can we can make it work right so we can say um we can just ignore this piece a piece that we no longer need slash are responsible for and we can do the same down here um wait no not download I don't think we're gonna need the download function anymore uh we won't participate so down here uh if we get a piece uh here so if peace dot index not equal to peace I or um peace begin is not equal to the piece we're waiting for or those are actually the only two things that really matter um this can actually be an assert so if either of those are true then it's not the piece we were looking for otherwise it is the piece we're looking for and we can break right so uh if we happen to be getting some piece that we asked for in the past we're just going to ignore it because it's not the one we needed um the reason I say it feels a little weird is because having this twice for example reads a little odd and you could also imagine that you want to support requesting multiple blocks simultaneously maybe but maybe that's also an optimization we don't care about the fact that we have to duplicate have is a little weird um but but maybe this is okay we'll see how it plays out um so if we now go back to download um actually we don't need to I think we did that so this now handles choking and unshoking uh get gets the block sends the I guess peace here is really arguably misnamed this should be message and it's a little misleading for the message tag here to be peace because it isn't the piece it's a block it's a piece of a piece um but fine okay so now that we have this um and we were ignoring those and that's fine shouldn't be a problem in this instance and then we get all of this back so we now have all the bytes uh so bytes here is now going to be all blocks and files is just going to be um oh wait no that's not right so the the all blocks here is all the blocks for this piece um and so this is where we're going to stick it all in memory for now but obviously that's not actually what you will want to do uh is we'll do where's the place where we compute the length here um thought we added that to to the tracker did we not I guess we didn't uh sorry I mean to torrent no we did okay yeah so t dot length so we just create a giant thing of that holds all the bytes of all the pieces so clearly you would not actually do this right like to do this is dumb um but what we can now do is when we get back uh whatever this piece is we should be able to do all pieces uh index it by um index it by where do we have the piece i right so this is going to be um piece dot index multiply by the t dot info dot p length um there and further yeah and then we just copy from slice uh all blocks and in fact maybe the error case we do here is we do something like um stick this uh piece back onto the need pieces heap right uh so probably also stick this back onto the pieces heap but after we've done all this and after there there's nothing left in need pieces then now bytes should be uh all pieces and files should be t dot info what do we say this should be vex of file um which is going to be a match on t dot info dot keys right where single file then this is going to be a vex of file otherwise it's just going to map to files length and the path is just going to be t dot info dot name great and why can't i move out of this i can't move out of this because let me move it as enumerant multi-file which is behind a shared reference oh it's because uh i don't know fine it's because we're we're taking a reference to the torrent is why so this is going to have to be clone and this is files dot clone great and this is uh like yeah fine his length and now we don't need download piece we don't need this bit uh in peer uh we don't need any of the the download method we wrote that can go away because it participate takes care of it now what do we get right so now i think all the errors are in main so if we get rid of these unused things tidy up a little bit adder is never used pieces is never used that's fine for now go to main um what i want to do for download oh right source lib is fine just so that the old code keeps working nice okay so this now builds with the code the way that we had it and i wonder i wonder if this will just sort of work i mean i realize this is we wrote a bunch of code um that we haven't really tested but at the same time most of it is very similar to the um to the uh like it's sort of copy pasted from what we had before so the main question is whether this the scheduling logic actually does the right thing so let's do git add dot hit commit um first attempt at uh multi peer file download and i just want to see like if we push it up and run the test suite like does it just do the right thing there's almost certainly a bug somewhere yeah so someone in chat is pointing out that there's a there's a different design here that is much more centralized right so you have a uh that has a sort of central location that knows about the state of every peer the state of every piece um and then you have a sort of uh like you have a thing that's responsible for the connection every peer and and rather than having each peer decide what it does you just have the central thing say you download this thing you download this thing and sort of drive the whole thing and then the the peers just blindly do what they're supposed to do or what they're told to do rather um that's fine too i uh i actually don't think you would be that much nicer but it could also be because i haven't built it that way so it's hard to say let's see what it says for this last step i'm curious build build build build thank you step 18 how many steps are there i don't remember how many steps there were what are we here step i don't know what step this is step 24 somehow how many steps are there uh what is code crafters oh it's this so this is the from the previous video this is the the site that has like um build your own x style challenges and so i was doing their build your own bit torrent challenge um and it basically guides you through like if you were to build your own bit torrent here the the sort of a set of problems you would need to solve the the goal isn't so much to build something that's like a production ready version of x right like as you see from here this is clearly not a production ready version of a bit torrent client but it does force you to think about some like relatively real problems go read the spec implement some like actual real code um and uh you know i i think this is a good way to learn in general um if you want to try it out then there's a here you can do it through this thing all right something failed what failed source slice length does not match destination slice length at download line 223 no 123 okay so we do actually need here to say that this is until uh piece dot block dot len which i guess is really uh piece size wait no no block size uh and block size here is piece dot block dot len and then down here we're probably gonna have to do the same thing so this is basically copy from size slice asserts that the slice you're copying from is the same length as the slice you're copying into uh so here this would be piece size yeah copy from slice is strict i guess pier doesn't really need to hold its adder though the real reason i did that was so that um a pier could uh reconnect if they lost their connection stage 11 i think it's stage 12 is the last one i know it is stage 11 yeah this feels an awful lot like it's just hanging so the question is where it's hanging um i forget whether i can i like get one of the torrents yeah there's like a this thing download well something's doing something nice okay we get no output great um so let's hear let's do a little bit of debugging for our own sake uh start receive loop see if it just see if it gets there it does okay um so this is participant finished and this is um got piece this is got pieces end now if i remember correctly there are only two pieces so oh i know what's going on this okay okay okay uh so this is actually um if yeah if bytes received is piece size then we're done here have received every piece this must mean that uh exited or waited for more work in the case is great and then because um because every one of the participants is not keeping track of how much work there is the the what we actually run into is every participant holds a submit handle to read to put tasks back in which means that the looking for a new task will never finish um like they'll never get none from the new task queue um so therefore they'll always be stuck there and every thread holds on to finish which is the thing that lets you send blocks you finish downloading which means that the we'll never get the none signal here from done unless all of the peers have failed so this is uh there are no peers left um so we can't progress uh and in the else case here this just means that there are more blocks left so here oh less foo that looks pretty promising right actually exit when done now this is arguably a uh or not just arguably this is a simplified version right because in the code crafters one i think they assume that every peer has every uh piece um and so our logic and and no peers ever fail um and that obviously makes it so that it's a lot easier to get things wrong in the error cases or the not everyone has everything cases um and still pass the thing right all the tests ran successfully okay this is pretty amazing actually because again remember we wrote all of the code here and didn't test any of it until the end and what we had we had one bug right which is this one i'm not going to count this one because this is just copy from size doing sanity checking it was still correct but we had one bug nice that's really cool now now there there are obviously um things missing here right so for example in the peer connection case um over in this world uh we're going to want to deal with things like timeouts right so uh if the stream on the other end ends up taking a really long time we actually want to terminate that peer um and that's not something we currently have um any handling for so we'll add that here um if dot next timeout error and return block to submit if next timed out so there is more work for us to do here um but i think this is most of the restructuring that i wanted to show right like there's obviously more stuff to get this to production readiness but hopefully you can see now just how different the structure is from how we started in uh in if you look at download piece right which we wrote last time which is a very just imperative linear top to bottom do these steps in this order which worked fine for doing the challenge but once you start like thinking about how you actually want the system to operate first of all you change the interface but also um you you start changing it so that you can have this concurrency you keep track of multiple pieces of state at the same time um and that just makes it better to work with um oh right and then um obviously another uh another improvement we would want to make here is you know this is dumb like you wouldn't keep all of the pieces here uh in memory in a linear sequence um so uh at the very least use um bytes bytes uh to avoid the single large um allocation and having to mem copy um into it really like bytes doesn't really even help you here but bytes is actually probably what i would use down here um because what bytes let you do is it lets you have multiple vex or multiple um sequences of bytes that are gathered separately and then it lets you just sort of stick them together um and then you have one thing that references the other ones uh rather than having to copy them back and forth in practice i don't know that it matters within a piece um so i'm kind of tempted to just leave that one it's not it's not actually that dumb um but for this this is dumb uh because uh all the pieces for a given torrent may not fit in memory uh should probably write every piece to disk uh so that we can also seed also resume uh downloads and seed later on um and obviously this like if there are no peers that have the pieces all obviously also a big to do um there's more work we can do here on the uh the seeding side so if we look at peer right ideally um so up here ideally peers should keep track of what uh pieces we have downloaded uh and references to them um so that we can respond to um requests from the others side also choking the others choking and un choking the other side um so there's obviously a bunch more stuff to do here and i'm not going to claim that this is now a production ready bit torrent client but hopeful and the same thing with handling the haves here um there's like more dynamism that can be added but but hopefully this is a useful insight into how you would restructure this code and and as was pointed out in chat like there are other ways to restructure this code this is not the only way to do it um rather than having each peer for example be somewhat smart about how to manage its um its connection you can instead just have a central entity uh and then just do like i o parallelism like it just says it like all of the messages that come from any peer just go to a central point that central point says this channel send this message this channel send this message to you keep track of the entire state machine in one thing um and then it just uses the channels as sort of dumb i o channels that's also a totally legitimate uh restructuring of this which one is better is is hard for me to say because i think you learn which one is better from building with one of them and finding that it doesn't really work well or that the code ends up really messy with lots of interdependencies we're already seeing a little bit of that here right like for example i think if you try to fill out the code for have here it actually be pretty annoying um and it might have to tie into the code that's in the download loop uh and so that that's not going to be nice uh and that's an indication that maybe you do actually want the entire state machine to be encoded in one place um now that place is arguably download here right um but uh yeah so so you you could say the download should be that one entity and to make the peers here dumber rather than use a participate also totally valid and and and frankly could be better i i don't know i haven't gone that that path um but at least we've gone over a lot of async tricks a lot of channel stuff um a lot of hopefully more understanding of like the kind of state you need to keep track of and how to think of the state machine uh i hope that was useful i think that's where we're going to end it for today are you reserving the disk space up front of course yeah this is clearly a bunch more work that can happen here um not not at all claiming this is done hopefully that was useful and uh have a great rest of your friday or saturday if you're elsewhere uh or i guess or thursday for some of you um but i will see you all later and have a good weekend see you folks