 Welcome back to yet another Rust livestream. Oh man, that's a great name for the channel. Yet another Rust livestream. We are going to continue with porting Java's concurrent HashMap to Rust. We started this a while back and it was really just under the observation that we haven't really done anything that deals with low level concurrency and like unsafe code. And so I wanted to do something like that. And in this case, if you haven't seen it before, we're porting Java's concurrent HashMap, which is a data structure that the Java standard library provides. That provides a map, a HashMap sort of like Rusts, but where you can do reads and writes concurrently without having to stick any locks around it. And at least in theory, it performs well as you scale up the number of readers and writers. And so I don't want to say it's a straightforward port, but the entire code is written pretty well documented in Java. And so we're just porting that code straight over to Rust. And we had like a six hour stream on that a few weeks ago and this is gonna be part two. Now in part one, if you remember, we did most of like the read and write operations. And we got sort of halfway through table resizing, which is a part of the insert code. And so that is, we're gonna finish that up today. And then we're gonna move on to one of the big differences between Java and Rust code when it comes to concurrency, which is that in Java you have the garbage collector. So if you're accumulating garbage in a concurrent setting, you can really just sort of let it go out of scope because the, and not have to think about it anymore. And the garbage collector will ensure that whenever it's safe to free and drop the relevant memory, it will do so safely. In Rust, we don't have that guarantee. We need to make sure that we know when it's safe to drop a given element. Basically we need to think of it as we have a bunch of pointers, like raw pointers all over the place. And at some point we need to determine that this pointer is no longer being accessed by anyone else and it is now safe for us to free it. But we need to do all that tracking. Rust will not do it for us. And so that is the part that we're gonna have to fill in today once we finish the resizing stuff. Okay, fantastic. Okay, I'm gonna do a sort of brief recap of what we did last time just to refresh you all on how the Java map works and what we've done so far in the porting and sort of decisions that we've made. And then we'll dive into what we're doing next. For those of you who are sort of relatively new to the channel, this is, so this is me, John. And I do a lot of these live Rust coding streams. This is part two of an existing stream. So if you haven't watched part one, go back and watch that. And there are also a bunch of other live streams on different topics on the YouTube channel. If you wanna support my work, I can't actually accept donations because international student visas are weird, but I do have an Amazon wish list. So if you wanna support, that's something that's linked to my Twitter account. All right, let's dive back to the code. So what we have here, just so you're familiar with the setup, on the right-hand side here is the Java code, and on the left-hand side is the Rust code. And I'm gonna jump back there eventually. So the Java code has two primary sort of top-level comments. This is just the code from the Java standard library that's just been pasted into a file here. It has an initial comment that explains what the API for the concurrent map is. And then a little bit further down, once you get into the class itself, there's a description of how it works internally. And I think it's useful to recap this just because we're gonna be dealing a bunch with the invariance that this comment explains and that the code actually implements. So the way the Java code works, actually, let me try to paint this. That might actually help. Let me see how well I can paint this. So, let's do like a red, red is good. So in the Java map, we're gonna have a, ooh, that's so far from a straight line. I think there's a bend in my drawing tablet right here. We're gonna have a bunch of buckets, and each bucket, that's really, give me a second. Let's try that. That's better. So there's a bunch of buckets, and each bucket is really just gonna be a pointer for us. And initially, all of them are gonna point to nothing. And the idea here, like with most hash maps, is that for any given key, what we're gonna do is we're gonna take the key, we're gonna take it, we're gonna take its hash, and we're gonna do modulo the number of buckets, right? So this gives us a number into the bucket. Let's say it gives us some key, turns it to seven, zero, one, two, three, four, five, six, seven. So seven is gonna go into here. And this is going to be a linked list of nodes. And the reason this has to be a linked list is because we're doing this hash modulo. You could have multiple keys that end up hashing to the same bucket. And so you might end up with multiple items, each with a different key. So for example, here, both seven and seven plus, this is gonna be nine, so 16 and 25. All of those keys are gonna hash to seven, modulo eight. And so all of these are gonna end up in this bucket. And so if you look up a key, look at its bucket, you're gonna linearly walk this linked list. This is just basic hash map setup. Where the Java business is concurrent is in the following ways. First of all, why is this being difficult? First of all, there is this pointer. This initial pointer, when we want to do an insert, we're first gonna read this pointer and see whether it points to nothing. If it does, we're gonna try to just compare and swap in the first node here. So this is gonna be an atomic compare and swap. And if that succeeds, then we're done, right? Then we didn't have to take any locks, we just did the insert. And then we're done, we can move on with the insert. The second case is if there is already an element in there or if the compare and swap fails, which also implies that there's now an element there, then the element, the first element in here, technically all of them, but especially the first is gonna have a little lock inside of it. And if we need to add something later in the list, we're gonna first take this, which is sort of called the bin lock. And we're gonna take that lock and once we get that lock, add to the list. Now, you might recognize this is something that's called a hash map with per bin locking, which is sort of similar in that for every bin you have a lock, right? So this is a design that see hash map, for example, uses, which is another Rust concurrent hash map. The advantage of this design is that here in the common case where buckets are going to be empty, which is generally what you would, like you expect in a hash map for most of the buckets to be of roughly length one, then you will not ever have to take any of these locks because all you have to do is this atomic swap. Yeah. There's also the other thing that the Java hash map has implemented that is sort of a neat feature is, let me do that color, is that for any hash map, it might have to be resized at some point, right? Imagine that you have some buckets that are getting really long. What you want to do is double the number of buckets and then you wanna move all the elements from existing buckets over to the new map. And then you can free the old map. Of course, where this gets tricky is if you have one thread that's reading from this map, and then one thread that's trying to start a new map, which is gonna be sort of logically twice as large, sort of twice as many things in here. So now at this point, you have two maps and you have some threads that are operating in this map and some threads that are operating in this map. And so the way the Java code does this is if you're a reader, you're just gonna walk from the, you're always gonna access the old map unless you find a thing that redirects you to a new map. So the way a transfer is actually gonna work is if some thread decides it's gonna do a resize, it's gonna allocate this new map and then it's gonna start replacing these lists with a sort of special forwarding node that points to the new map. Any thread that encounters this is just gonna move over to this guy and look or do any inserts and reads over here. Of course, it's a challenge here because moving all of these bins over to the new table is gonna take a bunch of time. And so if a thread notices that this kind of resize is in place is happening, it's going to assist with the resize. So we're gonna have a bunch of threads and all of these threads are gonna be helping with this move. And then eventually when the move is finished, then we're gonna make sure that readers start accessing the new map entirely and then we're gonna mark this map as garbage, right? And once this map is garbage, at some point in the future, we're gonna free this map. We can't do it immediately because it might still be threads that are in the process of doing reads. But the idea is that this sort of transfer of bins from the old map to the new map is going to happen by having multiple threads help out with that process. And that's where we got to last time was this business of writing this sort of helping to transfer business. And so that's where we're gonna continue. Before we do though, for those of you who are watching live, do you have any questions about the sort of high level structure? I mean, I guess it's low level, but the sort of structure of the design and how the concurrency works before we dive back into the code. Like anything you want me to explain in more detail. This is a very like quick and dirty explanation of what we did last time. So if there are details that you feel are important and you're not quite grasping, now's the time to ask. So I'll give you like a little bit of time just to, if there are things you want refreshers on. Is it important that we're all on the same page as to where we're gonna start and what the sort of structure of the code is before we dive into like the low level code again. Cause it's been a while since last time, right? At least for most of us. All right, I don't immediately see any questions. So if you have any farther away and then we'll jump back into this and I'll try to explain more. So the place we were was this put function. So put is, put does a, well it does a put into the table. It inserts a given key value pair into the table. And the code is pretty similar to, let me find here, put val. This is basically a direct port of the Java version. And you can see that if you sort of squint at it. And you'll see that mostly what it does is it finds the bin, it first hashes the key. It finds the bin for that hash. It gets a reference to that bin. And this is where you see the like link list business where if the bin is a null pointer, then we're gonna just try to compare and swap in the first node in the bin. If that succeeds, we're done. We can just return. If there is already something there though, then either there's a node at the head, in which case we might have to, we might have to stick in you, like take the lock, which is what we're doing down here, and then insert the key value pair that we're inserting into that bin. But this is the tricky case, right? This is where what we find, when we're trying to do a put, when we find in the bin where we're trying to do the put is one of these forwarding nodes, which we've called moved. It's equivalent to this clause in the Java code, where it says if you are trying to do a put and you discover that the bin you're trying to put into has moved, then what we want to happen is we want for the thread that's trying to do that insert to help with the transfer. We wanted to assist with the resize that's going on so that that resize is gonna complete faster. And the reason why it will actually complete faster in our case is because we have these, we have the set of bins and the bins are entirely separate, right? For each bin, one thread can just like compare and swap out the entire listed ones and then compare and swap it into the new map. At least that's mostly true. We did, went through some of the details of that in the previous stream. As well we're missing is this help transfer function. And the way I think this is actually not gonna be quite like that. I think this is gonna be self.help transfer table. And if you remember from the drawing, a forwarding node has a pointer to the new map, the one that we're resizing into. And one of the reasons why it has to do that is because you could in theory have multiple resizes going on, right? Where one thread discovers that the current table is too small, so it does a resize. And then another thread, and then starts doing inserts into the new table why the resize is still happening and then discovers they want to do another resize because that is already too full. So it allocates another table and now there's a transfer happening from this table to that table and there's a transfer happening from this table to that table. And so in theory, we need to make sure that this still works out. And so these forwarding nodes actually have, if you look at here, this is our node and a bin entry is a pointer to a table. So this is, if we look here at the help transfer function down here somewhere. Yeah, so this is going to be next table. Oops, that's not what I meant to do. So the pointer there is going to be a pointer to the next table. Now, one thing that's worth keeping in mind is the Java code actually has two things that track the current table and the next table. There is a sort of a global pointer that's used for the entire map. Let me see if I can dig that up here. So we have like this is our top level struct. It has a next table pointer and then it also has the current table pointer. But notice that the next table here and the pointer that is in a given node and like a forwarding node might actually be different. I'm not entirely sure why the Java code has this but I think the intention is because you can have multiple resizes and the top level next table is going to be the target next table, like the one we're ultimately going to resize into and this next table is where this entry, the one we're currently looking at has moved to which they might not be the same. For example, even if there aren't multiple resizes, imagine that one resize happened and completed then the old map, let me flip this for you, the old map right here has a bunch of, like all the bins are just forwards to this table which is the one that we resized into. So all of the things in this table are pointing to that table and there might still be readers in this table, right? Like readers that are really slow or something might still be in this table and they are just going to encounter these forwarding nodes that point them to this table. Now imagine that there's another resize that happened. So this table is going to get resized. So we allocate a third table over here. Now we're going to start replacing the pointers in this middle table to be forwards to that table. But there might still be readers in this original table, right? We might, there might have been readers that were so slow accessing that table that they are still stuck over here. We need them to sort of move to the middle table because they might have to do a read from there or they might end up getting forwarded even further. But so that these pointers in the forwarding nodes are pointers to go look here instead. And that might not be the newest table because there might be a subsequent resize in progress. So this is going to be next table. So we're going to have to write this method. It's a good question whether we even can. I think we can. Yeah, we can call self here. Okay, so we need to write this. Let's write that down here. So we're going to have a FN help transfer. It's going to take a ref self. It's going to take a table. It's going to take a table which is going to be a, what this transfer is probably going to take basically the same as transfer, like so. So it's going to take help transfer and the way to read the signature is that it's going to help transfer from this table to this table. And the guard here, if you remember from last time, the guard is something we are going to use in order to do garbage collection later. The idea is that the guards track, think of the guards as tracking sort of epics where we collect garbage in one epic once everyone has moved on past that epic. I'll go through this in a lot more detail once we start doing the garbage collection itself. All right, so what does help transfer do? Well, okay, so it does a bunch of null checks which are not necessary for us, I think. Although we can totally do them. If is null or if next table is null, then we just return table. Okay, so this actually returns a shared table, although I'm not entirely sure why it does that. We don't need this instance of forwarding node because help transfer is only called if the node is a forwarding node in the first place. Okay, and then we're going to do RS is resize stamp, be this, where that's going to be table.length, where is our table down here? So it's going to be bins.len. And now we want, what's this? While next table is equal to self next table.load, it's going to take a guard and an ordering. So this is checking that the, we're going to help transfer from this table to that table, but we're only going to do that if this table is actually the target of the currently ongoing resize, which is what this is checking. And the table is equal to self table load. And this is a similar sort of correctness check of, we're only going to do it if the target table is the same as the current target of the resize and the old this table, the table we're transferring from is the same as the current table as far as we are aware. Basically, we're only going to do it if the resize we're asked to do is the current resize. And whatever this is, and interesting. It's like this then does SC is self size CTL load guard. All of these loads are ordering sequential consistent because the, so this is one thing I dislike about Java is that these are actually all fields on self. In Java, Java uses this, but there's sort of an implicit this before basically anything that's not a local variable. So this is really this dot next table. This is this dot table. This is this dot size CTL. And all of them are marked as volatile, which is roughly equivalent to doing a load with sequential consistent. Yes, the plan is to rewrite the tests as well. So if SC is greater than or equal to zero, then we're going to break. Or if SC is equal to RS plus max recisors or SC is equal to RS plus one or transfer index is less than or equal to zero. In any of those cases, we're going to break because that's what the Java code tells us to do. One thing that's weird about porting concurrent code like this is we're sort of blindly assuming that the code that's in the Java code is correct, which is probably a reasonable assumption, but it means that in some cases we're going to write stuff that we might not necessarily understand. And we will only really dig into it. We will only really dig into it if something turns out not to work or if there's a line that's hard to port and we have to really dig into what it means. And this is going to be compare and set int. Okay, so we're going to do a self.sizectl compare and swap. So sizectl, if you remember from last time, is this control value that used to track how many threads are currently working on the resize and what state the resize sort of is in. So this compare and swap is to add ourselves to the set of threads that are helping with the resize. If this is equal to SC, then compare and swap succeeded and then we're going to help transfer from table to next table. Break, what else do they have here? This is going to return next table. One thing that's interesting here is for this next table, this next table is really a const table if you remember, right? Cause it's the thing that's stored in the moved. And what we're going to do down here is next table is equal to shared from next table. And if you remember from last time, shared is really just a thin wrapper around a thin wrapper around a shared raw pointer and it's unsafe to dereference it. And the idea here is that you can only dereference it if you know that the target value is valid. So the question here is, do we know that? Do we know that the next table is indeed a valid pointer that hasn't been freed yet? The way that we know this is, let me see if I can illustrate this. Actually, let's leave that for when we add the unsafe annotations. We'll do that for now. Okay, so this should be fine. That I think is roughly the definition of it. And we wrote transfer last time. See if there are any fixed me's or to-dos left from last time. Counter-sell, figure out, numcpus, tree bins which we ignore and reservation nodes which we ignore. Ordering can probably be relaxed. We're going to ignore that. Treeify, we're going to ignore that. Okay, let's do a cargo check. There's no way this compiles, but we're going to run it anyway. What's the point of the cons keyword if things are immutable by default? The cons, do you mean for raw pointers and rust? So the cons keyword for raw pointers, it's just that they needed a way to name the raw pointer type that is not a mutable pointer type. And so const was the natural. Cannot find owned. What do you mean, owned is right there. It's the import not working for somebody. Use of undeclared type or module cross beam. Really? But it's right there. Oh, cross beam, try that. Oh, build hasher in hash map. We want ordering, that's going to mean need to be imported. Where is build hasher? Let's go to build hasher. Oh, it's in standard hash, hash build hasher. Three, two, four. Tempt to use a non-constant value in a constant. Oh, that's awkward. Yeah. What we're going to do here is actually, okay, so what we're trying to do here is allocate, when we're allocating a new table, we want to allocate a bunch of contiguous memory. These are going to be our bins. And this is trying to do that as an array, which won't actually work. This is going to have to be, I think we can do this with vex.intubboxed. So down where we have table, this is really going to be a box of this, because we want it on the heap. And that should be okay, I believe. We could make it a vex too, it's just I don't want it to be resizable is all. That's a node, super table. You cannot find n, only 59. And that should be self. Why it won't work? Why what won't work? I don't use LSB because I don't like noise in my editor. I would rather have them in a separate terminal. Expect to trade found derive macro. What? Oh, that's because I didn't import hash. Seems fine. Absent. Oh, that's right. We renamed this parameter from no replacement to if, it used to be called if absent, and now it's called no replacement, which now I need to parse Boolean logic. If absent and no replacement, are they the same or are they opposites? No replacements means that you will only do it if absent. So yes, they are the same. So this should say no replacement. What else do we got? To eight old value. This should say old val 390 finish equals true. I think that should say finished finishing. 425, oops, 25. Bin I, I think we want, I see, this is just I, not bin I. When would you want to use the channel featuring rust? I'm guessing to jump across threads. I'm not sure I followed the question. 500, all right, we need KV. That's easy to fix. 504, this is hash. These are just trivial fixes. I just want to get to the tricky parts. 49, wrong number of type arguments. That is true, there's no S parameter to table. 219, wrong number. This is KNV, this is KNV, and that's KNV. 316, this takes KNV, and that takes KNV. Great, that's more of the errors I wanted. No method finish, line 78, why is that? Right, we need to use hasher. Yeah, this is like lower level cleanup that, which is like, because the previous stream, we mostly just ported the code and didn't actually try to compile it. This is just tidying up a bunch of the shortcuts we made there, like mistyping variable names and stuff. What we'll eventually get to is the compiler complaining about the fact that you'll remember this from the end of the previous stream that shared does not dereference into its inner type. Dereferencing a share is actually an unsafe operation. And so that's what we're gonna end up with once we get through some of these errors. Like here, for example, like find is not found for shared is an example of this. Is it safe to cast U64 to U size implicitly? I don't think the compiler will let you do that. Or it'll be a checked cast, I believe. Is it normal to have use and functions? Well, I don't know about normal. Usually I do it if there's a trait method that I only call in one place, I might do it. It's up to you, 27. I see this is, yeah, this we're gonna have to figure out. This is definitely a fix me. Actually, this has to do with move too. So let's do that before we continue. So that's gonna be find. It's gonna be, yeah, so this is the class node. That's the find that we already implemented. And then there's a special case for this use is like, Java overrides, which is really annoying. So forwarding node extends node. It overrides the implementation of find. And the idea here is that this is, you're looking for a node in a list, in a linked list in a bucket that matches a given hash and key. And if you walk the linked list and you encounter a forwarding node, then of course you should stop looking in the current bucket you're looking in. And instead you need to switch to the other table. And so, oh, interesting. That's gonna be a little bit annoying to figure out. Because this obviously needs access to, actually, maybe not, maybe not. So what we're gonna do is let table as well as next table. This assignment is sort of useless, like it doesn't really do much, it's just renaming the variable. Because we, actually, let's just do this is next table. This is next table. This is mute table. And here we're gonna do, let's name it just like it did here. And this is just gonna be a loop because we might move to the next table and then discover that that table, in that table the list we need has also moved. So we need to iterate through these. Like it might have to be a cascade. If, interesting. So what is this if? This if is, if the key is null, which we can ignore in Rust because the key is a reference, so we know it's not null. If the table is null, or if, what's n here? I see, or table.bin.len, this is the same as empty. Or, right, then return null. Just probably not what we wanna do. For now, that's fine. And tab at is gonna be the same as table.bin, I think. So you remember from our table at the bottom here, we have this sort of convenience method for given a hash, find me the corresponding bin. And so this is gonna be, why n minus one? Why on earth is it n minus one? That is interesting. Okay. So what this is actually doing is it's looking up the last bin of the table. Why is it ended with H? What a bizarre lookup. Oh, this is just the inverse. n minus one is the mask from here. What a super strange way to do this. Because if you see here, let's look for another tab at regular tab at. Yeah, all of these are just doing at, and H here is the hash of the key. And n minus one is all of the bits that are currently being used for the, it's equivalent to doing a modulo of the table length, the number of bins, if the number of bins is a power of two. I was just surprised that they do it in line like that, but that is why. And so this is really the same as doing bin i. So let bin i is table.bin i of, table.bin i of hash. And then bin is gonna be table.bin bin i. And if bin is null, then return this. And then there's another loop. And this loop is y. I think the outer loop is for iterating over tables. Like if we encounter another forwarding node in here, there forwards us to yet another table. And the inner loop is just walking the bin. Okay, so this does e hash. I think this is just doing this business. So this is if bin.hash is hash and bin.key is key, then we can just return that bin. So this is if the, I don't understand why this needs to be a loop. Oh no, it is iterating over. Okay, so why doesn't this just call this find? That makes very little sense to me. All right, that's fine. So this is just gonna loop over the nodes in the bin. So if the code is not in the bin, the nodes in the bin. So if the current bin we're looking at, so that means this is gonna be a mute. If the current node we're looking at is matches the entry we're looking for in the bin, then we just, then we're at the right bin and we can return if e h less than zero. This is their special marker. So this is really gonna be match bin. That's what that means. If bin entry is a node, then we're gonna do n find hash key guard. If it is a moved, next table, then we need to decide what to do. Then what we're gonna do is we're gonna say table is equal to next table. And then we're gonna continue outer. And I think that means this doesn't have to be a loop because the n dot find is what's gonna do the actual loop here. That's gonna be this. It does suggest that maybe this maybe turn into a loop instead of recursing. I don't think that's gonna matter. Yeah. So the idea here is if we encounter forwarding node, then we follow that forwarding node. We look up the bin and the table we were forwarded to. If we find that there's a list there, then we're just gonna search in that list. If we find another forwarding node, then we move to the next table we're forwarded to and then we continue there. All right, let's see what that gives us. Which also means actually that this can now just be a continue. We don't need the label anymore. Now that we removed the inner loop. The mask, yeah, the mask is there for performance reasons. The reason there's a mask instead of a modulo, even though it probably doesn't matter that much, but that's why it's there. Is a good practice not to use brackets in every if statements. In Rust, you don't use brackets for if statements. Yep, great. No method, find, find for shared. Okay, so this is D ref, right? Shared node has a find method. And so in order to call find on it, on a shared of node, we need to do the unsafe D ref. So that is an error we want to be there. That's expected. Missmatch type on 45. Oh, find. Expected nothing. Expected unit found shared. That seems false. I'm 45. Well, end.find should be this, which returns a node. So why is it saying that it's getting unit from this? Expected this. What is it expecting? The entire match to be unit. Why is it expecting the entire match to be unit? It's continue should continue for this loop. Oh, this needs to break. Great. No method, bin, found for shared. Table, okay, so this is a D ref. This is a D ref. This is a D ref. No variant release before. It's because I misspelled it. 498. Expected I size found U32. Yeah, resize stamp is a really weird function. So I stamp, it turns to the stamp bits for resizing a table of size n. Must be negative when shifted left by resize stamp shift. Think this is just gonna be I size from that. This is a D ref. I size from, it's a D ref, yeah. I size from U32. It's not satisfied. Right, you're not guaranteed to be able to do that. So we're just gonna do this as I size and this as I size. We want that to be signed store for shared. Well, that should exist. So shared, oh, that's right. That only exists for atomic. It does not exist for shared. So on 488, this is in put transfer. Transfer, where are you? Down here. So what does this actually do down towards the bottom? It sets table dot store. These shouldn't be store, I don't think. These should be store bin because that's what we're trying to do down there. Is shared equivalent to a shared pointer in C++? No, a shared is very different. So a shared pointer in C++ is more like an arc in Rust. Shared is a very low level type that the cross beam library provides to us. So a shared is just a raw pointer with no guarantees associated with it. But it has the implication that the target of a shared is accessible by many threads and we don't quite know how many. I'll go into that a little bit more detailed when we start dealing with the actual garbage collection and with the fact that shared, dereferencing a shared is unsafe. So this is DREF, DREF, DREF 479. I expected atomic found shared. Oh, I remember we looked at this code last time. Okay, so this business is, let me go back to the drawing for this actually to figure out how this code is gonna work out. All right, so here is, let me get back to my, let's do like a nice blue. Let's make it also bigger. For some reason, the buttons on my drawing pad have stopped working, which is really annoying because it means I can't easily, oh, that's maybe overly aggressive. Great, so the observation here is, so we have our hash table, right? And in one of the bins, we have this business. So imagine that we have some really long linked list. And in the new map, in the new map, which is twice as long, actually let me just do that. So the new map is gonna be twice as long, right? So if this is N long, this is gonna be two N long. And remember that the way that we take a given hash and turn it into this is it's modulo N, right? Which means that in the resize case, H, in order to go to a bucket, it's gonna be modulo two N. What this means is that half, if you look at any given bucket, roughly half of the things that ended up in that bucket are gonna end up in a different bucket when modulo two N. You can spend some time trying to convince yourself that that is the case. But it turns out to be true that this has like hash one, this is hash two, this is hash three, hash four, hash five. Some of these is gonna be the case that mod N is equal to H one mod two N. But for some of them, oops, modulo that's an N is equal to is not equal to the modulo two N. The reason for this is modulo two N means let's take an example here. Let's do, let's take a trivial example. Seven modulo four. Seven modulo four is three. Seven modulo eight is seven. But three modulo four is also three and three modulo eight is still three, right? And it turns out that basically half of the things that hash two are given bucket is gonna end up in the same bucket and half are gonna end up in different buckets. Which means that we're gonna end up with some of these that are gonna end up in let's say bucket one and some of these which are gonna end up in bucket two. And it might be that this is actually split, right? So it could totally be that in fact, in fact this guy over here, come on, this guy over here also ends up in bucket one. And what we would like to do is because we're gonna have to link them into their new places, right? This is B one, this is B two, this over here is N. We would like to sort of amortize the cost of moving many things at once. Right here we could just walk the entire list and move each element to its new bucket. We do an insert for this in the new map. We do an insert for this in the new map. We do an insert for this in the new map. But that's sort of wasteful, right? In this case, these two elements could just, because they're both moving into here, we don't actually like this link is gonna remain valid because that link is gonna be the same link in the new table because they both end up in the same bucket. And so we call the Java implementation calls as a run and it also calls this a run. And what the system is going to do is gonna look for a run. It's gonna move that entire run into the appropriate bucket and then for the rest of them, they're just gonna do normal inserts. And so that's sort of the idea behind the code that we're about to fix. Where the error we're getting, and so here you'll recognize what it does. It looks, so whether something is a run depends on the bit, the next bit up from N. And so it looks for runs, this basically, this turns out to give you the some run of N, some run of the bin that are gonna all end up in the same bucket. And if the run bit is zero, if the next bit after N is zero, those things that were in the run are gonna end up in the low bin. So that is B1 in the drawing. Or if the run bit is one, they're gonna end up past N. So this is like seven. Then they're gonna end up in the high bin. And this is moving the entire, this is moving everything before the run. And here, for anything that we're moving, we're gonna allocate new nodes. And that node needs to, the nodes that we insert need to be linked to the run that's gonna follow them. That's perhaps poorly explained. What we're gonna have to do is we're gonna have to construct a linked list for each of the two bins, right? One for the low bin and one for the high bin. Remember how we're splitting the bin into two bins, right? So we're constructing a linked list for the low bin and we're constructing a list for the high bin. But whichever run we ended up taking, we're gonna have to append to the appropriate linked list. And we can append them as one chunk. That's sort of where we're saving the work. That means that the next pointer here is gonna be pointing to the thing that we wanna append to the end. And that's gonna be, that may or may not be existing nodes, right? It might be one of these shared nulls, but it might also be the run that we found. And what Russ is complaining about is the fact that the next pointer should be an atomic and it found a shared. So why is that the case? Well, when we allocate a new node for the nodes we're adding to the new linked list, the next pointer needs to be set to something. And we're setting the next pointer to whatever link was. The problem is that if linked was a run that already exists, we only have a shared pointer to it. We don't have an owned pointer to it. And this is expecting, well, it's expecting these to be atomic, which is not necessarily owned. But you can think of shared as the target, you can think of atomic as a pointer and shared as sort of the thing that's being pointed to. So if you have an atomic, you can call load and then you get the target of the atomic. Sorry, that's a lot of words. You can probably ignore the past like 10 minutes of me rambling, but maybe it helped. The problem we run into here is that we're gonna keep appending to this linked list, which means we need to keep track of the start and end of the linked list that we're constructing. And the pointer to the start of that linked list is owned because we're constructing a new one. But the pointer to the run, which we're gonna append at the end is not owned because it's a part of some existing data structure. And Russ is complaining about the fact that these two types are mismatched. And so how are we going to do this? Is a good question. This next, the next pointer, what we're doing here is we're taking the run and we're sticking stuff, we're sticking all the other things that belong in the same bin to the start of it. And so the type of this next pointer is gonna be an atomic pointer. So how do we get the atomic pointer for the run that we extracted? Well, in theory, we could get that out right here. So what is the head? Yeah, we need to find a way to get, we want not a, I'm trying to figure out how to draw this to explain it better. Here, let me try. So here's the problem we run into. Wish I had my buttons back. Here's the problem we run into. So this is our node type. Node has a next, a next pointer. And logically there's another node where the next points to that node, right? In reality though, this, okay, so the type of this thing is atomic, which really just means that it is a pointer. So it really means. But this value, the address stored in the pointer can change arbitrarily, right? Because it might be accessed from multiple threads. And so when we do an atomic load, what that means is look at, you can think of this as like, look at the pointer to this thing and do an atomic read of what the current pointer is. And the value of this, this is gonna return a shared. And the shared is gonna contain this value at the time of the read. So even though the value of the atomic is this pointer and the value of the shared is also this pointer, there's a distinction in that the atomic itself, its value, think of it as can constantly be changing. And when we load, what we get is the snapshot of the pointer value at some point in time. And that is what's stored in shared. So you might do atomic load, get one shared, get a shared with a particular address. You might then do a subsequent atomic load with a different shared address. When we are constructing our new node over here, right? And imagine that this has been identified as a run. What is its next pointer going to be? Well, its next pointer is gonna be pointing to this. But this business here is also gonna be an atomic. It's not gonna be the same atomic, right? These are different points in memory, but it does have to have the same value, which I think means that we can just use atomic new because it doesn't matter that it's not the same atomic, really. Which means that down here, I believe, let's just look up atomic. Does that have a from shared? Great. Yeah, so this is just gonna be atomic from expected node found mutable reference. Ah yes, this is gonna be a star. No implementation for U64 and iSize. Well, so okay, this is another bit that's a little awkward. Actually, no, this should be fine. Why is N here in iSize? Why are either of these in iSize? What did it say? U64 and iSize, so N is an iSize. Why is N an iSize? Where does N come from? It's Lens, so it should be a USize. So why is it saying that it's an iSize? Although, reading from the other code, we can just do as U64. But I'm not sure why it thinks it's an iSize. Really shouldn't be. All right, binary operation cannot be applied to node 465. P not equal to last one. This is a pointer equality, is what we want here. So we don't actually want to compare whether, remember in Rust, if you do like, if you do, if you have X of type reference to a node and Y of type reference to a node and then you do X equals to Y, what that will turn into is like X dot Y, right? That's what Rust is gonna turn it into. Whereas here, what we want to do is iterate until we reach the pointer that points to the place where we found a run. And so we actually want to pointer equality here. We don't want it to use the equality implementation for node. So there's a pointer eek, I think. Let's just do this. I expected shared found reference. Okay, so last run here. Why is last run a reference? Why is head not a shared? That's a great question. So lobin here, these are both shared because they're shared pointers. Last run should also be a shared. Not sure why it thinks that's not the case. Yeah, it's thinking that, so head for whatever reason is a reference and not a shared, which sounds wrong. Where does head come from here? I think I'm blind. Oh, there. Yeah, so the trick here is this dereference. So bin here is a shared to bin entry. But we're dereferencing here, which is gonna be an unsafe operation. But that gives us, what's inside here is a node. So this is a normal reference is not a shared because of this unsafe deref. So I think what we wanna do here is say, led is head is shared from head. I'm gonna turn it back into a shared because it came from a shared. A no field next on shared. So this is deref, a no field value on shared. That's deref 467. Yeah, so this is gonna be, how do I, from a shared, a shared, how do I get its pointer? Just as raw and P is also gonna be as raw. Yeah, so this is deref, deref, deref, deref, deref. This, oh, it's not letting me do this because I need to do that, which is fine. Um, deref, deref, this is not a deref. Compare and swap. So this is gonna take an ordering, that's fine. This is fine, this is just a, this is part of the garbage collection business that we talked about earlier. So that's also gonna have to do with deref. Um, swap up here that's missing something. It's back to three parameters. Probably requires a guard, right? Atomic. Swap takes a guard as the last argument. Okay, 382. Owned does not have a null. Let's look for, oh, interesting. There is no own null. So if I have an atomic, how do I set it to null? Store just takes anything that's a pointer and shared implements pointer. So this can just be shared null. That's fine. 361. I size minus U size, that's fine. Actually, let's just do, where is stride? That's gonna be an I size. Gonna make that easier. What else do we have? That is deref. That's a deref 334, no garbage. This is gonna have to be a swap because we want the old value. So we need to get the value that was there because that is now garbage. That's fine. 333. And yep, that's fine. 330. Method not found into boxed. So what is VEC? So the notion here is I wanna take, I have a vector. This is when we initially allocate the set of bins. We have a vector of bin entries and we want to basically remove the ability to resize it. And I believe there's a method into box slice. That's what I want. Which basically removes the VEC wrapper and turns it into a box slice instead. 309 transfer, that's gonna take a guard. 204, that's gonna take a guard. 291, max resizers. That's gonna be an eye size as well. 288, resize stamp. That's gonna be an eye size. Although that really should be a U size. I think what I really want here is, why is there a comparison between eye size and U size here? Partially for shared uses pointer equality. Oh, nice. Then we can just do this. So why is it claiming that it can't compare those? I'm not trying to compare them. Is it getting confused by this somehow? All right, let's leave that for now. This is D ref 270, that seems like a count as eye size. That's fine. 241, guard. 233, expected eye size from U size. Max resizers, eye size plus U size. But max resizers is an eye size, right? Yeah, that's weird. 229 is complaining about expected guard found ordering. That's because the order of these is wrong, which means the order of this is also wrong. 227, this is a D ref, 207, expected option found integer. It should be some, then count 198, what is this? This is gonna be the same thing as before, where here we need to say let head is shared from head as common. This is just to sort of reapply the shared wrapping after we undid it up here. Let's see what that gives us. Oh, there's certainly fewer now. 311, the guard comes from up here. This probably needs to take guard. So that just use the guard that it's given rather than make up its own guard. All right, what else do we have? I don't know. I don't know. It's given rather than make up its own guard. All right, what else do we have? 209, this takes a guard. That's a D ref, D ref. That is not a D ref. No field value on type owned. Well, that is certainly interesting. Own should have, own should implement D ref though, right? Yeah, am I missing something here? Oh, bin entry. Bin entry can also be a forward. So how do we know this is not forward? Where is this code? This is input. The existing value, old value is doing. Why is this even using node? Ah, node is the thing that we construct up here. And so we actually know that it's a, we know it's a bin entry node. So here what we can do is let owned bin, we're going to do, we can destruct it here because we know that that is what we constructed in the first place. And this is unreachable. And then this can just use the value 211, missing guard or not missing guard. Expected reference found guard. Where's this guard coming from? Ah, reference boxed slice pointer to slice. In other words, slice is meter. Yeah, a boxed slice is a, an owned pointer to a heap allocated slice. D ref, D ref, D ref, D ref. 186 is going to be owned. Actually, can I destructure owned? I cannot. Hmm. Oh, but I can do into box, into box. Which is what I want to do. Ah, no field key is 182. Oh, that's really awkward. Yeah, we need to like, node ref key ref hash, then key hash. It's because the code doesn't know that the value we've constructed to be put into the map or into the, into the linked list is of type bin entry node. It doesn't know that it's not bin entry forward or moved, for example. So we need to, so we need to tell it that that is the case. And then once we've done that, then now we can use the hash and the key. And I guess we can actually do this. So you don't have to do it each time. Ah, a lot of D refs. No field lock, that's also D ref. No field key though, that is different. So up here, this is the same issue actually, where I think actually what we want to do here is probably this, just hoist that all the way out. And then up here in this guard, this is gonna be, which means I guess, this can probably be moved even further out. This can be all the way up here. And I think we only need the key. Oops. This is basically to give us easy access to the key while we are... We take the key that the user gives us and we stick it in this node, but we still want to be able to easily compare against the key. And so we just take a reference out to it down here. And now in these places, we can now use the key. This is just gonna be H for hash. Then I think we're good. Alright, let's try that. What else are we missing up here? 162, that one we fixed. 159 is gonna include the guard. 139 is gonna include the guard down here. DRF... Oh, actually, let me do that. DRF, DRF, DRF, DRF. In-it table. It's not been found. Alright, so this is a function that we're missing. In-it table. Interesting. So we do actually have to implement this function. Should be easy enough though. In-it table. And I guess it's gonna return a shared table. That's the idea. This is probably pretty straightforward. This is just you decide that you have to allocate a new table, which only one thread should be doing. So I think what it's doing here is sort of double-checking that there is in fact no table, whether the table is empty, and then it checks that no one else is trying to do the table initialization. Yeah, it just allocates the new table. Great. So this should be straightforward. Here we can just do while self.table load ordering. I guess this is probably also gonna have to take a guard while this is null. It's gonna have to be a loop. It's gonna be that if table is not null, or table.bin.isEmpty. So basically if there is a table that's not empty, then we can just return that table. Otherwise, try to allocate the table, which we're gonna do by doing sc is size CTL load. Take a guard. So we're basically gonna check the control bits to make sure that no one else is trying to initialize the table at the same time. If sc is less than zero, we lost the initialization race. Just spin. That's gonna be a thread yield now, and it continued. So here what we're really doing is we know that someone else is allocating the table for the first time. So we're just gonna yield so that someone else gets to run if they can. Then we're just gonna try to load the table again and presumably return pretty quickly. Otherwise, we're gonna try to become the people who initialize the table, which we're gonna do by doing a compare and swap on the control bit to basically make us the initializer. And we're gonna do that by taking the old sc and replacing it with minus one, with ordering seconds. And if we succeed at that compare and swap, then we get to do it. Let's see, so what does this do? Oh, so at this point, we need to double check this business. So this is, if the table is null or the table is empty, then we're gonna allocate it. Otherwise, so think of this, the size CTL is sort of like a lock here. And so we need to recheck the conditions under which we initialize the table after we took the lock. So this compare and swap, if it succeeds, we sort of have the initialization lock. And now we need to check that initialization is actually still necessary. And down here, we're gonna do a size CTL store sc. Think of this as releasing the lock. And in here, what are we gonna do? N is if sc is greater than zero, then sc else default capacity. And then this is going to be the allocation, which is similar to what we did under resizing new table. It's gonna be this. And now we're gonna do, let's see here, this is gonna return table. I guess actually we can just break here instead, given that this is a loop. In this case, table is going to be, I guess self.table.store, because we're gonna store this as the new table. And we don't need to worry about deallocating the old one because we've already checked that there wasn't anything there. So we're just gonna do this. We're just gonna store it straight ahead. And I guess actually we could take a shared from it as well. So if I have an owned, can I get a shared? Yes. So we're gonna do table is new table into shared. And then this is gonna be store table. Let's see if that's gonna work. And then sc. So remember sc is a sort of weird value where if it's negative, then it's used to also count how many people are helping. Then there's a resize, and it's used to count how many people are helping with the resize. If it is not negative, it is the next capacity to resize to. It's a really weird value. And so here, because we allocated the table, let's just set it to n minus n shift right to not sure why. Oh, it's, I think it's actually how many threads are going to help with the resize. It's not the next size. That's unclear. All right. Well, in any case, this is what it does. I wonder why this is a try. Okay, but that is going to allocate our table for us. And we break with the table. And we return. All right. Now what's it going to complain about? It's DRF, DRF, DRF, DRF, DRF, DRF, DRF, DRF, DRF. That's not a DRF. Couldn't that peculiarity be encoded through the type system to avoid future confusion? So one thing that's awkward about this kind of highly concurrent code is that you often, you're relying on what the CPU lets you do atomically. And in this case, usually the CPU will only let you do atomic operations on things that are word-sized. So basically, if you have a 64-bit processor, you can only do atomic things on things that are 64-bits or sometimes slightly smaller, which means that it's not arbitrary type so we can do these atomic operations on. And the size control field is used heavily in an atomic sense. It's used basically as a lock. And so while it would be nice if it was a type that we can control ourselves, like it was an enum and you got all these nice properties, if we did, we couldn't have these atomic operations on it. Or we would have to use something more heavyweight as a lock to guard it, which might, because it's such a critical piece of the concurrency business, that would become a very highly contended operation. Now, it could be that we could just replace it with a lock and it would be fine, given that it's doing comparison swaps anyway. But I think we're gonna assume that the authors of the Java version were like, we don't want to lock the size control field. We want to do atomic operations on it. And because that is the case, we can't have it be an arbitrary type, unfortunately. 156. What's it complaining about here? Bin entry node has no field next. That is true. Node does. 55. I'm just gonna complain about those fields. That's okay. 147. Respected. Okay, found. Okay, so this has to borrow the key. That's fine. 144 insert. This is gonna be a self put key value false. So false because we want to allow replacement for insert. And 136. Respected eye size found U size. As eye size. That's fine. 134 into shared probably takes a guard. If and else of different parameters. Respected eye size found U size. Default capacity. Because SC can be negative. So what we're gonna do here is this can be an ass U size because we're already checking that it's greater than zero. This is a D ref, D ref, D ref, D ref, D ref. And 313. This business. Interpret as generic arguments. Yeah, it's things that eye size takes a generic parameter SC as what it starts to parse is as, but that's not the case. So we need to make it not be confused. Great. And this is not going to use unsafe cell anymore. All right. So now we need to deal with the, now we need to deal with all these D refs. And so we're gonna, I'm gonna commit here just so we can finish up port except for garbage collection. Push it as well for those of you who are following along at home. Nice. Okay. So now we get to a real tricky part. Now we're going to have to deal with the garbage collection part of this. And so if you think about it, if we never deallocate any data, if we never free anything, then all of the places that have shared, that have shared pointers, it's always fine to de-reference them. They will always point to valid data. Right. And so if we just never collected garbage, we could add unsafe D refs to all the shares and all of them would always work. That'd be great. But that's not quite satisfactory. Right. What we really want is we want, we want to deallocate things like, for example, if we do a table resize, we want to eventually deallocate the old, the old table, the old set of bins. Right. If we remove a value from the map or replace a value in the map, we want the old value to eventually be deallocated. But once you do that, now you need to be really careful that if you ever have a shared pointer, that the moment you de-reference it, you know that it's going to remain valid. And so that is what this next bit is going to be about. This is a great place where we're going to take a short break for me to go P and make T. I'll also annotate this in the, in the video itself that people can jump to this point. So I'm going to run get T, everyone go P and do whatever you need to do. If you have questions about what we've done so far or what we're doing next. Now is a great time. We'll take just like a short bit for me to go through any additional points of confusion that you might have. Right. So short break time. I should really have like elevator music. But I'll be back in a second. We are alone. Any way to specify an enum type represent the 64 bit int and do the atomic ops with it. You can make make an enum like a wrapper. One issue we would run into here is I don't think it would help us much. Because here like SC, we really want to say something like it is either negative or positive. Actually, we wouldn't have to stick to that. It might be possible to do that and sort of use the high bit as the enum discriminant. I think it would basically end up with you would be operating on that enum as if it were a number anyway. And it wouldn't really help you because you wouldn't be able to do things like pattern matching because the, the way in which you would know which variant you're in would you would have to like read out numbers. What we could do is maybe do a new type around SC and have that have some like helpful method names for things like, like to take control or like methods that under the hood did these atomic operations on the number. So that might be something we could do. So Erwin, I'm about to go through what the plan for garbage collection is. So I'll do that in like a bit. I think the tea is done and then I will then we will dive head first into garbage. Let's see here. So any any further questions about like the code we've written so far what the plan is sort of anything about what we're about to do or does the explanation of why why dereferencing shared is safe if you never collect anything. Does that make sense? I'm going to take the silence as a yes. Okay, so here's where it gets tricky. There are a couple of places in the code that we've annotated with, let me see if I can dig one up here, for example. Arguably, I can close this now. Let's close that. There are a couple of places where we generate garbage. This particular example is we are doing an insert and there is already a value for the key that we're trying to do an insert for. And so we need to drop the old value or we're going to replace that value, which means that the value that was there is going to go away. And like if that value is going away, it should be dropped eventually. The problem is we can't drop it immediately because there might be other threads that are reading that value concurrently with us. And so the question becomes, when is it safe to drop that value? And there are many, many strategies for dealing with this. One search strategy is if you have a garbage collected language like Java, the language has a runtime that tracks whether there exist pointers to basically every object, everything that you ever allocate. The runtime is going to keep track of whether you have any pointers to it and whether there are no more pointers to it. So there's no way for someone to reach that value. Only then does it get destroyed. In Rust, we don't really have this luxury because we don't have a runtime. We don't have something that's running in the background and knows what pointers everyone has. Which means that we need to have some other strategy for when do you know that it's safe to drop a value. There are many strategies to do this without a runtime. And there's sort of all this research literature you could look into. What we're going to do is we're going to use one that comes with cross beam. We talked a little bit about this in the very early days of this port. Let me make that larger and easier to read. And so actually, I recommend you read this text while I eat some pineapple. Because this is basically exactly the strategy we're going to take. But the premise here is that anytime some thread is going to start doing stuff with the map. And doing stuff can be do a read, do a write. Anytime it basically from the point when it does its first pointer read from the map. And so now has a live pointer into that map or any descendant data structures. From that point, we're going to sort of we're going to assert that that thread is in a given epic. And when it releases the last of the pointers that has into the map, even sort of transitively deep down, then we're going to say that that thread is now done with that epic. What this means is you're going to have multiple threads and all of them are going to be entering and leaving epics. It's not technically true. They're going to be pinning and unpinning epics is usually the way it's phrased. And as long as an epic is pinned, any garbage that's produced in it will not be freed. So imagine that we are current, we start an epic one. And in epic one, some thread pins the epic. So now we are not allowed to move on from epic one. Epic one is going to stay around for a while or we can move on. If that thread or any other thread generates garbage in epic one, that's going to be stored in like the epic one garbage bin. And as long as anyone still has pointers into epic one, that garbage is going to stay around. So any pointers into it is going to be valid because the objects haven't been deallocated. When there are no threads left that have pinned epic one, epic one is considered complete or closed or ended. At that point, we know that that garbage is no longer reachable by anyone, because if you entered after the garbage was generated, you would have entered an epic two. And if you entered an epic two, you can't have reached the garbage because it was already removed. And so once all threads have underpinned epic one, any garbage that was generated in epic one is no longer reachable by anyone because everyone who is in epic two must have happened after the removal happened and so therefore it's safe to remove that garbage and free those objects. And so this is what's known as an epic based garbage collection scheme. We're going to generate garbage in epics and we're only going to free the garbage in a given epic when every thread has moved on from that epic. And because all reads, this is what the guard is for, in order to generate a guard, you pin the epic and so you know that nothing you read is going to be garbage collected. Because I'm going to pin the epic and then I'm going to do a bunch of reads and if someone deletes any of that stuff, it gets deleted in the current epic because I've pinned the epic. So the garbage gets entered in my epic and that means that any pointers I read in that epic are going to remain valid for as long as that epic is open. At some point I'm going to, and so I can traverse these pointers just fine and I know that all of them are safe to access. Then at some point I'm going to give up the epic. I'm going to give up my pin of that epic. Then I don't have any pointers to anymore, so for me it's fine if any of that garbage is now freed. As long as that is the case for all threads, this should all be safe. Did that roughly make sense? Should I try to draw it without help? Pedantic on my pronunciation. Sure, go ahead. Does this mean that this strategy tends to accumulate more garbage on average over time than traditional GC but less bookkeeping is necessary? Sort of. Epic and epic, they're both valid pronunciations for epic as far as I'm aware. So epic based garbage collection doesn't necessarily accumulate more garbage. The amount of garbage is independent of the strategies used to reclaim the garbage. The question is how long does it take for garbage to be reclaimed? Or phrase differently, how long does it take between something gets deleted and it actually gets dropped? This is a product of two different things. One is obviously it's not going to get freed until it can be freed. The time between when I remove something and it actually gets removed is going to depend on if there are other readers that hold onto it. If someone is holding onto it for the duration of the program, it's never going to get freed. And that's fine. It's unfortunate, but fine. So that amount of time is not that interesting. What's interesting is how long does it take from when no one has a pointer to a piece of garbage until that garbage is reclaimed? And so that period of time is determined by your garbage collection scheme. If you use a garbage collected language, they have different schemes internally too. You can implement a runtime garbage collector using epics. You can do it using sort of generational GC. There's sort of stop the world GC mark and sweep. They're all these strategies and they all have different properties in this regard. Very often a garbage collected language, like a runtime garbage collector, is not going to free anything immediately, not at all. It's going to try to amortize the cost of collecting that garbage. So it will also introduce a delay. The other extreme of this is if you use something like reference counting. Reference counting is a garbage collection scheme. If I have a reference counted value, what that means is at some point I will know that no one has a pointer to it anymore because the reference count went to zero. And the moment that happened, I'm going to free it. So in a reference counted scheme, garbage is destroyed immediately once you can, but that can also mean that you don't get to amortize the cost of deallocating things because you must deallocate them immediately. An Epic Bay scheme does not have more or less garbage at any given time than a runtime garbage collector. The only thing that's tricky about an Epic Bay's garbage collection is that it is sort of based on cooperation. So I mentioned that a thread can choose to pin the Epic and then release that pin. Well, imagine that some thread pins the Epic and then runs for a really long time with the Epic pinned. Well, if they do, any garbage that's accumulated, it's just going to sit around waiting for that one thread. And so it requires that the pieces that are using your Epic Bay's memory reclamation are cooperative in some sense, that they don't hold on to a pin for longer than they need. This is also why if you look at Guard, Guard has this method called Repin. And what Repin does is it releases the pin and then immediately takes it again. And this can be useful because if you Repin, you allow the Epic to move on, right? So because you take a mutable reference to the Guard, that means that anything that was dependent on the Guard is now no longer valid. The bar checker is going to check that. And you're going to release the current Epic you had pinned and then immediately pin whatever is now the next Epic. And so this is the way that if you have a thread that's like really busy doing some operations, it can occasionally Repin in order to let garbage be collected. Do all previous Epics have to be unpinned? That is, if Epic 1 is still pinned by some thread, we are in Epic 3 and Epic 2 has some garbage and no one has pinned 2, could it be freed? Not quite. So you can't, this depends a little bit on the exact scheme. Let me see if it says here. Yeah, so there is a global Epic and you are pinning the global Epic. So that means that if I pin Epic 1, no one is moving on. We are staying in Epic 1 until everyone has moved on from Epic 1. So it's not as though every thread has its own Epic. It's that either you are in the current Epic or you are in the next Epic. And once all the threads are in the next Epic, then the next Epic becomes the global Epic. If that makes sense? So there usually aren't many Epics. It could be, in theory you could do sort of generational Epics but I don't think that's what Crossbeam does. Because it gets really hard to keep track of. Does that make sense? Okay, in that case we are going to try to do this. So the gist of Crossbeam's Epic based reclamation is that when you have a guard or whenever you load a value you have to give a guard to say that I load this value and it's protected by this guard and the guard of course pins the Epic. So it's sort of like the read I'm doing now is going to remain valid for the current Epic. And you'll see that there's this defer destroy. So defer destroy is I say this value is now garbage. Whenever the Epic moves on you should feel free to free it. And then Crossbeam will under the hood take care of that freeing. Whenever all the threads have unpinned the previous Epic so to speak. So what we need to do is use the guards defer destroy in the appropriate places where we generate garbage. And in theory that should be this pin has nothing to do with standard pin. By the way, these are separate concepts. So any place that we generate garbage. Let's see. Yeah, so the phrasing here is important. There is no guarantee when exactly the destructor will be executed. This is if we mark if we do a defer destroy. The only guarantee is that it won't be executed until all currently pinned threads get unpinned. Right. Okay, so this is really important. The moment we mark this as garbage when I do guard dot defer destroy now garbage. I like to annotate any unsafe code with a statement about why it is actually unsafe. So in this case, why what we need to argue is actually let's double check that I'm not being stupid. So for safety, the object must not be reachable by other threads anymore. Otherwise, it might still be in use when the destructor runs. And the value must be sendable. The value must be sendable. That's going to be its own kind of interesting. Okay, so we're going to require actually this is a little awkward. We're going to have to require that the value is send. It's also going to have to be sync, obviously. And I think the key is also going to have to be sync. Okay, so the safety we need to promise here need to guarantee that now garbage is no longer reachable. More specifically, no thread that executes after this line can ever get a reference to no garbage. Well, let's think about whether this is true. So think about a thread that already pinned its epic might have a pointer to this garbage, right? Think about what happens if a reader reads this value and then we replace it. So we mark it as garbage. That old thread still has a reference to it, so it's not safe to drop. And so the question is, is deferred destroy still safe? And the argument is that, well, if someone has already pinned the current epic or phrase differently, we have pinned the current epic. That means that no one else is going to get, anyone else who pins the epic is also going to get the current epic. So if someone else reads this value, that is prior to the line where we swap it out. When we swap it out, no one else can see it. So any future thread, anyone that pins in the future, they are not going to be able to access this value. It's only about threads that executed before this line because they are the only ones that might have seen that value and have references to it. So any previous thread must have pinned either before we took our guard or while our guard was active. That means that they must be in the same epic that we are or in an earlier one. And because that is the case, we know that this garbage is generated as of the current epic. So we won't be freed until the next epic. And we just said that any thread that has seen this value, has a reference to this value, must be in a previous epic. And therefore the safety is promised. Let's see if I can phrase this for the purposes of the comment. No thread that executes after this line can ever get a reference to now garbage. Here are the possible cases. It's like a useful exercise in working through safety issues. Another thread already has a reference to now garbage. Well, they must have read before swap, before they're called to swap. This means their guard must have been taken either. Their guard was taken before our guard. In that case, actually this is not a separate one. Either the guard was created before our guard, in which case another thread already has a reference to now garbage. They must have read it before the call to swap. Either because of this, that thread must be pinned to less than or equal to an epic less than or equal to the epic of our guard. Since the garbage is placed in our epic, it won't be freed until the next epic, at which point that thread must have dropped its guard and with it any reference to the value. The other case is another thread is about to get a reference to this value. They execute after the swap and therefore do not get a reference to the value. So free to now garbage, they get value instead. So freeing now garbage is fine. So that is the argument for why freeing here is safe. Does that argument make sense? Don't you need to guarantee that the swap pointer was unique, not stored in any other shared place? Yes, that is also true. We need to guarantee that this pointer, this now garbage pointer, is distinct. There are no other ways to get to a value except through its node, which is what we swapped through its nodes value field. So freeing now garbage is fine. So there's a requirement that there's no other path to that value either if there were a real problem. All right, where's the other one? So that was for freeing values. And this one, we're missing a case here. So here, this is, we've done a resize, and the resize is finished. So that is, yeah, this is the resize is finished. So the next table becomes the current table. And the old table needs to be freed, right? This is the the vet that we did like into box slice for. So that needs to be freed somehow. How are we going to do that? Well, so here we're going to have an unsafe guard defer destroy garbage safety. So this one is a little trickier to argue because here we need to guarantee that actually we need to guarantee the same thing. Right. So let me grab the safety from before. Right. So this is the property we need to guarantee that now garbage is no longer reachable. No thread that executes after this line can ever get a reference to now garbage. Right. So the safety property we need to uphold is the same. And so here are the possible cases either by just copy the cases as well. The argument isn't quite the same. Well, the argument is pretty similar. Another thread already has a reference to now garbage. They must have read it before the call to swap. Right. So that is certainly true. Because of this, that thread must be pinned to an epic that's less than or equal to the epic of our guard since the garbage is placed in our epic. It won't be freed until the next epic at which point that thread was adopted guard and with it any reference to the value. So that holds true here as well. If someone else has a reference to what is now garbage, it must be because they read self dot table in the past. Which means that they are tied to an epic in the past, which means that as long as we drop in the next epic, we're all good. Oh, you can indent on indent with the number of lines followed by double angle bracket left or double angle bracket right. But in this case, it's just rust format that doesn't for me. The other case is, and this is where the safety really comes comes up is we need to guarantee that another thread that's about to get a reference to this value won't get a reference to this value. So how can we guarantee that? Well, actually, let's do this. First, we need to argue that self dot table is the only way to get to that there is no other way to get to self dot table to get to now garbage, right? So this is the uniqueness property. Then we need to guarantee we can't free something if or we can't we can't give this guarantee that no other thread is going to get to now garbage. Unless we give the argument for why? Well, it is not accessible through self dot table anymore. It cannot be. It is not accessible through self next table. What about forwarding nodes? So this is bin entry moved. Well, okay, so the argument here is there are going to be some bin entry moves that or there might be some bin entry moves. The point into now garbage because remember bin entry moved is is bin entry moved is just contains a reference just a raw pointer to the table that you're going to go searching. Well, for a only bin entry moved that existed before. Well, the only bin entry moved that point to now garbage are the ones in previous tables or I guess let's call it earlier table previous tables is good to get to those previous tables. So one must go one must ultimately have arrived through self dot table because that's where all operations start start their search. Now self dot table has now changed only old threads can still be accessing them now changed only old threads can still be accessing them know new thread can get to past tables. And therefore, they also cannot get to moved that point to now garbage. So we're fine. It seems like rust is not helping as much as it could using this allocation scheme maintaining these invariance seem hard to reason about. That's true. For for this kind of low level concurrency, like this is just unsafe code. And it sort of has to be think of it this way unsafe code is it's there in order to let you write code that relies on invariance that the compiler cannot check for you. And it's totally true that the Rust borrow checker just like does not understand what we're doing here. Remember, we're relying on really intricate concurrency properties here. Right. We're relying on things like the relative ordering between different or atomic operations or control of like the size control field. Right. And these are things that the compiler doesn't know this. This is why for many of these data structures, like people have written long research papers about writing formal proofs where they're correct. And so the rest compiler doesn't really help us that much here. Now it does help us in some regards, right, like this, this, this sort of guard and shared scheme does help. It does mean that it's harder to write incorrect code, but it does not by any means make it impossible. But it does mean that any code that might be, you know, it doesn't help. Make it impossible. But it doesn't mean that any code that might be incorrect is going to be marked as unsafe. There are also some other things that the safes is completely from right like the putting the restriction that the key in the value or sink in that the value is send. In fact, I think the key also has to be send now that I think about it because it might be delegated by different thread. So these properties, the compiler just has no chance of checking for you. But that's why it's nice. The rest provides the sort of escape hatch of if you know if you think you know better, like you know that the invariant makes sure that this is okay, then go ahead. Okay, so actually we can make this argument in us perhaps slightly more useful way, which is we first say, first, let's talk about threads with existing references to now garbage. Such a thread must have read it, must have gotten that reference before the call to swap. Actually, no, here. Actually, yeah, this is the way to make this argument. This means that that no future thread, i.e. in a later epic where the value may be freed can get a reference to now garbage. Next, let's talk about threads with existing references to garbage. Such a thread must have gotten that reference before the call to swap. Because of this, that thread must have been to an epic less than or equal to the epic of our guard since our guard is pinning the epic. Since the garbage that plays in our epic, it won't be freed until the next epic, at which point that thread must have dropped its guard and with it any reference to the value. You'll notice that these safety arguments, the safety arguments about when we free stuff. That safety argument is really the safety argument for both this and for the reads. Because as we mentioned before, if we never deallocated things, then all of the reads would be safe. Like all of the dereferences would be safe. And so once we make the argument that this deallocation is safe, then that is sort of inherently also the argument why the derefs are safe. There should be one more of these, which is when you free a bin. Although I can't find that now, which is interesting. One challenge we have with the Rust code compared to the Java code is in the Java code, it never actually specifically says where it drops a value. Like when it goes out of scope, it just sort of like it just overwrites the value and then the old value will be garbage collected. This is definitely at least one memory leak here, which is here. The old P is now garbage, because it has been replaced with the node we allocated above. Right, so this is you're doing a resize, and you're going to take that there's an existing linked list, right? And there's a new linked list, and which we're transferring into and we're going to for every node in the old linked list, we're going to create a node in the new linked list, and we're going to fill it up with the field from the old one. And once we've done that, the old linked list, the nodes in the old linked list are now garbage, right, except for the run which removed wholesale. But we can't free them yet, because there might still be someone sort of reading from the old map and hence the old, the old bin, the old linked list. So, but it is garbage. And so here we are going to do guard defer destroy safety. Why is that indentation being weird? Strange. So what's the safety guarantee here? First, we need to argue that there is no longer a way to access P, which is not actually true. That's only true down here. Yep, that's awkward. So it's not actually there, it's down here. The old bin linked list is now garbage. Everything up to last run in the old bin linked list is now garbage. Those nodes have all been reallocated in the new bin linked list. All right, so first, so this is going to be basically the same while loop is up here. I'm going to do this business. And then we need to give the safety argument here for why this is actually the case. First, we need to argue that there's no longer a way to access P. So why is that the case? Well, it is no longer possible to access P because the only way you would access P is through the table and the table's entry is now moved. The only way to get to P is through table. Double check that, yep, is through table, or table I specifically. Since table I has been replaced by a moved node or move with a bin entry moved, P is no longer accessible. Okay, so that's pretty straightforward. Next, we need to argue that actually we can simplify this argument. We need to argue there's no longer way to access P. Any existing reference to P must have been taken before table.storebin, at which time we had the epic pinned. So any threads that have such a reference, I don't know if this is better. Any existing reference to P must have been taken before table.storebin. At that time, we had the epic pinned. So any threads that have such a reference must be at our epic, must be before or at our epic. Since the P isn't destroyed until the next epic, those old references are fine since they are tied to those old threads' pins of the old epic. This is basically the same argument for each of these. It's the same argument about why destroying is fine, or rather why old references are fine. New reference is sort of the key safety concern, really. I'm pretty sure there's at least one more missing. And that is when the whole table is dropped. So specifically, we're going to have to implement, what's that down here? ImpleKVDropForTableKV. I missed the word fine. Where? Where did I miss the word fine? I mean, I'm sure I did, but... Okay, so this one is a little subtle. Here, we're going to have to do ForBinInSelf.Bins. Great, good catch, thanks. ForBinInSelf.Bins, let's head here. How do I easily get that? Oh, I see. So this is going to be... This is actually going to be binI in binIsSelfBinBinI, if... Ah, balls, I don't have a guard, do I? So, Erwin, you can use Valgrind in Rust as well. Valgrind works just fine with Rust code. Okay, so what I'm thinking here is when a table is dropped... Oh, when a table is dropped... Actually, this is much better than I thought. Debug assertions. ForBinIn... For this assert, cross-beam assert binIsNull, or... mode here... I'm going to implement this... HasMood. So the idea here is... The idea here is we don't actually need to do anything when we drop a table, because if it's dropped because the resize finished, then all of them should be moved anyway. IsMood. Er, I guess HasMood. Table is spelling. Bin should have been... Moved or freed by whoever is dropping the table. If table is dropped due to a resize, all bins should have been moved already. If table is dropped due to the whole map being dropped, the map drop impulse should have done the work. Is there a binIsNull or binIsMood? Then the work to destroy everything. So in fact, we don't need to do any of this work here. That's refreshing. We are, however, going to have to do... ImpleDropForKVS DropForFlurryHashMapKVS Because when you drop the map, we want to make sure that you still end up dropping all the values and dropping all the bins. We have a huge advantage here though, which is Drop takes immutable reference to self. That means that we know that there's no one anywhere who has any pointer into the map. And there's a special thing in Crossbeam for this, which is unprotected. So this unprotected is used... This is the way to get a guard that will immediately destroy anything. The most common use of this function is constructing or destructing a data structure. And the reason of course... We can use the dummy guard in the destructor because at that point no other thread could be concurrently modifying the atomics. Because that would just unnecessarily defer things. So what we can do here, we can do guard is CrossbeamEpicUnprotected. Safety... Not concurrently... We have MuteSelf. So not concurrently accessed by anyone else. Nice. And now what do we want to do? Well... We basically want to walk the entire map and go through and free everything. Or destroy everything. The question is what's the best way to do that? I think we're going to assert here... Oh... Oh, there's some trickiness here. Like if the map got dropped in the middle of a resize. I think for now what we're going to do is assert that self.nextTable.load const is null. Then we're going to load the table. Then we're going to walk all the bins in the table. Then we're going to do bin load guard. Can you resize to zero? No, I think the resize is only ever growth. I don't think it lets you shrink the map. Although maybe. That would be nice actually if there was a neat way for us to do that. But not easily I think. You see this is like already a growth. I don't know that there's a way to shrink it after. Which is perhaps unfortunate. But luckily this code is going to be fairly straightforward. Because it doesn't really need to do that much. Specifically, if bin is null, continue. All it really needs to do here is do... P is bin. P is null. And here the safety argument is actually pretty simple. We're dropping the map, the entire map. So no one else is accessing it. We are also leaving. Technically here we could do like... We kind of want to leave in place a sort of destroyed here. Although it shouldn't really matter. This bin load. Because here we can just do like... In fact that's perfect. We just store this. Which is even better. We can just swap this with a shared null. We also replace the bin with a null. So there's no future way to access it either. One thing that's missing is... We also need this load needs to go before. Because otherwise the drop here would drop that piece. We wouldn't be able to access p.next. Oh, yeah. The same argument came in chat. There's another thing missing here. Which is that... We want to... Where's the value? Yeah, the value itself is also an atomic. p.value.swap.guard.ordering.sec. And then we're going to guard, defer, destroy the value. First drop the value in this node. Although there's another thing missing here. Which is... That's awkward. This is actually... We're going to have to match on p here. We're going to do if... Bear with me here for a second. p is if let bin entry moved is bin. Actually this can just be a match. If it's a bin entry moved. Then we just want to drop the value immediately. And we don't actually need to do any of this like recursing. So if it's a moved. We just do this. And if it's a bin entry node. Then we actually need to... Then we actually need to walk the list. This is going to be... Actually we're going to do this. I think. So the drop here is... We walk all the bins. We swap each bin for null. Because we're going to empty it out anyway. We have to look at what was in the bin. If the bin was just a moved entry. Then we can just destroy that bin without thinking anymore about it. If that bin was actually the head of a linked list. Then we're going to have to walk the linked list. Drop the values as we go. And drop the nodes as we go. And what we want to do here. I guess... Safety below. We replace the bin with a null. We own the... We own all the nodes in the list. Then we... Move to the next node. And drop the one we passed through. Alright, so that should indeed... That should free all the things. So now is there anywhere else where we drop a value? When you resize you drop the... The old bins. The old nodes in the linked list. And then you drop the old table. But at that point it should be all empty. Actually that's not true. There's one thing we're missing. That is... That is down here. We do have to drop... The... If the head of the linked list is one of those forward pointers. We still need to drop that head. So this is actually not entirely true. So this is going to be... Unsafe. Epic. Unprotected. Safety. No one else is accessing... This table anymore. So we own all its contents. Which is all we are going to use this guard for. So this is... We need to drop all... Drop any forwarding nodes. Since they are heap allocated. And we're just going to do a straight up... For bin in self bins. We're going to do... Bin is bin.swap. It's going to leave null pointers behind. This is no real reason not to. Sure, we can do a sequential consistent here. And if bin has moved... I don't think we even need this hasMove function. If bin.isNull. Then we just continue. If let bin entry moved is bin. Then we want to free it. So this is going to be a defer destroy of bin. And that is safe. It's safe by the same safety property we gave above. Otherwise, this should be unreachable. Dropped table with non-empty bin. That gives us... Alright, so the DRFs are still missing, right? We just want to check that the... We haven't broken any of the things for... Here's one. I always get the ordering of these wrong. It's annoying. These are all DRFs. DRF, DRF, DRF. That's fine. 158 is missing lock. Which is going to be... What do we say that this lock type was? It's a... Parking lot mutex. It's 60. It's an atomic new. DRF, DRF, DRF, DRF. And 620. It's going to be cross beam. Let's see what that does. DRF, DRF, DRF. 163. New. Empty. Great. Okay, so I think all the remaining ones are DRFs. So now let's think through it one more time. Is there anything anywhere else where we drop values? You do a resize. And when you do a resize, eventually the old table is getting freed that we've dealt with. And when the old table gets freed, it's bin heads that are forwards must be freed. We've dealt with that. If a value gets replaced, the old value gets freed. We've dealt with that. And if the map itself gets dropped, then we need to drop all of the entries. We've dealt with that and all of their values. And we've dealt with that. Okay, so I think this means that we now have all of the destruction in place. And so now let's do a... Save for that. All garbage collection logic. Okay, so the one thing that's now missing is that, as we've talked about before, shared does not actually implement DRF, right? And so that's why we're getting all of these errors. It's because dereferencing a shared, which is the raw pointer, into a reference into the inner value for... Let me pull that up here. A shared DRF. So you see the DRF function is unsafe, right? It does not implement the DRF trait, which is commonly the case for wrapper types like this. And the reason is because it could be pointing to invalid memory, right? A shared is really just a raw pointer. It's nothing else. Okay, so we need to... Whenever we want to DRF a shared, we're going to have to deal with this property. So this is the safety we need to guarantee when we want to call DRF. Deferencing a pointer is unsafe because it could be pointing to invalid memory. This is what we talked about with garbage collection. So we need to make sure that we haven't freed the memory that it's pointing to and the pointer isn't null. Another concern is the possibility of data races due to lack of proper synchronization. For example, considering the following scenario, a.store owned new relaxed, a load relaxed unwrap. The problem is that relaxed orderings don't synchronize initialization of the object with a read from the second thread. So one reason why this is less of a problem for us is that all of our orderings are currently sequentially consistent, but also the Java code is pretty careful about making sure that a load is going to check the value that it gets back. It's not just going to immediately download the value. Let's see, asref here. Oh, that's to get through the owned. I think the primary safety requirement is here. And in fact, for basically all of the places where we do this deref, the argument is going to be the same. The argument is going to be because our destruction logic guarantees that it's safe. All these derefs must be safe. So let's just walk through these, I guess, bottom to top. 83. Hmm? 83? Okay. So this is going to be a deref. And it's going to be an unsafe. Safety. Next will only be dropped if we are dropped. We won't be dropped until Epic passes, which is protected by guard. 88. Ooh. Table is unsafe. Table deref. I don't think this needs to be a shared. I think this is just a straight up. Actually, no, we do want it to be tied to the guard, I think. No, we can just do this. Okay. So the argument here is safety. We are referenced to the old table. We would never reference to self. We got that under the given guard. Since we have not yet dropped that guard, no collection has happened. The self can't have been collected yet. And so neither has next table. So we have not yet dropped that guard. This table has not been garbage collected. And so the later table definitely hasn't. Safety. Same as above. Great. Now what? 943. Bin. Okay. So here. What do we want to do here? Here we want to say bin.deref. Bin is unsafe. Bin deref. Safety. The table is protected by the guard. And so is the bin. What? It takes two arguments. It takes the guard. Makes sense. 34. Can't be null. 672. For bin in mute, self bins. It's not an iterator. It is now. 677. Okay, let bin is unsafe. Bin deref. Safety. We... Oh, it's the safety here. We own self. Or we have exclusive access to self. So no one else will drop this value under us. 673. This is probably me being stupid. The guard needs to come last. 681. Expected shared found reference. I see. It's just this. 646. All right, so here. This is going to be unsafe. P deref. Safety. Actually, the safety is just the same here. The same reason why we're allowed to destroy P down here is the reason why we're allowed to deref it up here. It's the same safety argument. 642. No value. No field of value on type P. Why is that? That should give me a node. And node has a field value. That's interesting, actually, that what does the next pointer here? The next here is an atomic. Which I think means that the head here... Freeing the first node is actually a little bit different from freeing the subsequent ones. Because the first node in the linked list is a bin entry, which contains a node. So we have an atomic bin entry, which contains a node. The subsequent ones are just straight up atomic nodes. So they're a little bit different. Which means that the freeing has to be a bit different too, which is kind of awkward. I think what we want to do here is... Does it mean that the first node and the subsequent ones do not have the same type? Yeah, the first node is a bin entry node. So it is a bin entry node, which internally contains a node. And then that whole thing is wrapped in an atomic. That's like the head type. But the next pointer is just an atomic node. There's no bin entry here. So they're actually different types. Normally this is fine, right? Because we're only ever calling methods on node. We're not calling that many methods on bin entry, except for find, which we're already using. The challenge is that when you want to free them, they actually have to be freed differently. Which is kind of awkward. So I think what we're going to do here is... We're going to do head.loadNext. Next.load. That's what I meant. Guard. And then down here, now drop the head. Which is going to be this business. It's going to be safety, same as for the tail above. So that should do that. 660, this is me being stupid this guard has to come last. Which probably means that I messed this up somewhere above too. Down here. This has to have guard last. Probably means that this needs to have guard last. Alright, how about that? Alright, 625. Shouldn't pin fix the issue for self-referential structs. So I don't think this is really an issue with linked lists. This is specifically an issue with a linked list in this particular context where you also have delayed dropping. It is true that the standard library pinning will help with writing linked lists but not this type of linked list where you have a lock-free linked list. Can the tail end of your loop unwrap the atomic node to an owned node to match the first state? The problem is when you call defer destroy, you have to pass it a type that was originally allocated as using an owned. And we never, for the head of the list for this first node, we never constructed an atomic node we constructed an atomic bin entry node. And so we can't, even though we can construct if we want to a shared node, that would be incorrect because that's not the thing that we're supposed to free. We're supposed to free a shared bin entry node for the first entry. For all the others, we're supposed to free a shared node. So that's why it actually has to be different. Note that we must do this separately because for the head of the list we're dropping a shared bin entry node. Not a shared node. That make sense? Alright, so six to five. Four bin in table bins. I guess here table is going to be unsafe. Table D ref. Safety is safe because same as above. Six twenty seven. That's true. So you come to think of it, where's the place where I do like this doesn't need to be that at all. It might be easier to manipulate using atomic into owned within the drop impulse because an own pointer will drop for you when you go out of scope and you get safety refs. That might be true. Although it's like unclear it matters that much. Well, okay, sure. We can do bin is bin into owned. So what it's called for shared. Yes, into owned. So this is your proposal. What if a bin head gets removed from the map? The next bin entry should change type to replace the head. That is a good question. I'm not sure. We might have to make the head an atomic. Be an atomic bin entry node where bin entry node is an atomic node to be awkward to say the least. Although it would mean that we could stick the lock right in that head. Maybe. Let me think about that. You might be totally right that also currently we don't implement removal. I think. But yeah, I think once we implement remove, I think you're totally right that that would be the case. And we'll have to look at what we do with that. That might actually change the underlying impulse. I agree. Okay, so here the argument is actually that I see. So here we can actually do a shared null. And then we can do into owned. And here we can do bin is bin is unsafe bin into owned. And here let P. So here we can actually take ownership of the node is unsafe P into owned. I agree. This this does make it nicer. And now this can be into owned. And then this can be no dot next. If you own the table atomic into owned skips the atomic load entirely. Oh, there's an atomic into owned as well. What does that do? Is that like if you have a mute self? Oh, that consumes itself though, which I can't easily. Oh, I see. You mean if I have this, then I can do this. And then here we can do so we can actually do even better now because we own this head here. We can do a swap with shared null that's owned. Then this value we can now do value into owned. And the bin is now going to be dropped automatically. And I wonder whether we can do even better, which is because we own this value. This we can just do into owned. And similarly here this we can just do into owned. Yep, that should do it. That's nice. And then maybe we don't even need the guard. In fact, we can do into owned here because we own the head. And then we just do node next into owned here. Right. We still need the guard for the top level though, I think. Because we don't technically own the table because we only get mute self. But this does make it a lot nicer. I agree. Here. Yeah, this is going to stay the same. Nice. I agree. That's much, much nicer. 680 is this complaining about outright star that actually gives us an owned back, not a method not found for owned node. Yeah, this is just P now three, four. Ah, if P dot next is null, actually, that's awkward. I saw the pointers and all. Yes. But is there a way for me to check whether it's null without loading it? Nope. That's not great. That's really awkward. That's really awkward. Yeah, I think we actually do need to do like a load here, which is super awkward. It's fine. It's just sad. And table bins. This is awkward too. We can't actually destruct this one. Well, actually, no, we can. We can do back from box slice. Isn't there a 632? Let's see here now. I'm going to star that. That's fine. 602. So now here we really need to do a defer destroy. That's fine. And this is going to be an unsafe P defer. And the reason that defer is safe is the same reason why it's safe to disper destroy. So that's all good. No method named defer. That's because it should be deref. Wow, so many derefs. 584. Okay, so here, next table and table. So I think what we want to do here is to say none of this changes table or next table, right? No, great. So up here. Okay, so here we need to make an argument along the lines of table deref. Why that is safe. And next table deref. Why are these safe? Why are these safe? It's a great question. I don't think that they necessarily are. I think that only if this is the case, are they safe, right? So in this case, those shares were both constructed. What do I want to guarantee here? So for transfer, we want to guarantee that both table and next table remain valid. Specifically that they're not destroyed. They will be dropped when the guard is dropped, right? We're guaranteeing that with the signature because if the guard was dropped, if the guard was dropped, then these would no longer be valid. So by having this constraint, we're saying that these shares are tied to the lifetime of this guard. Great, so that's what we want. As long as that is the case, these will be dropped at the earliest, or these were read while the guard was held, while the guard was held. We still hold guard. And the code that drops these won't drop these and ensures that they are only dropped. Actually, we got to be careful here. These were read while guard was held. We still hold guard. So guard was held. The code that drops these only drops them after A, they are no longer reachable. And B only drops them after they are no longer reachable. And B, any outstanding references are no longer active. We are still active. These references are still active, marked by the guard, so they won't be dropped. So the target of these references won't be dropped while the guard remains active. So safe with all that unsafe. So keep in mind that for each unsafe block here, we're thinking very carefully about why that unsafety is okay. And it is true that unsafe code is less safe than safe code. But the reason for that is because in unsafe code, you as the programmer are telling the compiler, I have checked that I maintain the necessary invariance. And the compiler cannot check those invariance for you. If they could, then you shouldn't be writing unsafe code in the first place. And so this is really us being really careful about thinking through why, how are we sure that we're upholding the things that the sort of restrictions of the unsafe code. Okay, now we're at 591, but that's method as raw found on table. Oh, that's just this. That's fine. 591 expected you size found I size. Oh, I see. Well, at this point, I can be used size because if it was negative, we would enter this 591 expected owned found shared. So that is store bin. And I think this can take anything that's a P where P is cross beam epic pointer. Store is pointer to one of these guys. Now imagine this was C and you would have to go through all the code to check those invariance. Yeah, exactly. So the idea here is that at least we only have to check the unsafe parts. That's the idea. And it's true this, this, this code we're writing is like highly concurrent and unsafe code. But that's sort of what we chose to write. And the idea here is that once we get this code to be right, anyone who uses this code will not have to think about that on safety. Right, that's the idea. We're sort of encapsulating the on safety in our code. And this is where things get tricky. Maybe we're going to run into this business already now. As of the challenge here now is I just realized something else too. No, that's fine. One thing we're going to run into here is I think this is the point that came up earlier. Here we're going to store a bin. But what we're storing is node, not bin entry, but the head needs to be a bin entry. That's kind of awkward. So there are a couple of options here. One is we do this like double in direction at the head. The other is that we merge the we just stick node directly into bin entry, remove the node type and just have the bin entry type. Maybe rename it. Problem there is that every node is now slightly larger. Why not really matter that much? That might be what we have to do. I wonder why are these constructs even needed? It feels like all of that validation can be done in a C++ compiler front-end. Can you show any example when it's absolutely impossible to do that? I think you need to provide an example. I don't know what constructs you're talking about. Or what validation you believe can be done in a C++ compiler front-end. And if so, which C++ compiler front-end? Saying that something is theoretically possible doesn't really help, right? You would need to show me one that can do this. Whatever this is. I think we're going to make the I think we're just going to have to merge this into one type. It's going to be necessary for the removal business later anyway. That's too bad. Defer the wrapping of the node until it becomes another node's next pointer. The problem with doing that is we're trying to do this all this atomically. So if someone wants to add another node, they're going to basically do an atomic like compare and swap. And so you can't easily do that. It's going to be awkward. The types aren't quite going to work out either. Merging node into bin entry will also make it tricky to follow chain of nodes down. Right now we have the good logic that node points to forwarding node doesn't ever happen. Yeah, it means that we could no longer quite encode that. Which is certainly annoying. Yeah, it's not great. It's not great. The problem we'd run into is we'd end up with a bunch of like runtime assertions of like this pattern can't happen. So for example, we know that if you have a node, all the subsequent nodes are going to be of the node type. You can't have a node that's followed by a forwarding node, which is currently encoded in the type system. But if we changed bin entry, if we merged node into bin entry, then now there's nothing in the types that guarantee that that is the case. And so it would still be fine. Like the code would be fine, but it would mean that there are more places where we would have to match on the node type and then sort of say that it's impossible to be in the case where this is a moved. Can you explain the exact problem with the current types? Yeah, okay, let me let me try this again. Okay, so here's actually let me draw this that might be easier. So here's the challenge. Okay, so we have our table. And let's consider any given entry. Each entry is just a pointer, right? And this pointer has some type T. And what we're going to construct is either a linked list or one of these forwarding nodes, right? This head pointer is going to be either a pointer to like one of these or a pointer to one of these. But if you are in this case, if you have a linked list, then this pointer we know is a pointer to a node. It is it cannot be the one of these pointers is a pointer to one of these types that can never happen, right? So the question becomes what are these types? Well, if T is something like an enum of node or forward, right? It's an enum of one of these. So the question becomes, what do we go what goes here? Right, there's going to be some type you, right where this is you, this is you, and this is you. So the question becomes in the definition of you, which is going to be a struct. What is the next type? What's going to be an atomic? And the question becomes what goes here? Well, either we stick you here, that's what we currently have, right? And that now in the type system, there is no way when you if you have a you, you know that the next thing is you. All is good. But it means that the head pointer, right, this pointer right here, that pointer is of type atomic T, right? Because that head pointer, it can be either or. So the head is of atomic T, but the nexus of atomic you. And this means that the the sort of how to phrase this. This means that the allocation that happens at the head is an allocation of a T. And the allocation that happens of next is allocations of you. And that means that there are different things. It also means that, for example, and this was the observation that was made earlier, if the head of a list gets removed and some later things becomes the head, its type is going to have to change from you to T. And it's going to have to do so atomically, because remember, all of this is like lock free atomic stuff. And converting something from a you to a T is not necessarily trivial. I mean, it is just a wrapping, but that wrapping is like a heap allocation and an atomic pointer swap that is tricky. So the alternative, and this was what was proposed, let me dig up like some other color here. One alternative to this is to say, well, this type is going to be T instead. Okay, well now these two types are equal, right? So that's good. The problem now is if I have a you and I do you dot next, the type of you dot next is a T. I know, right, as the programmer, I know that this is always a you. It's always a node, right? I know that, but that's not encoded in the type system. So everywhere where I do you dot next, I'm going to have to match on you dot next. If it's a node, then do the right thing. If it's a forward, then like panic, this is unreachable. So that's not very satisfying, right? Forward has to be heap allocated because it needs to be an atomic pointer swap. You're kind of using dynamic dispatch. Dynamic dispatch would not help here because for dynamic dispatch, you need a wide pointer. And a wide pointer you can't easily do an atomic pointer swap on. There's one more layer of indirection, node atomic you in T. Yeah, so the third option, if I can get this to do the right thing. The third option here is let's do that color, I guess. Third option here is for this to contain an atomic you. Next to also be an atomic you because now the type of allocation at this point and the type of allocation at this point are both use. So converting between them is simple. The downside is that we now have a double indirection for every lookup, right? Because an atomic is a heap allocation. And so this is also costly. Why not a trait object for inside the atomic? Yeah, atomic can't be trait objects, I believe, because you need a, it's a fat pointer because you need the V table pointer as well. You could box the trait object, but the moment you box it, it is no longer atomic. You can't do atomic swaps on it. Unless you have like support for like wide atomics, which I don't think we want to assume. So these are the three options. The only one that doesn't come with, the only one I think that doesn't come with overhead, sadly, is the one where we just merge everything into the top level enum. And then like the red solution basically. And it is definitely a little sad because we're going to have a bunch of unreachables, but I think it's what we're going to be stuck with at the moment. All right, let me go pee and think about this. And you can think about it too and pee if you need to. Are the current types only inconvenient to use? No, we don't actually have a way to, sorry, this is where the issue came up. Which is what I was getting to explaining, but didn't quite, which was here. The tricky part is when you do a resize and you have an existing linked list. And now you're going to split that linked list and you're going to place it in two different bins in the new table. Imagine that you're moving a run, right? The run, the head of the run is going to be just a node. It's not going to be a bin entry. It's not going to be an enum. It's just going to be the struct, but it's going to become the head of the new thing. So now we would have to, we would have to reallocate it for that to work. The head of the run that is now arguably in this particular instance, we know that we own the target bin. So we could just redo the wrapping, but this is still going to come back to bite us when we do remove. Red comes with some memory overhead. There's a little bit of memory overhead just because the tag of the enum. I'm more concerned with the effect it has on cache performance, but it should be small. Okay. Well, the only thing we really need to change is that this is going to become a bin entry. This, which means that this is going to become a bin entry. This, which means this is going to become a bin entry. This, which also is awkward because yeah, it's not great. It's not great. Oh, you're right. It's just going to be bin entry KV. Yeah, it makes me real sad. It makes me real sad. The other annoying thing is that we can't have find on node anymore because it might have to return self. We're going to have to if n dot hash is hash and n dot key is key, then self. Oh, that's awful. It's going to have to go through the whole fine, fine, fine, fine, fine. No, okay. This might be okay. This might be okay. It's just going to mean that this code is going to go up here. And this is going to be if this then returns self shared from self. Yeah, so this is currently recursive, which might not be what we want. But in fact, I think it's very specifically not what we want. This is going to be break. This is going to be break. This is one example of like here, we would have to do this match to see whether it's moved each time and just like not necessarily know it's not a forwarding node. So in the Java implementation, it uses like virtual function overload. No, I think self-referential structs wouldn't really help here. So here instead what we're going to do is is so ugly. It's so ugly. Yep, then it's going to break with that. Otherwise, it's going to do node equals. Next year, I think that's what's going to be. And this whole find can go away. At that point, there's no need for the node type anymore that can just go straight into here. Although that would make this not work. So we're going to keep it the way it was actually see what that gives us. That's going to break the self dot find. No, bin dot find. Great. Six five five. No field next on owned bin entry. Yeah. So now we get into this pain, which is this walk is awkward. We own the bin. We get the head. Right. And now this is going to be we're constantly going to have to destructure this because this is going to be node is if let bin entry node head. At least now these can be we don't need to drop the head separately. So now this can just be this piece head. This is now in reachable. So this is the the like ugly business I mean, right? This is going to drop the nodes value is going to get the node is going to do the node expected node bin entry. This should say bin. What's my typo? It's preventing all the formatting. Six five two. Right. No field next on bin entry. Yes, this is going to be the same thing where next is going to be node is going to be if let bin entry node head. No, n is equal to unsafe PDF is next then n dot next dot load. And we have to do this whole business also in reachable. And then P is next 591. Why is n here a an ice? There's n come from but n is a use ice. Nice. And is definitely a use ice. Let's do that later. 591. Same thing expected use ice find outsize. But it's complaining about this line. But I comes from I is definitely a use ice and n that equals n right here. Let n double checks that that is indeed use ice, but I don't see why I wouldn't be. Well, okay. Make as node an infallible method on bin entry. The panics if it's a move node. You could do that. It's not a bad idea. Another is to implement D ref. The bin entry D ref into node. I'm going to leave that for later. All right, we're back to where we originally were, which was to insert these unsafe PDF business calls. So safety here is probably the same as in all the other cases. In this particular case, though, it is where's our safety argument nearby here? All right, so here, this is kind of subtle. So here we need to somehow guarantee the given P that we are accessing. So let's do this a little bit nicer in that we do up here. Let node is unsafe PDF because that way these can all just be easier to deal with. So what's the safety here? P is a valid pointer. And the reason why P is a valid pointer is because P would be dropped. P would only be dropped. P is only dropped in the next epoch following when it was swapped to null. See safety comment near. Where's the place where that happens? That's down here. Following when it's been, see safety comment near table.store been below. It's been nice following. It's been a swap to replaced with a move node. We read the bin and got to P so it is not a move node. So it's been has not yet been swapped with a move node. Therefore it will be dropped in a future epoch and we have the epoch pinned. So the next epoch cannot have arrived yet. Therefore it will be dropped in a future epoch and is safe to use now. 580 hiplet, bin entry, node is that. Then node else, unreachable. 555, it's probably the same argument up here. Actually at this point we could, do they need to be shared still? I don't think they actually need to be shared. Oh last run does. We could just construct shared from them probably. We'll leave it this way. So we can do here to save us a little bit of extra typing. If node.next is null. Why is this while loop checking next first? Why is it not if node is null? No, I'm fine. Stick to what it was. 56564. What? Oh I see. We need to do a next is node next load. Next is null. Break. P is next. 546. This probably doesn't have to happen here. This can happen here. 523 match bin. Alright so here we're going to do a unsafe bins. This is the same story. Safety here is, it's very similar to the safety yp is a valid pointer down here. Bin is a valid pointer. Bin. So when is bin is dropped? Bin. Why does this only be dropped? That's not what I meant to say. Same thing here. Bin is only dropped when the head of the bin is replaced with a move node. We read the bin and we got, we read the bin. Actually the bin is only dropped when the table is dropped. That's when we drop all the heads. Actually there are two cases. There are two cases when bin pointer is invalidated. One, if the table was resized, bin is a move entry and the resize has completed. In this case, right, so if the table is resized and bin is a move entry and the resize has completed then the old table is dropped and that includes all of the bin heads which are just forwarders. In that case, the table will be destroyed and the next, the table and all its heads will be dropped in the next epoch following that. Two, if the table is being resized, bin may be swapped with a move entry. If the table is being resized, bin may be swapped with a move entry. The old bin will then be destroyed, will be dropped in the following epoch after that happens. In both cases, we held the guard when we read the, when we got the reference to the bin. So the next, if the swap happened, if any such swap happened it must have happened after we read, since we did the read while pinning the epoch, the drop must happen in the next epoch, i.e. the one that we are holding up by holding on to our guard. Can this unreachable matching business be encapsulated? It can. We can do this if we feel strongly about this and the way to do it is actually pretty straightforward. It's just going to be a node or deref to be honest, but as node I guess, it's going to be self to node kv. And in fact, if we want it to be real fancy, we do this. Let bin entry node n is self, then some n, else none unreachable. Does that make you happier? It does look nicer, I agree. It's going to be ass node and this is going to be the same. This is going to be next. Here we actually want to keep the current one because we want to get the node as owned. This we want to keep. Here we can do node as node unwrap key. That we want to keep, this we want to keep. 611, mismatch type, this needs to be a bin entry node. 511, this needs to be method not found. It's because next table should just be this. 505, function takes two. It takes a guard. An easy fix. 491. I don't understand where this n is coming from. What am I missing? n is a u-size. What's it on about? Expected i-size found u-size. Re-size-stamper, reading is definitely an i-size. All right, fine. So if that's an i-size, then i also has to be an i-size. So we're just going to keep it an i-size. We're going to cast it to a u-size down here. Actually, where's the place where we combine it with a 2,4? As i-size. I think next table. So any of this code will do anything with next table? No. Just that. Next n is len. Next table is read while the guard was held. The code drops next table. Only drops it after it is no longer reachable. And any outstanding references are no longer active. This reference is still active. Right by the guard. So the target of the reference. Yup. 445. Expected i-size found u-size. This is going to have to be as i-size. 386. This is going to be the same argument as for next table below. So what I'm going to do here. This should say table. Same argument as for table above. 398. What do we got here? I think we need that to be a u-size. Now we're going to let n is n as i-size. How can n be negative? I don't think n can be negative. Oh, it's for the like resize stamp business. I think a resize. The n is always show me this resize. Resize stamp. Okay. Of size n. So n is definitely a u-size. Which means that my dear n up here can go back to being a u-size. This can stay a u-size. Great. 631. Now these can just be y plus n. The way it was meant to be. 498. Why is i here? Right, because i is an i-size until here. Because it can be negative. 41. Okay. If we get to this or, then i must be positive. So this can be this. Transfer index. Why can transfer index be negative? Like why is transfer index an i-size? We'll leave that for a second. Unless they use like negative to indicate that. Next bound. Yeah, unless next bound can somehow be negative. But all right, fine. I'll store it as an i-size. I still find it very hard to believe that it has to be signed. 573. Consider giving head a type. It is a bin entry. Not a bin entry. I'm in fact entirely lying. This has to be a head. This is going to be bin now. Yeah, this is just going to be. This is going to be bin. This is going to be bin. This is going to be bin. This is going to be bin. Make sure we didn't overwrite bin anywhere in between. I don't think we did. It's not mutable, hopefully. 620. No clone found for raw mutex. Why does it matter? So this is we're moving. This lock is only taken if you want to overwrite the. Decently sure this can just create a new lock. But if we're unsure. Then what does this code do? Oh, I see. This can definitely be a new lock because in the Java world, it's using the like per object lock. So there's just a new lock every time. So that's fine. Why does the key need to be cloned? The key doesn't need to be cloned because the old key might still stick around in the map. That's real awkward. So the observation here is that the old bin is going to still have a bunch of. The old bins linked list is still going to have a bunch of nodes. And. And there might be threads that are accessing those nodes and you need to look at the keys. But we need to allocate a new node. And that new node also needs to hold the key. And so here we're going to clone the key. What one alternative would be to not clone the key, but instead like put the key behind like an arc somewhere. I don't really want to do that if I can avoid it. So instead, we're just going to require that the key is clone, which definitely makes me a little sad. But such is life for now. Muscle team. Yes, I am using COC, but I don't actually want stuff in my terminal because very often, especially when doing development like this, like I know that there are a bunch of errors. And if the errors just came as I typed, it would first of all take a lot of CPU cycles and also many of the errors I know about. I don't want to see the errors until I'm ready to see the errors. I could have a shortcut for running COC or running RLS, but it's annoying to set up. And it's easier to browse through them this way too, because now I have them in a terminal, rather than have them show up in like the Vim Quick Fix. So that's why. Okay, so here is the argument. The argument here is probably the same as what we've given above actually. So the safety here is do we have another safety further up? No. Table is only dropped on the next epoch change after it is swapped to null. We read it as not null. So it must not be dropped until a subsequent epoch. Since we hold a guard, we know that no new epoch is happening. We know that the current epoch will persist and that our reference will therefore remain valid. You see, this is really just many ways of stating the same security property. All of which are relating back to when we choose to drop things. Same thing. DRF this. Ooh, what did I break? What? 286. I'm confused. 489. Oh, 286. I think also for this, we probably need to require these, or this. We read it as not null. Since we held a guard at the time, team use to persist and that our reference is therefore valid. 259. This is another node, isn't it? Yep. It is indeed. This is sadly going to be bin. And here, I guess we're going to do n is no. It's going to be p because that's what we've been using. pdref.asnode unwrap. It's the same pattern. And the safety condition here is probably just going to be exactly the same. Specifically, we read the bin while pinning the epic. The bin was a bin will never be dropped until the next epic after it is removed. Since it wasn't removed and the epic was pinned, that cannot be until after we drop our guard. All right, we're getting closer. So here, we can do another table.dref safety. Let's see that that actually fixes a bunch of these. Yeah, it should. So what's the safety here? Safety here is there are a couple of them actually. Table is a valid pointer. If table is the one we read before the loop, then we read it while holding the guard. So it won't be dropped until we drop that guard because the drop logic only cues a drop for the next epic after removing the table. If table is read by init table, then what's the last case if table is set by move below? Are there any other things to change table? No. All right, so what does init, because these both override table, right? So we need to make sure that no matter which shared we ended up with, which shared table, it is okay to do this dref. So if table is read by init table, then the only way we break from this is either here, in which case we do a load after the guard is pinned, or here, in which case we did a store. So it must be valid. And no one else, if someone else were to drop it, they would have to wait for an epic. If table is read by init table, then either we did a load, and the argument is as 4.1. We are in one of three cases. Either we'd load in the order. Or we allocated a table, in which case the earliest it can be deallocated is in the next epic. We are holding up the epic by holding the guard. So this dref is safe. And if table is set by a move node, this is a tricky one. So we're just going to do this help transfer through help transfer. Then what? In all these cases, it's going to return next table, except for there where it's going to return table. It will either return table, which is fine by 1 and 2. It will either keep using table, which is fine by 1 or 2, or use next table from... or use the next table pointer from inside the moved. In the latter case... Okay, so the question becomes... Okay, here's the question. We have a moved entry, and the moved entry is a raw pointer. Here, in order for this dref to be safe, we need to guarantee that that raw pointer is still valid. How do we know that is the case? Well, raw pointer from inside the moved. How do we know that that is safe? Well, we got to the moved node transitively through a load of table. That load of table happened in the current epic, since the epic is still pinned. So when is a table destroyed? A table is destroyed when... A table is destroyed in the next epic after the resize finished. Since that... How do we phrase this? A table is only ever dropped. It is only dropped in the epic following. Resized... Actually, a table is only dropped in the epic following. Well, it's weird because the moved points to the table after the resize, following when its resize has completed. In the case of movedT, can only have been dropped if T was resized and that resize has completed and an epic has passed. For T to be resized, this is a finicky argument. The question is, does the user code of this HashMap need to concern itself with not holding guards for too long to avoid growing memory? Else the epic never increases. Yes and no. We haven't implemented iterators yet, and iterators is one place where this is going to be tricky. But if you look at the signature or the contents of get and insert, those internally construct a guard. So the user never gives a guard in. What this means is that the user just can't hold on to the guard for too long because the guard is only held for the duration of the get or insert. Either keep using table, which is fine by one or two, or use the next table raw pointer from inside the moved. How do we know that that is safe? A table is only dropped in the epic following when its... A table is only dropped if it is resized. This paragraph gets too complicated. A table is only dropped if it is resized. For table t and moved t to be dropped, I feel like there's a simpler argument here. For table t and moved t to be dropped. The property we want to make sure holds is that if you drop a moved t... Actually, the property we want to make sure we hold uphold is that if a moved t exists, then the t is still valid. Which I think we uphold by virtue of the destroy, but I don't think... I can't quite figure out how to articulate it. We must demonstrate that if a moved t is read, then t must still be valid. Fix me. Let's leave that for a little bit later. I believe it's true. It's just we need to... Specter reference found here. This is... It's going to be p instead of n. 285... Node. The value type for Node is wrong. Atomic new. I think maybe this should just be that. Great. 254 into owned. Ah! That was not at all what I meant to do. And this is fine because we haven't given out. So let's say the safety... We own value and have never shared it. 249. This has to be a reference. Oh yeah, keys obviously need to be comparable. 229. No field lock. 214. Almost there. Specter reference found shared. Oh yeah, this should probably just be t. Because we need to still be able to refer to the old t. Other tables. Those are the only ones. Great. 212. Match bin. Oh, this is a simple one. This is just... Safety... I think this is just the same argument as this one. Yup, well that's just the same argument. 168. This is going to be... Table D ref. Safety c argument below for not is null case. What were we saying? 129. In a table. This is going to take the g and give you back a shared over g. Safety... We loaded the table while epic was pinned. Table won't be deallocated until next epic at the earliest. Same argument as here. So few errors left. We haven't gotten to the bar checker stage yet, I think so. Probably still some challenges, but... This is going to be unsafe... D ref. I think what we probably want here actually is a... Implement Guard API of our own. So why is this okay? Safety. We are still holding the guard. So the and... And saw value. And saw v. So it won't be dropped until the next epic after we drop our guard at the earliest. Table is unsafe. Same argument again. 102. Safety. So this is the same as the argument for our other bin D ref. Sort of want to collect all of these in one place rather than have them spread out like this. Okay, let node is node as node unwrap. 116. This probably takes a guard. And 120 node is unsafe node as ref. No as... I mean D ref. And here the safety is the same argument as before. Where node D ref. Huh? D ref? Nice. How many? Six. Oh man, that's not many at all. Can I move out of D ref? What? Where? 778. This is into box. So that's fine. 68. 68 value does not need to be mutable. 765. Can I move out of D reference? Oh, into box I guess. Oh, that's awkward. So here we can't actually walk these bins and have them be owned. Because we can't take ownership out of table because table implements drop. And if we iterate over the array, the box slice, then we don't get very far either. What we can do here though is place. I guess it's back into box slice. And then back from. And I'm pretty but it works. 765. That should be 8 indeed. 667. Run bit has to be mutable. That is totally true. 461. Count has to be mutable. That is also true. 420. Table does not have to be mutable. 237. No. Borrowed. Oh, that's going to be real awkward. Actually to fix. Because we borrowed the key here to make life easier for ourselves, but that might have been a mistake. So you know, we can just do it here instead. What am I missing from that? No use of key further up. So we can just do it here. It compiles. It compiles. It is alive. It's alive. Saw bin length. We don't actually know what to do with that yet. Load factor is not used. Ooh. Implement all. How to phrase this? Do we phrase this? So what we write here is, it doesn't really matter for these kind of commit messages, but perhaps something along the lines of add in all safety. Yeah. Figure out most of the safety. And there it is. Do you plan on implementing tests? Yes, indeed. So actually your screen is going to go bright in a second. And the Java testing, the Java concurrent hash map has a bunch of tests. So the plan is to implement those as well, but we won't do that today. My guess is there'll be one more stream on concurrent hash map. Now that we have like the basics working, we now need to actually test it and get it to work. And that's going to be the next stream. All right. So we now have a thing that compiles. We don't actually have anything that can do anything useful. So let's just, just for the heck of it, it works. Do we have like a new? Probably don't even have a new, do we? That's awkward. Okay, we don't even have an implementation of new. So this is going to be the fix me for next time. Do I not sign my get commits? I used to. These times not so much anymore. It's not clear to me that it carries that much value, especially if you turn on auto sign. I might turn it back on. I had some problems with GPG for a while, but those should be resolved now. Okay. I think we're going to call it there because now we have something that actually compiles. The biggest things that are left now is actually running the thing. Removal is going to be a big thing. There's sort of the tree balance, the balance tree stuff that they do, which I think we're probably not going to port and the sharded counter business they do. There's one more. Oh yeah. And then of course figuring out that last safety invariant that we ended up leaving as a fix me. We need to document and convince ourselves that that code is actually correct. With that though, I think we're in a pretty good position. My guess is there will be one more stream on the concurrent hash map where hopefully we'll be in a position where it all works. So that's going to be whenever the next stream is. I'm leaving for holiday on Monday and we'll be back mid-January. My guess is that the next stream is going to be end of January sometime. I hope this was possible to follow. This is very hairy code, but hopefully us talking through it helped a lot. And I've pushed all the code, so if you want to read through it at your own pace and compare it to the Java code, just have a look in the GitHub repo. And with that, I wish you all a happy new year and I will see you in 2020. Bye, everyone. It's great to have you here as always. That's not what I wanted to do. All right, bye.