 Hi everybody. So in the last video we talked about hash collisions and load factors and how they work into the hash table data structure. Well, let's go back to our code now and try and deal with the fact that we are going to have some hash collisions. So here is the hash table code I was working out with from the last time. And the main bugaboo here is this hash collision section where I'm trying to put a value into my hash table, but I have run into a hash collision, meaning that I went to put it in the slot in the table, but that slot was either was occupied by something of a different value. So for example, if we put in to our hash table the value 2, that should go into slot 2, slot 0, slot 1, slot 2, there it is, because my hash table has size 20 and I'm using the simple modulo hash value where I mod the value by the length of the hash table or the number of buckets or number of slots in the table. Now if I put in 22, I should get a hash collision. Now we want to have 22 in the table, but the question is where do we put it? Well in order to do that we need to implement one of our collision resolution strategies. Well, probably the simplest, I don't know if it's the simplest, but definitely the safest one to use is to use open addressing with linear probing, where we hit our collision and then we just kind of walk down the buckets, walk down the slots, trying to find an empty slot. So let's implement that logic. In our put method, we've hashed the value to get our index in the list. And basically what we want to do is keep walking up the list until we find an empty slot. Well how do we know that we have an empty slot? We know that we have an empty slot when the index, self.table subindex returns a value of none. So let's implement a probe that starts to like walk up the list. Well, if you're talking about walking a list or iterating up a list, sounds like you're headed for a loop. So we're going to use a while loop here and let's use a while loop because we're not doing something to every item in our collection, in our list. We want to loop until a condition is met, so that means we should use a while loop. I'm going to start by just saying, while I'm not done, I have to define what done means in a minute, but I'm going to initialize done to false. I'm not done. I'm going to do some work. I'm going to do some probing until I find it. So let's implement the logic where we're actually going to compute the new index. So new index, now in a linear probe, linear probing means go to index plus I and I is going to increment where I increments until we find an empty slot. That is linear probing. Let me define I and I'm going to start it at one. Let's see why here in just a second. So what I'm going to say is the new index where we want to look is index plus I. So if I hash to slot two and I see that something is there already, but not the value I want, then I gets one, index plus I, two plus one is three, so that's going to give me the next slot in my hash table. So this is my new index and it's at this new index where I'm going to look and make sure hey is stuff there. Is there an empty slot? Now I need to consider the fact though that I may actually walk off the end of the list here. So I'm always going to mod, anytime I compute an index in my list, I always need to mod it by the length of the list. And that will just keep me within the range of values from zero to the length of the list minus one. Alright, so now, whoops, I'm starting to probe. I've walked one step here, right? And now I need to ask myself the question, hey can I insert here? Right? And there's two conditions where I would want to insert. One is, of course, if the table at this new index is none, that means I found an empty slot. This next slot here is none, so it's empty, so I can put it in there. Great. Why don't I do that? Self.table sub new index, not the old index, gets the value I want to put in. Okay? Oh, and by the way, I am now done. I found the thing, I'm finished, done is true, so this loop would exit. Okay? Now there's another condition where I'd be happy for this logic down here to take place. And that's when the value at the new index is equal to the value I want to insert, right? Just like we had up here, right, when I initially tried to put it in. So if either of these things is true, put it in there, right? But if it's not true, if neither of these things are true, then what do I need to do? I need to go to the next step. Well how do I go to the next step? I need to go, I need to increment I, right? So else, I plus equal one, okay? So that's kind of it for this part, right? Let's give it a try. I'll pause here so you can kind of catch up with the code if you need to and pause the video. All right, so let's try this. Let's insert some values, right? Before when I inserted 2 and 22, I got a hash collision. Now if I've implemented linear probing correctly, what I would expect to see is that 2 is still here, but 22 appears in this slot. Okay, let's try it. Aha, there it is. I don't know if my collision is still being detected, but it did actually insert it here. Let's put in a different value entirely. Let's put in 4, okay? Aha, so here's 4, right? In slot 4, 0, 1, 2, 3, 4. There was no collision here when I inserted 4. But now what's gonna happen if I try to put in 42, okay? 42 will hash to 2, okay? So it's gonna hash here. This slot is an empty. So if I've implemented linear probing correctly, it should go here. But that slot's not empty, so that it should go here. But is this slot empty? No, it's not. So it should go here, right? So let's run this, see what happens. Woo-hoo! All right, so I've got a couple of hash collisions, but predictably 42 is right here. And that's in fact where we want it to be. Let's check our mod on the size of the list. Let's insert 19. And that should get us down here in this last slot. There we go. Okay, and now if I put in 39, right? That will hash to slot 19 as well, but there's a slot here. So I should walk off the edge and come back, and insert 39 down here in this position. Let's see. Yes, there it goes. And I should have three hash collisions, and there they are. Okay, so while I have my collisions, it's not the end of the world, because I'm actually getting the values in here. Now, if you're paying attention, you may see that something really bad could happen. I could fill up this hash table, right? I could fill it up, and then there's no slots, right? So maybe since I'm doing linear probing, it might be a good idea in here to check and make sure that the table is not full, okay? Well, a really easy way to do that, copy this down here. An easy way to do that with linear probing, this won't work with quadratic probing, will be to ask the question, is the new index the same as the original index, right? So the original hash value. If I've gone through and looped through everything, then I will have wrapped. I will have, say, started here, gone here, here, here, here, here, here, so on and so forth until I wrap all the way around and come back to where I started. If I come back to where I've started, this hash table is full, and that's a bad day, right? I can't do anything. So I think what I will do in this case is I'm gonna raise an error and just say the hash table is full, no room to insert value, okay? So that's bad, we never really want that to happen. So another thing that we could do that we discussed was to bring the load factor into this equation, where before we put anything into the hash table, we ask ourself, is this hash table so full that it's worth my while to resize it? Okay, so the way that we would do that is, let's first check the load factor. Will the table exceed an acceptable load factor? Well, what's an acceptable load factor? Usually the load factor is configured when you initialize the hash table, right? So let's just say that an acceptable load factor for us is 0.66, okay? This is what Python uses when they implement hash tables in their kind of native implementation. So we're gonna check this load factor, and if the size of the hash table for the number of items in the hash table exceeds this load factor, or excuse me, the number of items in the hash table divided by the length of the hash table, which is the load factor. If it exceeds this threshold, we need to resize the hash table, okay? So how do we keep track of how many items are in the hash table? Well, your first inclination might be length of this list where we're keeping things, but this won't be accurate. Because the length of this list here is always gonna be whatever the size is. None has a value. So the length of the table is always gonna be whatever this size is, okay? So we're gonna need something else. We're gonna need to keep track of how many items are in the hash table. And the simplest way to do that is just to increment like a counter variable every time we insert something into the hash table. So let's do that, let's just do that down here. We'll increment here in our put method, right? All right, so now self.count is gonna tell us how many items are in the hash table, and then we can calculate from that the load factor. So remember the load factor, the load factor is equal to number of items in the table divided by the total number of slots. So we have this information. We can compute it by saying the load factor is the number of items in the table divided by the length of the list that's storing our data, right? So this will give us the load factor. And now the question is, if the load factor plus one, cuz we're getting ready to insert one, if the load factor plus one is greater than our threshold, and we said we'll pick 0.66 for that. Now it's time to resize the table, okay? So how do we resize the table? Well, I have not defined that method yet, but the basic strategy is gonna be this, okay? The basic strategy is gonna be first, let's copy the old table list, right? Let's grab a copy of it, a deep copy, grab a deep copy. Let's initialize our self.table to a new list that's bigger and empty. So what do I mean by bigger? I don't know, two times the size, right? Whatever the old size was, multiply by two, okay? So maybe we need to keep track of the size up here, right? Instead of just defining it, so self.table size, get size, right? Then this way we can kind of keep track of it. So we're gonna make the table bigger and then re-put all the old values in our copy from number one back into the hash table, right? So we're gonna call by calling self.put value, right? We're gonna, we wanna reset the inner hash table and just keep putting things back into it, right? And the reason that we need to call self.put is because, because our new table is larger, it's gonna affect the hash function. Because the hash function cares about how big the table is. So we need to re-put everything back in there, right? Otherwise, we're gonna wind up with things where we don't expect, okay? So I'm not gonna implement that right now. This is gonna be an exercise for you to do as part of homework assignment number seven. Not number six, number six is focused on other things. But for next week's assignment, you may want to do this. It'll be part of one of your grades for a more advanced grade, okay? So one other thing I wanna say before we leave off here is the linear probing not only has to occur here when you're putting things in. It has to occur down here as well when you're trying to get things out, right? So it could be the case here that we need to probe in order to find, in order, well, let me put it this way, in order to be sure that the value is not in the hash table, right? Because we could be in this situation down here where we wanna say, hey, give me 42, well, it's gonna hash here. And that value is obviously not 42. So it should look here, well, that value's not 42. And that value's not 42, well, but this value is, okay? So it's necessary for you to probe to make sure that 42 is not in there. And you will know it is not in there if you encounter a none or whenever your index wraps back around like this, okay? So, but the probing is kind of the same just like it is up here. The logic for it is almost the same except you're gonna do slightly different things, right, based on what values you see there, okay? So this also I'll leave as an exercise for you for your later assignment. But I hope this was helpful for you. And again, let me know if you have any questions and I will talk to you soon.