 In this video we are going to explain how the concept of associative cache is implemented. Let's first review very quickly the other alternative which is direct mapping in which the address that is sent by the microprocessor is divided into three fields. The first one tells me the offset within a block. The second one tells me the cache block in which it should be stored. And the third one is the tag, assuming this is the least significant address bit and this is the most significant address bit. So in direct mapping every address that comes out of the microprocessor is divided into these three fields. This is depending on the size of the block. This is depending on the size of the cache and this is the remaining set of bits. The cache is actually a memory which could be depicted like this with three fields and each entry in this memory has a tag which is telling us which block is actually being stored here, the block data and a valid bit telling us if this data is actually correct and corresponds to a true data from the main memory. So the main, the cache memory is divided into these three blocks or three columns. Now the special property for the direct mapping not for the associative mapping but for the direct mapping is that one memory block can only go to one cache block location. This is perhaps the most restrictive condition of this memory. So I pick any block in my memory and by looking at the address of that block it is only entitled to go to one single location of this memory and always to the same location. So with this type of mapping or with this type of policy there is one situation that could occur which is the following. Suppose we have this cache which is a direct map cache, let's call it DMC, direct map cache. I could have a few blocks here, let's suppose I have eight blocks and I could have certain blocks of this cache memory that are empty, these ones here are empty because the microprocessor is sending addresses that correspond to blocks that are stored in these other five locations. So addresses are repeatedly submitted or sent by the microprocessor in these five blocks, no address corresponding with any block that is suitable to be stored in these three positions is emitted and therefore three of them are wasted and are not being used. And the effect is that the hit rate on that cache can be improved. If you think carefully the restriction that one memory block can only be allocated in one specific cache block location means that even though my cache could be accommodating additional blocks, it cannot do that because of these restrictions, so these restrictions translate into a hit rate that can be improved. Now what happens if this restriction over here is lifted? In other words, we're not going to require that every memory block goes to only one possible cache block location, we're going to relax this condition and design a new cache which we call associative cache, which has a similar structure as the one we have described. In other words, still we have three columns and still the first column stores the tagged second column, the block data and the validity bit, so this is exactly the same as before, but now the restriction is lifted and therefore any memory block can go to any cache. What this means is that rather than having this distinction here in which we have the offset, the cache block number and the tag, now the memory address here needs to be divided into two fields which is the offset still, but now even the fact that any memory block can go to any position here, the remaining bits over here make the tag that goes to this column over here. So this is the address, zero to D minus one emitted by the microprocessor, now this type of memory does not impose this restriction we have seen here and therefore any memory block can go to any block in the cache provided that is allowed to be stored there. Now the problem comes, I mean this lifting this restriction is great, it seems to be fairly obvious and makes sense, however the problem appears on trying to search for an address. In other words, if I have my blocks organized like this, if I'm given an address by the microprocessor how do I know that I have the block data for that address? Now if we go back to our direct mapping, given this address I know exactly how to look for this data in precisely one single position of my cache because that position is given by this fill over here and the reason this searching is simplified is because taking the number from here, this number directly points to one single unit and the only thing I have to do is compare the two tags and see if it's okay. Whenever we go to the fully associative cache the fact that the block can be in any of this position means that now I have to take the memory address like we said before with the offset and the tag I cannot separate any bits for the block number because that restriction is lifted but now this tag has to be and this is the most important compared with all tags in the cache, this is the main difference between associative and direct mapping so given the fact that we lifted that restriction now a block in the cache can be anywhere now when we search for a specific address this tag now needs to be compared to all of them in parallel. Now what is the advantage? The advantage is we probably have a better hit rate. Let's go back to the example we described before in this situation which the processor is hitting or trying to access blocks that collide in five out of the eight positions when we take the same behavior to this associative cache these three slots instead of being kept empty they will be populated with previous data that is needed by the microprocessor therefore we can guarantee that lifting this restriction of memory blocks coming to a specific and fixed position on the cache we will maintain this memory completely full with relevant data so this is the advantage better hit rate this advantage this is more significantly more complex when we design this memory why because compared to this other one in which a read or a write operation by looking just at the cache block number I go straight to one single line and can do the operation here this memory needs to compare all the tags at once and maybe one of them returns the correct one saying yes it's me I'm in this position and I have a hit here so this is the tradeoff even though we get better hit rate the design of this hardware is much more complex because the read operation now requires in fact the word that is given here a look up okay so the search or the read operation always requires a look up and that look up is performed over the entire column