 In this video we're going to explain the concept of cache memory and more precisely one of the heuristics that is used to place content from main memory into the cache. So let's start with the conventional situation in which we have here a microprocessor and let's assume that this microprocessor has a memory, the address of which are encoded with d bits. What this means is that in this other side we have the main memory which has then addresses from 0 to 2 to the d minus 1. This is the conventional situation. Now the first technique that is used to adapt the speed to which this microprocessor runs and to which this memory chip is capable of executing read and write operations is dividing the main memory into blocks. So this block contains a set of consecutive bytes stored in memory and the whole memory is divided into these blocks of equal size. We have the one here all the way down to the last one over here. Now these blocks contain several addresses but the same way each address has a number we can also number the blocks in here starting at 0, 1, 2 and if we want to know how many blocks do I have in fact I have to take the total number of cells which is 2 to the d and divide it by the size of the block which call it mb size representing memory block size and since the number of blocks I started numbering from 0 the last block over here will have precisely this number 2 to the d divided into memory block size minus 1 and as we can see once we know d, once we know the memory block size that we have chosen here we can calculate the number of blocks we have. Now how, why do we make this distinction here? Well suppose now that the memory address from my microprocessor which is between 0 and d bits minus 1 suppose that given an address I want to know this address in which block is located which one of these blocks is the one that contains this specific address from here now what I have to do is one operation which is divide the address by the block size and then the quotient is precisely the block number now you'll agree with me that this division can be quite sophisticated operation but the division is highly simplified if this block size is a power of 2 if that is the case the division is simply to remove as many bits as this power of 2 let's suppose that this power of 2 is 2 to the b and this means we have that the block size is 2 to the b and b is less than d so if we go back to our scheme here on the right now if the size of this block is power of 2 is 2 to the b bytes this division becomes 2 to the d divided into 2 to the b minus 1 which in turn says 2 to the d minus b minus 1 now if we look at this result what it means is that I can choose any size I want for the block but if the size of the block just so happens to be a power of 2 since memory are assumed to be also a power of 2 once divides the other and the number of blocks that I have here can be calculated very easily like that now this in terms of my memory address and going back to the initial question which block is a certain memory address being stored then this is very easy just removing the bits on the less significant part this is what we call the offset and the remaining d minus b bits will give me the block number so so far we haven't done anything special between the microprocessor and the memory it's just an observation that if we divide the memory which is a power of 2 in blocks of size power of 2 we have a very simple way to compute which block corresponds to a certain memory address ok so here comes the cache memory cache memory is a memory that's going to sit here in the middle and it has very interesting properties the first one is that it's going to be divided the content of this cache memory it's going to be divided into blocks of exactly the same size of what we have been discussing so far so we have blocks all the way down here the other property is that the number of blocks in this cache is much less than the number of total blocks in my main memory the idea is that this cache is just going to have a subset of certain blocks from main memory with the condition that by being in this memory this memory is much faster for the microprocessor to perform the read write operations now you might be asking yourself why don't we make this memory over here the size of this one and that way we will gain that speed however current technology allows us to manufacture very large memories that are very slow or very small memories with this structure that are very fast so this is basically a restriction that is imposed by design requirements therefore this cache is going to have a certain number of blocks and the property as I said before is that the size of this block is the same as the size of this block now the question remains is how do we manage to store some of these blocks into this cache and we are going to study the first simplest scheme which is this block would be a store over here the second block would be store the next position and so on and so forth until we probably reach the last block over here which is a store in this position now one thing we have mentioned is the number of cache blocks it's basically going to allow me to number these blocks from 0, 1 all the way to number of C blocks minus 1 that would be the number of this block and the number of C blocks is obviously much less than to the D minus B which were the number of memory blocks so the mapping as I said it's going to be very easy first block on the first position of the cache second block on the second one, third block on the third one, so on and so forth until we reach the situation in which I have a block that maps or that is a store into the last position of the cache now the question remains what happens now with the following block and in order to make this heuristic the simplest one what we're going to do is that map this one also to block number 0 now as I said before the size of this cache is much smaller than the size of this main memory therefore two blocks eventually two or more blocks will be stored in the same block in the cache now the question that remains to be answered is if I have an arbitrary block here somewhere in my memory the question is where does it go in this cache and if you look carefully at the type of mapping that we're doing here blocks 0, 1, 2 all the way to the number of blocks minus 1 are stored in these consecutive positions and the block number C blocks the block with number equal to the number of blocks in my cache will go back to be stored here therefore as a matter of fact the way to know where to store a block is to take the module of the address is address module address of the block module the number of C blocks now if you look carefully at this operation again taking the module of a number in binary it's a fairly complicated operation for arbitrary numbers but it is trivial if the number of blocks is also a power of 2 let's call it 2 to the C so let's suppose the 2 to the C is the number of cache blocks now taking the module of the address of the block is very easy if we go back to our example here if this is my address I know that I have to take b bits out because this is the offset as we said before and now the remaining bits I have to take the operation module number of C blocks which are just happen to be 2 to the C module 2 to the C is very easy on a binary number because what I end up doing is taking these bits over here which are C bits and this directly gives me the answer to this question here providing me the block number in cache that should be used to store the block derived from this address so as we can see we have here d minus b minus c remaining bits what we have done so far is first divide the memory into blocks b is the number of bits that I need to consider removing in order to get the block number and then over this block number what I'm performing is the operation of module because of the mapping that I chose here first one to first position all the way down to the number of blocks following one goes back to zero and this module operation is very easy to compute if the number of blocks is a power of 2 as we chose here now the couple of questions that remain to be answered is like ok so we figure out that this block over here from this one will have all these bits equal to zero therefore it should be in this block now how do I identify which block is storing this position zero in the cache because as we have seen there is more than one block in original memory that can go here and the answer is in this field this field is what we call the tag and the tag identifies block inside the cache unequivocally so what is going to happen is that the tag it's going to be also stored in the cache memory we're going to have a structure like this and with the tag we will be able which by the way this tag is only d minus b minus c bits with this tag we will be able to identify out of all the possible blocks in main memory that are mapped to block numbers in the cache which one is being present there so the way this approach works is that whenever the microprocessor requires certain memory address this operation is going to be done these six bits are going to be ignored c bits are going to tell me the block number in the cache where this address is supposed to be and the tag I'm going to compare with the tag that is stored over here if these two tags are the same then I know that the block corresponds with the one that I'm asking for if the tags are different it means that even though I have a block here it doesn't correspond to the block containing the memory address requested by the processor and in that case and this is what is called a memory cache miss I have to go back to memory bring the other block put it here and update the tag so this is the structure of the cache with the addition of yet another field here which is only one bit and it's called the valid bit and it's basically to notify if whatever content that is here can be trusted or not if it is correct or it is some old block that is not relevant to the calculation therefore the condition for the microprocessor to trust one block stored in the cache is that the tag corresponds exactly with the tag the address being required and the valid bit is equal to one and this scheme is what is known as direct mapped cache direct because we look at cbits and mapped because we decide we use the cbits to decide where to store the block remember very important the main memory is divided into blocks these blocks are size a power of 2 and they divide exactly the total amount of memory this is very important because with this condition we simplify the division operation that is needed to figure out in which block is one of the memory addresses of the microprocessor once we have established that the cache memory will have a subset of blocks typically the number of blocks here is also a power of 2 why because in order to figure out which cache slot should occupy one memory block we need to perform the module operation on the address of the block and the module operation again it's performed trivially if this number is a power of 2 so typically cache memories have a number of blocks which is a power of 2 and finally given the fact that more than one block can be stored in the same cache block we need to extend the annotation with the tag just to make sure that whenever we come here to look for a data we are requesting the correct tag that is stored in this location in that case it would be a cache hit and if in other case if the tag is different or the valid bd is equal 0 then we should go back to main memory bring the block to the cache and then notify the processor that we have performed the correct access so this is the operation that takes place read write operation of the microprocessor with respect to the cache and the cache then has its own circuit to do read write operations between the cache and the main memory