 we have focused on different techniques for improving the performance of the processor. But in a computer system, it is not only the processor, there are other components which are present and the performance is dependent not only on the processor, but also on those components and memory is one such very important component of a computer system. So starting today, we shall focus on hierarchical memory organization which is a technique for improving the performance of the memory system. So I shall discuss about key characteristics of memory systems which will provide you necessary background to know about hierarchical memory organization or rather understand hierarchical memory organization. Then I shall particularly today start our discussion on cache memory which is present nowadays present in almost all processors. Then discuss about various basic issues in cache memory design like cache size, mapping functions, replacement algorithm, write policy, block size, number of caches, performance analysis and so on. So these are the things I shall discuss, may be it may not be possible to cover in one lecture but we may cover in two lectures. As I mentioned in the beginning, memory systems are critical to performance, why it is so? The reason for that is, as you have seen the program and data are present in the memory, the CPU has to fetch it from the memory for the purpose of computation. So instructions are to be fetched from memory, data are to be fetched from memory. So if the memory is very slow, cannot provide the instruction and data at the same rate at which the processor is computing, then the execution of instructions cannot proceed and that is the reason why memory systems are critical to performance of the overall computer system. And as a consequence, computer designers have devoted a great deal of attention to develop sophisticated mechanisms to improve the performance of memory systems. So many sophisticated techniques have been developed which I shall discuss in a couple of lectures to improve the performance of memory systems. And the most primary approach used to improve performance is hierarchical memory organization. So this is the key technique which has been used to improve the performance of the memory system and that is the reason why we shall discuss about it in more details. Now you may be asking, what is the basic idea behind this hierarchical memory organization? It is based on the observation that programs exhibit temporal locality and spatial locality. What do you mean by temporal locality? By temporal locality, we mean that a piece of program which is being used now will be used in near future. That means that temporal locality is concerned with time, that means same code will be reused in near future, means recently it will be used. Then comes the spatial locality, what does it mean? Spatial locality means that as we know the programs or codes are stored in a memory. Now spatial locality says that if you are executing a code on a part of a, I mean from a particular memory location, the adjacent locations will be accessed in near future. That means the memory will be accessed from nearby locations, nearby memory locations whether it is instruction or data and particularly it is for instruction it is very common. So these two properties can be exploited to build a hierarchical memory organization and we shall see how it can be done. Now before we go into the details of hierarchical memory organization, let us start with the simplest computer system based on Von Neumann architecture. That was the basis for stored program computer. As you have seen we have got a central processing unit comprising arithmetic and logic unit, registers, timing and control unit and there is an interface through which it is connected to the memory and IOD devices. And this particular external boss through which it is connected is known as system bus as you know address, address lines, data lines and many control lines are used for the purpose of accessing memory and IOD devices. So you see in general we assume that your program and data is outside the CPU and it is residing in memory which are to be accessed through a system bus and obviously it will be slower. Of course, this is the Von Neumann computer architecture and subsequently as we have incorporated different kinds of memory the architecture has changed and as we shall see how it has changed particularly to take care of hierarchical memory organization. So let us have a look at the various characteristics or key characteristics of computer memory systems. First start with the location. Location means where it is located, where the memory is located. It can be say on chip means you know that nowadays we know that CPU is implemented on a single chip that means the CPU on a chip is commonly referred to as microprocessor and microprocessors are used as central processing unit in present day computer systems. Now since it is realized on a chip some memory is present on chip and particularly as we have seen the resistors apart from ALU arithmetic and logic unit resistors which is a kind of memory is stored on chip. So we can say that the first type of location where memory can be stored is on chip. So one is on chip. Now whenever you build a system obviously you cannot build a system just using the CPU you will require main memory and other types of memories. So where that will be present one possibility is that on board. On board means you will make a printed circuit board on which you will put your CPU central processing unit and main memory and other types of memories. So that is why second type of memory can be on board. One board particularly main memory can be on board. Third type of memory which can be external. External means it is not on chip it is not on the board but outside. Outside means you have to connect it usually through what is known as IO bus for example hard disk and other type of memories are external and usually connected through IO bus. So we can say broadly there are three possible locations one is on chip another is on board another is external and obviously the time required to access memory will be dependent where they are located. Then important characteristic is capacity. Capacity of a memory is specified usually in terms of bytes. There are possible there are several approaches capacity. Capacity the preferred unit for capacity can be number of words. Number of words that is present in a particular memory system that means for a 8 bit processor the word size can be 8 bit for 16 bit processor the word size can be 16 bit for 32 bit processors it can be 32 bit. So based on the word size can be considered as the in terms of the number of words the capacity is specified that means say 32 kilo words can be the capacity. However, for the sake of generalization that means irrespective of the word size you can specify the capacity for example just by the capacity you may not be able to identify the exact size unless the word size is specified. So just from this 32 K you cannot say how many bits or bytes are present that is why nowadays the common approach is to use number of bytes as the units of capacity and you will see subsequently we shall be always using the number of bytes as the capacity of a memory. So if it is a say word size is 4 I mean 4 bytes then essentially you have got 32 into 4 K bytes. So you have to specify in terms of bytes and in that case the capacity has to be represented in this way. So this is how capacity is represented then comes the access method there are various types of access methods which can be used for different types of memory. The most common ones access methods number one is random access what do you really mean by random access? By random access we can mean that wherever the byte or word is present can be accessed at the same time that means same time it will be required for accessing irrespective of its location. So for example you have got memory which is linearly organized starting with location 0 0 0 0 to 1 1 I mean 0 0 0 0 may be in hex and this can be say ff ff in hex. So this is the you have got 64 K bytes of memory. Now irrespective of whether it is located where it is located whether it is the 0th location or this location it can be accessed at the same time. Same time will be required for accessing that is why it is called random access that means some kind of and this location is fixed for a particular word I mean you can it is specified with the help of the address that means address specifies the data that is present you would specify what data is present there and it can be accessed using this I mean at the same time. Another possibility is sequential access in case of sequential access memory is organized in such a way that you have to access it serially as it happens in your tape as it happens in your CD ROM. So I will find that you have to access one after the other starting from a particular point and then after you reach that point then you will be able to access. So in such cases you will see that access time cannot be specified by a fixed time. So in this case it is fixed time but in case of sequential access that where it is located that will also decide how much time will be required to access it. So it will be a variable time depending on its location that happens in case of tape and also in case of disc to some extent. Then third type of access is known as associative in case of random access we have seen for a given address it is the location is fixed the data where it is present is fixed. Now in case of associative memory it can be decided based on content a part of the content can be used to access it and that is why it is also called content addressable. That means the location cannot be decided based on it is not fixed for a particular data location is not fixed it can be present anywhere and part of the content will decide where it is located and later on when I shall be discussing about cache memory you will see this type of associative memory is used and we shall see how a part of the content can be used to access it later on. Then comes the performance. Performance plays a very important role and obviously you have to specify performance what you really mean by performance. Performance means CPU is trying to access either data or instruction from the memory. So what the CPU will do? CPU will generate some address for that instruction or data and it has to wait for some more time to get the data that means the memory system will take some time to provide the data to the processor and that is why the access time is very important that means time required from the beginning of providing the address and when the correct data is available on the data bus or through which it is accessed. So access time usually it can be specified in terms of say the rate at which it can be specified in terms of may be nanosecond or microsecond or millisecond depending on the type of memory that is being used. For example, for main memory systems it can be 10 to 100s of nanosecond and later on we shall discuss about cache memory you will see that for that the access time will be dependent is it can be fraction of nanosecond because it depends on the technology that is being used to implement that memory. So access time and also it can be specified in some other way another way that is your transfer rate the rate at which data transfer can take place. So if it is access time is nanosecond 1 by access time will be the transfer rate the rate at which transfer can take place. Now as I mentioned for random access memory the transfer rate will be fixed irrespective of the location on the other hand whenever it is sequential then access time will be there will be two parts one is fixed part and another is variable part that variable part will be dependent on the exact location and the number of bytes that is being transferred. So the sequential access time is that transfer rate is variable cannot be fixed in case of sequential access. Then comes the physical type physical type means how it is being realized I mean the technology that is being used as you know nowadays we use semiconductor memory for our main memory and cache memory and particularly the semiconductor technology is used to implement the processor and the same technology is being used to implement the main memory and cache memory and as a consequence there is good electrical compatibility and depending on the technology that is the realization it can be two types as you know one can be static RAM another can be dynamic and depending on that the that access time or cycle time will be different and accordingly we shall use it for different purposes. So semiconductor memory is very common and that is used because of their Edo operation. Another possibility is magnetic technique for example in hard disk magnetic tape the data is stored with the help of magnetic technique. So that magnetic medium is used to store it and you have to use and in such cases you will find that access time will be longer and another technique that is being used is optical technique that is used in your CD ROM. So these are the different types of physical type that can be used for memory in your computer system then comes the physical characteristic. Physical characteristics means we have seen that we can use different types of memory and their characteristic can be different and particularly it is usually classified into two types one is known as volatile. What do you mean by volatile that means as long as power is present then information is available information is not lost information is retained that is why I mean it is not read sorry volatile means as long as power is there information is retained but as you withdraw the power information is lost as it happens in your semiconductor memories particularly static and dynamic RAM. So volatile I mean that is one of the physical characteristics but you will see in your computer memory system you will always require two types of memories you will require volatile as well as you will require non-volatile. By non-volatile I mean information will not be lost even if you withdraw power that means when the power supply is withdrawn and in that case also the information will not be lost that means it can be retained for longer duration and it can be subsequently used and reused. So non-volatile is another very important property particularly optical disc magnetic I mean the using magnetic property magnetic types of memory devices optical type of memory devices belong to this category and in semiconductor memory devices also you can realize non-volatile and that is known as read only memory ROM. So physical characteristics can be broadly divided into two types volatile and non-volatile then comes the organization. So let me specify the organization in terms of the memory that is being used particularly say semiconductor memory we have seen that the memory usually is organized in terms of bytes or words particularly the CPU will access in terms of words but it may be several bytes. Now on the other hand your memory can be organized in a different way there are several alternatives that means it can be bit organized what do you mean by bit organized that means you provide some address say m bit address is provided and inside this memory it is available in terms of bits that means it produces in terms of bits it may be usually it is bidirectional that means it is one. So you have got inside the chip 2 to the power m memory locations and each containing one bit so it is called bit organized. Now whenever you have to realize a n bit memory system where n is the number of bits for the word you have to access n bits in parallel how will you do you will be using several such memories a number of such memories so n such memory elements will be required to implement so this will provide first one bit this will provide another bit this will provide another bit in this way you will get n bits and this is called memory bank. So if it is bit organized to realize n bits you will require n chips to have n bits simultaneously. Now it is not always necessary that the memory devices will be always bit organized it can be bit organized or it can be nibble organized what is the difference between the bit organized and nibble organized in case of nibble organized you will have you can access 4 bits simultaneously that means it will have m addresses present and inside it is organized as 2 to the power m into 4. And as a consequence if you have to realize n bits of memory you will require n by 4 such memory chips that means the number of devices that you require in your computer systems will be n by 4 in parallel. So here also you will require a memory bank but number of devices that will be required will be lesser instead in compare to bit organized. Similarly if it is byte organized you will get 8 bits at a time and if the number of address lines is again m inside the chip it is organized as 2 to the power m into 8. So you will also require a memory bank whenever it is byte organized and particularly for example if the word size is 4, 32 bytes or 32 bits you will require 4 such memory chips to have 32 bits. So it will give 8 bit, it will give 8 bit, it will give 8 bit and this will also give 8 bit. So this way you will get 32 bit from 8 I mean 4 memory chips. So this is how memory banks are realized and in all modern computer systems you will find this is how the memory is organized particularly you will see that when it is dynamic RAMs are usually bit organized. On the other hand static RAMs may be nibble organized or byte organized. So whenever you design a memory system you have to take into consideration these characteristics in realizing the memory system. Now let us have a look at the key characteristics of memory systems. Number one is storage capacity in megabit, megabyte versus access time in second of different types of memories. So a computer system will usually require different types of memories. Now what are their important characteristics, relative characteristics? Here we find that on the x axis we have the storage capacity of different types of memories that is being used and on the y axis you have got the access time. So access time is on the y axis and on the x axis you have got the storage capacity. We find that when the access time is very small 10 to the power minus 7 second then your capacity is very small. So when the access time is I mean small or rather the when it is speed is high then capacity is small. On the other hand we can see as the access time is larger, capacity is more hard disk will have larger capacity and magnetic device, magnetic tapes and other things will have still larger capacity but their access time is very large. So you find that larger the capacity slower is the device, this is a very key observation important observation and which will be used in implementing the hierarchical memory organization that means the larger the capacity slower is the memory but that particular type of memory. So this will play a very crucial role in deciding or in implementing hierarchical memory organization. Now second observation is that cost, we find that on the x axis we have the number of years and on the y axis we have the procurement cost. We find that the RAM and ROM are costlier although their cost is gradually decreasing over the years because of the advancement of technology, process technology is improving. So it is getting cheaper and cheaper but if you look at their relative positions we find that RAM and ROM that is your semiconductor memory devices are costlier than magnetic disks and magnetic disks are costlier than optical disks or magneto optical disks. So their cost is decreasing over the years but this relative position has remained the same that means observation is higher capacity lower cost. So here first observation was larger the capacity slower slower is the memory another we can write larger the capacity it is cheaper that means lower is you can write cheaper you can specify lower is the cost. So these two plays a very important role in implementing hierarchical memory organization. Another very important observation is you can see how the performance is changing over the years for processor and memory system. Here actually the memory by memory we mean that dynamic RAM. So dynamic RAM has been used as the representative memory here and here is your processor you can see the performance of the processor is improving rapidly compared to the memory systems. It is not that the performance of the memory system is not improving but the improvement is very slow the rate at which the performance of the memory devices is improving is very slow compared to the rate at which the performance of the processor is improving. Here you can see the processor performance is improving at a much higher rate may be 7 percent per year on the other hand for memory it may be 0.07 percent per year. So the rate of improvement is much lower for memory what does it really mean? That means as the year is progressing may be in the year 1980 the performance of the processor and performance of the dynamic RAM was same but it has changed over the years and you can see this gap is gradually becoming wider and wider. So question naturally arises how to bridge the gap because the processor has to access instructions and data from memory and if the gap is increasing the processor is becoming faster and faster memory is not becoming faster at the same rate the gap is widening. So we have to bridge this widening gap by using suitable technique and hierarchical memory organization is the technique that can be used. So we have now we have given you necessary background for hierarchical memory organization. Now let us see what is the basic objective of hierarchical memory organization first is fast. So whenever you want to have a memory system your objective first objective is you want it fast that means obviously you want it compatible with the processor whenever it is compatible with the processor then the processor will not incur any you know wet cycle or delay. So the rate at which the processor needs it it will get it from the memory if the speed is compatible. So that means you want as fast as the CPU that is your first objective. Second objective is large so fast means obviously we have seen you have got different types of memory. So first test available memory then second objective is it has to be large obviously this requirement is arising from the programmer's demand as you know the size of the code is increasing over the years the problems are becoming more and more complex. So you will require more memory to store your program and obviously you will require large memory and what is the basic objective? Objective is it should be as large as the largest memory available. So whenever we say fast first means that the speed should be as close as possible to the fastest memory as you know the semiconductor memory devices are the fastest. So the speed should be closer to the semiconductor memory devices. On the other hand whenever we say large we know that your secondary memory or backup memory magnetic tape has the largest capacity. So capacity should be close to the magnetic tape but can we get these two together that means it should be fast it should be large. So these two we want together but unfortunately as we have seen that fastest memory is the costliest and smallest smallest in size and largest memory is slower and cheaper. And last requirement is it has to be we have to get it at an optimum cost. So we have to use if we say suppose you decide that you will use only semiconductor memory which is very fast and a very large size then cost will be very high. So that will not serve your purpose. On the other hand if it is as large if you decide to use the I mean largest memory then it will be slow it will not serve the purpose. That means what we have to do we have to devise a mechanism such that speed will be closer to the fastest memory size will be closer to the largest memory cost will be optimum. Optimum means it will not be very high. So that is the basic objective of hierarchical memory organization and with this basic objective memory has been organized in a hierarchical manner and this is how it is being done. Say we have got we shall be using different types of memories in the level 0 we shall be having registers, level 1 we shall be using cache memory and in level 2 we shall be using main memory and level 3 secondary memory. So as you can see we shall be we have I have shown here 4 different levels of memory and now they are the 4 different memories can be organized since they are organized in a hierarchical manner you see how their parameters will be changing different parameters will be different for different types of memories. Another one is access time say here we have organized in a hierarchical manner. So this is the 0th level, this is the 1st level, this is the 2nd level, this is the 3rd level you may have 4th level. Now if we consider it as the ith level this is I minus 1th level and this is I plus 1th level. So we can say it this way now what is the how the access time varies whenever you organize in this hierarchical manner. So access time will vary in this way that time t i that is access time for the ith level we have seen that this is fastest this is slowest. So access time this will be this that if this is t i plus 1 so this will be more that means that access time will be more as we go downwards access time is more for your larger memory size. So we can state it in this way that means as we go towards hierarchical memory the access time will become more. Second is your cost per byte say c i, c i minus say c this is c i and this is c i plus 1. So cost per byte of which one will be more or which one will be less how the cost will be changing. We know that as we go down the cost is lower that means this cost for a lower level will be more per byte than the cost for higher level. So c i is more than c i plus 1 or what about the memory size memory size s i and s i plus 1. So you have got say two different size as we see as we go downwards the size is increasing that means s i plus 1 will be larger than s i. So memory size will be larger for i plus 1th level compared to ith level that means it will be more. Then comes the transfer bandwidth b i and b i plus 1. So what do you really mean by bandwidth? Bandwidth means rate of transfer the rate at which transfer will take place. So rate of transfer obviously will be more between these two levels or between these two levels or between these two levels that means the rate of transfer will be that it will be here it will be more compared to b i plus 1. Then comes the unit of transfer unit of transfer is x i and x i plus 1. What do you really mean by unit of transfer? You see we can transfer one bit, one byte, one word. So whenever we transfer what is the minimum size that we transfer? So you will see whenever it is between register and cache memory it is usually in terms of words and whenever it is between cache memory and main memory it will be in terms of blocks and whenever it is in terms between main memory and the secondary memory it will be in terms of maybe some other unit maybe page and you will see that their size is gradually increasing and that is the reason why this will be more that means the unit of transfer is more as we go towards higher and higher level. So these parameters will be applicable whenever we organized in a hierarchical manner. In addition to this as we go from lower to higher level access time increases, cost per byte decreases, capacity increases, frequency of access decreases and these characteristics will be maintained whenever you realize hierarchical memory organization. Now in addition to that three important properties are mentioned what are the three properties? Number one is inclusion say here you have got say registers let us assume this is your cache memory and here is your main memory. Now inclusion property says that if a particular data is present here it must be present here it must also be present here that means there cannot be any data which is present at lower level is not present in higher level this is the inclusion property. Inclusion property will guarantee that any data which you get in a lower level of memory has to be present in higher level. So it can be specified as M i, M 1 covers, M 2 covers, M 3 so if you have got M levels this will be maintained. In other words that a subset of data that is present in the main memory will be present in the cache memory and a subset of the data which is present in the cache memory will be present in the register. So this is called the inclusion property which will be maintained whenever you organize memory in a hierarchical manner. Then second is coherence now say here you have got one copy of a particular byte and another copy is present here another copy is present here. So from here it was copied here and from here it was copied here. So initially it was present in the main memory from where it was transferred to the cache memory and from the cache memory it is transferred to the registers. Now coherence property says that coherence or say consistency it is also called consistency it says that the copies will be consistent or identical what does it really mean when you say all the copies of the same data present in different levels has to be identical what does it really mean? It means that say if CPU modifies here it is essential that it has to be modified here it is also essential to be modified in the main memory. That means you know that registers are very close to the processor or rather registers are present inside the CPU. So whenever it executes a program modification takes place in the registers. So as you do that it will be necessary to transfer the same modification implement the same modification at higher levels so they have to be identical. This is the second property to be maintained third property is locality of reference. I have already told you about the locality of reference one is your temporal locality another is spatial locality and third locality that is known as sequential locality. So temporal locality as I told if you use some data or program instructions at this moment in near future you will be using it that means it will be reused in near future. So that is quite common because if you are running a particular application right now same application will be running in near future. So just like you know if you are reading a book at a particular instant in a semester you will be reading the same book in near future. Second is your spatial locality that means some books which are related to the book that you are reading say computer organization book other books related to that you will be reading in near future. So this is somewhat similar to some adjacent memory location which you are not using at this moment but it is likely to be used in near future but as I have already told this is your spatial locality. Third locality which is somewhat similar to spatial locality but little different which particularly arises in the context of you know programs execution of instructions. So we know that normally what is a program? A program is nothing but a sequence of instructions. Normally the sequence of instructions are stored in contiguous memory locations. So when you are executing a program it is very likely that you will be executing fetching instructions from sequential memory locations. On the other hand that spatial locality is applicable to data and sequential locality is applicable to instructions that is present in the memory. Whenever there are branches, subroutine calls, interrupts in such cases it will not be sequential it will fetch from non sequential memory location but most of the time you will be fetching from sequential memory locations. So that means this locality of reference is followed in memory access by the processors and which will be exploited in implementing hierarchical memory organization. So based on this we can say here a typical example is given. You can see you have got registers present which is part of the CPU, cache memory and this is main memory and hard disk. Typical sizes are given as you can see the size is gradually increasing 500 bytes maybe that is the size of the registers and cache memory size is 64 kilobyte, main memory size is 512 megabyte. So all units are in terms of bytes and hard disk capacity is 500 gigabyte. So you can see that it is roughly about 10 orders of magnitude. So the registers, the cache memory is about 10 orders of magnitude than the 10 to the power 3 compared to the register size. Similarly, the main memory size is also about 3 orders of magnitude compared to the cache memory. Similarly, it is 3 orders of magnitude compared to hard disk size is compared 3 orders of magnitude higher than main memory. So this is the that means size is it is becoming bigger as you go down. Similarly, you can look at the typical speed of present day memory devices. The registers can be accessed with access time of 0.25 nanosecond, the cache memory can be accessed with the access time of 1 nanosecond, main memory can be accessed with the access time of 100 nanosecond, hard disk can be accessed with the access time of 5 millisecond. So you can see I mean that is the typical value but as I have said that will be little bit of variable. So this is how the hierarchical memory is organized. So with this introduction in my next lectures I shall discuss about first the cache memory organization that is the first level of hierarchical memory organization that is used that has been used in all computer all the contemporary computer systems. Thank you.