 So, we talked about couple of static prediction techniques. So, one technique was that you is very simple, you just decide to say the same thing for every branch. We never always not taken or always taken. So, for loop branches we said that always taking is going to be good actually. If you have a loop of main iterations, we will misprint it from the last time. So, fairly high accuracy. Slide improvement upon that is forward not taken and backward taken. And here we say that, so this particular decision is taking to say our loop branches. So, which are mostly taken backward branches. Whereas, forward not taken comes from the fact that if else type of constructs most of it makes it to the if blocks. Now, so these are all static techniques. You can fix the prediction at compiler. By looking at the source code you can say backward branch I am going to say taken all the time. These are forward branch I am going to take and say always not taken. So, there is nothing to do at front. So, you can make a prediction static. However, that is not enough. So, the problem arises as your pipeline gets bigger. That is we have to just talk about pipe stages. But usually today processors have much deeper pipeline. And the reason for having a deeper pipeline is that why is it frequent? Because the simple reason is that as you cut down pipeline stage into smaller sub stages. You can your cycle time is determined by the longest stage. So, naturally your cycle time goes down. So, they block the processor at a higher frequency. So, here is an example just to convey what really happens. If you have a deeper pipe than what we have discussed or although this is not a very deep pipe. So, after instruction fetch MIPSR4K takes two pipe stages to compute the target and one more to evaluate the condition. So, essentially what I am saying is that you have the instruction fetch stage. And then you have three more stages until you know for sure the target and the condition. So, target is known here and the condition is evaluated here. And then of course, you have more pipe stages to access memory etcetera. Not important example. So, let us assume that we have a program which has 4 percent unconditional jumps, 6 percent not taken conditional branches and 10 percent taken condition branches. And we will just more everything else. We assume that everything else is CPI of 1. So, question is evaluate CPI increase for three schemes. So, we have three options in our branch prediction. One is unconditional flash. That is whenever you come across one of these three types of one of one of these two types of instruction that is our conditional jump or a conditional branch. What you do is you insert no option in the pipeline until you know the target for unconditional jumps and target and the condition for conditional branches. Second option is predicted always taken it is this one whatever that is you predict taken. And third option is predicted always not. So, you have to evaluate these three options in terms of the CPI. So, let us do that. So, first you have to figure out how many bubbles you need to insert in all these cases. So, for unconditional jumps let us focus on the first one unconditional flash. That is whenever you will come across an unconditional jump or a conditional branch instruction you insert a no. So, in unconditional jump how many knobs should I insert? Two right because here I know what I need to do because the target is not here. So, I do not know what to do here I do not know what to do here, but I know what to do here. So, these are the two bubbles. So, unconditional jump instructions under the unconditional flash model will have a two cycle branch. So, in this case CPI is going to increase by two. So, unconditional flash. So, CPI delta CPI unconditional flash. So, we have two times 0.04. Plus what about not taking conditional branches? How many bubbles here? How many bubbles in that unconditional flash model? Three right. So, I have to wait until S3 before I know what to do. So, here only I will be able to fetch the corrective structure. Say for taking conditional branches we are not taking or taking it is an unconditional flash. So, in both of these cases I will have a branch penalty of three cycles. So, three times 0.16, 0.06 plus 0.1. So, what do I get? 0.56. So, that is the that is why CPI increase for unconditional flash. So, now let us take up the next one. Prediction always taken, delta CPI always taken. So, what about unconditional jumps? How many cycles do I lose? Like this prediction model always taken. What is the branch penalty? You must be waiting between 0 or 2 and 0. Which one is it? Can it be 0? I do not have a rocket which can tell me a target. I have to wait for two cycles to get a target right. So, 2. So, what about not taking conditional branches? How many cycles do I lose? Three right. I use everything. I make a wrong prediction right. Three times 0.06. What about taking conditional branches? Are you sure? Two right. I still have to wait for two cycles until I go to the target. I can save the third cycle of course. So, two times 0.1. So, how much does it come to? 0.46. So, you can see that we saved something. We saved delta CPI of 0.1 which is a big difference. What about the other one? Always not taken. What is your intuition? Which one is it? Virtually it is going to be better. Looking at these instructions. Why is that? Exactly. You have 10 percent taken branches right. So, always saying it should be better. We are biased this side. But remember that alright. So, let us see what it comes to. So, what about unconditional jumps? Two. What about not taking branches? Two. What about taking branches? Three. How much does it come to? 0.5. So, it depends on your program which one is going to be better. So, and you can guess what happens if my pipe gets even bigger even deeper right. If I could chop down these pipe stages even into half right to gain in terms of frequency. My branch melody is going to increase even more alright. So, that justifies this particular statement. Deeper pipelines increase branch payload alright. So, must have better branch predictors for deeper pipes. So, if you are really targeting a very fast clock frequency for your processor, you better have a smart department that can design good branch predictors. So, there is just one requirement for deep pipe. We will see many more requirements as we go along. So, just getting just for getting good frequency if you make the pipe deeper you are running this in other departments. So, keep that in mind alright. Because of course, you will reduce your clock cycle time, but you will gradually increase your CPI. And remember that it is a product CPI time cycle time will be your performance given constant number of instructions. Any question on this example? So, I will start again with one compiler technique before we go on to dynamic branch prediction. And this again comes from MIPS. So, the MIPS engineers soon figured out that filling a branch based clock is problematic. Often you will find that you do not have instructions to fill in you cannot prove that the produced clock is correct. And one option is of course, that compiler if the compiler could predict which way a branch is going, then of course, from a predicted path it could put something in the prediction. So, if you cannot prove correctness of prediction it has to be conservative meaning that you will actually fill in the delay slot. So, some I S S provide nullifying branch of branch likely instruction. What is that? Compiler encodes the predicted direction in the instruction and fills the delay slot accordingly. So, compiler when compiling the program makes a prediction which may be correct may be not. And based on that prediction it fills the delay slot accordingly. It will actually pull up an instruction from the predicted path and put in the delay slot which may actually be wrong alright. So, now if at run time the branch turns out to behave otherwise the delay slot is crushed. So, when the program finally executes you find that oh the compiler made a wrong prediction which means a delay slot should not actually be executed. So, at that time what would you do is you cancel the delay slot you turn into a null at run time. So, why is it any better compared to not having such an instruction? Sometimes it will be correct right. So, it gives the compiler more freedom more leeway to actually be to be more aggressive. It can actually make a prediction and have more options to fill up the delay slots some of which will actually be correct alright. So, MIPS offers an instruction called cancel if not taken branch. So, that if the compiler thinks that the branch will be taken it can fill the delay slot from the target and of course at run time if you find that the branch is actually not taken you can cancel the delay slot. So, why didn't they have a cancel if taken branch instruction? They have only one flavor of this cancel if not. So, it has it says something here within parenthesis can somebody decrypt that and then so what suppose most of them have taken yeah. So, why shouldn't I have an instruction which is cancel if taken? Yeah, why is that? But it will be taken so exactly. So, I need this kind of an instruction which will be actually useful on the last branch when you fall through the last iteration of the loop right. So, this decision actually was taken to favor loops which actually forms a large body of code structures usually run loops a lot. So, they found that if they ran you know they took statistics and found that cancel if taken is not that useful that is why they didn't support this instruction MIPS has only one branch like instruction which is cancel if not taken. Is this possible here branch likely? It is definitely a big improvement on top of not having on such instruction and telling the compiler to fill out the instruction. So, now we want to look at dynamic prediction techniques. So, just to formalize the notion of that. So, this comes from the general phenomena called control dependence and roughly every 15 instruction is a branch that is a known statistics from your program analysis and you need to be on the right control flow path that is very important for executing a program because this is the source of input to the pipeline. So, this determines whether the pipeline is wasting time or doing useful if your input is bad the pipeline is actually wasting time. So, you want high quality input meaning that you want to be on the right control flow path as often as possible. Statics are not enough because you need highly accurate predictors especially when you go for very deep pipelines. And there is a need to speculate past branches meaning that you predict a branch start fetching from the predicted path and while you are fetching on the predicted path you may encounter one more branch which you should be able to predict also and go along the predicted path again. So, you may have multiple predictions going on until you resolve the first ones you really don't know where you are going but you should be able to do this highly accurate. So, alpha 2 1 to 6 4 no longer there actually this particular company allows 20 outstanding branches meaning that you can have 20 unresolved predictions. So, that is a limit of 20. So, which means you predict the branch you start fetching along the predicted path you encounter another branch you should be able to predict that and depending on this prediction you start fetching again along the path we encounter another branch you predict that and the limit is 20. So, when you reach 20 they say no more you have to stall at that point you cannot fetch any branch. So, on encountering the 20 first branch the front end of the pipeline will stop waiting for the first branch to resolve or at least one of these branches to resolve. MIPS R 10 K allows only 4 and this number actually is very important. So, we will soon see actually why. So, you need to speculate past predicted branches in deeper pipelines which is actually on a big issue in pipeline because if you look at the pipeline that we have talked about you will never encounter this situation that you are on a predicted path and encounter another branch why is that because if you remember how our pipeline was etcetera. So, we are making a branch prediction here right in this particular stage. So, I will fetch something here, but this branch will resolve at this stage. So, next cycle I know where to go. So, I will fetch exactly one instruction from the predicted path and there is very of course there is a likelihood, but it is most unlikely that there will be two branches. Because this instruction has to be a branch also if you want to but this will not be predicted because it will never get to the decode stage if it is mispredicted. So, you will never encounter this situation that you are on a predicted path and you encounter one more branch which you have predicted, but if you insert a few more stages here you can easily see that now that is what will happen. So, prediction accuracy is of course important and it wanted to be high. So, here is a very quick analysis just to show why this particular number is very important. So, probability of a correct prediction is p let us say and assuming that these predictions are independent probability of staying on the correct path after n predictions is p to the power of n. So, now, if you plug in n equal to 20 that is what we are saying actually you have to be on the correct path after 20 such predictions. So, I want p to the power of 20 to be at least 0.5 to make n sense of such a prediction we can. So, can you imagine what the value of p will be if I want p to the power of 20 to be greater than 0.5 sorry close to 1. So, it turns out to be 0.97. So, I am demanding a branch predictor of prediction accuracy 97 percent to make any sense of this number 20. If you have p to the power of 4 p comes to about 0.85 which is probably achieving. So, this number is very important you cannot just increase this number of victory that will hardly over chance of being on the right path very soon what will happen is that you will be on the wrong path for sure unless you have a highly accurate prediction. So, keep this in mind. So, essentially what we are going to do is we are going to design predictors goal is to make p as high as possible is this analysis clear to me. So, let us first try to define the problem more formally. So, let us encode this to be the picture. So, we know that p e k 2 is equal to 0 and d 4 is equal to p e k 1 is equal to 4 p e k 2. So, this actually happens to be 0. So, you have to So, the problem of direction prediction is essentially design of an estimator that given an n bit history tells us the next most likely outcome for a particular static branch instruction. If I give you a branch instruction, I give you its n bit history and I ask you design an estimator which you tell me the most likely outcome for the next execution of this particular branch. Is it clear to you this particular formulation? So, what all branch predictors do is they compute the probability of seeing a 0 or 1 given the recent pattern history h of some limited length n. So, you have seen the history h of length n and you ask the probability of getting a 0 or 1 after this. So, suppose the number of times 0 appears after h is c 0 and similarly defines c 1. The prediction is 1 if c 1 is bigger than equal to c 0 and 0 or 1. Does it make sense? I have seen more 1s after h than 0. So, I say that well these are most likely prediction ones. So, instead of actually having 2 counters what we maintain c 1 minus c 0. So, the difference is there. Is it clear? So, let the difference counter be c h for a certain history h. So, since we have to design it a hardware we decide size of these counters. So, let c h be of k bits length. This is independent of the history length remember that. It has nothing to do with how long h is h may be untrained. So, what does it mean? That means c h can count from 0 to 2 to the power of k minus 1. It is a finite length. On seeing 0 after h you decrement c h on seeing 1 after h you decrement c h and it saturates at boundaries which means it is a saturating counter. So, it does not decrement below 0 on decrement above 2 to the power of k minus 1. So, by examining c h at any point in time what can I say? We can say which outcome had higher likelihood in the last 2 to the power of k occurrences of history h. Because I have I can only count in this range. So, if the counter value is 10 I can say that well 1 appeared 10 times more sorry. So, you have to be ok. So, you need to shift the origin to the midpoint first. So, 2 to the power of k minus 1. So, from the distance from the midpoint if you are below the midpoint you know that you have seen more 0's than 1's. If you are above the midpoint you know that you have seen more 1's than 0's. However, that you can do that you can say only about the last 2 to the power of k occurrences. Because you should go up gradients suppose you are seeing 1's ok. You go up gradients saturate here not move any more. So, you cannot really say beyond that point what happened. So, if c h is below the midpoint the prediction is 0 otherwise. This is an example. So, I have a counter let us suppose k equal to 10. So, what is the midpoint? 512. So, that is my equation 512. And I have a history. So, that corresponding to this particular counter. So, history can be something like 101, 0, 0, 0, 1 etcetera. So, here let us suppose I fix 8 bits of history. This is my h. So, what this counter maintains is after this history h whenever this history h occurs if a 0 comes this counter value will be decremented. If a 1 comes after this history this counter value will be decremented. Now, suppose at some point in time I found the counter value is 20. What can I say for this value? 492 0. Well, I can probably say that 490 sorry say it 490. 492 0 more than 1. Exactly. So, this much is the deficit. This many 0s I have seen more than once after h. But that I can say for sure only in this span in the last 1023 occurrences of h this is what has happened. I cannot see beyond that. Because of this limited length of the problem. If we get all once you come here. So, you can still predict whenever you see 1 0 you will start moving in this direction. If you want 0 0. That is fine I will predict 1. That is what it says if h is below which point prediction I will just take here at this point I will saturate it. Any question of this? Are you done? Do we have a branch mariter? We are not yet. We still have to determine one small thing. How many different histories do we want? Here we have focused only one counter corresponding to one history. Now, each branch can have millions of different histories and there may be thousands of branches. So, let us start with the simplest option. So, we do not have these. We have no solution. What does it mean? You just have a global counter that counts occurrences of 0s and 1s. So, there is no history whenever I see a 0 I decrement the counter whenever I see a 1 decrement the counter. So, it is not very useful actually because all it is telling you that well how are many branches I have seen? These many branches are taken. These many branches are not taken. It gives me the difference of that. Of course, based on that I could make a prediction. So, well this program seems to be biased towards taking branches. I say that I say taken when the value of the counter is above the threshold otherwise I say ok. So, based on the last 1024 branches I have seen I will make a prediction based on the population of the one history. So, it is not used in any process because it is not very accurate as you can guess. So, what is the next option? So, here essentially. So, you can improve this a little further. You can say that well I still have no history, but I am not sure I am able to understand. So, this is not very accurate. So, what I am saying is that what I am saying is that well I will say. I am going to say that well what I am going to say is that I am going to say that well when I say that well I am going to say you can make a large number of installations in those cases. So, this is called a bimodal predictor and this actually used in many processes. So, one counter it is not exactly one counter for branch you actually use a hash table for. So, the bimodal branch predictor essentially it is a table of counters table of saturated counters and you take the branch PC shift out the last two bits as always. Take a module of number of counters you can afford it is called 2 to the n plus 1. If I have 2 to the n counters here each k bits that is my bimodal branch predictor. So, this predictor will work extremely well for branches which are primarily not predicted but, it would not be able to detect any pattern in branches that is impossible because it does not have any history which is only counting the number of zeros and ones. So, from no history let us make one small set suppose we give you a one global history register. That means, you still do not have one history per branch, but we have one history register where whenever you see a branch you shift in its outcome. So, the if you have a history of say n bits you know the outcome of the last n branches that you have seen. So, it captures cross correlation between branches if there is any. Since history is a sliding pattern H will keep on changing it is not a fixed history that we are making. So, H will gradually keep changing you just capture the sliding window of pattern. So, now the question is ok fine I have this history should I have one counter for each history pattern. So, usually typically use a hash mapping in. So, what does it look like you have a global history register of some length small n let us say which will be used to index into a bank of counters to the n problems. So, can someone tell you what am I doing here. So, whenever I see a new branch what I will do is I will shift this register by one bit position and shifting the new outcome of this branch. So, it is capturing the outcome of last n branches it is a binary string. So, what is it learning actually. So, for the prediction mechanism is still same I whenever I get a branch I use the current content of the global history register look up the corresponding counter. And if the counter is above 2 to the k minus 1 I say the branch is taken it is below 2 to the k minus 1 I say the branch is not. So, what is it doing actually. Mashed in. No, I am not asking you to come up with program constructs where this is going to be Guler bank. I am just asking what is it doing that is a particular definition does it make any sense that is the first question actually is it doing anything. So, what will what will the number in a particular counter corresponding at any point in time sorry. By the local. Which branch. By the local. Is it by the local branches. The last. What is this counter attached to each counter is attached to something very unique what is that. How do I index into a counter what do I use to that to that. Now, what is this I use the history right index into a particular counter. So, the history attached to a particular counter will remain same throughout right. Do you see that this counter will get indexed only when I see a particular history here right. See if I assume that there is no aliasing meaning that I have enough counters here each history gets a new counter in this table right. So, now, somebody can tell me what is this actually measuring this particular counter. How many times this kind of situation. Exactly. So, this counter is telling me that the history corresponding to this counter how many times that had a 0 after that history and how many times I had a 1 after that history. This may be the difference of that right. So, each counter gets attached to a particular history it does not have anything to do with the particular branch. So, when a branch shows up with a particular content of the global history. That means I am asking well tell me in the past what happened when you saw this history was it a 0 after this or a 1 after this and that is what this counter is going to tell me the most likely of it. So, that is a huge improvement over no history. So, this predictor has very high accuracy. So, we will actually assume come up with an axiom of your predictors. So, I am just giving you the blimpses. I will soon name these predictors they are names and then of course, you can extend it say that well instead of a global history I could have 1 local history per branch. So, I make this a table instead of a single register. So, this is not a global history register anymore this becomes a history table and I will index it with anybody guess I want a local history per branch. So, if I have 2 to the p entries here I will essentially do p c shifted by 2 and it with 2 to the p minus 1 that is why it index. So, even though I may not have 1 history per branch if my table size is limited I will at least have you know there may be some collision of course, but in a region of a program as long as I have no more than 2 to the p branches I am happy. Because I do not really share when I move on to a new region of a program what happened in that region right. So, hash history patterns to counters, but it loses global cooperation. Now, you will not be able to see if there is a relationship between this branch and this branch these histories which you were having when you had a global history. So, these are roughly 3 types of branch predictions that you see. So, I will come up with a naming convention for this. So, any question. Sir, this prediction is not very slow it is like we are updating it we are reading it and then we complete some dates. Updating there is no update going on yet. We are updating the counter in the history. When the branch finally executes you will update this and this both. But that is not the time of prediction that is later. At the time of prediction you look this up. You look this up and you make a prediction and when you finally execute the branch you know the correct outcome at the time you can update it. You cannot update with the predicted outcome this has to be correct whatever you learn. In future when we get the outcome of. Yes. Yes. Yes. Can we mix both? Yeah of course you can mix as many as you want. Yes we will talk about it. Yes. Any other question. So, you have already talked about this one bimodal predictor. So, some names here the table of counters is called the branch history table. So, that we have multiple branches mapping to the same entry. And in some cases that may be very very bad. Two branches may behave in two different ways in after same entry. One will increment one will decrement you will get nothing. How wide is each counter? Do you really implement a counter? Well how wide is each counter? So, that determines how far you can see the span of your window. You actually do not really implement a counter here. You just have a finite state machine implementing a saturated counter. So, you know when the state is 0 1 when you do not increment particular counter value what will be the next? It will be 1 0. Actually you do not implement a counter you just implement a finite state machine. It goes to the states given a particular loop. Performance of loops on the bimodal predictor how will it be? Good, bad, always good. So, talking about this predictor if you have forgotten. We talked about the 5 times sorry 5 minutes ago. So, we have branch history table which is k bits wide. It is a bank of saturated counters indexed with some function of the branch PC. The function is basically a volume hash. So, we are asking what can you say about a loop branch that you receive? We will be correct except the last one. Does it have a relationship with k? Are you assuming something about k? k is the predictor. I tell you that k is very important here. Why? What happens if k is very large? This should I thought this should be an easy question. You see a loop, it keeps on incrementing the counter. The state is gradually, last time it increments. What happens if the value of k is large? It is a very wide counter. I am just trying to figure out if I have a large counter what will happen? There will be only what will happen. So, if you cannot think in abstract terms I will give you values. Suppose k is 10. I have a loop of 100 iterations. What will happen? Now, it does not saturate. Very good. So, what? Or it will saturate. It will saturate something. Where do I start from? Can I say anything about that? So, what makes sense? What is the issue now of these counters? Middle. If I start in the middle what will happen? If I have 100 iterations I should be fine. I will make a mis prediction in the last branch. Now, the thing is that these counters are actually initialized to 0. When the machine inputs. So, if I actually it is at 0. If I have a counter of 10 bits. And a loop of 100 iterations what will happen? Except the last time I will make mis prediction all the time. I will never be able to cause a threshold. So, usually these counters are small. You will often find 2 may be 3 bits. Never move that. So, I really do not want a very large history. I want a small window. I want to know what happened in that small window. That is it. So, I do not need a large window. Because the problem is that often having too much of history may pollute your prediction. Because what happened in distant past may have not changed in that behavior. You may not see that behavior. So, usually these counters are small to capture a small window of history. So, what about alternating branches? Alternating branches have this particular pattern. What will happen to them? Prediction accuracy 50 percent. Are we assuming something of the initial value? Mid, right? If I initialize it to 0. 50 percent, right? If I initialize it to somewhere. If the counter is too large. And if I initialize it to all this side. Still 50 percent, right? So, it is not very useful. That is what it means. If you are 50 percent correct. That is more or less random, right? I could make a random prediction. I could toss an unbiased point and we have it. So, alternating branches we need something better. Bimodal prediction is not good for this. What about correlating branches? There is branches that have relationship. I have an example. So, suppose we have this program. If f is equal to x. Then you assign y to f. If g is x. Then you assign y to g. And then you say f not equal to g. So, do you see that the outcome of this branch. Depends on the outcome of the previous two branches. So, this is called a globally correlated branch. Your branch which depends on the history of few other branches. So, even if I tell you the exact history of this branch. It will be very difficult for you to predict. What is going to happen next? But if I tell you the history of these two branches. You probably have to come up with a code letter very easily. The pattern will emerge. If these two branches behave like this. I am getting this. It gets the pattern immediately. So, this requires two levels of table. So, these are the levels of table. The second level table is the branch history table. And this one is called the pattern history table. And this table may have even just one entry. That is also fine. That is still called the pattern history table. For example, if we have just a global history register. That will still be the pattern history table. But having just one entry. So, then we can come up with a taxonomy of branch predictors. Depending on how you exactly index in the first level table and the second level table. So, the first level table can either be global. Can be per set or per branch. So, what does that mean? If it is global. That means you just have one register. The global history register. Where you maintain the history of all branches. So, whenever you encounter a branch. You shift this bit. One position to the left. And you shift in the new outcome. And all the branches are mixed up. So, that is called a global first level table. The global pattern history table. Then you can have per set. Where essentially what you do is. You have a table. And a set of branches get one entry into the table. So, that will happen when you take the PC. And do a modular operation on that. A set of branches you may have to monitor. And then of course you can have per branch. If you know beforehand how many branches I am going to see in my program. You can size the table. So, that every branch gets exactly one entry. Is it clear? These three things. Whatever the second level table. There also you can do the same thing. You can have a global second level table. That is exactly what that is. That is there is. There is no specific table. Branch history table attached to one particular counter. All counter can update any. All history can update any counter here. But every counter will get attached to a particular history. For example, suppose this counter gets indexed. When you have the history 1 0. 1 1 1 1. That is it. Now this history may appear. Sometimes this particular history entry. May appear in this branches history sometimes. May appear here sometimes. So, whenever this history shows up. You will index it to this counter and make an update. So, that is very important to understand. That these counters are not attached to any specific branch. A particular counter gets attached to only one history. And that is why the name comes from global. It is a global branch. And then you can have per set branch history table. Where you say that well a set of histories will map to a table. So, I will basically have a table for a set of histories. So, I will have multiple second level tables. And the third one says that well you can have a part branch table. So, that for each entry I will have a branch history table. So, for that branch it will have all the histories. Accumulated in that table shoulders. And then you have the update method. One is a static method. Usually not used. So, that you hear essentially what you say is that. You just statically encode this table at good time. The programmer table is static. So, there is no update as such. The other one is called adaptive update. Which is more popular. As you go on you learn update the tables. So, combining this people name branch predictors. So, first entry here corresponds to the type of the first level table. Second entry here corresponds to the update method. You will invariably find A everywhere. You will never see an S here. These are all adaptive predictors. And the last entry corresponds to the type of the second level table. So, for example, path basically means what? What will a path predictor look like? There are big first level table. Each branch gets one entry in this. This will point to one branch history table. This one will point to another branch history table and so on. And how will you update any entry here? So, you take the history here. Whatever history you have. That will be used to index into one of the counters here. And that counter will look like that. Similarly, the histories used here will be used to update the counters here. Is it good or bad compared to that? It is a gigantic predictor. You can easily see that. You will have many entries in this table and each entry will correspond to a table. Because you might be able to update one counter in this table. Is that true? It is like someone is adding something in the table. It will affect the counter as such. Not in the table. It is like two entries are always updating same counters. One is adding and one is decreasing. It will nullify the effect. So, it will not happen in this table. If you have a particular history and after this sometimes zero appears, sometimes one appears. In that case, what will happen is that? What are you saying? The effect will be 25. We are adding and decreasing. So, depending on which branch it is coming from, it may have two different behaviors. But it will update the same counter actually. So, that may have a destructive interference effect. Is that a good effect? So, what he has pointed out is a bad thing that we should try to avoid. So, that is exactly what he is trying to avoid. Two entries in the table are updating the same counter. Point is equal. You learn something in one course, learn the same thing in the other course. Is that good? Poor a bad thing. Maybe you learn better. Or you learn faster. So, in this case if there are two histories that reinforce their learning. It is going to learn very fast actually. But much faster rate. You will reach the learning pattern quickly. So, this is my path predictor. Similarly, you can come up with SAG. SAG is probably the one of the most popular one. That is the SAG predictor. It has each entry corresponds to a set of branches here. And it is a global table. Is that a bad BAP? It will not be beneficial of correlating branches. Of course. So, neither SAG. SAG has some correlation. Provided your mapping function will be able to map those correlated branches to the same section. But that is an accident. It is impossible to design such a hash function before that. And for that purpose, you use a gag. So, you have a global entry in the first table. And you have a global temp in the second. That will allow your correlating boundaries. So, what does each one buy? This one allows you to learn each branch separately. This one gives you more or less the same accuracy at a much lower cost. This one allows you to learn global correlation. There is one special case that is called a g-share predictor. Which is more or less same as gag. Except that it has a slightly different index function. So, you have the global history register as usual. And you have your branch history table as usual. So, if this is n bits, how many entries do I have here? Sorry? 2 raise to the power of n. Before you index it, you do an XOR with PC. I show it as XOR because that is what is used most popularly. You can come to the other smart functions. Why do I want to do this? G-share is amazingly good. So, it is a good thing to do. Mapping the same branch. The same DC. Yeah? No, there is no such guarantee. You are exorbing it with global history register. You destroy all the patterns. What is it trying to do? Why do I bring in the PC? What do I lose in a global predictor? I do not have a G-share and a branch history table. What is it that I used? Branch management. Branch management. Branch management. I really do not have any learning about the local history of a branch. So, I am just trying to introduce some flavor of that by exorbing it with PC. The hope is that a particular branch, when it sees a particular history, it will get a particular different entry. So, essentially I am trying to map this to PC, G-H-R on to this complex. The hope is that each different pair will get a different power. Of course, I could do other things like I could catenate these two, but my table will become now gigantic. So, I have to come up with some function.