 So, if you recall in the previous class towards the end we started looking at information about next to the information and then we were trying to minimize the number of constraints. Now actually if you look at this part this was just to bring this example out but actually we do not minimize number of constraints. What we do is we use next to the information to at the time of code generation 2 make sure that when I am using registers when I am using temporary I will use fewer temporary. So, when I talk about 4 generation that time you will see that I am actually going to use the next to the information. So, let us look at what code generator does and what I am assuming here is that we have 3 address code we already have a model of the machine in mind and this 3 address code is now to be converted and we look at both arithmetic logical, Boolean or kind of expressions into some kind of machines in X that is the intent. So, what we want to do now is that for each statement in each statement is in general of the form either X being assigned by of Z or is if X well of Y then we go to the level of this form what we want to do is we want to remember if often in the register in my register register is one of the resources but we will also remember more things. So, what we do is we create two kind of descriptors now one is what we call as register descriptor. So, basically what I do is I create a table and in this table I say that if I have registers r 0 to r n minus 1 I want to remember what each of these registers contain. So, I may have information like saying that this register contains value of k this register contains c and so on. I may also have a situation where I may say that a register may contain two variables. So, register may contain let us say a and d how would that happen is that possible that a register simultaneously can contain value of two variables possible. So, I do not look at the value I do not know I mean how many variables may have the same value. So, the reason this may happen is that suppose I say a is assigned d is assigned a. So, what I do I will do is that I have a time to erase which has been loaded in a register then I just do an assignment. Now, assignment means this value is already in the register I do not have to change anything in my descriptor. So, register descriptor basically says keep track of what is currently in each of the registers and initially before the dressing block I assume that every register is just an n. So, I start using register and end of a dressing block. Now, I have another descriptor and this is called address descriptor and what I do here is that for each of the variables I want to now keep information and these variables are nothing but either they could be user variables or which are in the symbol table or they could be temporary variables and I want to keep information like that for each variable. So, if these are my variables I want to know places where they are available. So, if this may be available for example in both register and memory. Now, variable may also be available more than one place. So, for example if I say that I have a variable in memory and then I say load it in register. So, then it immediately becomes available at two places. So, this is what my address descriptor is that I want to keep track of locations where current value of the name can be found at run time because we are not talking about execution of the program. And the location might be a register, stack, memory address or a set of these locations. So, a variable may be available simultaneously in more than one location and a register may keep value of many variables including simultaneously more than one variable. So, these are the two descriptors I create when I start doing code generation that I will say that initially now initially where will variable be what will this register descriptor contain when I start doing code generation of a basic node. All of them will be in some memory location. They are not going to be in any of the registers. I will start voting them as I go through the registers and each register to begin with is going to have is going to be empty. So, these are the two descriptors I have and I come to Boolean style and more information. But right now for arithmetic and logical expressions there is sufficient information as well as code dimensions. So, what we do now is let us look at how we do code generation and what I am looking at is the simplest possible output. In fact, I mean this is the most general output which is X being assigned Y of Z. The I can have unary operators. So, I can say X is assigned of Y or I can have just assignment where I say X is assigned Y. But those are actually simpler cases. This is the most complex case I have. First conceptually understand what I am trying to do there. We want to generate fast efficient code and therefore as far as possible we would like to use registers. Now suppose I say that I want to compute these values now to begin with everything is in memory and Y and Z are in memory and I want to apply this operation on them and store value in X. Now as far as possible I would like to use a register to store value of X. That is the first thing. Second thing I will say is that suppose Y is or Z is already in one of the registers and that register has or that particular variable has no filter use then I can reuse the same register and this information comes from the next use information. So, what I do is that first I say let me get a register for doing this operation. That is the first function I call. As far as possible we will try to use register. If register is not available or register is not required then I will use a memory location. But my first preference is always going to be a register. So, first I do is I make a call to a function which I call get register and we look at this function in the next file. What this function does? It actually returns a location which says this is where you can compute and store value of X. So, what this get register does? We invoke a function called get register which determines some location. So, this location need not be a register it could be anything but so do not confuse between this name get register and location L. Get register always need not return a register. It could be any location but preferably it will be a register where X is going to be stored and usually L is going to be a register. That is always not possible. Now, if this is in a register what do I do? I will say that I can load. Now, suppose this register did not contain anything what I can do is I can load Y into the register and then I can apply that operation on the register with Z and I am done and then I just have to move it. But suppose now if I check that Y is already in that register I can do that check and how do I check that? So, it is possible that get register found that since Y does not have a future use I can return whatever register is being used for Y for keeping the value of X. So, remember this P1 is being assigned P1 plus P. So, I know that beyond this point P1 has no use. So, I can use the same register. So, I need to check whether this register contains already Y or not. So, now we look at address descriptor of Y and we want to find out Y prime. Now, remember what is Y prime? Y may be simultaneously available in more than one location. I want one of those locations which is the fastest to access and fastest is normally going to be a register. So, we determine Y prime and we prefer a register for Y prime which value of Y is already not in L. Then what do I do? I generate one instruction which says move Y prime to L. Now, remember what may happen here is that I am saying that look at Y. I look at address descriptor which says that Y is in register R0 but R0 may have a future use. So, get register will not return that. It return register R2 for example. Now, I cannot overwrite R0. I need to protect it. So, what do I do? I say that now move copy value from R0 to say R2. And then once I have generated this instruction. So, move instruction is basically a copy which says that from the location where Y is stored and that value location is Y prime load this value into L. That is the location which has been returned to me. And after that I generate one more operation which says now OP and OP is this OP which is this operation of Z prime value. Now, what is this? What is Z prime? Z prime again is a location from where I can pick up value of Z at runtime and this again is expected to be the fastest location that we use. So, it is possible that Z also from the address descriptor is available in multiple locations and if it is already in a register then what do I do? I just applied this operation. So, suppose to begin with I say Y is in R0, Z is in R1 and this returns a value saying that both R0 and R1 have a future use. I cannot feel it returns R3 for this. So, what do I do? Immediately I say first move Y to R3 and then I say add R2 to R3. So, these are the two instructions it is going to generate and sometimes first instruction may not be generated if Y is already in a register which has been returned for computation of X. Then I generate only the second instruction. Now, what do I do after this? So, I started doing code generation for this free address code which says X is Y of Z and I have generated these instructions. Am I done? Do I need to do anything more? Any more code generation or any more bookkeeping? I need to do bookkeeping now because now I will have to change all my descriptors. So, now I will say that I am using now L which may be a register. So, now this contains value of X and then I have to change address descriptors of X to say that X is now available in this particular location. So, now again I am going to prefer register for Z which was Z prime and after this I update all my descriptors of X to indicate that X is in L and if L is a register update descriptor to say that now it contains X and remove X from all other locations. So, I do bookkeeping after doing this code generation and now we say that if Y and Z they have no next use and this is where this next use information is coming in and we say that they are going to be dead on exit from this block. We change descriptors to indicate that they are no longer containing these registers. So, that get register can take advantage of this fact that this is no longer available as a register and therefore these registers immediately become free. So, I need to generate two instructions and then I need to do some bookkeeping. This is where I use next use information. This is one of the places where I use it. Simple straightforward. So, as far as simple arithmetic instructions are concerned I can easily take a clear address instruction and I can match it on to a machine. Now, you can see that I assume the machine which has move and so on. Now, you can see that if I am using certain addressing modes then I will have to use addressing modes of Y prime and Z prime. At least I am assuming it is a register. So, that that will be straightforward. So, if I have complex addressing modes then I say that whatever that addressing mode is which will help me fetching value of Y prime and value of Z prime. I am going to use that addressing mode which I am not going to explain. Now, let us move on to this function getreg which I use here. Now, how will getreg work? So, getreg is saying that getreg is being invoked to say that give me a location where I can do this computation. So, what do I do? Now, we say that if Y is in some register so, what is Y now? Y is the first component here. So, now we say that if Y is in some register that has not hold any other value. You can see that Y may have some other value where this variable may not have a future use but this may have. So, I need to remember that. So, if Y is in some register and has no other value and Y is not live and has no next use after this particular location then what can I do? I can say use the same register. So, if we say Y is in register R0 and beyond this R0 does not hold any other variable and Y has no next use. That means I can free R0 immediately after this particular instruction. I use the same register for doing this computation and getreg is just going to return a register for Y. Now, it may so happen that this register either holds additional value or Y may not be in a register. In that case this condition is going to fail. So, what do I do? In that case I say get a free register and use that in that case. I was giving you the previous example. I just took a fresh register. So, to begin with no variable will be register. So, first time I am going to use a new register. Now, suppose there is no empty register. This can still fail that I find all the registers are full and what do I do? I can find the memory location but I prefer a register. All registers are occupied. So, can I free a register temporarily? So, this is what is known as register spilling. I can say this register does not have an immediate use. What I can do is I can take value of this register, load it to some temporary location, start using this register and when that particular variable is required then I load it back into the register and continue from that point onwards and this is a technique which is known as register spilling. So, what we do is if x has an x2 in the block or operation is such that it requires a register then get a register and how do I get a register? Store its content into memory location by this and use it. And at some point of time we have to get. So, what is the advantage of this? The advantage of this is going to be that suppose this x has lot of uses beyond this one. And this particular variable which is containing r is not going to be used in say x2 instructions. And I may do only one extra store and extra load but then all the accesses of variable x are going to be used because that's the advantage. So, now you can see that if x has an x2 in block or this operation sometimes may say that I can do this operation only on a register. I cannot do this operation on a memory location. Then I need a register. So, register spilling is the standard technique which is used and as far as and you can see that register spilling is always going to have an overhead. So, you would like to minimize register spilling as far as possible. We will see example of how to reduce register spilling but in this case we have to do a register spilling. Now, suppose this is not required that either we say that this is the only time I am computing x beyond this point. x is not going to have a use and therefore there is no point in the spilling that means moving something into a memory location and loading it back and also the pressure is such that it does not require a register. Then I may use directly memory location for x and so on. So, only this memory location will be used if all these conditions fit. So, as far as possible use a register but if there is no free register then try to use a register by freeing it temporarily and using it. But if it is not required then there is no need to free a register because every freeing a register requires two operations. One for moving the content back into memory and second time moving it back into the register. So, then you will say these two instructions. So, this is what get register. Now, there are lot of optimizations you can see immediately possible here because I say here if x is in a register that holds no other value I can do the same thing for z. And also you will see that and this is something you can try by compiling your program for c and you will find that lot of actually optimization which goes on here and it is not possible that if I look at this non-trivial to find out whether x plus equal y and x is being assigned x plus y they are equivalent. It is not easy for compiler to do this kind of small computation. So, many times you will find that when it comes to code generation compilers generate better code for this as compared to this. Because then they assume that the first operand on the right hand side is same as and therefore you are saying that whatever is the variable here that is being used to store the value of the left hand side. So, normally I mean y I mean c introduce these kind of operators reason was that I do not have to do too much of symbolic computation I do not have to do optimization at that level if operator if an user can help me in using these kind of operators then all this is safe. In fact, it becomes sometimes worse than this that suppose I have a long right hand side where x occurs somewhere then whether in general this can be transformed into something like this it is not free way. Computationally it becomes very possible you can do lot of symbolic analysis and you can do all this reorganization but that unnecessarily takes time and therefore if user knows that this is much faster you will find that so you will find that lot of c code when you read and there is a purpose for introducing these operators that I can generate more efficient code. Here how we generate code now for arithmetic operators so this can take care of all possible arithmetic operators it can even take care of Boolean operators if I am just doing assignment so we will see how the conditioners are handled but at least as far as assignment of Boolean operations is surprising the sensor you can see that I am just using off here I am not specify what kind of operation is this so any binary operator, unity operator or any single assignment it is going to work for everything so how do we handle I have an example so let us go through this example so what we have is suppose I am looking at a block in which the first instruction says p1 is assigned a minus and the notation is saying that whenever I use this p1, p2 and so on these are very variable generated by compiler and whenever I use like a, b, c, b, etc. they are the user variables and we will assume that all user variables are going to be like on the basic so what we do here is for this we say that my get register routine because nothing is in register it turns the register it says it can be used for computation which is r0 so I now first say move a to r0 and then I subtract from r0 and how do my descriptors change at end of this these two instructions this says r0 now contains p1 and this descriptor says that p1 is in r0 and then I have this instruction it says p2 is assigned a minus c now what do I do now I can see that a is now what happened here was that I decided that whatever is this value I am going through another load again here so now since this value is already in p1 we now say that move a to r1 and then we say that and the reason will become apparent when we look at the rest of the code by I am doing this extra verb and then I say subtract c from r1 so now at end of this we say that r0 contains p1 and r1 contains p2 and p1 is r0 and p2 is in r1 so let me actually give you the full code here so now this says p3 is assigned p1 plus p2 so now I find that both p1 and p2 are in registers and beyond this point p1 has no use so this register can be used and therefore what we do is we straight away instead of doing a move operation straight away to an addition and my descriptors change like this and then I say add p3 and p2 and p2 is in r1 so now I can use because beyond this point you will find that neither of r0 or r1 is going to have a use so what I do is I just do this addition and then store it into memory location d and my descriptors now say that r0 contains d and d is in r0 and d is both in r0 and memory location because of this move operation so this way my code generation can happen but when it comes to conditional I have to deal slightly differently and machines provide special hardware to deal with conditions so what happens here is that normally if you recall assembly language programming you have done you will find that there is something called the conditional code descriptor or condition code bits on the machine so whenever you execute an instruction on a machine simultaneously depending upon the value of the result some bits are going to be set so suppose the value is very high some overflow it will be set or value is too low overflow it will be set if value is 0 some 0 bit many set if value is negative some positive bit many set some negative bit many set if value is positive then some positive bit many set and so on and that happens automatically in the hardware and advantage of doing that is that when it comes to conditionals I can take care of many of the jumps by looking at these condition code bits I do not have to explicitly do the computations so what may happen is something like this that when I say I have branches so what may happen is something like this I say that if x is less than y then go to z now I for example if I look at first two instructions here this says move x into r0 then it says subtract y from r0 now I do not have to know the value of r0 I do not have to compare value of r0 with whether it is less than 0 or not simultaneously what will happen is that some condition code bit will be set which will say that if r0 is negative then that bit is set and what do I do here then I just say that jump if that bit is negative on the location z okay so basically what we do is that if one of these six conditions so it could be either negative 0 or positive or non negative non 0 or non positive and many machines will have up to 16 condition code bits and simultaneously on each operation more than one condition code bits will get affected okay we are going to set that bit and then we use this information here in the instruction and do this so I am you can see that I am not comparing r0 with anything but this condition code information is being used because condition code has been set because of this operation okay so what we are doing now is that we use condition codes and we indicate what was the last quantity which was computed and load even when I do a move for example for example when I say move x into r0 some condition code bit will be set okay depending upon what is the value I am moving what is the value of x okay so if value of x is positive then automatically move x r0 inside the condition code bit okay so this is loaded into location okay and depending upon what was the computation or loading which happened one of these bits is doing itself and I am just using this extra information but machines also provide the compare instruction where I do not even have to do a move and do not have to do arithmetic or logical operation I can state I will do comparison operation and comparison operation also is going to set one of the condition codes so what may happen is something like this when I say compare x and y this sets some condition code bit to positive if x is greater than y okay and similarly for other value so what I may do is if I say if I want to do code generation for this I may just say x and y and condition on less to z okay so this becomes now conditional jump okay so this is so there are two kinds of jumps one is unconditional jump where I just do not test anything I say straight away change the program counter to this new value so jump is nothing but changing the value of pc okay and condition conditional jump is you say that change the program counter to z only if this condition is met it says that it is less okay so I can use both these okay but then I can also use to remember something known as conditional descriptors so I have these two descriptors now I can also keep one more descriptor and say that what was the last operation which saved a bit or impacted a bit okay so what may happen here is something very interesting okay so look at this if I can remember suppose I say x is sign y plus z and then I say if x is less than 0 then jump okay now because of this computation I don't even have to have a comparison operation what I can do is if I can remember what I can do is I generate three instructions for x being a sign y of z so I say move y into r0 at z and then I say move r0 to x okay now when this happens then I know that one of the condition code bits is going to be set because of x now it is possible that these two instructions are not adjacent there is maybe there are few other instructions which are in between but they are not impacting the value of x okay I can still use if I can remember that the last time I loaded x this bit since that has not changed it was set by x okay so what can happen in that case is I can now just say that jump on negative from some level x this operation itself you can see that I can even say one comparison instruction if I can remember this particular thing that the last operation was the one which really changed this particular bit okay but for that I need to do additional bookkeeping and that additional bookkeeping is in terms of a condition code description so for each of these condition code bits I can say that which bit was set because of what operation if I can remember that then I can further save certain instructions okay but again you have to see that whether machine provides this information and machine provides all these facilities okay so some of the earlier machines for example did not have enough condition code bits they I think earliest I remember is a one byte condition code disker of condition code bits that means I took separate 8 conditions and some of the recent machines had 16 conditions and 32 conditions and so on all kind of funny things will happen but if you can remember those things then you can impact these things yeah okay so you can see that how do I handle conditionals and how do I handle how that we address codes okay and this as far as machine is concerned is really nothing but correct code bit is coming out you have a question in the last place in the last place if you have here so my conditional descriptor will say that but then see this descriptor here condition code descriptor so my condition code descriptor suppose I say that x is time y plus that and then I say compare say A and B suppose I have one more instruction now compare A and B is going to impact certain bit as soon as it impacts a bit my conditional condition code descriptor is going to say which bit was impacted and if it was the same bit which was impacted by x then my descriptor will change and I will get this instruction but suppose that bit was not impacted which was set by x then I can use so that is why I am saying I need to do additional bookkeeping because not all the hardware but the compiler has to do it instead of saving only one bit we can have a then we will be able to save many variables that change that that is the hardware see this descriptor condition code descriptor if I can say if I have 16 bits I will say that I will keep this condition code descriptor and say that for each bit so this is bit 1 bit 2 and so on I will remember that what impact but how is bit 1 set and bit 2 set and bit 3 set that is not in my control that is the control of hardware hardware has already decided that this instruction executes and these conditions are met then these bits are going to be cleared that is beyond us so the descriptor I can keep is only for all the bits and for the instructions which are impacting those bits right this question answer or not so let us move on and let us also see that hardware will get code for that now so far I have been looking at 3 address code but we also remember that at one point of time we started looking at DACs and how is a DAC different from 3 address code how is a tree different from an expression tree different from a 3 address code only name of temporary is not the same right so in a tree or DAC I will not have temporary I will have only internal loads and when I explicitly write 3 address code I am going to have this temporary names so basically what will happen is that when I start doing code generation for a tree or DAC you remember now that what is the name I am using internally for each of the nodes that is the only additional thing I need to do now right because these names have to be generated at some point of time so what will happen is that this is useful data structure so we have already talked about properties of these DACs but basically leaves are going to be leveled by the identifiers which are either variable or constants interior nodes are going to be leveled by an operator symbol and nodes are also optionally given now a sequence of identifiers for labels because I will have to jump somewhere so I need to have levels and this is the only information I will have as far as a tree or DAC is concerned other names I will have to completely generate okay so let us look at this particular use of code okay so this is some DAC some representation of a DAC where I am saying that I want to compute T1 which is 4 star I T2 which is array access with this particular array access of a so this is basically saying is ai then I recompute T4 star I and then I say T4 is signed VT3 so this is fetching ai this is fetching ai this is giving VI then I am adding the 2 and what is this doing then I am saying that I am after doing this multiplication I am adding it to a variable product this product is then assigned T6 I increment I and then I is assigned T7 right and then check whether this is less than equal to 20 and if it is then I jump to level 1 so what is this is doing is just taking 2 arrays and finding out their thoughts right this is the computation I am doing now you can see that here 4 star i is being recomputed now since I am talking about DACs you can see that I will not recompute but I will use the same nodes for this computation and similarly when I say T1 T1 is 4 star i and therefore T3 also is 4 star i so if I just compute this T1 I can access A with T1 and I can access B with T1 and as far as these 2 instructions are concerned which says T6 is assigned prod plus T5 and then prod is assigned T6 since there is a single assignment I could have said right away T6 is being assigned T6 or prod is being assigned prod plus T5 and here I can straight away say i is being assigned i plus 1 right this is how it so happened that this is how I did it okay so we look at i0 which is basically i and this is 4 the 2 are multiplied value is stored in T1 so this T1 actually now contains star 4 i and then I have A and T2 is basically saying that I am looking at base address of A and I am looking at index T1 and T2 is now an array operation which is saying AT1 so this is corresponding to the second instruction and then I say since T3 is a copy of T1 therefore instead of generating a new tree I just generate a label is currently called T3 and say that T3 now capturing the computation which is below this node and then I have T4 being assigned VT3 now T3 is same as this so what happens here yeah so we have B here and T4 is nothing but B and the same node which could either be called now T1 or B doesn't matter okay and the next one is T5 is assigned T2 multiplied by T4 so T2 multiplied by T4 is being assigned T5 these are the two edges which are not visible and then we say T6 is assigned prod plus T5 so I am looking at T5 and I am looking at prod I am adding the 2 and that goes into T6 and then I say prod is nothing but T6 so therefore this node is also given a label called prod okay and then I say T7 is assigned I plus 1 so T7 will now take this value so this is taking I plus 1 which is going into T7 and then I say I is assigned T7 so I now gets the same label and now I have this conditional it says if I is less than equal to 20 then I go to 1 so what I do here is I have this conditional which is testing for less than equal and is jumping on label 1 and it is testing this value but this comparison okay so this is how the DAG looks okay and then if I look at the code generation okay this is the straight code for the T not really the DAG okay so what happens here now here we are saying that S1 S1 is now some symbolic resource which is being assigned for star I I look at the base address of A and then I do this array and I do this computation but if I do it for the DAG what are the instructions I will say immediately you can see that this instruction will be eliminated and instead of S4 here I will use S1 but what happens to this part okay this part I will say that prod is generated so S8 will be eliminated because this is just a copy and I will say prod is assigned prod plus S7 and similarly this part will say that this is assigned I plus 1 and this is conditional so I will be able to eliminate one of the instructions here okay so this is how the whole code looks okay that this instruction gets eliminated and this instruction these two are merged in one these two are merged in one okay so this is the advantage of that okay and you can see that basically how do I do code generation and we shortly see how to do code generation for T's okay if I know it for how to do it for T and you can see also because the way we treat it thanks we are keeping extra labels for this thing so we have seen how to do code generation for 3 address code but we have not seen how to do code generation for T's as yet but basically since we are going to do that next okay I am showing you that basically you can handle the two things together okay so we will see how to do code generation for T's alright so let's move on okay now let's come to something I just mentioned about for this test field okay and if you recall in the beginning I said one of the optimizations was that I wanted to change the order of execution so now when I say that A is assigned B plus C and suppose B and C are expressions then which expression should be evaluated first suppose both are function calls so suppose I write a code like this here I say that T is assigned let's say F1 plus F2 and both are function calls okay this clearly says do an evaluation from left to right but what is normally not specified is many times which function should be evaluated first okay and suppose these are complex expressions okay so what master rules tells me that this are actually evaluated from left to right okay but most of the time we don't really unless they are explicit side effects and I am doing partial evaluation and side effects could be different okay I don't care about how do I do my evaluation whether it is left or right most compilers will try to do it in an order so that I can optimize my resources okay now how does changing the order of evaluation optimize my resources so look at a simple situation suppose I have some limited resources I have any resources okay now when I start doing this computation this could be an arbitrary complex expression I start doing this computation suppose efficient computation of this requires n registers and I have only n registers okay what will happen here that I am going to use all the n registers for this computation but finally I will leave this value either in a register or if I require more registers then I will do a spelling okay and I leave this store store this particular value in memory location now suppose I leave it in a register then I am left with one few registers okay now suppose this also required n registers okay then this register can come only because of spelling okay so what may happen is that if I change my order of execution so look at this code okay now this code actually is interesting what I have is t1 is being assigned a plus b then t2 is being assigned c plus b and t3 is being assigned e minus t2 so you can see that this particular instruction depends on this but if you look at t1 t1 is being used only in the fourth instruction okay now if I look at the tree corresponding to this again I mean these edges are not visible sorry about this but what is happening here is that I am on the left hand side I am saying that a and b are being computed and they are being stored in t1 and then on the right hand side e and d are being computed c and d are being added that is being stored in t2 and then I am saying e minus t2 is being computed and that is being stored in t3 and then I say x is now being computed by saying t1 minus t3 okay this is the kind of situation I have now suppose that I have only I have three address code and I have only two registers available that is an assumption I am making and I am doing my left to right evaluation so what can happen is that I will generate this kind of code so this is this corresponding to saying that I am moving a to a register then I am adding b to this so this gives me now a plus b okay and then I say c to a register so a plus b is now available in register r0 okay so one register is now blocked on the left hand side now I say move c to r1 and then I say add now d to r1 so I am now computing c plus b and beyond this now I say move r0 to t1 okay now why I am doing this basically now I want to do this computation which says that I temporarily free this register r0 this basically is now moving a register spelling and then I am moving into a register which has been freed by this particular spelling and then I do this subtraction and after I have done this subtraction and one of the registers has become free then I move t1 back into r1 and then I subtract r0 from r1 so basically after this r1 is being stored in x and now if I change order of execution here suppose I say that I am first doing this computation now you can see that this is a smaller expression which requires extra registers this is a smaller expression it requires one less register suppose I did this computation first then these two instructions I am generating which is moving r0 to t1 and moving t1 into r1 these two instructions would have been saved because this is basically is the register spelling so what kind of code I will generate this is now going to give me these two instructions are saved and basically this is saying now compute c plus b e minus c plus b which is being computed and that is being stored in r1 and then I am left with r0 and I can use r0 for doing this computation which says t1 is assigned a plus b so a plus b is now computed in r0 and then I do this subtraction so basically this is what is register spelling and this is what register reload is so sometimes changing the order of execution is going to give you faster code so normally there is one technique which is used is that when I am trying to code generation code please I will use dynamic program and I say that I keep on computing the cost of so cost could be in terms of number of registers and say that I keep on computing number of resources required for the left hand side and the right hand side and I determine which is the optimal way of doing it if a dynamic program immediately makes it your complexity, computational complexity goes up so you can either do it in single traversal and incur this cost but many times when you want to get optimized code then you use dynamic programming for doing that and find out that using computing this part into a register and then doing this computation is faster and you are not violated any rule because there are no side effects the computation is going to remain the same and therefore you can get much faster code as compared to this one now you can see that actually so this is another thing that has to sort of register in your mind and should come clearly when we take the faster code now you may start computing how much I am saying I am only saving two instructions no way but what you have to remember is that if I count number of instructions I have how many four and two six and two eight plus two ten instructions and when I say I am saving two I am saving two out of ten which is twenty percent saving and if this is part of a loop which is executing large number of times then this twenty percent saving can be very large in fact when people are doing optimization nobody is saying whether my program can run hundred percent faster or can double the speed or triple the speed no I mean people are saying can I program run ten percent faster twenty percent faster can I cut down ten percent that is a big game for people I mean imagine a situation where somebody's program is running for twenty hours if you say I can finish this in sixteen hours that is a big game for the person and all large computational program actually run in those days I mean in terms of hours and days they are not like when you type the program say eight out now press return and this is there you press eight out now and then you go home somewhere between that after three days and check your result normally this is what happens and that is there are all these optimizations done so even a small saving there five percent ten percent saving is matters a lot and that's the number of people are looking at nobody is looking at can I double the speed of my program and that is something unless we have very special hardware is utopian on a single machine no matter of optimization and a single processor is going to be that kind of speed unless you have some very specific program you will do this so rearranging order is important but there is not the only thing we want to do we want to do more optimization and this more optimization is what we know as people optimization so I'll just introduce this and then we can take a break so basically what people optimization is something like this I look at here this board so let me go back to an example we had yesterday that we look at this board as example here so what you want to do is I say that here are six instructions okay let me have a small window which I call a people okay and move it over this board okay and when I say move it over this code I say that suppose my people is of size two okay then you say that okay move it over this code and say look at these two consecutive instructions and say that can I convert this into a faster instruction okay now when you move it over this part of the code okay you will see that immediately notice that you are saying move R0 into a register and then you are saying move R into a movie A into a register and going load and store so people have created these kind of pamphlets now where you say I am doing this kind of store and then I am loading this value second instruction can always be ready okay similarly if I say that my window is of size three okay then I can experiment with this code and say oh I know this template this is just adding a constant value to register and I can then just replace with this by single instruction okay so people optimization in general is a technique where I examine my code after four generations it happens I do not do any of these optimization try at four generations and once I have this code then I define my P4 of size two three four and normally people have experimented and normally it was found these are empirical results got asked for any proof because there is none this says that normally if I have a P4 of size 4 to 5 then a large number of optimizations can be done by just mapping these templates okay so what you try to do is you create a lot of pamphlets and then move it over the code and whenever you find a template like this let us replace it by a faster instruction sequence so it is possible that I may take these two instructions replace this by one I may take these three instructions replace it by two and so on and do it few times and you improve quality of your code okay so what we will do in the next classes we will look at few pamphlets of people optimization and then we will also look at four generations of how to do four generations for piece like we did for three generations okay so this is where we take a break today and last class in the last class we will discuss these techniques okay on first day