 So, in the last lecture, we discussed the requirement for code optimization, types of optimizations and we also discussed the procedure for building a control flow graph. Today, we will continue with our discussion on value numbering and its use in local optimizations. To do a bit of recap, so here is the example that I showed you in the last lecture. This is a basic block containing 10 intermediate code statements and this is the directed acyclic graph corresponding to this basic block. The most important feature of this directed acyclic graph is that it can enable several optimizations. For example, here is 4 star a which is actually a constant because a is a constant. So, this constant propagation and constant folding can be performed using the directed acyclic graph representation. It cannot be so easily performed using the quadruple representation. Here is an expression i star j and here is another expression e star j, but both are equivalent because e receives the value of i during this execution. So, t 3 and t 1 are identical in value at all times and we can use t 1 instead of t 3 and delete quadruple number 8. So, this is called common sub expression elimination and this can also be performed using the directed acyclic graph representation. But, as I explained in the last lecture it is you know useless to try and build the directed acyclic graph using link data structures because every time we want to locate some node we will have to start from the top of the directed acyclic graph and search the entire graph as such. So, this is a waste of time the technique that is normally used is called value numbering. So, what we really do is we assign numbers to expressions in such a way that two expressions receive the same number if the compiler can prove that they are equal for all possible program inputs. We use hash tables then you know hashing techniques rather in this value numbering optimization and we assume that there are quadruples with binary or unary operators. So, there are three tables hash table value number table and name table. So, and the value numbering technique can be used to eliminate common sub expressions do constant folding and constant propagation in basic blocks. We can also take advantage of the commutativity of the operators addition of 0 multiplication of by 1 etcetera. So, here are the three you know data structures that we are going to use in our technique the first one is the hash table entry. So, the table will have records of this type the first field in the record would be the expression itself and the second field would be the value number. So, what we do is you know compute a hash value for the expression using a suitable hashing function. The hashing function must have you know must take into concentration not only the two operands of the expression, but also the operator itself the type of the operator and once we get a hash value we search the hash table and at that particular hash value we insert the expression and also a unique number called the value number itself. So, value number is nothing, but a unique number assigned to each expression. So, we are going to assign value numbers using a counter the second table entry is the Valnum table entry. So, this is the Valnum table and the record structure is shown here. So, this is indexed by the hash value of the name. So, there is name and then the value number. So, the difference between these two is that this stores expressions and this stores names. So, we cannot use the same hashing function for both expressions and names and that is the reason why we are using two different tables and two different hashing functions. The third table is the name table its entry is shown here. So, the entry has one field known as the name list another field known as the constant value and the third field known as the constant flag. So, this table is used to store the constant values of certain names. So, if there is more than one name which actually stores the same value then it is formed into a list. So, as I have written here the first name on the name list is the defining occurrence and it replaces all other names with the same value number with itself or its constant value. So, if there are five names here we really need to use only one of them in all places and the other four can be deleted from the program. If there is any computation associated with that name that can also be deleted along with it. So, if the name does not have a constant value associated with it then the constant flag will be false and the constant value a field does not have any relevance. If it indeed has a constant value then this will be said to true and the constant value can be read from this field. So, let us understand the algorithm using a you know an example program. So, let us assume that this is the high level language program that is given to us. So, a equal to 10, b equal to 4 star a, c equal to i star j plus b, d equal to 15 star a star c, e equal to i and then again c equal to e star j plus i star a. So, when we generate intermediate code you know for this sequence for this sequence it obviously it will be split into many instructions. For example, we may have a equal to 10 then b equal to 4 star a as it is. So, now the i star j goes into a temporary and then we have c equal to t 1 plus b then again 15 star a goes into another temporary and d becomes t 2 star c then e equal to i remains as it is t 3 is e star j t 4 is i star a and c becomes t 3 plus t 4. So, we have these 10 quadruples in our basic block. So, let me explain how the value numbering technique is applied on this program. So, we start with the first quadruple. So, remember that there are 3 tables the first one is the hash table for expressions, the second one is the name table for names, the third one is the rather the second one is the Valnum table for names, third one is the name table for constants. So, this is a equal to 10. So, the quadruple after value numbering remains the same what we do is enter this a into the Valnum table and also the name table because this is a constant and the effect of that is seen in the second quadruple itself. As soon as we have 4 star a you know and of course, I forgot to mention one thing. So, a is a new name that we have encountered. So, as soon as we enter it into the hash table we are going to generate a new value number for it and assign a that particular value number. So, when we have b equal to 4 star a the first thing we do is search for a in the Valnum table it is indeed found there and then you know we are using its value number we can search the name table and find the value to be a constant that is 10. So, immediately we know that 4 star a can be computed at compile time itself you know it does not vary during execution. So, 4 star a can be computed as 40 and the quadruple can be rewritten as b equal to 40. Now, b is a new name. So, we generate a new value number for it enter it into the Valnum table and assign it a new value number. So, this is the second thing that happens then we have the third one i star j similarly i is you know i and j are new names and they need to be entered into the appropriate entries into the table i star j is a new expression. So, we need a different value number for the expression also. So, the important thing is the name t 1 and the expression i star j will be assigned the same value number. So, whenever we find t 1 we know it is i star j and whenever we find i star j we know that it is t 1 itself and rather we can use t 1 for that i star j. So, now this quadruple remains the same the next one is c equal to t 1 plus b. So, t 1 is already in the tables b is also present in the tables we find that b is a constant it is actually from the name table. So, we can rewrite the quadruple as c equal to t 1 plus 40 and this new quadruple is now hashed into hashed and then entered into the hash table. The same thing happens with t 2 equal to 15 star a. So, a is found as a constant. So, it can for 15 star a becomes 150 and the quadruple is rewritten t 2 star c becomes 150 star c e equal to i remains the same the important thing here is e and i now get the same value number i had the same value some value number before. So, now e also gets the same value number now when we search for e star j in the quadruple t 3 equal to e star j the value number of e is the same as the value number of i and value number of j already exist. So, in fact we end up hashing e star j into the same slot as i star j because the value number of e is the same as the value number of i. That means this particular you know once the value number is the same the expression remains the same. So, essentially we can rewrite this t 3 a quadruple as t 3 equal to i star j. In fact, we can even delete this quadruple because all occurrences of t 3 can be replaced by t 1 you know because t 1 is also i star j then t 4 becomes i star j which is i star 10. And finally, we have another occurrence of c defining occurrence t 3 plus t 4 the old c is not relevant anymore. So, this c is given a new value number and entered to the tables and instead of t 3 we are going to use t 1 and t 4 remains as it is. So, 5 this particular t 2 equal to 150 is not necessary anymore because it is only use of t 2 was only in d equal to t 2 star c. And once we have expanded t 2 to 150 this quadruple is unnecessary this of course, is a common sub expression i star j has already been computed. So, we do not have to compute i star j again and we can use t 1 in place of t 3 we have already done that. So, quadruple number 8 can also be deleted. So, this is how we catch common sub expressions and this code is red code because we never use the value of t 2 again we can eliminate such red code and eliminate common sub expressions as well. And in the meanwhile we have all this has been possible because we propagated the value of a to these quadruples it was propagated to this place and this place and of course, this place. And here for 4 star a and 15 star a we also did constant folding that is evaluation of the expression at compile time. So, what I have written here is a record of what I just now explain. So, a equal to 10. So, a is entered into the valum table with a v n of say 1 and into the name table with a constant value of 10 then b equal to 4 star a. So, a is already found in the valum table its constant value is 10 in the name table. So, as I already explained we have performed constant propagation. So, we have evaluated 4 star a. So, that means we have performed constant folding. Now, b is entered into the valum table now a new value number say 2 is given to it and into the name table with a constant value of 40. T 1 equal to i star j. So, as I already said i and j are entered into the tables i star j is entered into the hash table and T 1 is entered into the valum number table with the same v n as i star j. Similar actions continue till e equal to i now e gets the same value number as i. So, in e star j e you know we have e and i with the same value number. So, e star j is nothing, but i star j i star j has already been entered into the table. So, we have got a common sufficient when we search for e star j. So, from now on T 3 it can be replaced by T 1 and this can be deleted. So, this of course T 3 plus T 4 already exist and have a new value numbers. So, T 3 plus T 4 is entered into the hash table the assignment is to c. So, c gets a different value number and the old c should be killed. So, quarter pulse are enumbered after the deletions. So, let me show you the tables after we have completed all the operations. So, i star j was entered into the hash table you know let us begin here. So, a got a value 1, b got a value 2 then we had i star j. So, a e and j got values of 3 and 4 then i star j was entered into the table with the value of 5. So, T 1 was assigned i star j. So, it gets a value of 5 then we had c. So, it got a value of 6 which is the same as T 1 plus 40 then I will explain 11 very shortly. So, T 2 gets a value 7 and this 150 star c gets 8 and d also gets the same value because it was an assignment. Then observe that e has the same value number as i. So, e has the same value as 3 which is i's value number. T 3 has the same value number as T 1. So, that is 5 and finally, T 4 has the value number 10. So, i star 10 has the value number 9 and i T 1 plus T 4 has the value number 11. So, this is how the value number table is used for the various optimizations. Now, I mentioned that it is possible to exploit the commutativity of operators. So, in other words if we have an expression i plus j in the hash table or let us say j plus i in the hash table and we try searching for the other. So, we search for an expression i plus j it fails then try searching for j plus i may be j plus i was already present in the table and i plus j was not present, but plus is commutative. So, whether we use i plus j or j plus i the value remains the same and therefore, we can conclude that i plus j and j plus i are equivalent in all cases. So, searching for one of the two forms and then using the value number of that particular expression which was found in the hash table will not change in the value or rather the result of produced by the program. So, they can be deemed as equivalent and given the same value number. So, we do not even have to enter it into the hash table. So, because we always search both i plus j and j plus i in this hash table if one of them is found we assume that there is an equivalence. If there is a quadruple i plus 0 it can be replaced with the quadruple x equal to i and similarly, the quadruple j star 1 can be replaced with j. So, the assignment becomes y equal to j. So, in both these types of replacements the value number of x becomes the value number of i and value number of i becomes the value number of j. So, quadruple x host left hand side variables are used later can be marked as useful all other unmarked quadruples can be deleted at the end. So, I also mentioned that c gets a new value number 11 because it was assigned again the old value of 6 is not used anymore. So, such value numbers you know the names are killed and then reused. So, the value number 6 never be used again. So, it should be the value 11. So, that is how you know there could be possibilities of i and j getting changed you know. So, and then we recompute the value of i star j. So, in such cases the second occurrence of i star j cannot be the same as the first occurrence. So, as soon as there is a there is some you know i star i and j are changed. So, we need to make sure that we do not use the old values of i star j. So, that is another thing that we need to keep in mind when we are using the value numbering technique. Of course, sometimes it happens automatically as soon as the variable i is reassigned value the old value number of i is thrown away. So, we will never hash into the same location as the old value of hash value of i star j. So, that is in some sense automatically taken care of if we do this killing for the old value of c. The next important thing is how to handle array references. Scalar variables do not pose this problem at all, but consider the assignment of arrays. So, we have the first quadruple x equal to a i and then we have another quadruple following it a j equal to y. So, it is not necessary that these two are in the same you know immediately after one another there could be other quadruples between one and two which have nothing to do with either x or a. So, in that sense this is the next quadruple which is relevant to this first statement a j equal to y suppose i and j are the same. So, then this quadruple really is a i equal to y that means the old value of a i has changed. So, when we take another quadruple z equal to a i and suppose i and j were the same we really cannot use the old value of a i which is available in x. So, in other words we cannot replace the above sequence by x equal to a i j j equal to y and z equal to x. So, we are trying to do common sub expression elimination here by not using a i by rather using reusing the value of a i which is present in x, but this is illegal because most of the time we do not know whether i can be equal to j or not equal to j. If we definitely know that i and j cannot be the same then we can use this sequence, but if we have no idea whether i and j are equal or not equal then it is not possible to rewrite the sequence in this fashion. So, the effect is when a j equal to y is processed during value numbering all the references to r a a so far are searched in the tables and are marked as killed. So, there could be many such references in above this as soon as we get a j equal to y this is an assignment to a. So, there will be many references to a i a f k a f l etcetera above this a j equal to y all these references will have to be killed they cannot be reused anymore because we do not know whether j equals any of those index values which have occurred earlier. So, in this sense the x equal to a i quadruple will also be killed once it is killed the automatically z equal to a i will not replace it as will not be replaced by z equal to x. So, as I already mentioned the fresh table entries have to be made for z equal to a i and unless we know i is not equal to j we cannot really do this comma section elimination. A similar problem arises when we handle pointers and also procedure calls. So, first let us look at pointer references. So, there is x equal to star p and then after a couple of statements which do not affect x or p we have star q equal to y. So, suppose p and q could be pointing to the same object. So, in that case we really in effect have star p equal to y which implies that the object p is pointing to has changed. Therefore, even though x stores the object pointed to by p we cannot say z equal to star p becomes z equal to y rather z equal to x. So, in this sequence is illegal that is because in most cases we have no idea whether p and q are pointing to the same object or not pointing to the same object. If pointer analysis has been carried out then the result of that pointer analysis may affirmatively say p and q do not point to the same object. So, in such a case this sequence can indeed be replaced by this sequence otherwise we cannot. So, if we do not know whether p and q point to the same object as soon as we process star q equal to y all the table entries created so far will have to be killed. So, p and q can point to any object in the basic block because we have no idea what p and q point 2. So, we must assume that it can point to any object in the basic block. So, every one of the entries in the table up to this point will have to be marked as killed that means the table is automatically emptied and when we use when we process z equal to star p the killed quadruples are not used for commons of expression elimination. Fresh table entries have to be made for z equal to star p and of course, as I already said if we know that a priori which objects p and q point 2 then table entries corresponding to only those objects need to be killed. We do not have to kill all the objects all the statements and all the objects in the table. Procedure calls are handled in a similar way they also have similar side effects for example, if we did not analyze the program the analysis is called data flow analysis. We need to assume that a procedure call can modify any object in the basic block how can this happen. Suppose we have call by reference parameters and we change them call by reference parameters when they are changed the originals also get changed. So, the variables which are accessed in the basic block will also or anywhere else you know with the same name as the call by reference parameter will get changed. Similarly, if there are global variables within procedures and the global variables are changed within the procedures they will also affect the other variables of the basic block because global variables may be used anywhere in the program they will also be used in the basic block. So, all these will affect the variables in the basic block. Why are we looking at only the basic block the reason is the value numbering technique restricts itself to the basic block. So, its effect will not percolate to other basic blocks. So, we do not have to worry about we are not detecting common sub expressions across basic blocks at this time. So, we do not really have to worry about the effect of all this beyond the basic block, but if the procedure call can be separated into a different basic block then this problem automatically gets eliminated because we do not consider effects across basic blocks. If that is not so, if the procedure call is in the midst of a basic block then all the table entries created so far for that basic block will have to be marked as killed and no common sub expressions can be detected and used. Therefore, usually we make a procedure call into a separate basic block along with its parameter evaluation and so on. So, now having done this local optimization on basic blocks. So, is it possible to actually process a sequence of basic blocks and say look let me do you know common sub expression elimination value numbering etcetera on these extended blocks. It is indeed possible, but before that we will have to define what we mean by an extended basic block. A sequence of basic blocks B 1, B 2 etcetera B k such that B i is the unique predecessor of B i plus 1. So, if we have B 1 and B 2, B 1 must be the unique predecessor of B 2, B 2 must be the unique predecessor of B 3 etcetera etcetera. And B 1 is either the start block or has no unique predecessor. So, then you know it must be the start block or it should have no unique predecessor. So, we will see what that really means. Extended basic blocks with shared blocks can be represented as a tree. I will show you an example of doing this as well. And shared blocks in extended blocks requires coped versions of the hash tables, name tables and valum tables. The new entries must be purged and change entries must be replaced by old entries. So, we use a pre-order traversal of the extended basic block in order to perform the value numbering. So, now let me give you an example of extended basic blocks and then continue. Here is a flow graph. So, there are many basic blocks start then B 1, B 2, B 3, B 4, B 5, B 6, B 7, B 8. So, let us form extended basic blocks from this particular flow graph. So, we start at the first basic block and then we go to the next basic block which it leads to which is B 1. And now start and B 1 are indeed in the same extended basic block. Now, let us try going to B 2. Unfortunately, B 2 does not have a unique predecessor in B 1. It also has another predecessor which is B 7. So, therefore, we cannot include B 2 into the same extended basic block as start and B 1. So, that is why we have separated start and B 1 into a different you know extended basic block. So, now we again begin at B 2. We go to B 3. B 3 has a unique predecessor B 2. Then we go to B 5. B 5 has a unique predecessor B 3. We go to B 7. B 7 unfortunately has other predecessors as well. So, our search for extended basic block must blocks which can be included in the extended basic block stops here. So, we have B 2, B 3 and B 5 as another extended basic block. So, it is listed here. So, if we take the other path we have B 2, then B 3, then B 6. Again it has a unique predecessor B 3, but we cannot go to B 7 for the same reason that it has many predecessors. So, B 2, B 3 and B 6 will be another extended basic block. Then we have B 2, B 4 and we cannot include B 7 because of the same reason that it has many predecessors. So, B 2 and B 4 will form a separate extended basic block. Finally, B 7 and stop will form the last extended basic block. So, we have five such extended basic blocks out of which start comma B 1, B 7 comma stop do not share blocks with any others. So, these two are separated as two different extended basic blocks. Now, the other three share blocks for example, B 2 is shared and then B 3 is shared. So, we can actually represent these extended basic blocks using a tree structure. So, B 2, B 3, B 5 is one extended block, B 2, B 3, B 6 is another extended block. So, they share not only B 3, they also share B 2 and B 2 is shared between these two extended basic blocks and also this B 2, B 4 extended basic block. So, this forest of trees represents the set of extended basic blocks that can be formed out of this flow graph. How do we apply value numbering on such trees? That is our question. Applying value numbering on this tree which consists of start and B 1 is straight forward. We really can merge all the information in rather quadruples in B 1 with start and apply hashing. The same applies to B 7 and stop as well, but the middle one is a little more complicated. We must perform a pre-ordered traversal starting from the root of the tree. So, we will start at B 2. Now, the old value numbering technique can be applied to the contents of B 2. Absolutely no problem there because it is just one single basic block. After the processing of B 2 is complete, we go to B 3. Now, because B 3 has a unique predecessor B 2, there is nothing wrong in detecting common sub expressions using an information and tables available in from B 2. So, B 2 and B 3 in some sense can be combined together to give you more benefit. So, we apply, we just extend the hash tables and other tables of B 2 by including the entries of B 3 as well, but a lot of caution here we may want to undo this effect little later as I will explain. So, we must keep these entries separated from the entries of B 2. So, B 2 and B 3 together now you know detect common sub expressions. So, there may be something here which is reused here. So, we do not have to keep that quadruple anymore. So, once the processing of B 3 is completed, we go to B 5. The same argument holds here as well. The quadruples of B 2 and B 3, the tables of B 2 and B 3 can be reused to give advantage in B 5 as well. So, expressions which are available in B 2 and B 3 can be reused here. So, common sub expression detection becomes even more effective. So, we extend the tables of B 2 and B 3 with the entries of B 5, but I already mentioned that we need to keep them separate even though virtually logically they are connected to the tables of B 2 and B 3, we will have to mark them as new entries. The reason is once the processing of B 5 is completed, we go back to B 3 and the pre-order traversal now takes us to B 6. So, that means the effect of B 5 will have to now be undone. The reason is the effect of B 2 and B 3 can be seen in B 6, the sub expressions of B 2 and B 3 can be still reused in B 6, but the tables of B 5 are not useful in processing B 6. Obviously, the control in the control flow graph will either go from B 2 to B 3 to B 5 or go from B 2 to B 3 to B 6. It will never go from B 5 to B 6 and therefore, the tables of B 5 will have to be thrown away, the tables of B 2 and B 3 alone will have to be kept and if anything has been marked as killed in B 2 and B 3 that effect will have to be undone and the tables which were present just before the entry to B 5 took place will have to be restored. Now, we can process B 2, B 3 and the entries of B 6 can take advantage of these. So, we do that and then we the pre-order traversal goes back unwinds goes to B 2 and then it will visit B 4. Now, it is time to undo the effects of not only B 6, but the effect of B 3 as well. So, the entries of B 3 and B 6 will have to be thrown away, the changes which were done to B 2 because of the processing of B 3 and B 6 will have to be undone. The old table which was present just before the entry to B 3 took place will have to be restored and then we enter B 4 process B 4 completely taking advantage of the entries of B 2. So, this is what I meant here when we said the shared blocks in extended basic blocks required scoped versions of tables. So, these are very similar to the scoped versions of symbol tables which we used when we processed blocks nested blocks and nested procedures. So, now the function visit ebb tree for you know does the value numbering on these extended basic block trees. So, it is quite simple. So, whenever we enter the enter a particular node and function visit ebb tree is called on that node, the new names will be entered with a new scope into the tables. When searching the tables we always search beginning with the current scope and move to the enclosing scope. In other words when we are processing B 5 we must search the tables for B 5 first and then the tables for B 3 and then the tables for B 2. So, this is very similar to the processing involved in symbol tables for lexically scoped languages. Now, so we call value number e dot b. So, process the block e dot b using the basic block version of the algorithm. So, let us assume that this does that. Now, these two are nothing but the pre-order traversal lines. If e dot left not equal to null then visit ebb tree on e dot left and then the right one if e dot right not equal to null visit ebb tree e dot right. So, after these two left and right sub trees have been visited we remove the entries for the new scope from all the tables and undo the changes in the tables of enclosing scopes. So, the main calling routine simply says for each tree t do visit ebb tree t. So, we just do this for all the trees in that forest and that completes the value numbering. So, this brings us to the end of the lecture on local optimizations. So, welcome to the lecture on machine code generation. So far in the previous lectures we have studied you know lexical analysis syntax analysis intermediate code generation semantic analysis and then we also saw some of the simple optimizations which can be local optimizations which can be done on the intermediate code. So, now it is time to understand how machine code can be generated from the intermediate code and after this long session on machine code generation we will move on to the machine independent optimizations and so on and so forth. So, in this lecture we will first understand the main issues in machine code generation. We will look at a few samples of generated code consider very simple code generators two of them to be very precise and then we consider the very important topic called optimal code generation which you know in fact generates the best possible code. We study three types of optimal code generators. The first one is the classical code generation algorithm due to Seti and Ullman dating back to 1970. The second one is the dynamic programming based algorithm again this is this dates back to 1976 or so, but it is still in use today. The third is the tree pattern matching based algorithm for machine code generation which uses dynamic programming and also tree patterns and after this we will understand how to generate code from directed acyclic graphs. The reason is we definitely you know do local optimizations using directed acyclic graphs. So, what we get after the optimization of the basic block is still a DAG. So, we must know how to generate machine code from DAG and then we will look at a special class of optimizations called people optimizations which are carried out on the machine code. So, here what exactly is machine code generation it is a transformation it is a transformation from intermediate code to machine code. So, when we say machine code machine code could be in binary form or it could be in assembly form this really does not matter to us because both assembly code and binary code are equally useful. It is just that if we generate assembly code we need to use an assembler to transform the assembly code to binary whereas, if we want to generate binary code directly then we may end up doing all the work of the assembler itself. So, in our lectures we will assume that we are generating assembly code and not binary. Of course, as usual we will assume quadruples and the control flow graph to be available to us. So, that we can process them to generate code. So, that is the transformation we talk about intermediate code to machine code. So, in this transformation which instructions should we generate the problem is for some of the quadruples it is you know definitely correct to have more than one sequence of machine code in this transformation. For example, take a simple increment a equal to a plus 1 it increments the value of location a. So, for this quadruple we may actually generate a single machine instruction called increment a if such an instruction exists in the machine. In some machines increment operation cannot be carried out on memory it can be carried out only on registers. So, in some other machines there may be no instruction for increment at all. So, in such machines we may want to generate load a comma r 1 this is the risk architecture style then either add 1 to r 1 or increment r 1 then store r 1 to a. The reason we had to do this was the machine performed all the arithmetic only on registers the only operations from and to memory were the load and store. So, we had to bring the operand into the rather the a there is nothing, but the operand of the right hand side we had to bring it into a register alter its value and put it back into the location. So, one of these you know obviously this decision will be based on the machine. So, this is 1 to many mapping which mapping is really good for us will have to be chosen by the code generator. One sequence actually may be faster than the other sequence in certain cases in this case of course, increment a if we had both options we would want to generate increment a because this is just one instruction and much faster than executing three instructions. So, the implication is there is a cost attached to the execution. So, if we are able to take care of compute the cost of the instructions which are generated then or the rather may be generated we may be able to make a decision regarding the sequence that is to be generated. The next one the next issue is in which order should the machine instructions be emitted. So, some orders may use fewer registers and or may be faster. So, in such a case the again the cost implication comes into picture and the number of registers used etcetera comes into picture the compiler would like to evaluate these possibilities and do its best. The next issue would be which registers are to be used in the machine code that is generated. So, optimal assignment of registers to variables is very difficult to achieve. In fact, the problem is very famous it is called as the register allocation problem. So, this will say certain variables it will the allocation algorithm will decide on which variables should be placed in registers and which variables need not be placed in registers. So, optimal assignment of such registers is such you know registers to variables is a very hard problem NP complete. So, there are heuristics which can be used for that purpose. We will study these heuristics at a later point in time. Should we optimize for memory time or power? So, I already mentioned in the lecture on optimization that we can optimize the space the time or energy it power consumed by the program. Is the code generator easily retargetable to other machines? The problem is code generation is very tricky and very difficult. So, if we are able to actually write a specification of the machine for which we want to generate code and the code generator can be produced automatically from the specifications of the machine that is the best scenario. So, in such a case we simply have to you know have a code generator generator for which you know processes the different machine specifications and then easily we can generate code generators for any type of machine. So, in the absence of that we will end up writing machine code generators for each one of the machines by hand. So, let us look at some of the samples of generated code to begin with I have listed actually the more complicated types of statements here straight forward quadruples such as a equal to b plus c are very easy to handle. So, let us look at the more complicated ones such as involving arrays and pointers and conditional statements and so on. So, this is a quadruple b equal to a i. So, first of course, this means we will have to take the ith element of the base address a from the base address a. So, remember this a i it does not mean it is a single dimensional array. We have already understood how to translate you know multi dimensional array references to single dimensional memory reference. So, this a really can be the base address or the starting address of a multi dimensional array it need not be the starting address of a single dimensional array, but for our purpose it really does not matter at this point it is a sequence of bytes and we simply want to take the you know appropriate byte from starting from the address a and rather the appropriate 4 or 8 bytes from the starting address a and then put it into b. Let us see how the machine code for this looks like. So, we load i into the register r 1 and then we multiply the value of i by 4 the reason is we are assuming that every entry in the array a corresponds to 4 bytes. So, we say r 1 equal to r 1 star 4 then we use indexed mode of addressing. So, starting from the base address a this is the index a of r 1 comma r 2. So, take the contents of a plus r 1 and put it into r 2 finally, store r 2 into b. So, we have to generate 4 instructions just for this one intermediate code. The same is true for x j equal to y as well we have load y comma r 1. So, no problem load j comma r 2. So, that is the left hand side part then multiply r 2 by 4. So, that takes us to the appropriate place within the array where we can load y and then store r 1 comma x of r 2. So, x of r 2 gets r 1. So, this is the indexed assignment. So, this is the sequence of instructions generated for just one intermediate code here. Similarly, x equal to star p translates to load p comma r 1 p is a pointer an address and then with you know this is nothing, but indirect addressing 0 r 1 implies address of contents of contents of r 1 goes to r 2 and we store r 2 to x. Star glue equal to y is similar. So, load y r 1 and load q r 2 and store r 1 comma 0 r 2. So, again this is the indirect mode of addressing. If we have x less than y go to l then you know we load x into r 1 we load y into r 2 compare r 1 and r 2 and branch on less than 0 to l. So, this is the sequence which appropriately you know performs gives you the same effect as if x less than y go to l. So, now let us look at slightly more complicated scenario involving activation records static allocation dynamic allocation sub routine jump etcetera etcetera. The three address code has say code for this is the code for function f 1. So, this is these are the two functions that we have. So, within the code for function f 1 we have some action code segment a number of instructions in intermediate code and then there is a procedure call. So, in this example we are concentrating on the procedure call and activation records etcetera the rest of the intermediate codes are not very important to us at this point. So, there is a call f 2. So, we are calling this function f 2 action code segment 2 after the return we execute this piece and then halt code for function 2 has an action code segment and then a simple return. So, we are looking at a static allocation and we assume that there are no jump sub routine instructions in the machine instruction set. So, we will have to actually do a jump. So, because we have to do a jump we also need to store the return address in the activation record itself. So, the activation record is a static activation record in other words it will never change you know the same static the activation record is used for all invocations of the function f 1 and f 2 no recursion is possible with static allocation. So, this does not create a problem for us. So, here is the first one in the offset 0 we starting from the top we store the return address then we have a data array a we have the variable x and variable y. So, totally 48 bytes are needed for this f 1 similarly for f 2 we require 76 bytes return address there is a parameter as well then another an array and a variable m. So, here is the code that may be generated by a compiler code for function f 1. So, say we start from address 200 then there is the code for segment 1. Now, it is time to call the function. So, we have to you know move hash 264 comma 648 264 is nothing, but the address of the return address where we need to return then the parameter is moved into the location 652 which is the place for the parameter then we have a jump which is the call to this you know this 400 is the function right and then we have in this function when we want to return we take the return address from the stack and then jump at 648 648 stores the return address. So, we come back to this action code segment for 2 execute it and then we halt. So, and here is the activation record format for f 1 600 to 647 and then the action activation record for f 2 648 to 723. So, we will stop the lecture here and continue in the next class. Thank you.