 Welcome to the set of lectures on intermediate code generation. So, in this sequence of lectures, we are going to learn about different types of intermediate code, why intermediate codes are required. And we will also see how the attributed translation grammars can be used to generate intermediate code for various constructs. So, to begin with and to put the intermediate code generation phase in the right perspective, let us consider the compiler overview diagram that we have seen many times so far. So, we have once the character stream goes through the lexical analysis, syntax analysis and semantic analysis stage, we get the annotated syntax tree with over which intermediate code generation can be performed. So, the output of this will be sent to the machine code optimizer. So, this is the perspective of intermediate code generation. So, we will look at you know generation of the intermediate code using the you know SCATGs that is the synthesized attribute translation grammars. And we will also look at some aspects of code generation using LATGs that is the L attributed translation grammars. So, let us see the other uses for intermediate code in the interpreters as such. So, what is the difference between compilers and interpreters? Compilers generate machine code and interpreters generate intermediate code and then they continue with that process and interpret the intermediate code as well. So, when we say intermediate code is interpreted, the implication is the entire runtime environment that is required by the program to run is also provided by the interpreter itself. In cases such as java, the intermediate code is actually produced by the compiler and then there is a separate interpretation phase whereas, in other languages such as Perl, Python or even our Unix shell basic list, the compilation process to produce intermediate code and the interpretation process are in the same program. Obviously, interpreters are much easier to write and can provide better error messages than a compiler because the optimization and machine code generation phase is absent in an interpreter. The symbol table is still available to an interpreter and therefore, error messages are easier to provide and better error messages can also be provided, but the catch is interpreters are very slow at least 5 times slower than the machine code generated by compilers. To offset this problem or the deficiency, the java runtime system and the interpretation system produce you know in actually provides what is known as a just in time compilation. So, in jit compilers the interpreter code is actually compiled into machine code and then run. This is very useful if the code is going to be run again and again. So, in such cases jit compilers are probably very close in execution speed to the compiler code. Interpreters also require much more memory than the machine code generated by compilers because interpreters all all said and done also have the symbol table and other data structures. So, and they really need to simulate the entire machine environment in which the code is supposed to run. So, all this requires much more memory than the that required by the machine code which is generated by compilers. So, I already said that pearl python unix shell java basic and lisp are examples of interpreted code whereas, compile code we all know c c plus plus you know Pascal and many other languages produce the compilers for these languages produce machine code. So, now the big question that needs to be answered properly why do we require intermediate code at all the other option is you have source languages and you have target machines. I just write a compiler for the source language a and the target machine x. So, why cannot this be done. So, let us look at the implications of this process. So, let us take an example there are four source languages and there are three target machines and we want to implement all the four source languages on all the three machines. So, to begin with we obviously require four front ends which do lexical analysis parsing semantic analysis and intermediate code generation and if the intermediate code is immediately converted to machine code within the compiler or it could it is also possible that the intermediate code is not produced at all. So, intermediate code could be at a very high level such as an abstract syntax tree in these cases and it will not be at a lower level such as the quadruples that we are going to use in our intermediate language study. So, for all practical purposes we can say that the source language is directly compiled into machine code. So, four front ends which actually do the first part of compilation then there we require four into three twelve optimizers which will optimize the code and we also require four into three twelve machine code generators. So, really speaking this order you know is kind of interleaved because we produce machine code and we also optimize the machine code itself we do not have any intermediate code here. So, these two actually mix or mixed with each other some of the optimizations are done on the basic blocks in the machine code whereas, some of the optimizations are done on loops etcetera. So, this is a fairly heavy investment for each of these languages we require an optimizer and also a code generator. Let us see what happens when we have an intermediate language. So, definitely we require the four front ends which do go up to semantic analysis and they produce intermediate code. So, instead of producing machine code in such a case we require four front ends and then the intermediate code optimizer just one of them is enough because all the four source languages compile into the same intermediate language and the intermediate code can be compiled you know into the machine code. So, we require three different machine code generators as well. So, the extra is in the case of in the first case is quite a bit we require a large number of optimizers and machine code generators whereas, here we are able to reuse the machine code generators and of course, you may argue that this front end and this front end are not the same because in this front end we do not do any intermediate code generation whereas, in this front end we do some intermediate code generation, but producing intermediate code is very very simple as we are going to see and it definitely is not as difficult as writing too many optimizers and code generators. So, this is one of the problems. So, too much code to write too much code to debug now the problem is we are not able to reuse the code that we have written so far. So, the code optimizer is one of the largest and extremely difficult to write components of a compiler and since in this case we have a machine code optimizer and not an intermediate code optimizer which is independent of the machine language. We really cannot reuse the optimizer written for this language in this particular code generation system or code optimizer system each one of them will have to be rewritten whereas, if you produce intermediate code the machine independent code optimizer is just a single piece of code you know it can be reused with all the compilers. So, this is a very efficient you know solution to the problem of producing many compilers for many source languages and machines. What are the various types of intermediate code that we have available in literature? So, to do that we must first of all understand what is the level at which intermediate code is positioned. So, the first of all intermediate code must be very easy to produce and it must be easy to translate to machine code this is something in between the source language and the machine code. So, you can call it as a sort of universal assembly language and obviously, because this is supposed to be independent of any machine it should not contain any machine specific parameters such as registers addresses etcetera etcetera. The type of intermediate code deployed is based on the application. So, there are many of them for example, we have quadruples we have triples we have indirect triples we have abstract syntax trees these are the classical forms of intermediate code and these are used for machine independent optimization and machine code generation. So, this is the traditional use of these intermediate codes recently when I say recently it is still about 10, 15 years ago that the static single assignment form was invented. So, this is a form which is very effective for certain types of optimizations. So, for example, there is an optimization called conditional constant propagation and another optimization called the global value numbering these are far more effective on the static single assignment form and rather than on the traditional you know intermediate codes in the form of quadruples and triples. Finally, the program dependence graph or the PDG it has already you know has been in use for many decades in the automatic parallelization of code and they are also useful in instruction scheduling and software pipelining phases of the machine dependent optimizer. So, these are the various forms of intermediate code starting with the classical forms then the SSA and the PDG. So, we are going to really study all forms of these intermediate codes in the coming lectures. So, let us look at conceptual intermediate code called the three address code. So, let me emphasize that the three address code is really a generic form of intermediate code and it can be implemented as quadruples, triples, indirect triples, trees or dag I will give you some examples very soon. In the three address code the instructions are extremely simple there are three examples of instructions here a equal to b plus c x equal to minus 5 if a greater than b go to l 1. So, these are three examples of intermediate code we will see many more as we go on the in the assignment statements of this kind either a equal to b plus c or x equal to minus 5. The LHS is the target and the RHS is has at most two sources and one operator. So, this is the operator and the b and c are the sources why did we say at most two sources in the case of such simple you know unary op instructions we have just one source. So, maximum of two and minimum of one if you consider the branch statement. So, even here we can say this l 1 is the target and these are the sources and this is the operator. So, RHS sources can be either variables or constants. So, we can say a equal to b plus 1 we can say a greater than 2, but we cannot definitely say 2 equal to b plus c. So, the left hand side must always be an address. So, let us take a simple expression a plus b star c minus d slash b star c the interpretation would be subject to our usual understanding of for the operators. So, the multiplication takes precedence over plus plus and minus are at the same level and slash and star are also at the same level. So, the first one is the first intermediate instruction would be a equal to b star c because we cannot do a plus b first we will have to do b star c first. Then the second instruction is t 2 equal to a plus t 1. So, we have evaluated b star c then we say a plus t 1 the third one is again t 3 equal to b star c this particular thing because we cannot do division before we evaluate this and since division has more precedence than minus we will have to do the division first to do division first we will have to do multiplication even earlier. Then we do t 4 equal to d slash t 3 and finally, t 5 equal to t 2 minus t 4. So, a few points have to be emphasized here of course, the form of the intermediate code it is that of you know three address code here. So, we have one binary operator and two operands in each of these instructions more important the left hand sides are all temporary variables which are generated during the intermediate code generation phase. So, it is very important to remember that the intermediate the intermediate code employs a large number of temporaries and these temporaries will be generated as we as and when we require them. There is usually no reuse of temporaries after their work is over we just generate new temporaries and go on using them the machine code generation and the optimization phase will take care of eliminating the redundant temporaries. Here is an implementation of the three address code or rather many implementations of the three address code. So, we have three address code then the quadruples then triples then syntax tree and DAG. So, traditionally this has been used as the textual form and the other four are used as data structures inside the machine or the inside the compiler. So, the quadruple you know gets its name because there are four fields in each instruction. So, op arg 1, arg 2 and result. So, it is possible to in fact show even jumps using the same format because the as I said the result is the jump target and arg 1, arg 2 are the arguments of the expression and op is the relational operator. So, this is just you know listing of the three address code here there is nothing very special here. So, we can this is self explanatory triples are slightly different they we really do not show the temporaries explicitly in the case of triples. So, let us go through them the first instruction is star b c and we have not shown any temporary. So, when we want to do t 2 equal to a plus t 1 a is depicted here and instead of t 1 we provide the index of the instruction which computes that particular operand. So, in this case this is the instruction 0 star b c is the instruction which is executing. So, next we again do star b c then we have slash d n 2. So, t 4 equal to d slash t 3. So, t 3 is this particular you know instruction. So, we provide the index of that instruction here as 2 finally, for minus we say t 2 minus t 4. So, t 2 is number 1 this particular instruction and t 4 is number 3 that is this particular instruction. So, really speaking this is nothing but a straight forward encoding of the tree in this array form. So, if you look at the tree this is easy. So, we have you know star b c here and then we have a plus b star c then we have b star c here and then d slash b star c and then finally, a minus. So, this is nothing but an array encoding of this tree that is it. What is a directed acyclic graph representation of this tree address code? It is very similar to that of the tree with the difference that whenever there is a some expression there is some expression which is already available we do not recompute it, but we simply make the operand pointer point to it. So, in this case b star c has already been computed and therefore, the tree for b star c is right here we just point the right operand of this slash to this particular subtree and that is why this is a directed acyclic graph and not a tree representation. The important difference between DAG and all other forms of intermediate code that we have here is that these catch what are known as common sub expressions. So, there is no expression which is recomputed unnecessarily it is all reused again and again whenever necessary and of course, if possible why did I say if possible suppose you know suppose you assume that either b or c has been assigned a value before b star c. In that case this particular b star c and the prior occurrence of b star c are obviously, very different and in such cases there is no question of reuse we recompute b star c. So, this is how the 3 address code is actually implemented in practice. So, in our discussion we will use 3 address code of the textual form in this form and we will say that the machine implementation can use any one of these. So, what are the various forms of 3 address code? So, I gave you a very few examples now let us look at the exhaustive list. There are many types of assignment instructions. So, a equal to b y of c, a equal to u of b and a equal to b y of is a binary operator it can be arithmetic operator logical operator or relational operator u of is a unary operator it is either an arithmetic operator or a shift operator or a conversion operator or logical operator. So, minus also is included. So, I missed it. So, minus shift and conversion are all arithmetic type of operators and logical operator is the complement operator. So, what exactly is special about conversion? Minus and shift we understand already conversion is useful in converting integers into floating point numbers and floating point numbers into integers characters into integers and so on. So, we saw in semantic analysis that we look at the coercibility of various types. So, if the coercibility is defined by the programming language then we need to convert these operands into suitable types before we emit the intermediate instruction corresponding to it. So, we are going to look at this also in the intermediate code generation phase. Then we have several types of jump instructions. So, there is an unconditional go to L. So, L is the label of the instruction to either target instruction. If T go to L, so if T is true then jump to L, if A L of B go to L. So, if A L of B is true then jump to L otherwise continue. So, here T is a Boolean variable. So, either take 0 or 1 A and B are either labels or constants. So, then we have many types of instructions to take care of function declaration and function call. For the function declaration we require a function begin and name of the function, a function and instruction to end a function. Then to pass a parameter and place it on a stack we require param p instruction and this is a value parameter. There is a ref param p which is required for a reference parameter. So, different types of parameter schemes will be learnt a little later, but now I should tell you that value parameters actually evaluate the expression which is passed as a parameter in the high level language and then place that value as the parameter. Whereas, in the case of a reference parameter the expression is evaluated and the address of that particular value is placed as a reference parameter. Then there is a call f comma n which is a call instruction to call a function f with n parameters. There is a return instruction without any value and there is a return a instruction in which we return a value from the function. Then we have indexed copy instructions. So, a equal to b of i. So, b of i looks like it is an array, obviously b is an array, i is the index into that array. The only difference is even though this appears as a single dimensional array, we are really going to convert multi dimensional array accesses to such single single dimensional array accesses. So, that is why this is intermediate code, we are breaking down higher level statements into lower level statements. A is said to the contents of contents of b plus contents of i. So, usually in a if it is a simple array, then b is the base address of the array and i is the offset into that array. So, you take the base address at the contents of i, then you know you get the place where we actually want the value. So, access the value of that particular place and put it into a, this is the semantics of a equal to b i. Similarly, a i equal to b implies ith location of array a is said to b. So, again as I said this could be a translation of the multi dimensional array into a single dimensional array, this may be the result of that. Then we have, so you must also observe that we do not have any instruction of the form a i equal to b i, this is because a i is already you know it has an indexing operator. So, here for example, if you say a equal to b of i, just like a equal to b star c and star being an operator, here we have b and i as source operands, the indexing is the operator and this is the target of the assignment. Similarly, here as well i is an operand, b is another operand because they are not modified and it is an assignment. So, and then of course, indexed assignment is the operator. So, this is usually indicated as a brackets equal to and this is indicated as equal to b brackets. Point assignment we have a equal to and b which gets a to the address of b that is a points to b star a equal to b. So, we take b then evaluate the address as star a. So, take the contents of a treat it as an address go to that address and you know that is where we are going to put b. So, contents of contents of a is said to contents of b, the effective address is obtained by looking at the contents of a and go to that particular place. So, it is not a single level addressing here, there is indirect addressing mechanism as well a equal to star b is similar. So, a is said to the contents of contents of b. So, contents of b would be an address. So, we have to take the contents again. So, here also the contents of a would be the address where b is placed. Now, we are going to look at a series of programs and the intermediate code that is produced by a typical compiler for such programs. The c program has in a equal to a 10 b 10 dot product and i they are all integers a and b are a raise of size 10, dot prod is assigned 0 to begin with initialize to 0. There is a loop which starts from 0 goes up to, but not inclusive of 10 and it is incremented once with an increment of 1 every time. Dot product equal to dot product plus a i star b i that is the meaning of this. So, we compute the dot product and the translation is quite straight forward. The declaration does not have any translation. Obviously, there is no code produced for declarations. We start with dot prod equal to 0 this is already in a very simple form. So, there is nothing more to do. Then we have i equal to 0 this is the translation of the loop. So, we check whether i greater than equal to 10 if. So, go to l 2 that is the exit of the loop otherwise the body of the loop. So, now take the address of a. So, in fact the address of a could be the stack pointer value pointing to the place where a is placed. Then we have a second instruction t 2 equal to i star 4. So, we are now translating a of i then t 3 equal to t 1 of t 2. So, essentially we are doing a of i with these three instructions. So, you can easily see that a single instruction a of i rather single access a of i translates to three instructions in the intermediate code. Then we translate b of i. So, which is t 4 equal to address b, t 5 equal to i star 4 and t 6 equal to t 4 of t 5. So, this is effectively b of i. Now, we do the multiplication. So, t 7 equal to t 3 star t 6. Then we add that to dot product. So, t 8 equal to dot product plus t 7. Then we must assign it back to dot product. So, dot product equal to t 8. Now, we do the second part the increment here for the loop t 9 equal to i plus 1 and i equal to t 9. So, you should also observe that the intermediate code generation produces really dumb intermediate code. It is easy to see that this is nothing but i equal to i plus 1, but we do not do that. Even this it is nothing but dot product equal to dot product plus t 7, but we do not do that. The intermediate code generation that is why is a very simple minded program and optimizer anyway is necessary to improve the program. Finally, there is a go to l 1 which repeats the loop. So, this is the intermediate code produced it is just like you know assembly code for this particular program. So, let us look at a second example the same dot product program, but let us say we use a pointer to run through the arrays instead of using indexing as we have done here a i plus star b i. So, a i star b i instead of that let us run through the arrays using pointers. So, we have a 10 we have b 10 dot prod and i as integers then we have pointers to integers in star a 1 and in star b 1. So, we start with dot prod equal to 0 a 1 equal to a. So, the pointer a 1 is pointing to a pointer b 1 is pointing to b the loop is the body is different, but the loop header is the same. So, we write dot prod a plus equal to star a 1 plus plus star star b 1 plus plus. So, what is the meaning of this assignment? We do star a 1 first. So, that gets you the contents of the array similar in a way similar to a of i. Then we must go to the next location in the array in this case we actually in this case we did a of i and then this i plus plus took care of progressing to the next element in the array. Since we are not using the i to index into the array we must alter the pointer itself. So, after star a 1 we do a 1 plus plus. So, that automatically takes you to the next element in the array similarly star b 1 gets you the contents of that location and b 1 plus plus will take you to the next location. Multiplication of these two will produce the product and then we add it to dot prod that. So, that is really the same dot product that we had seen earlier. Here the loop variable i is not used in the computation, but it is used only for the termination of the loop. So, let us see what the code corresponding to this could be. So, this is easy dot prod equal to 0 then the pointer assignment a 1 equal to ampersand a. So, that is the address of a similarly b 1 equal to ampersand b address of b then the initialization of i i equal to 0. Now, the loop so this part is the same if i greater than equal to 10 go to l 2 the body of the loop first we do t 3 equal to star a 1 we have an intermediate code instruction for that then we do t 4 equal to a 1 plus 1 that is the a 1 plus plus part then we do a 1 equal to t 4. So, these two together do the auto increment on a 1 then we have star t 5 equal to star b 1 t 6 equal to b 1 plus 1 and b 1 equal to t 6. So, that is the star b 1 plus plus we do the multiplication t 3 star t 5 then add it to the dot product in these two as before then the loop control here. So, this shows an example with the pointer to the array instead of indexing the third program shows you a function for the dot product. So, these are all different variants of the same computation. So, the function is int dot product it takes two arrays as parameters int x and int y we have d i as integer variables inside the function d is initialized to 0 then the loop runs exactly the way it used to and we have d plus equal to x i star y i exactly the way it was in the normal program the main program before. So, return d returns the value of the dot product. So, func begin dot product obviously, beginning of the function requires this intermediate instruction then d equal to 0 and i equal to 0 as before. So, the loop control is also as before. So, nothing to expand, but here after the loop terminates we need to return the value of the dot product and then you know go back to the program. So, return d combines the tasks of value return and return to the main pro the callee. Func end of course, ends the function in the body of the program the code is not very different. So, I am not going to expand it all over again explain it all over again. So, we have address x i star 4 t 1 t 2 address y i star 4 t 4 t 5 t 3 star t 6 d plus t 7 d equal to t 8 etcetera. Now, we should also see how the function is called. So, what is special about this? This shows you how the function is return and how the values are returned by the function. In the main program we have int p int a 10 b 10 and then p equal to dot product you know a comma b I have skipped the part where we read values into a and b. So, func begin main. So, main is also a function in c and then the first parameter array is always passed by reference. So, we have ref param a and the second parameter b is again an array. So, it is ref param b the base address of the arrays are passed in these places then we also need a place for the result. So, ref param result. So, you must keep in mind that this location result is actually in the in this main program it is not a part of the function. Therefore, the code generator must be able to produce the appropriate code for this return instruction. So, we will see that later anyway then there is a call to dot product and the number of parameters is 3 including the result. So, a b and the result then come out p equal to result the result would have been assigned value by the function and then we have func end. So, these are a couple of examples to show the various constructs in the intermediate code and one final example will show you how recursion is handled in the intermediate code. So, we have the famous factorial function here int fact n if n is 0 return 1 otherwise return n star fact n minus 1. So, it is quite straight forward nothing very special here func begin fact if n equal to 0 go to l 1. So, in l 1 we have return 1. So, that is this part right then we compute n minus 1 push that parameter you know using the param t 1 then the result ref param result then call fact with two parameters first is 3 1 the second is the result. Then t 3 accumulates the value n star result and we return t 3. So, as I said since return combines the combines two functions one is sending a value back to the caller and second is to return to the caller. So, there is no question of the control flow going to return 1 after the return t 3 instruction. So, nothing to worry here. So, that is about the examples of various types of intermediate code how they are produced rather what intermediate code is produced for programs and so on. So, now let us delve into the details of producing such intermediate code for various constructs in the language. So, let us look at code templates for if then else statement. So, the form of the if then else statement we already know very well if E S 1 else S 2 the other way is if E S. So, the assumption is we do not have what is known as a short circuit evaluation for E. So, we will see a little later that if E is a Boolean expression we can actually have jumps out of the expression E if you produce what is known as a control flow code for the Boolean expression. So, this is known as short circuit evaluation for E. So, let us assume that there is no short circuit evaluation in other words the expression E is evaluated completely with no jumps and then a decision of whether it is true or false is made. So, obviously the code that must be produced for this is quite intuitive. So, first of all we must produce the code for E then let us assume that the result of this is in the temporary T then here we must check whether T is true or false. So, if T is false go to you know the else part has to be executed. So, go to L 1. So, that is why the jump if the T variable contains a true value then we execute S 1. So, code for S 1 now after S 1 we actually have come to this point we should not fall through and execute S 2 we should actually jump to outside of S 2. So, that is why go to L 2 which is the exit, but there are also cases where there are jumps from within S 1 we will see examples of this very soon to understand it similarly there will be jumps within from within S 2 as well. So, all exits from within S 1 and S 2 also jump to L 2. So, this is something we must be careful about and I will show you examples of how this can happen. If E S is only a subset of what we have discussed so far code for E then the branch statement to check whether T is true or false and then the if it is false then we go out otherwise we execute code S and then go out. So, all exits from S also jump to L 1 what about the while construct again we have no short circuit evaluation for E that is the assumption we will consider short circuit evaluation a little later. So, we produce the code for E the result is in T then as usual we must check whether E is true or false. So, if T less than equal to 0 that is false we jump out that is L 2 if E is true then we continue and execute S. So, the code for S must be produced after code for S we must go back to the code for E evaluate it and continue with the loop. So, there is a go to for L 1 the other special case here is if there are any jumps out of S all these must actually jump to L 1. So, that must be taken care of. So, let us look at an elaborate example to show how the jumps from statements within can also arise. So, let A in this example be assignments and E I be expressions. So, the code is if E 1 and in the then part we have a complete if then else again and in the else part we have L C A 3 and after this entire statement of the outer part we have the next statement A 4. So, the intuitive understanding is we check whether E 1 is true if E 1 is true we execute the second if then else if it is false we execute A 3 and then go on to A 4. So, if it is true we come inside we again check whether E 2 is true if it is true then we must execute A 1 and then jump to A 4 directly. If it is false we must execute A 2 and then jump directly to A 4 we should never execute A 3 after any one of these. So, this is how jumps from within a statement can arise. So, there is a jump from here and also a jump from here which should actually take you to A 4 and not to A 3. So, let us see the code for this there is code for E 1 then the temporary for E 1 is tested T 1 less than equal to 0 go to L 1. So, that would be the code for A 3 the else part. So, in red we show the code for the outer part and in violet we show the code for the inner part then the code for E 2 if that is false the E 2 expression is false go to L 2. So, that is the else part of this second expression second if then else we execute that and then jump to L 3 that is the code for A 4. So, observe that this is the jump out of the inner statement in this if then else then we have code for A 1 and we jump to again L 3. So, as I was saying it is this jump then we have code for A 2 jump to this thing and finally, code for A 3 and fall through to A 4. So, that is how the jumps out of inner statements can arise. Let us look at an example of the while statement as well. So, here is the while part and inside we have an if then else and finally, we have the body of this while loop is an if then else and finally, we have another assignment statement. So, while the expression even is true we go on executing this and then finally, we jump to A 3 when the expression becomes false. So, again after A 1 there is a jump out of the if then else and after A 2 again there is another jump out of the if then else both of them will take us through the beginning of E 1. So, code for E 1 and then if T 1 less than equal to 0 code to L 2 that is the exit code for L 3 A 3 directly otherwise we execute the code for E 2 then it if it is false we go to the else part that is the L 3 and then we go back to the beginning of the code. So, otherwise we execute the code for L 1 and go back to the beginning of while loop. So, this is how jumps can arise from within while loops as well. So, it is now we move on and start looking at the you know S A T G a tribute translation grammar with a synthesized attributes to produce intermediate code for various types of constructs. So, there are many attributes. So, let us look at some of them the others will be clear as we go along most important we have what is known as S dot next and N dot next S and N are two non terminals that we are going to use in the grammar. These are lists of quadruples indicating where to jump. So, the target of the jump would still be undefined when it is on this list then there is if expression dot false is. So, these are all synthesized attributes. So, we are not going to put any arrows corresponding to it if expression dot false list indicates that it is a quadruple. So, we want to jump where to jump if the expression is false. So, when we generate to the jump statement for the expression if expression is false then jump the target is still undefined and the target will be defined a little later. E dot result in general is a pointer into the symbol table entry which just is a temporary result of evaluating the expression. So, as I already told you all temporaries generated during intermediate code generation stage are inserted into the symbol table and in quadruple triple and tree representations that is the implementation pointers to symbol table entries for variables and temporaries are used in place of names whereas, in our textual examples of three address code we will continue to use names, but whenever wherever we have used a name it is really it is meaning is that there is a pointer to the entry for that particular entry in the symbol table. So, the C S A T G will make these attributes very clear. Then we have a global variable called next quad which is which actually contains a number the number indicating next quadruple to be generated. So, whenever we increment the next quad you know that means we have generated one instruction and we want to go to the next whole where an instruction can be placed. Then there is a back patch instruction which contains which takes two parameters one first one is the list the other one is the quadruple number the list contains a large number of a number of rather branch instructions whose targets are unfilled as of now those targets will be filled with quad number by this back patch instruction. So, this is a compiler instruction and not a you know intermediate code instruction merge is a compiler function it takes many lists and merge this them into one the core of the S A T G would always be this gen quadruple which outputs a quadruple. So, where is the quadruple going to be placed it is going to be placed at position next quad and next quad counter will be incremented as well. So, again just to you know stress this point the temporaries are variables which are inserted into the symbol table and we used names in the textual representation, but in actual implementation the pointers to symbol table will be used. We have used a large number of temporaries. So, we require a temporary generator new temp is a temporary generator and its parameter is the type of the temporary that is needed whether it is int or real or care or etcetera etcetera. This generates a temporary name of the type temp type inserts the name which is generated into the symbol table and returns the pointer to that entry in the symbol table. So, that is how T 3 T 4 etcetera are dealt with. So, as we had done in the case of semantic analysis we need to break the production for if then else into two parts. So, this is the production S going to if expression then S 1 else S 2. So, we have broken the if E part into another production if X going to if E. In the case of semantic analysis the purpose was to provide an appropriate error message very early whereas, in the case of machine the intermediate code generation this becomes necessary because we must first produce the test for E immediately after parsing E before we produce the code for S 1 and S 2. If the code for S 1 and S 2 is produced early without the jump then you know it is incorrect code. So, what is what is happening here? So, we have parsed E we have come up to this stage the reduction has not happened in the parser. So, here the if expression dot false list that is a attribute of the left hand side will now store the next instruction in that list. So, make list next quad. So, the next quad is the quadruple number of this particular instruction at this time. The instruction generated is if E dot result less than or equal to 0 go to dash. So, this blank or dash is unfilled at this time and if possible we will fill it otherwise we are going to fill it in some other production. So, we so far we have tested the expression and produced code here. In the second one S going to if expression S 1 then we have a marker non-terminal N else and another marker non-terminal M and finally, S 2. So, here if we need to patch this instruction which implies that the E dot result is less than is false and address to which it is patched is the beginning of the code for S 2 that is stored in M dot quad. So, M going to epsilon will store the quadruple number as M dot quad that is the address of the code corresponding to S 2. So, back patch if expression dot false list comma M dot quad and then on the S dot next we still do not know where you know jumps out of S 1 will be going to. So, all the S 1 dot next list will be placed inside this merge statement. We do not know where exactly this jump for N will go to. So, N going to epsilon will produce a go to statement after the then part and we do not know where the jumps out of S 2 will go to. So, all these will be merged into S dot next. Here this is the if then expression. So, if X S 1. So, we just do a merger of S 1 dot next and if X dot false list and put it on S dot next. So, we will stop here and continue with the rest of the translation in the next lecture. Thank you.