 And of the previous class, we were looking at the various IRs and what are the kind of properties they have and what are the desired properties. And finally, we were discussing a set of previous instructions, where we had assignments, we had unconditional and conditional jumps, then we had also on arrays, then we had procedures and the pointers. And more or less we said that this is a large enough set which will capture a class of languages and we'll notice as we go along that as we go along and start discussing core generation, you'll find that more or less you'll be able to handle most part of the languages with this. So this is really very familiar representation of it is one more piece of information. And just to conclude this discussion on IA. So we looked at several IA's but actually there are a lot of important intermediate representations and it has become so important that people are now learning conferences, the special conferences which are held of the special interest group on programming languages and compilers which talks about various intermediate representations and this is rather small list of intermediate representations which are used in various parts of compiler and it's actually a much larger list. So if you're interested you can just search on this sit-land proceedings and we'll get to our representations and we'll find a lot of discussion. So this is where we'll close our discussion as far as IA's are concerned. There is another very important data structure which has to be used internally by compiler, apart from intermediate representations. And we also have to worry about that when it comes to code generation. What is that? Tag. Tag. Text. Tag. Tag. D and D tags. Tag. Tag. Oh, okay. Sorry. So stag is, okay, so why do you say stag is an important data structure that you should be interested in? Because the function location and posteriority values are much much larger. Okay. So what is important in that part? I mean we need stag. That is okay, but what is there to discuss in stag? That you have push and pop and you have a stag point and other than that, what can you do with a stag? Symbol table. So if you recall initial compiler structure, there was one data structure we said which is going to be interfaced with all the phases of compilation starting from lexical analyzer to the code generation. And that also we need to understand how we can organize symbol table because this again has certain properties which will come in the interfaces. Okay. So what compiler does is it's going to use symbol table to keep track of information about all the variables. And now there is an additional issue that is coming with that. At some point of time you also have to worry about the scope information. Okay. So what may happen is that I may have nested procedures and functions. I may have non-local variables and they are offering a lot of different slopes. We need to make sure that our symbol table is somehow able to capture when I look at a variable. For example, what may happen is I may have the same variable name offering two different slopes. And when I refer to a variable, I should be able to get the right variable access to the right variable. Now where that may be? So I may take something from languages which are like flat language like C. So I may have a global X here. And then I may have let's say function f1 which may not have global X, but then I may have another function which may have let's say let me not give a type to this. Let me just call it local X. And now question that becomes that when I refer to X in this scope and I refer to X in this scope. When I refer to X in this scope, I am referring to this variable. And if you have method procedures and functions, then issue is going to be a lot more complex than this. So we need to figure out how this scope information is going to be kept because that ultimately will tell me how the binding is going to be done of variables. So when I say a variable use here, so if I say use of X and if I say use of X here, in this case this is the binding and in this case this is the binding. And I should be able to handle that to the kind of symbol table sahab. So also we need to understand certain properties that symbol table is very frequently changed. And all the phases of compiler potentially may change it. What change means is that I might keep on adding new and new information. So if you look at lexical analyzer, lexical analyzer, we put lexive information, but I actually put type information and code generation and memory I located my code relative offset of that particular variable. So whenever a new name is discovered and new information about an existing name is found out, I'm going to go and modify the information in the symbol table. That means we also have to worry about how fast I can access my symbol tables. Now symbol tables therefore must have mechanism to take care of these two functionalities. One, that I should be able to add new information. I should be able to add new entities. The other one is the symbol table. And therefore I should be able to add that. And also I should be able to find existing information because if the variable is already there and some new information has to be added about it, then I must first locate where this variable is. So I should be able to do that again efficiently because that is an important issue. And what are the kind of common data structures? Perhaps you have already gone through the data structure and there is a data structure and there is a source. So linear list is one way to implement this. So how linear list will have this information? That I can have structures. I have a list of structures and every structure will have information about all the symbols. But you also know that this is very inefficient because every time you have to do a full traversal of this and you know what is the complex thing which is involved in this kind of traversal. A more efficient way of course is that I use hash tables. And hashing obviously if you don't have a good hash function then it can become very bad because you may have just one bucket. But then you have to design good hash functions and here programming and you may have slightly more space overhead and programming work. But this definitely is going to give you much better performance as compared to a linear list. So you have to now find out and hash tables you will again find actually is a combination of hashing and linear list because you may not be able to map all the keys onto unique entries and then whenever you map them onto the same key what are you going to do? You actually have a list of all the information which is mapping onto that key but you want to feel that as small as possible. So if it is still of order n and n is small enough then from practical point of view we don't worry too much about it. So if you find that each list has only about 5 to 6 elements then it doesn't matter how much time I take for traversal. Also compiler should be able to grow symbol table dynamically. Now why would symbol table grow dynamically? What does that mean? Why I can't have a symbol table of no size to begin with? See when I start reading my program for compilation. So one way I can say is that I can have a static structure like a symbol table I say array of the mouse. And it is large enough that doesn't matter what program you give me I will give you but that is very inefficient space for this. So what we are likely to do is we will keep a small symbol table and as it when required I will do a malloc and we will keep on having it lost. Also what may happen is that to take care of the scope information I may start creating symbol table for a single scope and once I find a new scope I may start creating a new symbol table for that and this is some information which is not known to be applied. So I make sure that I start with a reasonable size symbol table and as and when I require information for more and more symbols and more and more scopes I just know this information. And if size is fixed then it should be large enough for the largest program which is never an efficient way of doing it because if you are planning for the peak capacity most of the time you are going to raise this. So looking at the information which is going to go in the symbol table it has declaration of the name. That is the basic thing that will happen there and this is something we will see when we start doing code generation for all the declarations that time we will see that how I will calculate this. And format need not be uniform because information depends on the use of the name. So for example when I say that I have so let's keep this but let's look at something like this. Suppose I have an index and I also have let's say in y but it is actually a function. So this is just a variable in this loop or declaration of a function. Now if I say that in my symbol table I am keeping structure where I have information about text and information about y. Why need the same kind of structure for both x and y? It's very clear that in this case the information which is going to come in is very different than this kind of information. It does not need to be uniform because information is going to depend on how this name is going to be utilized. So in some cases when I say that I have a different kind of utilization of the name my format may be different and each entry is a record which consists of consecutive words. So normally what I will assume from this point onwards is that at least information about each variable is going to be kept in consecutive words. So I will say that this record now it doesn't matter how I access this record whether this is part of hashing whether it is part of an area of structures whether this is part of a link list that is immaterial but as far as this record is concerned this will be in continuous locations in memory and this will have all the information about text. So this is making sure that I have some kind of format and some kind of allocation which is done in this case I don't want to keep its type information at some other place and I don't want to keep its offset at some other place and so on. So at least as far as this record is concerned I want to have continuous memory location and we want to have if I have entries which are uniform then I may want to keep them outside the symbol table. So have you seen this example and we were keeping some information which was outside the symbol table for the lexical analysis we were saying that we want lexies and lexie information we don't want to reserve 32 bytes for each lexie but I don't want to have a chain in this structure saying this lexie is 5 bytes, this lexie is 7 bytes and so on so I just kept a pointer there of 4 bytes and actual lexie information was kept somewhere else. So what that did was that made my record uniform and some of the information which was actually part of this record or part of this symbol it went outside this particular record. And information is going to be entered in symbol table at various times so keywords are going to be entered initially and identifiers are going to be entered by lexical analyzer and sometimes even information about type of this is going to be entered by semantic analyzer and so on. So at various times I am going to add this information and symbol table entry may be set up when role becomes clear so sometimes it is possible that lexical analyzer may not have role may not know the role of a variable we will shortly see an example of that in that time I will say let us delay this decision and only when I have sufficient information I am going to set it up. So let us look at this example of a declaration like this now this is a possible declaration where I have an int x and I have a struct x and this structure is then having fields y and z in the same program in the same store. Now when I encounter x lexical analyzer remember that lexical analyzer was going to give me a token identifier but at the same time it was going to give me information like saying that there is a pointer to symbol table. Now when I say that this is my lexeme and this is my lexeme what is the pointer identity now I will not know at this point of time when I am just doing lexical analysis that what is the right pointer for x I will have to delay that decision at what point of time this will be known to the compiler what is the role of x the top semantic analysis you do not have to do that I mean this part of syntax so I may not know the type but at least I know that this is a variable of variable and this is actually a structure and that we will know at the time of parsing itself because the grammar rules are going to be different. So basically I mean it is again not important when you know it but important thing to remember is that delay the decision and we may have to say that I do not know what is the pointer to the symbol table that was on into this so I keep the names and only when I am able to resolve it at that point of time I replace that name by its pointer so in case two symbols table entries are created and only this is created when the role of the variable becomes clear till then I will have to keep the names and not leave the pointers. The important point is this is the take away point that what the symbol table is created but role is not known and whenever I know the role at that point of time I am going to return pointer and otherwise I will just keep the names. It is point clear to everyone? So attributes are attributes of a name are entered in response to declaration so what are the attributes so type information is for example an attribute and also labels are names but labels have very nice syntax expected that most of the time you will find that label is going to be forward by a point so this is again going to come from the grammar kind of grammar you have that you will have labels and labels can be identified so when I say that I have a token identifier sometimes by looking at the syntax you will be able to say I know this is an identifier but this is a type label without having explicit declaration from the syntax except and syntax of procedures again I am going to say that what are my formal variables so if I have for example this kind of declaration then syntax will be able to find out the list of all the formal variables and characters etc this we have already discussed in case of the lexical analysis that except point of time I know that I will have some lexemes but I want to keep this information separately so that I do not my surface structure of the symbol table I do not have to actually plan the R bytes I can just plan the 4 bytes there so these are some of the points we have already discussed now let's look at how do I do storage allocation or what kind of information I need for storage shortly once I am the reason I am waiting for this background is so that when we start writing code for the symbol table we do not get into the kind of issues which we have not discussed so first I want to discuss all the issues which are required into design of an IR and design of a symbol table so that when I start writing code which will say that now get it information about intermediate representation in symbol table all those issues have already been addressed so we will see structure of the code as well as the data structures so information about storage allocation is also going to be kept in the symbol table now what is the storage that I will do do I get it actually code or relocatable code so normally for any reasonable size program relocatable code so do I know what is the actual address of variable at time of code generation yes no we do not know it what relocatable code means is that there will be some base address and I will say that with respect to this base address this variable is at a certain offset so I will only record offset information and not absolute information and when will I know the absolute address at time of loading will I know the absolute address remember something about paging noise so page can go at any place right and your program is done so actual address will be known only at time of your program is executed what does loading do it gives me full map but it does not give me the absolute address is there so that will be known so that means there should be some way that I should be able to initialize my fail pointer and my code should be able to take care of all this information when I generate code and I generate my procedures and functions I should be able to say that there is certain information which is loaded only when my program starts execution if you do not know the time of compilation yes so if target is assembly code then assembly is going to take care of the storage because if target is assembly what is the kind of information I am going to generate I am just going to generate the variables I am not going to really generate even the offset information and compiler but in large cases is going to generate machine code so compiler needs to generate data definitions which can be used with assembly code but if compiler is generating machine code then compiler will also have to do the allocation itself and for names whose storage is allocated at runtime no storage allocation is there and what are the variables for which storage will be allocated at runtime what are the kind of variables for which I do allocation at runtime and not at compile time heaps all the objects pointer so pointer is allocated actually space at time of compilation when I say something is a time pointer I allocate 4 bytes to this but what these bytes are pointing to that will be known only at runtime so when I do a malloc when you program is executing or when I create a new object at that point of time I am going to allocate certain space on heap and then we will say that this particular pointer is now updated and it starts pointing to this particular location so that will have to be now when I say it has to be done at runtime storage allocation is done who is going to fill in this information that now I have done a malloc and therefore this pointer will have a new address who is going to fill in that information suppose I say there is some pointer some variable is of type pointer so instead of int x I suppose I say int is x is the pointer to type in and how many bytes will I allocate for that 4 bytes somewhere I say malloc and I get some space for storing that particular int right who will modify the value of x do you understand the problem this question is clear or not okay so suppose this is my memory map okay and in compiler time or at compile time I have said that this is a pointer to an object so this is really star p which has been given 4 bytes now what is the value which is stored here at compile time do I know this value so when my program starts executing at that point of time I will say that this is star p is equal to new some object and at that point of time when my program is running at that point of time I will say that here is space which is allocated for this object this is particular point of the start point this object now who will modify this value when will I modify this value because nano happens at compile time why should OS go and start modifying your process OS will give the address of the OS will give the address but who will modify obviously address I know of only at compile time we just do variables so we can modify this value just like any other way so this has to be done by the compiler now compiler has to generate sufficient code that I am going to modify all this information sorry that is the wrong button so basically what happens here is that I have to generate sufficient code in my compiler so that all these modifications run time modifications can take place that immediately tells you something that I must have some control over compiler has to generate sufficient code to manage the run time system including the activation stack when my procedures are being ordered and my code should be such that whenever actually something happens at run time then I am going to modify all these locations and will be able to execute these programs so point here is that whenever something is happening at run time still compiler is responsible for doing all those modifications in its own code so variables are changing but you have to modify, you have to make sure that your code is such that it will take care of this so this we will see when we actually start then we import for managing the run time stack so if target so this point we have seen and if name whose storage is allocated at run time no storage allocation is going to be up so for example when I say that something is going to be I will not know how much space to allocate there that is the run time data and therefore no storage allocation is required when it comes to compilation and compiler is only thing it is going to do is it is going to plan out what was known as activation records anyone has an idea of what activation record is we will talk about it later but from your assembly language programming and some experience in programming do you know what activation record is so basically what we want to do is that corresponding to each of the scopes and corresponding to each of the functions in procedures I want to store certain information now what may be the relevant information relevant information may be that what are the arguments which are being passed it may require certain space for local variables it may require certain space to save the status of the machine it may require some space for returning the return value as well so corresponding to each of the procedures we are going to plan an activation record and compiler will have to say that all this activation record looks and how much space I must give to this activation record and then my code generator base has to fill information in the activation record so we will see this when we do code generation for the procedures and functions so let us look at data structures once again we will touch this point so list data structures very easy to implement and I can use a single area to store names and information and search for a name is going to be linear so if you recall lookup and update so when I say lookup in that case that is a search which is linear and entry and lookup are the independent operations and cost of entry operations are in this case when you have list data structures are very high and lot of time is going to go basically to that so we do not want to do that hash tables obviously I mean you know from your data structures advantages are obvious but one has to be very clear about the hash function to design it and how do I handle the scope information now so let us take this case how do I handle this information how do I make this handle in the symbol table so that this variable looks different than this variable what are the possible solutions so in my symbol table how do I say this axis in this scope and this axis in this scope each scope is a different symbol that is one approach any other approach which is possible so that is saying that each scope will have a different symbol table and we have a tree that means we are talking about different modes that is more different than what we are talking about it will be good if I do this scope it will be good all that should be ok and when it is done for this scope return going into scope is a different issue that is a run time issue I am creating only symbol tables at this point of time so for example one way to do this may be that I have a different symbol table for a global scope I have a different symbol table for this and therefore I am going to find this information in appropriate symbol table that is two problems and that is when I say use X which symbol table do I look at so I must know that which symbol table I am referring to again that is a way of managing symbol tables but another common way which is used for small programs may not be efficient for large programs is that you put scope information along with the variable so you may say that I can define because each scope is going to have a distinct name now suppose I say this X is F2X but I have two dot X internally and any X which is defined here is going to be F1 dot X and any X which is defined here is going to be G dot X then I can put scope information along with the variable so suppose I keep on looking for information about the scope then I am done because then I will be able to find out information about each scope differently by just looking at what is the prefix of that particular variable so entries are declarations of the name and when lookup is done entry for appropriate declaration must be different so for example when I say X I want this entry and when I say this X I want this entry and scope rules are going to determine which entries are appropriate entries and we want to maintain separate table for each of the scopes that is one way of doing it and simple table for procedure of scope at some time is actually equivalent to something of an activation record in runtime so as I said since we look at activation record later and information about non-local can be found by scanning simple table in the enclosing blocks what does this mean that if I say that I am looking at use of this particular X and I want to find out where it is do I scan all symbol tables or do I look at only some of the symbol tables so will I go when I say use of this X will I go and look at this symbol table to find out use of X so what I will do is I will say what are the functions and procedures because we are going to use practical stroking rules here what are the procedures and functions which are nesting this particular function and it must be one of those I have various levels of nesting if I don't find a variable here that I find that this is being tested by the main function and therefore it must be in the simple table of main if it is not in the simple table of main then it must be in the global symbol table so I will only look at the functions or the scope which are nesting and not into any other symbol table that is because of the skill stroking rules and symbol table can be attached to the abstract syntax tree itself so what we can do is that I can just take a symbol table and then I can attach the symbol table to the scope information itself which is in the tree so let's look at a symbol table structure let me skip this part and come back to this let me show you a symbol table so that at least this okay so here is one way of looking at hash local symbol table now when I say local symbol table I am looking symbol table for only one of the scopes now these are my hash keys but you know that if I have entries in the symbol table then I am not going to have a unique key for each of the entries there so what may happen is that if I say key n then it may take me to entry 1 but then I will have a linked list of all these entries of n so I will have a linked list it will say that this is actually now n1 this is corresponding to n2 okay so this kind of structure will be used but if I go for non-local symbol table then I will have symbol table like this when I say I have multiple symbol tables which are nested now but then I can put one more structure which actually gives you not only tree structure because this is a problem of saying that how do I know what my parent knows okay so if this is the outer scope and then these are the inner scopes for which I have symbol table so if I want to find use of a non-local in F then I must find out what is the enclosing procedure which is this one so I should be able to go back and actually what I use is something like this I will go back and start discussing properties of this but I first want to show you a symbol table so that at least you have a mental picture of what is going on so now if you see so let's look at at least this program which is having lot of non-local so I have a program and I am pushing to one column so I have variables A, B and C and the reason I am using this Pascal program is because C has a flag structure it does not have an exit flag structure so I am not able to create this example in language like C so what we have is a program which has variables A, B and C and then it has a function here so what is happening with this particular projector so this is a procedure F this is a procedure G this is a procedure I and this is a procedure J and within G H and I are nested so at top level I have three procedures so basically this yellow color are just nested inside this and this green colors are nested inside G and then it has some local variables so you can see that global variables are A, B and C I also have variables A, B, C in F and then within G I have A and B and within H and B and B and B here and so on so now if I put this information in the symbol table what I will have is that I have definitions of A, B, C here which is the global symbol table within that I have F, G and J and it has definitions of A, B, C this has definition of A, B and this has definition of B now you can see that if I am in this particular scope and I want to access C which is a non-local variable immediately what I need to do is I cannot find it in the local symbol table and therefore I must go back and look at this in the symbol table and close the procedure or the parent procedure if I do not find it here then I must be able to go back further and look at that particular scope therefore I need pointers to go back and the way this is implemented is something like this we say that I have these local variables so corresponding to sort I may have local variables here and then I may have calls to other functions which are like read array, exchange and quick sort and then corresponding to each of this I have local information and you can see I have nothing and this is the pointer which takes me to the parent node and this pointer you can see for the procedure at the highest level obviously is needed because it has no parent except one information which is here header this I will not discuss for the time this will when we go into this particular program and see how do I do it simultaneously so at least I mean you should have our mental model of our symbol table before we go too deep into the properties of the symbol table so this is going to make sense that what kind of symbol tables I am talking about and this is a common structure which is used in compliance any questions on this here so this is really I mean not too difficult to see the difficult part may come and we have to go into create symbol table so right now we are only looking at properties which is more or less descriptive can get slightly boring at times because we are only talking about properties and not seeing that how do I generate this but let us finish this discussion quickly so that we can get into so that we can get into our discussion on how do I generate code so when we have most closely nested scope rule this can be implemented by data structures so three kind of data structures will take care of all the nested scope and each procedure can be given a unique name and block must also be numbered procedure number so when I am saying that I do not have pre-structure I do not have nested structure then I can actually append name of the block in procedure to the name and then I can have a single scope and most closely this rule can be created in terms of these operations which says that I want to look up in certain I want to remove certain information so symbols have their own attributes and typical attributes are going to be the name which is the lexeme, the type, scope information, size, addressing mode etc which I want to associate at 4th generation times and symbol table entries may connect together attributes which are going to be then easily set in the tree so this is something this is a key functionality to remember that I should be able to very efficiently access my symbol tables and example of typical names in the symbol table may be that with name I will have character string with class I will have enumeration size integer and so on and type again can be something which is enumerated how can I enumerate type information remember your description in type checking when you are saying that I know what are my type constructors and I know what are my basic types and therefore I am always fixed number of bits and say that what possible type so a major consideration in designing symbol table is that insertion and retrieval should be as fast as possible one dimension symbol table we have seen as slow I hope I am going forward and balance binary trees hash tables and hashing with the chain of entries generally a good approach so this is normally what I will assume now so let us take two points from here that I want insertion and retrieval which is fast and I want hashing with a chain of entries and these are the two things I am going to use for design of a symbol table and this we have seen and here is a Pascal program for which I showed you the symbol table which is in the tree structure and then I wanted pointers back and therefore I created this kind of now with this background we are ready for code generation now how do I do code generation so first let us look at the kind of things on the parts and also the program for which I need to do the code generation and then start looking at each part now my programs are going to have declarations to begin with let us assume that I have only one scope so if I have only one scope then what are the kind of code I can write what is the kind of code I can write in a single scope I can have assignments I can have conditionals so I am not going to now put so basically in code part you will see that I will have expression assignment kind of code then I will have jumps and with jumps I will have conditionals I will have expectation and of course when I talk about expressions here all kind of data structures may come in so I may have simple variables I may have arrays here I may have structures and so on more or less what we can have in one scope and then you may go into code for procedures or functions now when it comes to code for procedures and function how it is different from this because body of each procedure and function is going to be similar so code for body of the function and procedure is going to be something similar to this but then they are linkages between the functions and procedures one function may call each other and it will be passing in argument and it may be turning so when it comes to specific interfaces of procedure then we look at code generation for procedures so let's go in this order and if I know that how to handle declarations that means how do I put information in the symbol table how do I actually generate free address code and how do I make linkages between procedures and function then I know that for the whole program I will be able to then generate code and we use the same method which was syntax directed translation and in syntax directed translation we will start now actually writing grammar and seeing that how do I keep on pushing information as I am going through the process of passing my grammar so declaration is the first thing I look at and for each name we want to create symbol table entity which has information like type and it's relative address so let's write a grammar root or a set of grammars and start looking at straight away the attribute information so what we are doing here is we are saying that so at this point of time I am not even putting code information you can see that that can be easily inserted later what we are saying is the program is nothing but the set of declarations and each declaration can be concatenation of two declarations you can see this is a recursive rule so I can write any number of declarations here and each declaration is of the form where I have actually a variable and a type name now you can also see that this declaration could have been of the type of list of identifiers followed by a type but that doesn't bother me anymore because we know that even if I have a list here how to handle that we have already seen why we synthesize the translation and type shape so what we want to do here is as we are generating relative code we are saying that the first variable will be at an offset of 0 so we say there is some base address which will get initialized later but I will say that the first variable I encounter will be at an offset of 0 and then if I know the size of the variable that means if I know how many bytes it is going to take that is going to determine the offset of the next variable so at this point of time I will not worry about whether I want to do alignment on the word boundaries I will just go with the size and then we can look at about alignment problem so what we say is that offset is 0 then I have a declaration like this which says identifier is of certain type then what we want to do is we want to enter this information in the symbol table which says that for this particular name type information is given here and offset is what is my global variable but as soon as I have made this entry I want to modify my offset I want to be ready for the next declaration to be covered so what do I do I change my offset to offset and I add width to this so then you will start seeing immediately that when I have multiple variables if I say I have variables x, y and z and let us say these are of type this is of type float this is of type let us say character what will I say that to begin with offset will be 0 and then I will make this entry in the symbol table and I will say that in my symbol table this is of type build but as soon as that happens I will say that in type takes let us say 4 bytes so this is where I will see that if I have integer type I will say what is the width if I have real type what is the width and who determines what these numbers are or and so on who specify these numbers you see a language reference manual which says that an integer will take 4 bytes always this is partly implementation issue so we have to be clear on that because depending on the machine you are implementing it your bytes size will change so if you are implementing your compiler on a 16 width machine then you are saying it was 2 bytes and short it was 1 byte and long it was 4 bytes you go for a 128 width machine on that then you say that integer is 8 bytes and floating point is 16 bytes so this issue of implementation for now you have to be aware of what your machine is and these numbers 4 and 8 are coming not from just the language specifications but from the details on which this particular language is going to be you have to be now aware of what is being provided on the machine it is no longer because you can see I am getting into code generation I cannot live with just the with the source language specifications I have to talk about machines now the numbers are going to be determined by the implementation platform and therefore I will say that here if I say int takes 4 bytes then immediately I will change my offset add 4 to that and therefore offset will become 4 and when I have float x or float y the information which will go into the symbol table is going to be something like this and then I will say that float takes 8 bytes and therefore my offset becomes not planned and entry corresponding to this is going to be now z character at an offset of 12 and now suppose character takes 1 bytes and I have another information it says int p then what is the information I will enter in the symbol table I will say p int at an offset of what will be the offset of p now as far as this type of writing code is concerned this will be 13 so as I told you that at this point of time I am not too much worried about that whether I have to align all my variables at world boundaries for addresses and so on but we will see at later point of time that for efficiency reasons accessing because if it takes 4 bytes and imagine you have a machine with 4 bytes then what is happening to your integer is actually it is going into 2 consecutive words so it will start storing from here and then it will take this byte this is corresponding to z now this is highly inefficient so what we may have to do therefore is we may have to do what we know as padding we may have to say that leave these 3 bytes empty and start allocating from this point onwards so when padding happens I will have to modify my code now there are various methods of annealing this we may say that you take all the variables and variables of different types so instead of saying that as I encountered a variable I will enter it in the symbol table or I may say sort these variables on their sizes and take the largest size first and then take the smallest size at the end and then you will save some space on padding so there are various allocation strategies which are used but as far as this code is concerned it does not take care of any issue about alignment on world boundaries that kind of code saying that whenever you encountered this information put this information in the symbol table and that information is going now so you can see that initially I was able to put only the same information now I am able to put type information through type checking and at time of code generation I am also able to put now offset information so as we are doing more and more means in compiler symbol table is having more and more information this point here we have a declaration suppose now I have a declaration like this which says array num of t1 what information will go in the symbol table type of type of t this is only a declaration there is no type of t type is only of a variable so go back and see what I did here I said declaration is of an identity file of certain type and then I had this t type and an offset so these are two variables I need before I can make this npm process further that means in this subsequent part I only need to compute two things what is t type and what is the t width right so what is t type here array 1 dot num of t1 type and what is t width here so I will just take width of t1 and multiply that by number of np's in this array so this is what we do here t type is an array of num of t1 type and t width is whatever is the number of np's I am just completing these two variables so that when I actually have a declaration I can use this similarly when I say that t is the type of pointer to t1 what will be t type it will be a pointer to t1 type and what will be t width so which argument on that particular machine my pointer is 4 bytes so 8 bytes I will determine that and t width therefore will be a fixed number it has nothing to do with t1 type right so this is how I process all my information and say that first time I am going to determine so this part you have already seen in type changing that is how I was finding out type of of the variables so I just need to compute this information which is t type and t width and when I actually process this particular declaration then I make this entry in the simple bit that is it but this is still saying that all my variables are in the same scope I am not putting any scope information right to put scope information I need to do something more okay so what is that something more I need to do so now I want to keep back off all the local information assuming that I am doing only in the single scope so when nested procedure is seen then temporarily suspended and I go into the deeper nest so for example if I am processing for this particular scope and now I say that I start computing certain offsets for this and now suppose I have let's say in t okay so when I was processing variables here so let's put some names here I say index in y and I started computing from here offset was 0 this took 4 bytes this took 4 bytes and then I come to p so what will be offset of t this you should know from the way you have grown up what will be offset of t from the code I have just shown will it be 0 or it will be 8 it's a different scope right if it's a different scope then how can offset be the same because offset is going to go in the same simultaneously right so when I start looking at a new scope first thing I have to do is I have to suspend okay and I have to say whatever was the offset here let's keep track of that now I can have two kind of syntaxes in the language one syntax could be that I can have this declaration then I can have procedures function then I can have some more declarations of local variables and so on so they are all interest first all their languages it will say that no you must have all the declarations of local variables before you can have declarations of the procedures so let's go for a more general structure for some languages it will permit that that you can have a declaration anywhere in the program so you may have some local variables you may have a procedure and function that again local variable and so on and local variable that means if I have a situation where I say in z here what should be offset of z offset of z should be 8 but then I must remember that in between I have a function for which I have to set to 0 so I must remember this information it's point clear so what we therefore do is that whenever a nested procedure is seen processing of declaration of the enclosing procedure is temporarily suspended that means when I see this scope I suspend this and start a new process start the new offset when I suspend it at some point of time I will have to revive it so this is the structure for doing this stack so what I can do is I can put this information on the stack can start processing this and once I finish this I will pop and this will get exposed and so on so I will be able to continue so what we will do is we will continue this discussion in the next class and start looking at code for nested procedure