 So, let's start discussing where we have to do this class and we start looking at what are the important issues DFS design of the one side. So, we looked at the layout of activation with all the data such that we need there. Now, we want to, when it comes to design, we also want to look at what are the important issues. So, in this while what I am going to do are just display issues, we should say that they always have some impact on the design of runtime system. So, first issue, what the first question we have to ask is in the while designing runtime system, is do I have to proceed procedures or not? Because whether I have procedures which are recursive or not, I am going to have an impact on runtime system in DC. So, we address each of these issues, but this point of time I just want to flag the issues. Another important question may be that when I return from a procedure, what happens to local languages? Typically, if you have to come in C or C like languages, what happens to local languages when you return from a procedure or a function? The two things what you are saying is X or minus X, now both cannot be too simultaneous as it is or they are discarded. So, there is a contradiction. So, you have to take a stand. We do not see those variables. We do not see those variables, but suppose that procedure is called once again, immediately then what happens? So, that means they are discarded. So, I did not overwrite the memory locations, but they are discarded. Exploiting code is a different issue. Programming language is not saying that to explore code. What it means is that, so when I say a variable has been initialized, if I declare a variable now, question what happens? When I allocate a memory to a variable and say X is going to give you this particular location, do I overwrite everything with 0, 0, 0 base? When I say this has been allocated, there will be some gel pair which I can get. These two are incrementations, but we are talking about programming language, not incrementation. So, question is from point of view of programming language, we say what happens to locals when we return from the procedure. The programming language says that this is no longer available to us. Implementation wise, I may not have been careful and having any kind of miscalculations here, but programming language is very dear that locals are no longer accessible if you can avoid this kind of discrimination. Then another issue that comes is, what does that mean? How do I access that? How do I pass parameters? Because there are many methods of passing arguments. And we must know what is the right method of passing argument so that I can design my runtime system. Next question is, can function of procedure itself be a parameter? So far we have been passing arguments with our expressions. Can I pass a function of a procedure as a parameter? Or can I return procedure as a parameter? Can I return parameter to a procedure and then later on I will prove it with few more arguments. What about the allocation of storage at runtime? Can I dynamically allocate storage? Some languages won't permit it. Some languages do. But if they do, then you also have to worry about how to manage this. And then we also have to do the reallocation. So again the question that comes is, can it be reallocated or I explicitly need to reallocate it or there is going to be a garbage collection when we get it back. So these are various questions we have to ask before we can get into design of a runtime system. All of these questions are going to impact the design. And slowly you will see that at some point of time when I am talking about runtime system or the poor generation of procedures all these questions we get at this time. Now first let's look at all the local data that is allocated on the stack or in this case in the activation data. So local data means that this is the data within a procedure for which I am not going to do allocation and compilation. But I need to have my code prepared or my activation report prepared for this. And what I am going to assume is that byte is the smallest unit of addressable data unit I have. Now everything is going to be in terms of byte. So I can allocate one byte, two bytes and I am not trying to allocate three bytes and I will allocate four bytes and then I will allocate eight bytes and so on. And the reason for this being is that when I talk about words, words will typically either have two bytes or the machines have two bytes but all newer machines will have either four bytes or eight bytes. And we want to make sure that all my data allocation happens on word boundaries. There is an efficiency reason associated with that. When I try to access any data which is aligned on the word boundary, I can do it much faster and if it is not aligned on the word boundary then what is going to happen? I need more than one instruction to fetch the same data. So in that situation like this that I have two consecutive words each having four bytes and my data is in these two four bytes. I cannot fetch it in a single instruction. So I have two instructions and therefore this is going to be a lot more efficient so we are not going to do that. So multiply the objects are stored in the constitutive bytes and they are given addresses which is of the first byte. So I am going to say that if I multiply data I will start storing it from here depending upon whether you have addressing from left to right or right to left. So I will keep these four constitutive locations and then I will try to access this and the amount of storage needed is going to be done by the Python. So if I say that I have a short end or I have a character then I will allocate one byte to this and if I have double precision numbers I may allocate double precision floating points then I will allocate eight bytes to this. So type is going to determine that and how do I associate byte and then the byte that is again the implementation issue. So programming language will not say that always interior is going to take four bytes. Programming language depending upon which machine this is going to get implemented it is going to determine whether interior is going to be two bytes. So in some cases if you look at older machines they said short end and inside they are going to take the same amount and somewhere it became other way around. So this also becomes an implementation issue. Memory allocation is also going to be done as declarations are processed. Now this is an important issue because what happens is that as I am passing my program at these declarations suppose I say something like this if I say A, B then I have float C and then I have let's say can be and I again have let's say suppose this is how I have written my declarations. Now in which order I am going to allocate space to this normally in the order if I am not worried about optimization of space that happens when some compilers but normally what I will do is I will first allocate space for A then allocate space for B and then for C and D in this order. Now what is this order making sure that I can allocate space as I am passing my program. As I look at it I will get the allocate space. But what may happen in this case is that if I try to align it on word boundaries then I may end up wasting space. So I have to insert what we know as padding lines. So what will happen in this case for example so let's assume that each integer is going to take four bytes this is going to take eight bytes character takes one byte and again integer is taking four bytes. And I have a machine which is a four byte machine where word is four bytes. So how will it look? How will allocation go? Allocation will go something like this that this will get associated with A this will get associated with B and then next two words are going to get associated with C. Now when D comes what will happen? This space will go to D which is a single line. Now when E comes do I start allocating from here? Now the reason I will not allocate from here is that if I start allocating from here then E will fit in these three bytes and this byte and accessing this is going to be highly inefficient. So what I will do? I will put E here and I will put F here. But then you will see that these three bytes are being wasted. Now on most machines most large kind of general purpose machines this is not too much of an overhead I don't have any problem and therefore I can label it. But imagine a situation where you say that you have an E problem on which you have four case space and your boot loader has to load then you cannot afford to base problems. So in that case you have to use different strategies. The most possible strategy is that you take all these declarations and sort them in the order of types. And once you sort them in order of types what will you do? Then you will say that let me start with the largest type and keep on then allocating. So what will happen in that case? C will get the first space and then A and B and D and F will get space and then these three bytes will come at the back. But suppose you have multiple declarations therefore I say normally in this situation what will happen is this byte will get allocated and D will go in padding but if I sorted it then I could have used the bytes and only two bytes would have been based in padding instead of six bytes going in. So depending upon your final requirement you may do this but compliers also always don't try to save space. I mean if your target machine is such that you need to save space you have very compact space requirement. Imagine that in the lab when you should say that allocate a bit. So is there a group which is doing here a compiler on Pascal or front-end on Pascal that group around? So do you know the packed area of data? We are not implementing. You are not implementing. But Pascal has a data type called packed area of Boolean data. Now packed area of Boolean data idea is that I will be able to allocate one bit for every data entity. That means if I say that I have an area of size 36 then I can fit it in one word and each bit is going to be then allocated for each element. Now this is highly sufficient as far as access is concerned. How will you find out what is the bit value? How will you find it out in such an allocation? What kind of code will you write? But you still need to access the whole word. So for accessing any data you need to access the whole word and then depending upon the bit position you access masking using an end and then you get the bit value. Now that's extra operation. For most Pascal compilers are going to do they say don't worry about it. We are just going to allocate one word for every bit. Programmer may say that I have packed data and I am trying to save memory but suddenly compiler writers found that memory is not a constraint on this particular machine so I slow down your code and therefore I allocate one word for every element of this area. That makes it more efficient. So these kind of decisions compiler writers do take. So very times in C you say I want this variable to be registered. What does that mean? Suppose you declare that every word has every data item has to be registered. You can give such a declaration what will happen? What will compiler be? You just ignore it. There is a physical limitation of how many registers your machine can have and if it tries to allocate everything in register what will happen all the time it will be spinning so it will be moving something from register to memory and loading it back and load what programmer wants and allocate whatever is only essentially in register. So as a compiler writer many times we need to take decisions which even are going beyond than what programming languages are specified. As long as functionality is there because many times programming languages try to decide about the efficiency issues which are really beyond them. For example when somebody is packed area of data programmer they are leaving it and the programmer and the designer saying that this is the efficient way of implementing it in compiler writer because of the machine. So we have to worry about that. So we have to pack data so that no padding is left and additional instructions may be required to execute this kind of situation and many times you don't want unless it's really essential on embedded systems because they are the space requirement space is critical requirement. Now what are the storage allocation strategies? So static allocation is one now what is static allocation? Static allocation is saying that this activation record statically has to be bound to a procedure and has to be given space. And then another allocation strategy we have seen is what we do as stack allocation. So this activation record has to be put on the stack that is the second strategy and the third strategy obviously is that you put it on heap and this is obviously going to be that as and when you require some space you put it on heap but there is no guarantee that you can access it either by a fixed location or by going to a stack point that it could be anywhere and you need to maintain additional pointers for that and this is again a requirement coming from the language side. So let's look at all three strategies and see in which case each one is beneficial so static allocation. So what happens in static allocation anyone who knows which language uses static allocation of activation records so first part static allocation is that to recall I showed you that we have code then we have data here then we have stack going and stack is the one which was keeping all my activation records and then I have heap going here. Now suppose I say that I have main program and I had so let me go back to quick sort example. So the main program was sort and then within that I had a read array and what else did I have I had partition sort and maybe I had exchange right these are the functions we have function does not matter what the function body is. Now what static allocation we will say that don't worry about the stack part and let's say that main data is already in the global data then I have an activation of read array I have activation corresponding to partition I have activation corresponding to quick sort and I have activation corresponding to exchange. These are the activations which are already created and I have switched all the data to that. Statically I have located space to each of the activation records. What is the advantage of this and what is the disadvantage of this. So what is the advantage of this model or does it have an advantage at all. One clear advantage is that at the compile time I know address of each of the activation records whereas in stack at one time I will have to determine. So if I am trying to access a variable I know right away I don't have to have any pointers on the symbol table I know right away that this variable is going to be allocated in some space. Another interesting thing that happens is that what happens when I say that my control is going from my control is going from quick sort to read array. And when read array finishes what do I do with this space. So first function was in main program I called read array. What did I do in stack allocation? ID allocate is it. What do I do in static allocation? Nothing it remains there. That means if I call read array once again then all the local variables will be available to me and the old values will be available. So I can keep track of a local variable and say how many times this function has been invoked for example. I can increment this counter which is not possible in a stack allocation. So names are bound to storage as program is compiled. Names are going to be bound to storage as I am compiling my program. I don't have to wait for any run time support which I need really in this stack allocation and when the program makes these bindings do not change. That means every time I call read array this is going to be the bound binding. This is where my read array always come and for any invocation the procedure name is going to be bound to the same storage that means every time I call either read array or quick sort I am going to bind them to this location. But this puts a very serious constraint. What is that constraint? I cannot have recursion. So value of local names are retained across activations and type of name determines obviously the amount of storage that has to be kept aside. So constraint is going to be size of data object must be known at compile time. So that means no dynamic allocation. And recursive procedures are not allowed and data structures cannot be created everything must happen at compile time. So all these addresses can be filled at compile time I do not have to have access links and compiler then decides location of each of the activations. Then obviously it is not a good model but this was the model which was used before because this was simple. This is something people understood. In fact you will see that all the theory of compiler was developed after the first compiler was built. There was no theory of classical analysis there was no theory of code generation register allocation almost everything happened after the first compiler came. So first compiler people were able to functionally do something whatever they thought was fit at that point of time. Similarly if you look at row major and column major I mean they thought that column major is going to be more efficient or column major is the one they picked up and later on you found that almost every language had row major. Stack allocation what happens? Stack allocation is going to be where I will say that these are not going to be allocated space at compile time but I will know the layout of activations of each of these at compile time. But actual space allocation will happen only when my program starts running. So what happens here? So typically when I say that I have the main program or the sort program this activation goes on the stack and after this if redirect finishes and then I take all to quick sort 1 9 then activation of redirect is thrown from the stack and quick sort 1 9 comes from the stack. Unlike here where lead array always remain and then if I am somewhere here in my call tree and this obviously is node at the top then the only nodes which will be available on my stack are sort quick sort 1 9 and quick sort 1 3 and read array and partition 1 9 already being thrown. So then what happens in in the stack allocation but what is the call sequence because in stack allocation what is going to happen is that somebody will have to not put data. This data is not going to be known to me as compile time. This data has to be initialized at runtime so I need to generate additional code for managing my activation record. So what happens here is that I have a call sequence which is going to allocate activation record and is going to enter information in this field. So I will actually show you towards and after we have looked at all the issues that how the code for initializing an activation record looks and return sequence is when I am going to restore the state of the machine or status of the machine. So look at this. Suppose I have now 2 activations and this is where my stack is growing so stack is growing in this direction this is the activation record of the procedure and this is the activation record of the procedure which has been called. So this is the call sequence and these are the data items it has parameter values, return value because this must also have any hope by someone so this is like a snapshot of 2 consecutive activations in my stack. So you can see that these are exactly same only the size of data may differ and I have these control lanes, I have access lanes and then I have space for data, parameters and return value and so on. Now first question that comes is if this is the caller and this is the colleague who fills in this data. So when I say that who fills in this data I now am going to detail of saying that if some data has to be initialized is it job of the caller should that code be part of the caller sequence or caller procedure or should that part be should that code be part of the colleague sequence. Start looking from here so this is the caller's activation record this is the colleague's activation record now when caller starts now who fills in this data parameter and return value or the space for the return value who fills in for this in this particular activation who has filled this in. So there must have been activation of a caller which initialized this so obviously if say A calls B what are the parameters which have been passed. So this must have been filled by the caller of this caller similarly the control link where the control has to go back must have been filled there but if you come to this part this must have been filled by the caller which says what are the links and what is the machine status value I am saving and how much temporary space I want. So typically what happens is that this part is going to be filled by the caller and this part for the activation of the call is going to be filled by the caller in the same manner as this part has been filled by the procedure which start in this activation. So when I go with this structure it does not mean that all of the activation has to be filled in by this particular procedure. So therefore now you have to see that when I generate code my code sequence when I say call sequence and return this code must be part of the caller sequence this code must be part of the caller sequence and this part code must be part of the calling procedure. Functionally you can see that this is where I have information from the caller and this is where I have information from the caller. The caller will not know what are the local variables which are required in the procedure which is being filled. Does that make sense? So this is what we have to remember that when we start doing code generation then we will start filling in all this data. So this is okay so there is a call in responsibility so let's look at what happens when actually we go for generating code. So what is my call sequence? Caller is going to evaluate all the actual parameters. So it has to supply all the parameters. So see this that all these caller part has to make sure that all these parameter values are then filled there. So I have caller which evaluates all the actual parameters then caller also stores the return address and other values which is controlling to do the call in activation report. It is telling call in that this is where you send the control back. Then caller is going to save all the register values and call is going to save all the register value status information before it starts okay and once it has done that then it is going to initialize its own local data and is going to then begin execution. So there are certain responsibilities in the call sequence which have to be done by the procedure which invokes and certain responsibility have to be taken by the procedure which gets in here okay and what is the return sequence then? Now call is going to place return value next to the activation with all of the call okay. Now if you see something very interesting is happening here call you when I say that I have this activation. So suppose I have situation where a calls b this is a and this is b and sequence is a calls b okay. Now after we finishes okay assumption is that this is no longer this activation no longer will be accessible to me okay. But if you see this instruction this is saying call is placing the return value next to the activation record of call call is placing the return value here and is not putting it in K why do I do that and why it is putting in location which is immediately next to it why it is not putting somewhere else. Now I can put it somewhere else I can put for example value register nothing stops me from doing that if I can do efficient for generation I can put this in any way okay. Advantage of this is that normally when I say that I am reallocating certain space what that means is that I can use it for something else that does not mean that I am going to initialize all the big sequences okay. That means when I put my return value here if this is my stack pointer so earlier my stack pointer was here when I reallocated my stack pointer becomes this okay and then I say whatever is in the stack pointer plus 1 logically 1 that is where the return value is okay. So before this code of a resumes execution it is going to now take stack pointer plus 1 copy it into local variable its own local variables and continue from this that is what is going to be the return sequence okay. So I can very safely now why do I keep it in the location here and not anywhere else the reason is very simple anywhere else is going to be arbitrary for the every time for every activation record that offset is going to be different here it is saying that offset is always going to be stack pointer plus 1 okay. Fix value is respective of which procedure is being improved okay. So then what we do that once it places value here it restores the register values branches to the return address and return address is stored in the control sequence and then the caller is going to copy return value in its own activation code. So then the caller will have some board sequence where this value will be called copy right and then execution will start okay. Now suppose that when I am talking about hope so when I am talking about activation record one issue I will not address and that is I look at my activation let's go back to my activation once again. So I am going to copy so this is a I have space for local data suppose you have local data which is very large. Suppose you have local data which is like an area of size 1000 okay and I try to then allocate space to this what will happen to my activation record my activation record will just become really big. Now I don't want that activation record and what is the reason for doing that or what is the reason for not having a big activation record? It will go to the other page page will change. Page will change and the simple thing is that normally what will happen is that I will have a pointer which is stack pointer and then I have a pointer which can be frame pointer so I can have pointers at two sides and I want all my data which I am accessing should be in a small range either from stack pointer or frame pointer. And what will happen, what will the page will change even if it fits in a page what may happen that offset is large and therefore it becomes low okay. So normally what we do is that when I have large non-local data and large local data then I don't keep it in activation so what I can do is I have this activation of P and that if suppose I have these arrays A, B and C which are say large arrays I push these arrays outside the activation record. But now I have to do additional loop keeping for my stack pointer that means when my stack pointer was this normally when I implement stack pointer I will say stack pointer is going to be stack pointer plus size of activation now in this case it will be stack pointer plus size of local data plus size of activation. For additional loop keeping it has to be done I have to remember that I have created some local data here but my activation becomes a dangling reference so basically sometimes it's not clear less and they try to access a location which has already been reallocated. So what may happen is a situation like this and a common situation is see that you have a local variable and then you return address of this local variable and as soon as you return control technically what happens is that this case has been reallocated. Now suppose after some time you preserve this pointer and then you make another procedure call and the same space to be allocated to that's the next function and what will happen in that case when I try to access this data it will happen just right. Now many people I mean unless you know all stack allocation happens may not realize what is wrong with this but once you understand stack allocation and immediately you know what is wrong with it's trying to gain access to something which is not local. This also becomes a loophole where all these viruses and stack overflow and so on they try to manipulate it but as a programmer in fact I mean there are checks and balances which are possible that you can push your program through these checkers and they will say oh you are trying to return a pointer to a local variable which is not permissible and will not even permit you to compile this program. So there are checkers which are available to this but this is something even as a programmer you should not do and you can explain it only by stack allocation if you show it to a C programmer who does not know the runtime model then you will find out that they will not even understand what is wrong with this program. So I think today I will have to stop here and today we need to take 4 feedback and one of you will have to volunteer come forward and do the 4 feedback. Who will volunteer for doing 4 feedback?