 Welcome to part 7 of the lecture on machine independent optimizations. Today we will continue our discussion on static single assignment form and its application to optimizations. So, the content of today's lecture would be to look at the you know constant propagation algorithm, but the variety that we are going to look at is the conditional constant propagation algorithm to do a bit of recap. So, a program is in the static single assignment form if each use of a variable is reached by exactly one definition. So, this is something I already mentioned the flow of control remains the same as in the non SSA form then there is a variable merge operator phi which is used for the selection of values in joint nodes and of course, the conditional constant propagation is going to be faster and more effective on the SSA forms. So, the conditional constant propagation algorithm itself it uses SSA form with extra edges corresponding to the definition use chains. So, these are called SSA edges. We use both the flow graph and SSA edges and maintain two different worklists one for the flow graph edges and the other for the SSA edges. So, these are called flow pile and SSA pile respectively flow graph edges are used to keep track of the reachable code and the SSA edges are used to help in the propagation of values. Flow graph edges will be added to the flow pile whenever a branch node is symbolically executed or whenever an assignment node has a single successor. The SSA edges are added coming out of a node or added to the SSA form SSA worklist whenever there is a change in the definition value you know. So, there is a change in the value of the assigned variable changing the value of the definition. So, now the reason why we do this is to make sure that the node which is affected by the change in this value is processed as soon as possible and there is no need to go through all the execute you know all the edges in the flow graph before we arrive at this particular node. So, usually towards the end of the algorithm there will be very few changes in values and very few nodes will be affected. So, towards this phase of the algorithm it is beneficial if the affected nodes are informed directly. So, this ensures that the all the uses of a definition are processed whenever definition changes its value and this algorithm requires much lesser storage compared to the non SSA counterpart. Conditional branches at branch nodes are evaluated. So, they have either true value in which case the true edge is added to the you know flow pile. If the false edge is the one to be taken then the false edge is added to the flow pile and if the value is not known then both the edges are added to the flow pile. But at a join node the meet operation considers only those predecessors which are marked as executable. So, this allows us to actually you know remove some code which can never be reached. So, we I gave you this example last time, but we did not work through it let us do that this time. We have initialization of the three variables a 1 b 1 c 1 this is already in the SSA form. So, here we have a phi operator for b 2 to choose the value coming from the back edge or from the top. Similarly, c also has a phi operator to choose the value coming from this side or from that side. So, let us see how this works to begin with we add this edge to the flow pile. So, when we extract this edge from the flow pile this node is going to be processed. So, we symbolically assign the lattice value as a constant 1 for a 1 constant 1 for b 1 and constant 0 for c 1. Then since this is the only outgoing edge from b 1 this will be added to the flow pile and correspondingly when this node is when this edge is processed this node will be taken up for processing as well. So, here we have you know a phi function. So, now this node is not yet marked as executable only the advance are marked executable. So, the value coming from this point is irrelevant to us at this time. So, the phi function will consider only the value coming from the top that is b 1. So, b 2 is evaluated as phi of b 1 obviously, with just one parameter it gives you the value of that parameter which is 1 c 2 similarly will give us 0. Now, c 2 less than 100 can be evaluated to true because c 2 is a value 0 and once that is done the true branch is the only one which is relevant and that is added to the flow pile. So, when we process this edge we process this node and we see that b 2 less than 20 is really true because b 2 has a value 1. Again the true edge is added to the flow pile and when we take up processing b 5 the two assignments are going to be evaluated. So, that gives us b 3 equal to a 1 equal to 1 because of this value a 1 which has not changed and c 3 is c 2 plus 1 c 2 is again just 1 with just 0 and therefore, we get c 2 plus 1 as 1. And then we add this node to the flow pile and when we actually process this edge this node b 7 will be processed and when we process this node we get again exactly one executable edge as incoming edge the other one is not yet marked as executable. So, b 4 becomes phi of b 3 corresponding to this incoming edge and that is value of b 3 which is 1 c 4 becomes phi of c 3 which is the value 1 again. So, now this edge is marked as executable and that brings us to the node b 2 for a second visit. So, now during the second visit the value of b 2 actually you know there has no change because the value coming from this side is also 1 coming from the top is also 1 whereas, the value of c 2 changes c 4 from here is 1 whereas, c 4 from the top is 0. So, the meet of these two values is actually not a constant 1 and 0 you know. So, from the two incoming edges that would be marked as not a constant and once that is marked as not a constant its value has changed. So, the two s s a edges will be added to the s s a pile. The value of c 2 less than 100 now is unknown because this is a not a constant. So, both true and false edges will be marked as executable we had already processed this true. So, there is no need to add it to the pile again, but this will be added to the flow pile again. So, now from now on since this has already been marked this will not be added processed once more unless the value changes because of the s s a edge it indeed happens in this case. So, the flow pile consists of only this particular edge and nothing else and obviously, there is nothing to do in a stop node. So, there is no change as far as any of the values are concerned. So, when we consider the s s a edges and process them we would be actually looking at the node number b 5. So, when b 5 is evaluated we see that b 3 is now a 1 which is 1 and c 3 is c 2 plus 1 which is not a constant. So, previously b 3 of course was still 1, but c 3 was also 1 now because of c 2 changing its value to unknown rather not a constant now c 3 becomes not a constant. So, this leads to a change in value for c 3 and that would be you know made known to the algorithm by adding the s s a edge from b 5 to b 7 to the s s a pile. Of course, we must process this node also because it has been affected by the change in b 2, but it so happens that the incoming edge is not yet marked as executable. So, this processing of this node will not be taken up because it is still unreachable via executable edges. So, nothing happens in b 6 then we process you know we take up this edge right and we process this node b 7. So, when we process b 7 b 4 is computed as 1 phi of b 3 because the other one is still not marked as executable c 4 now gets the value not a constant because of this value and now the c 4 previously had value 1 and b 4 had value 1 now b 4 retains the value 1, but c 4 now gets the value not a constant. So, it you know at this point we again have you know for example, there is a change in the value of c 4 and that will be affecting this particular node. So, the s s a edge from here to here will have to be taken up. So, if the third visit to b 2 there is no change in either b 2 or c 2 and of course, this expression also remains the same and since there is no change and there are no more edges to be processed in the either the s s a pile or the flow pile the algorithm stops. So, once the algorithm stops we can actually do some optimizations such as did code elimination etcetera. So, after the first round of simplification we have you know in this case for example, we have b 2 equal to 1 and c 2 of course, is not a constant. So, it remains as it is c 2 equal to 5 of c 4 comma c 1 and this expression could not be evaluated. So, it remains as it is both the true and false edges have been added. Now, b 3 remains as 1 and c 3 expression remains as c 2 plus 1 because it became not a constant. Here again b 4 becomes remains as 1 and c 4 can be simplified as 5 of c 3 because there is only one incoming edge here and that is c 3 itself. So, after this round of simplification there is more simplification possible. So, we could eliminate some you know for example, b 2 is 1 here b 3 is 1 here b 4 is 1 here. So, but none of these have been really used anywhere right. So, we can remove such code. So, all the constants here are not used. So, they can all be kind of thrown away and here we could replace this by a constant itself right. So, if we do all these simple modifications to the program using dead code elimination trivial 5 function elimination copy propagation etcetera. We get the final form of the code which is a very compact piece of code, but remember this is still in s s a form it has a 5 function. So, this is how the constant propagation takes place right from the first one from here we have come a long way to eliminate a large number of blocks and also statements simplify the statements you know remove some of them etcetera etcetera and the flow graph has shrunk to a small piece. So, that is about you know static single assignment form and conditional constant propagation. Now, we move on to a different topic and this is instruction scheduling and software pipelining. So, in these lectures we are going to understand instruction scheduling specifically we will consider simple basic block scheduling which is based on list scheduling algorithms. We will also see some optimizations trace super block and hyper block scheduling which are useful for you know multiple function unit processors and so on vector processors and so on. The second type of optimization on machine code is called software pipelining. So, instruction scheduling and software pipelining are actually machine dependent optimizations they depend on the machine architecture and machine instructions. So, they cannot be performed very effectively before we perform code generation. So, what exactly is instruction scheduling well it is just reordering of instructions. So, as to keep the pipelines of function units full with no stalls. So, this is the goal of instruction scheduling. So, it is nothing but reordering of instructions I will give you an example to show what this means. The problem of this reordering is as usual like all good problems or difficult problems in computer science it is an NP complete problem and once it is NP complete we can only apply heuristics to overcome the exponential explosion and the heuristics will obviously be will not produce optimal results, but they will produce decent results. If it is applied on basic blocks alone then it is called local instruction scheduling and if it is applied on several basic blocks at a time such as super blocks then it is called global scheduling. This requires elongation of basic blocks similar to the extended basic blocks that we studied long back. So, let us take this example. So, we have several instructions here. So, there are two load instructions T 1 gets A and T 2 gets B then we have T 3 as T 1 plus T 2, we have T 4 as load again, T 5 is another add instruction, T 6 rather minus instruction, T 6 is a you know multiplication and then D gets the value y i a store. So, this is the instruction sequence i 1 to i 7 that we are trying to execute. If you look at this sequence and then see how the results are used we get this type of a graph this is a dependence graph. So, for example, we have T 1 and then T 2. So, T 3 uses T 1 plus T 2. So, it has to necessarily wait this computation has to T 1 plus T 2 has to wait and loads of both A and B are complete. So, that is indicated by adding these two arcs to from the two loads to this particular plus add. So, similarly the value of this addition is used later in you know in this i 5. So, again we have this is you know edge and the load also feeds a value to this operator. So, we have another edge from here finally, we have you know an edge from this to this indicating that there is a use of this T 5 in this right and finally, T 6 is used in the store instruction. So, that is used that is indicated by this edge. So, the value which is produced here is also used by this multiplication. So, that is seen here easily. So, T 3 is used here as well. So, this is the dependence diagram that is relevant for this basic block. Now, as I mentioned the evaluation of T 1 plus T 2 cannot take place until both T 1 and T 2 are ready that is both these loads are completed. So, if we assume that load requires two cycles and any other operation requires only one cycle then after T 1 is initiated we can go and initiate T 2. So, now T 1 is still in progress at this point when we try to evaluate at this point when we try to evaluate T 1 plus T 2, T 1 has completed because it has finished two cycles, but T 2 has not yet finished. So, we cannot execute any you know we cannot evaluate this particular plus operator. So, this is said to have a stall at this point because we need to introduce a no op instruction to take care of being you know idle. Then at the there is another load here R 4 and the result of that load T 4 is used immediately in the next cycle and because load requires two cycles we really cannot evaluate T 3 minus T 4 in this cycle you know. So, immediately after T 4 we have to wait for on cycle and then go to then evaluate it. So, at I 3 and I 5 we have two stalls and we need to introduce one no op after I 2 and another no op after I 4 to make sure that the code executes properly. So, the purpose of instruction scheduling is to try and eliminate such stalls. Let us see how we can eliminate these stalls by reordering the instructions. So, we have the same sequence of instructions, but the you know these load instructions have been changed. The sequence of these load instructions have been changed. So, we have R 1 here R 2 here and instead of R 3 we have R 4. So, here we had R 1 R 2 R 3 and in this we have R 1 R 2 R 4. So, this is the difference. So, after R 1 and R 2 actually are initiated we initiate the next load instruction rather than you know we going directly to the addition. So, this gives enough time for this load instruction to complete. So, at this point now both R 1 and R 2 are ready they have their values available. Of course, if we had used R 4 here instead of R 2 we would not have had that value ready, but we are not using it. So, R 1 and R 2 R 1 plus R 2 can be executed directly now. Now, after this by the time we reach R 3 minus R 4 this load would also have completed. So, R 3 minus R 4 can also be executed in the next cycle. Of course, R 3 star R 5 and store R 6 can all be executed in the following cycles. So, this code requires only 7 cycles and has no stalls at all whereas, the previous code it has 7 plus 2 no ops. So, 9 cycles and it had 2 stalls. So, this is the purpose of instruction scheduling we try to eliminate as many stalls as possible. We try to introduce as few no ops as possible into the instruction sequence. So, if we did have a stall here after R 2 and another one just before R 5 this I 5 then you know the pipeline would have been kind of stuck at that point. So, it cannot proceed further until the load is ready. So, this is the this will take more cycles and therefore, the speed of the either the program requires more time to execute. So, let us go through some definitions before we take up the algorithm for instruction scheduling. So, let us say there are three instructions I 1 I 2 I 3 R 1 is load of R 2, R 3 is R 1 plus 4 and R 1 is R 4 plus R 5. So, what is significant here is that R 1 is computed into and then used here then you know R 1 is used here and then computed into again R 1 is computed into here and here. So, these are three different types of dependences the first one R 1 being computed into and then used is called as a flow dependence it is indicated as I 1 delta I 2. The second one you know I 2 and between I 2 and I 3 we are using R 1 here and then writing into R 1. So, this is called anti dependence and it is indicated as I 2 delta bar I 3. The third one between I 1 and I 3 for the same R 1 is called as an output dependence. So, the dependence is always indicated between instructions here. So, I 1 I 2 I 2 I 3 I 1 I 3 etcetera the reason why such dependences become important is that parallelization later cannot be performed that is one. Secondly instruction scheduling we really cannot schedule this instruction ahead of this instruction because of this dependence and then here is another important point anti and output dependences can be eliminated by register renaming. Flow dependence is also called as a true dependence and it cannot be eliminated by any transformation, but anti and output dependences can be. For example, suppose we use R 1 here and R 1 prime here. So, the R 1 and R 1 prime let us say are two different registers. So, in that case there is no anti dependence between these two. Similarly, suppose we use R 1 here and this is R 1 prime. So, these two are two different registers and thereby the output dependence is also eliminated. Such register renaming in some of the machines can be done by the hardware at run time as well, but otherwise compilers can perform this register renaming and then eliminate such dependences as well. So, in fact if you recall during Chaitan's register location algorithm we clearly said every live range has exactly one variable. So, the second writing etcetera automatically is eliminated you know this is output dependence automatically gets eliminated. So, similarly since we are doing register renaming in that algorithm this anti dependence also gets eliminated. So, let us look at the dependence directed acyclic graph a full example. So, here is the basic block corresponding to this directed acyclic graph it has these nine instructions and we have shown the flow dependence as a full line for example, I 1 2 I 2 I 1 2 I 4 etcetera etcetera. And then we have shown the anti dependence in the form of these dash edges and the output dependence in the form of these dash dot edges. So, what has why should we indicate so many dependences in this particular diagram. So, it so happens that there are two load instructions here and then there are two store instructions and the compiler has not been able to determine that a b and c are all distinct memory locations. So, what has happened is the compiler assumes that it is possible to have a b and c as the same memory location. So, it says let us add edges between the instruction load instruction and the store instruction to indicate that there is a dependence anti dependence. Similarly, between I 1 and I 8 as well of course, between I 2 and I 8 and I 2 and I 9 then the output dependence between these two is added because of the same reason. So, it the two store instructions use b and c we have no idea the rather the compiler has no idea whether b and c correspond to the same memory location or they correspond to different memory locations. So, it adds an output dependence edge from this to this. So, this is the complete dependence drag and whenever a scheduler goes through this dependence graph and tries to schedule instructions, it cannot violate any of these constraints either the flow dependence constraints or the anti dependence constraints or the output dependence constraints. So, let us see how to schedule such basic blocks. The reason we want to consider basic blocks is that they are kind of independent entities will not consider the effect of the instructions before the basic block and after the basic block. We will just assume that each of these instructions in the basic block are connected according to these dependence constraints and then schedule them. So, if we really want to do better work then we will have to look at other basic blocks also that we will do in the advanced instruction scheduling parts of the lecture. So, each basic block consists of instructions which are called as micro operation sequences. So, each of these micro operation sequences is indivisible in other words the micro operations within the sequence cannot be individually scheduled they will have the entire sequence has to be scheduled as one unit. So, that is why because they are indivisible then each MOS has you know micro operations these are the several steps in the instruction and each requiring resources. So, of course, each step of the MOS requires one cycle for execution. How does all this relate to real instructions in a machine well the MOS is nothing but the pipeline stages of the various micro operations correspond to the various pipeline stages of you know of the machine. So, obviously each pipeline stage requires the sources and each pipeline stage executes in one cycle. So, this is a fairly realistic assumption regarding the operation sequence and this is a fairly realistic modeling of the operation sequence as well. There are two types of constraints that we need to show one is called the precedence constraint and the other is called as the resource constraint. So, the precedence constraints they relate to the data dependences that I already mentioned you know flow, anti and output dependences and they also relate to the execution delays possible because of these dependences. So, load may take two cycles multiply three cycles etcetera. The resource constraints relate to the limited availability of shared resources for example, the function units you know adders, subtractors, multipliers, load store units etcetera these are all the various shared resources. So, depending on the availability of these the schedules vary. So, the resource constraints in a scheduling problem relate to the availability of limited availability of shared resources. So, what is exactly the formulation of the problem as such. So, the basic block is modeled as a digraph g equal to v comma e. So, the nodes v and the edges e have to be explained now the nodes are nothing, but the you know micro operation sequences. So, m o s and the edges are nothing, but the precedence constraints that we already mentioned. Of course, we are also going to use other notation for example, r is the number of resources in our formulation there is also a label on the node every node has a label. So, what is that label the label is a resource usage function rho v of i for each step of the m o s associated with the node v. So, I said there are many micro operations within each sequence right. So, each of these micro operations require some resources. So, rho v of i for each i you know will actually tell us the resources used for that particular micro operation. If there are 5 micro operations in a m o s then i will range from 1 to 5 and we are going to mention the resources may required for each of these micro operations for this rho v of i. Of course, we also need the length l v of the node that is nothing, but the number of steps in the micro operation sequence. There is a label on the edge as well this is the execution delay of the instruction and that will be denoted as d of e. So, for example, the load instruction requires 2 cycles multiply instruction requires 3 cycles. So, these are the delays associated with the m o s. The problem is to find the shortest schedule sigma. So, this schedule is actually a mapping from v to n. So, in other words the this n is nothing, but the time. So, we are going to assign each node to a time slot such that for all the edges in the graph the precedence constraint is satisfied sigma v minus sigma u greater than or equal to d e. I am going to explain this very soon and the resource constraint is also satisfied. So, I will explain this also very soon. Now, once the schedule is found the length of the schedule is nothing, but maximum of sigma v plus l v. So, take all the nodes find this sum and take the maximum. So, that is going to be our schedule. So, if you consider the precedence constraints say this is the node u, this is the node v and between these two nodes is a delay d. So, this is the label on this particular edge. The node u has been assigned a number sigma u that is its schedule the time step node v has been assigned a number sigma v which is its time step at which it can execute. So, this simply says this delay d must be less than or equal to sigma v minus sigma u. So, this is quite understandable it simply says that once sigma u is completed and that will happen once u completes and that will happen only after sigma u plus d steps. Because, once u is initiated it requires d steps to time steps to compute. So, sigma u plus d is equal to sigma v that is the earliest that sigma v can start. So, that is what this says sigma v greater than or equal to d plus sigma u. Of course, it may have to start later than this you know minimum value of sigma u plus d because of resource constraints, but that is ok. Now, let us understand the resource constraints. So, what we have shown here is a table on this side is the you know m o a sub step. So, each m o a sub step actually micro operation executes in one cycle. So, we can say that this is also time in you know the units are time units and on this side are the nodes which have been assigned the time units again. So, let us assume for our example that sigma of v 1 is 0. In other words node v 1 starts instruction at node v 1 starts its execution at time step 0, then it has 4 micro operations in it. So, at time step 0 the first you know micro operation executes and it requires 1 resource. Let us assume that there is only one type of resource. So, it requires 1 resource unit 1 unit of resource. In the second micro operation sequence which starts at micro operation rather at time step 1. So, this is 0. So, this is 1 it requires again 1 unit of resource the micro operation sequence number 2 rather 3 requires starts at time unit 2 and it requires 2 units of resource. So, the at time step 3 we have the 4th micro operation which requires 2 units of resources. Suppose just for the sake of example we assume that node v 2 or the instruction at node v 2 has been scheduled at time step 1. Then for the various it has 5 micro operations in it and each of these require 2, 3, 1, 1 and 2 resources respectively. Similarly v 3 has been scheduled at 2 and it is micro operations require 3, 1 and 2 resources respectively. Finally, if v 3 is scheduled at 3 and it is a micro operations require 1, 2, 3 and 2 resources respectively. Now, if you look at the total number of resources available in the machine let us say it is 5 there is only 1 resource that is necessary for execution and it is available in 5 units. So, when we start this at this point there are no other instructions which are executing. So, we have requirement of 1 resource and there are 5 of them. So, this can execute well at this point the previous step has completed and again we require only 1 resource out of 5 and this resource has been released after its completion. So, this can also execute and now when this particular instruction you know number v 1 is executing in it is time it is second micro operation. We have already begun the second instruction and that is executing it is first micro operation. So, in some you know actually whenever we look at the diagonal they correspond to the same time step. So, this is time step 1 micro operation 0 and this is time you know this is micro operation 0 and time step 1. So, these 2 are actually in the same slot of time. So, this requires 2 resources. So, at time step 1 we have a requirement of 2 plus 1 3 resources which is still because we have 5 of them. Similarly, if you continue at time step 2 we have the node v 3 which is executing it is micro operation step 1 or a micro operation 1 v 2 is executing it is micro operation number 2 and v 1 is executing it is micro operation number 3. So, the resource requirements of all these 3 have to be added up because if this is the same time slot of 2. So, 2 plus 3 5 5 plus 3 8. So, we require 8 units of resource what we have available is only 5. So, actually speaking such a schedule is not possible. Let us just for the sake of the example continue. So, at this point we have v 3 this is micro operation 0 this is micro operation 1 and along this diagonal we have 1 plus 2 plus 2 which is fine you know. So, we require 5 units of resources and that is available. In fact, even at this point we have 1 plus 1 2 plus 1 3 plus 2 5. So, this is also ok, but unfortunately this required more than what is available. So, this schedule does not work. So, we may have to introduce some you know dummy instructions in between without rather the no op instructions at these points in order to make sure that the resource constraints are taken care of and schedule comes back you know is proper. So, that is what has been said here. So, this sigma really adds up the various resource requirements of the various micro operation sequences at in the various states of their execution. So, that is what this minus operation is really doing. So, this gives you the entire resource requirement for the whole machine with many instructions operating at different in different states. So, the list scheduling algorithm is stated here. So, the purpose is to find the shortest schedule v 2 n such that precedence and resource constraints are satisfied. If there are any holes they are filled with no op's no ops. So, function list schedule it takes a dependence graph. This is a actually this is a topological ordering sorting algorithm really the there is a queue called ready which is used by this algorithm. The ready queue consists of all the root nodes of v which do not require anything to execute. So, these are the top level nodes in the DAG. So, as I said this is a topological sort. So, we start from those nodes which do not require anything to any do not have any precedence require you know constraints that is there are no incoming nodes for this particular root these root nodes. So, those are the only ones which can be executed in you know to begin with right it does not mean that all the nodes in ready queue will be assigned the same time start not necessarily. We have to check many other conditions of resource constraints as well. So, the schedule to begin with is empty and we continue till the ready queue is empty. We get the highest priority node in the ready queue. So, how to assign priorities is the next thing that we need to understand we will do that in a few minutes. Then for this particular node v which we have picked from the ready queue we want to find a slot time slot to schedule it. So, to do that there are two things to do first is find the lower bound for the time slots of v which satisfy precedence constraints. So, this is a function which is in the next slide. So, we will see that v comma schedule comma sigma. So, it takes these parameters the node the partial schedule which has been obtained. So, far and of course, you know rather the set schedule that has been a set of schedule nodes and then sigma is the schedule itself that is the mapping from the nodes to the time slots. So, this gives you this function gives you the lower bound on the time at which it can be the node v can be you know scheduled. So, it does not mean that this is the place this is the time slot at which we are going to schedule v. We still have to check the resource availability. So, sigma v the slot at which we will be scheduled is obtained using the satisfy resource constraints function which takes v then the set of schedule nodes the partial schedule sigma and also l b as parameters. So, depending on the resource constraints which we have mentioned here you see in this we already mentioned the resource constraints here. The function satisfy resource constraints will assign a particular slot for v. So, it could be l b it could be l b plus 1 l b plus 2 etcetera, but definitely something will be found for this. So, the schedule now is you know it gets v as an extra member because we have already scheduled v at sigma. Now, the ready queue loses v, but then we also add all the nodes u which are actually now eligible to be scheduled. So, u is not yet already scheduled that is number 1 not of u in schedule and we have all the edges as w comma u such that w is scheduled. So, we take all the you know I will show you a picture here. So, this is the you know currently scheduled node now it has been assigned slot. So, and these w's are all the already scheduled nodes these two. Now, the ready queue will lose v of course, but then it has other possibilities of you know actually to be taken care of. So, u 1 u 2 and u 3 are 3 nodes which are emanating from v and whose other predecessors w 1 and w 2 have already been scheduled. Whereas, the successor x 2 still has another node predecessors x 1 which is not yet scheduled. So, this is not yet ready to be scheduled, but u 1 and u 1 u 2 and u 3 have all their predecessors already scheduled this and of course, this. So, this has only this predecessor this has this and of course, this. So, these two have already been scheduled this is just now scheduled. So, u 1 and u 2 and u 3 will all be added to the ready queue. So, that is what this says. So, this is done for the entire ready queue one at a time and then we return sigma. So, now let us look at the constraint satisfaction functions satisfy precedence constraints. So, this you know simply considers sigma o plus d o v for all the scheduled nodes and picks the maximum. So, let us see what it means. So, this is the precedence constraint satisfaction. So, we are here right and u 1, u 2 and u 3 have already been scheduled. So, their scheduled time slots are 10, 25 and 18 the delays for the u 1, u 2 and u 3 those instructions are 2, 4 and 3. So, the earliest that we can schedule v is 10 plus 2 or 25 plus 4 or 18 plus 3 the maximum of these. So, obviously it is 25 plus 4 which is 29. So, until you know that is because even though these two complete earlier u 2 does not complete earlier than 29 cycles and that is the minimum at which we can be scheduled. So, that is what this satisfy precedence constraints really tells us. What does satisfy resource constraints tell us. So, let us look at the picture again. So, the same picture that we had used before you know we had a problem here right these two are not present these two dummy slots were not present this row was actually present at this point. So, we had 3 plus 3, 6 plus 2, 8 as resource requirements. So, now to take care of the resource constraints we have actually made these two dummy slots there are no up instructions here. So, now the resource constraint of 5 is satisfied. So, here any diagonal now has only 5 or less. So, this is 5. So, this is 2 plus 1, 3 that is it and this has 3 plus 1, 4. So, here this has 2 plus 1, 3 plus 1, 4 this has 2 plus 2, 4 this has 3 and this has 2. So, this schedule with 2 blanks here is a proper schedule and this is precisely what the resource constraint satisfaction function checks. So, it checks at this point whether the total resource requirements are satisfied if not it increments the counter to by 1 and checks again. So, it did that for these two found that only 4 is a feasible slot and that is precisely what it is doing it is trying from L B to infinity and check this inequality and then return the slot at which the resource constraints are satisfied. We are definitely certain that we will find some slot because you know even though we do not reorder every instruction has a finite amount of time requirement. So, every instruction must finish after a few cycles and after that we will definitely have get a slot to assign to this particular P. We will stop here and continue with this part in the next lecture. Thank you.