 Welcome to part 4 of the lecture on machine independent optimizations. Today we will continue our discussion on data flow analysis, its theoretical foundations and then move on to control flow analysis. So, in the last part we began our discussion on the theoretical foundations of data flow analysis and I mentioned that there are some basic questions that we can answer using the framework. So, for example, when will I know the iterative DFA algorithm be correct, how precise is the solution, will the algorithm converge at all and what exactly is the meaning of a solution, these are the questions we seek to answer using the framework. So, and a framework consists of a direction of data flow either forward or backward, a domain of values which is actually a semi lattice, a meet operator. So, we and the meet operator form a semi lattice and a family of transfer functions. So, there is going to be one transfer function for each statement of the basic block and rather the whole program, we compose the functions of each of the statements in a basic block and form a transfer function for the whole basic block. So, we also have these constant or identity transfer functions for the entry and exit nodes. So, we discuss the framework, the semi lattice and you know the iterative algorithm in the last part. So, now what exactly are the properties of the iterative DFA algorithm, the iterative DFA algorithm iterates over the statements of the program, applying the in and out constraints or equations until the solution converges to f x point. So, the first property of the iterative algorithm is, if the iterative algorithm converges, we still have not given conditions on its convergence. So, if the algorithm converges, the result is a solution to the data flow equations. This is quite straight forward to understand, because the if the algorithm converges then you know the algorithm make sure that there is no change in the solution rather in the value produced by the algorithm and therefore, it must satisfy the data flow equations. If the framework is monotone, then the solution found is what is known as the maximum fixed point of the data flow equations. So, this is a very important condition, the framework to be monotone and what exactly is MFP. So, the maximum fixed point solution is such that in any other solution, the values of in and out are less than or equal to that is less precise than the corresponding values of the MFP. So, the MFP solution is the best that you can get by the iterative algorithm and the any other solution that we get would only be less precise. So, this is the advantage of getting a maximum fixed point solution using the iterative algorithm. So, this is guaranteed by the framework again, but the framework must be monotone in this case. If the semi lattice of the framework is monotone, so this is a condition again and of course, it is of finite height most of our rather all of our semi lattices are of finite height, then the algorithm is guaranteed to converge. So, termination is guaranteed. So, again intuitively the we can see that the data flow values decrease with each iteration, maximum number of iterations possible would be height of the lattice into the number of nodes in the flow graph. So, why is this true? The height of the lattice is the number of distinct values possible in the lattice one for each level. So, we start with the top and then we can go up to the bottom, we cannot go below that and as we already discussed the values actually you know decrease with each iteration. So, at some point they do not decrease any further the maximum rather the lowest solution that can have that can be obtained is the bottom of the lattice. So, that would be the, so we start from the top and go up to the bottom that would be the height of the lattice and every time we are going to look at all the nodes in the flow graph. So, this is the maximum number of iterations possible for the algorithm. So, let us now discuss other possible solutions to the data flow problem and then understand the ideal solution, the you know maximum fixed point solution and the MOP you know solution as we call it. So, MOP solution would be the meet over all paths solution. So, what would be the ideal data flow solution to the problem to a particular problem. Suppose, we can find all the possible execution paths from the start node to the beginning of a basic block B or a statement B. So, this may or may not be possible in all cases, but let us assume that we can find all the execution paths from the start node to the beginning of B. So, if we assume forward flow then you know we can apply composition of the transfer functions to the data flow value at the end of each path. So, what we do is that consider every point on this path each path then compose the data flow values using the transfer functions at those particular nodes. And at the end of the path that is the beginning of B, we would have actually got the composed value of the data flow from the start node to the beginning of B. So, we have one such value for each of the paths which are converging at B. Now, no execution of the program can produce a smaller value for that particular program point than this equation what equates this equation can produce. So, ideal value of B that is possible is the meet over all the execution paths from start node to B of f p of v in it. So, v in it is the initialized value of the data flow entity. So, f p is the composed transfer function over the path p. So, we apply the composed transfer function to the initial value. So, you get some value that is what we talk about here then we take the meet of all these values for each one of the paths we get the ideal data flow solution rather value. Why is this the ideal we are considering the execution paths. So, and the data flow value that can actually accrue at execution time. So, you really cannot get anything better. So, answers greater than ideal. So, in the sense of the partial order relation are actually incorrect. So, the point is we get a greater value only if we leave out some execution paths valid execution paths. And any value which is smaller again in the sense of the partial order relation are equal to ideal is of course, fine, but smaller than the ideal solution is conservative. So, in other words it is safe, but it has infeasible paths have been included. So, this makes it you know less precise. So, closer the value is to the ideal value more precise it would be. So, if you include less and less number of infeasible paths then you know the value becomes better and better. So, this is the meaning of an ideal solution. So, you consider all the execution paths to a particular point and then take the meet over all those execution paths of the data flow value using the composition of the various transfer functions and the meet operator. So, the next type of data flow solution is the meet over all paths solution m o p as it is called. So, finding all execution paths is an undecidable problem this is a very well known difficulty. So, we do the next best that may be possible sometimes we approximate this set to include all the paths in the flow graph itself. So, if we do that then it is called a meet over all paths solution. So, again we consider a path from start node to b. So, it is not necessarily an execution paths some of these paths may not be executable also. Again we apply composition and then the of the of the transfer functions that would be f p over the initial value and we apply the meet operator. So, now m o p will be less than the or equal to the ideal value because we are considering a super set of the execution path some of these paths may be infeasible. So, we have included those as well and that means we are approximating the ideal solution by the m o p solution and m o p is bound to be less precise than the ideal solution. So, there is one more difficulty suppose there is a while loop in the program then finding all paths in a flow graph may still be impossible because we do not know the number of times a while loop would execute we may not be able to say this is the number of paths in the flow graph from one point to another. So, the next best is to apply the iterative algorithm. The iterative algorithm does not try to find the you know paths in the flow graph what it does is it visits all the basic blocks not necessarily even in execution order in a random order it applies the meet operator to each joint point in the flow graph not necessarily trying to take the paths and then apply it at the joint point. We just take the joint point and apply the meet operator that we know already the solution obtained is called as the maximum fixed point solution. So, this would be a little more approximate than the m o p, but there is a theorem which we are not going to prove it says that the if the framework is distributive then the m o p and the m f p solutions will be identical. If the framework is just monotone, but it is not distributive then the m o p is superior to the m f p solution. Fortunately, the reaching distributions problem the available expressions problem and the live variable analysis problem they are all distributive in nature. Therefore, for these problems the m o p and the m f p solutions will be identical. With just monotonicity we are going to have this inequality m f p is a safe, but the least precise solution m o p is the next safe, but less precise than ideal solution and the ideal is the best which you cannot actually achieve. So, the solution provided by the iterative algorithm is always safe that is the m f p solution. So, we have answered a couple of questions so far we have talked about the quality of the data flow solution the solution provided by the iterative algorithm. We have talked about the convergence of the data flow algorithm and we have also related the you know m f p to the iterative algorithm. So, let us continue with our discussion on the theory of data flow analysis by considering a special type of algorithm for constant propagation. So, before discussing the constant propagation algorithm I must give you a little bit of background using the lattice diagram that is shown in the picture. So, here is the constant propagation lattice. So, these are all the values of for each variable which is a constant in the program we are going to have one lattice of this type. So, these are all the values that the variable can take if it is a constant and if it is not a constant then it would be given a value of bottom and if it is an undefined value then it would be given a value top. So, these are the only three abstract values possible the top or an actual constant value or not a constant value bottom. So, this is the lattice for each variable and what we want to do is to take a product of the lattices of all the variables. So, if there are ten variables declared in the program then we will have we want to determine which of these ten variables would indeed be a constant. So, we will have to take the product of all the lattices of these ten variables this would be too large to show. So, let me show you a very simple product lattice. So, this product lattice has a 0 here a top and a bottom the other product lattice has a 1 a top and a bottom. So, it does not even have 0 and 1 in both of them. So, if you take the product lattice the product lattice is defined as S 1 cross S 2 the size of the product lattice would be size of S 1 cross size of S 2 and an element in the product lattice would be a pair a comma b. So, that is shown here and the partial order would be a comma b is less than or equal to c comma d if and only if both the components a and b are less than or equal to c and d respectively. So, a less than or equal to c and b less than or equal to d. So, the same holds here you know. So, this constant value here for example, 0 is less than top then bottom is less than 0 1 is less than top and bottom is less than 1. So, these are the partial order relations. So, using this definition and all the possible values for the product lattice we have just considered exhaustively all the possible values from top top to bottom bottom and we draw this lattice right these are the various partial orders which are possible. So, if there are you know a large number of such lattices taking a product of these we cannot really show it in pictorial form, but believe me that is the lattice we want to consider when we do constant propagation. So, let us understand the constant propagation framework the reason we want to do it is the constant propagation framework happens to be a monotone framework, but it is not a distributive framework. So, in other words when we apply the iterative algorithm for constant propagation we get an approximate value it cannot catch all the constants in the program. So, there would be some which are left out we will show you examples of such a situation very soon. The lattice of data flow values in the constant framework is the product of the semi lattices of the variables one lattice for each variable I already mentioned this in the product lattice. So, even this has been mentioned if you take two variables we have a 1 comma b 1 if you have 10 variables then we would have 10 components in this entity a 1 b 1 c 1 d 1 etcetera. So, a 1 b 1 is less than equal to a 2 b 2 if and only if a 1 is less than equal to a 2 according to the a partial order and b 1 is less than equal to b 2 in the b partial order. So, each of these semi lattices for the variables may have a different partial order that is the most general case assuming a 1 a 2 in a and b 1 b 2 in b. So, each variable v is associated with a map m and m v is the abstract value of the variable as in the lattice. So, what is the abstract value top the constant value or bottom these are the three abstract values possible. So, there is going to be 1 m v 1 function m v for each of these you know variables rather is a map m and m v is defined for each variable. So, it could be defined differently for each variable each element of the product lattice has a similar, but larger map m right. So, this map m is in the product lattice whereas, this map m is for each one of the variables separately. Now, in the product lattice when can we say a map m is less than or equal to the map m prime. So, we can say so provided for all the variables v we have m v less than or equal to m prime v. So, this must hold for each variable m prime is a different map m is a different map. So, for these two maps if we take m v and m prime v then the less than or equal to relation as in the map for the variables must hold. So, if this is possible for all variables then the map m is less than or equal to the map m prime. Why did we define the map that is necessary because our transfer function would actually depend on these maps. So, to define the transfer function for the constant propagation framework let us assume that there is one statement per basic block and the transfer functions for basic blocks containing many statements may be obtained by composition. This is something we already know m v is the abstract value of the variable v in a map m. So, and the set f of the framework contains transfer functions which accept maps and produce maps. So, this f we are going to define the transfer function one sample transfer function of f very soon. So, that would take m as a you know as an input and produce some other m prime as the output. So, f also contains an identity map which is required for the framework and for the start block this m naught v would be undef for all the variables this is reasonable because all variables are undefined before a program begins. We assign values to the variables and thereby they get their values. So, now let us define the transfer function a sample transfer function of the set of transfer functions. So, let f s be the transfer function of the statement s. So, for every statement s we are going to have such a transfer function and capital F will be the set of all such transfer functions. So, again as I said the f s takes a map and produces a map. So, m prime equal to f s m then f s is defined as follows. The first case is s is not an assignment statement at all. So, if it is not an assignment statement it could be a branch statement or an ordinary go to statement conditional go to these are all possible and then f s is the identity function. If it is not an assignment there is no side effect therefore, it is an identity function. If s is an assignment to a variable x. So, then m prime v is m v for all v naught equal to x. So, the assignment is being made to the variable x. So, for other variables the map does not change that is what this says m prime v equal to m v for all v naught equal to x and for v equal to x it changes. Again there are three cases here if the r h s of s is a constant. So, we have an assignment to x the right side is a constant. So, we have a statement of the form x equal to c then m prime x will be obviously a constant. So, in other words the initial map of x would be something different, but after the assignment x equal to c the map of x would become c. So, from now on whenever you apply the map for x you get the constant value c. If the r h s is of the form y plus z. So, in other words the assignment statement is x equal to y plus z. So, if we apply m to m to y and we apply m to z. So, and if m y and m z happen to be constants. So, there that in other words y and z have been assigned constant values earlier that is when m y and m z can become constants just like this. So, in such cases such a case we add the two constants and that would become m prime x that is quite logical and intuitive because if there are two constants the new constant value would become the map of x. So, if either m y or m z is determined not to be a constant then the new map of x will also be n a c. So, if neither of them is n a c and y and z are not constants then we say that the new map of x would become and f. So, these are the three possibilities and we have covered all of them. So, both are constants. So, we add the constants one or both are not constant. So, that is a very strong statement then x would become not constant and in other cases it would become and f. If the RHS is any other expression it is not an assignment of the form y plus z it is not a constant it is something else in such a case m prime x would be n a c. So, we do not want to take any chances we just make it not a constant. So, the constant propagation framework is said to be monotone in other words the transfer function that we discussed here always produces only you know is monotone and therefore, the values produced by it will always you know be at the produce it will always produce a lower or same level value in the c p lattice whenever there is a change in inputs. So, let us check it out. So, let us say y changes from. So, this is undef this is constant and this is not a constant. So, if the input for y changes from undef to c 1 all others being constants then for each of these combinations is check this undef is undef here undef is undef here. So, it remains at the same level c 2 you know undef and c 2 was producing undef here now we have change the input to c 1. So, c 1 and c 2 will become c 1 plus c 2. So, we have actually come down from undef to c 1 plus c 2. So, lower value third one whenever it is undef and nag it will always be nag. So, with this nag combination what this would be nag is the lowest level and we get nag here also the same is true from c 1 to nag as well. So, c 1 and undef gets you undef nag and undef will get you nag. So, we have come down from the undef to nag really c 1 and c 2 gets you c 1 plus c 2 since one of them is nag and the other is c 2 we get nag. So, that is lower and nag and nag will always get you nag. So, in other words this statement is true the transfer function m prime equal to f s of m always produces a lower or same level value in the constant propagation lattice whenever there is a change in inputs. So, because of this the you know we have shown informally that the transfer functions of the C P framework are indeed monotone. Now, let me give you an example of the non distributivity of the constant propagation framework. So, here is a very simple flow diagram this is not even you know it does not even have loops. So, on this side we have basic block b 1 with x equal to 2 and y equal to 3 and in b 3 we have z equal to x plus y. So, if we take this path x plus y indeed adds to 5 and on this path we have x equal to 3 and y equal to 2 just the interchange version of this and x plus y still adds up to 5, but unfortunately when we apply the iterative data flow analysis to this type of you know control flow graph for the constant propagation framework at this point we are actually going to take the join for the 2 x values which reach here for the 2 y values which reach here. So, since x carried a value of 2 here and x carried a value of 3 here the x value here can only be said to be not a constant because both are different constants at this again similarly y equal to 3 here and y equal to 2 here. So, the meet operation of 3 and 2 will produce not a constant value here n a c. So, with x and y being n a c at this point z will obviously, produce n a c. So, the iterative method determines z to be non constant, but z is always a constant and, but this cannot be determined by the iterative method. If we had taken the m o p solution. So, we would have taken this path separately and this path separately right. So, if we do that then when we take this path separately we take the value of x value of y and then the value of z is computed that is 5 and on this path we again take the value of x value of y and another value of z will be computed as 5 and now we take the meet of these two values at this point the z 1 and z 2 both of which are 5 will obviously, yield a meet of 5 here, but this was obtained using the m o p solution, but the iterative solution uses not a constant n a c value for z. So, this is the problem with the constant propagation framework it is not distributive and therefore, the iterative algorithm for solving the data flow equations will not catch all the constants available in the program. Here is a formal statement of the same property if f 1 f 2 and f 3 are the transfer functions of b 1 b 2 and b 3 respectively. So, we apply f 1 of m 0 meet it with f 2 of m 0 and then apply f 3 to it this is one possibility the other one says take f 1 of m 0 apply f 3 on it take f 2 of m 0 apply f 3 on it and then take the meet operation. So, these two will produce different results they should have produced the same result if the framework was distributive since it is not it produces different results. So, we can check it out here m 0 to begin with is undef for all the three variables x y and z when you apply f 1 then x gets a value 2 then in the y gets a value 3 and z has not been assigned anything. So, it still has the value undef f 2 corresponding to this basic blocks f 1 is this basic block f 2 is this basic block f 3 is this basic block. So, f 2 gets a value 3 for x 2 for y and undef for z. Now, if you take the meet of the two values f 1 m 0 and f 2 m 0 these two values being different this will be a knack the same is true here these two values are different. So, this is a knack and these two are undef. So, this is still undef now apply f 3 on the meet of f 1 and f 2 it gives you nothing better it gives you knack here knack here and obviously f 3 applied to undef will give you knack here as well. Whereas, if we had taken f 1 m 0 and then applied f 3 to it we would have got the z value as 5 of course, these two will remain 2 and 3 if we had applied f 3 to f 2 m 0 then we would have got the z value as 5 here as well these two remain 3 and 2. Now, take the meet of these two. So, whatever we have done here. So, these two obviously will produce knack these two will also produce knack, and the z value being 5 and 5 would have produced 5. So, the value being produced by this method is different from the value being produced by this method and therefore, the framework is not distributed. So, this is a counter example to show that it is not distributed. So, that completes our discussion on the theory of data flow analysis. So, let us move on and consider the control flow analysis which is also required before we study the optimization algorithms. So, in this lecture we are going to understand why and what is control flow analysis and we are going to study the dominators and their use in producing natural loops and we will also look at the depth of a control flow graph which lets us know how many iterations are needed for the iterative data flow analysis algorithm to terminate. So, why do we require control flow analysis? So, control flow analysis helps us to understand the structure of control flow graphs. So, the first use of this would be to determine the loop structure of control flow graphs. So far, we have only talked about loops in the graphical sense whenever I show you a control flow graph we see a loop in the graphical sense it goes back to one of the nodes in the you know up in the flow graph, but given a data structure for a control flow graph how do we determine a loop? The reason we require such a loop structure to be determined is that the optimization requires us to determine the nodes in a loop structure. For example, even the register allocation algorithm which we discussed you know the usage counts based register allocation algorithm it requires the loop structure to be known we must know various basic blocks which are present in the loop. So, then the dominator relationship has to be defined the flow you know analysis allows us to determine dominators and these dominators are very useful for loop invariant code motion. The conditions on code motion will all be defined using the dominator principle. We also require the control flow analysis the dominators and something more to compute what are known as dominance frontiers. These are useful in the construction of the static single assignment form which is a very useful intermediate form which permits many more optimizations we are going to consider examples of SSA later. And in parallelization we require the concept of a control dependence to state whether something can be parallelized or cannot be parallelized. So, let us begin with the discussion on dominators. So, let me show you a picture and then we will come back to the text. So, here is you know I show you a lot of paths in a control flow graph here is an initial node the initial node rather and there is a node d here forget this p 1 and p 2 for the present and there is a node n here. What we want to define is the dominator relationship. So, we say that the node d dominates node n if and only if all the paths from the initial node to the node n go through d. So, for example, there are lots of paths which emanate from the initial node. So, many of them, but they will all converge at d and then all the you know from d to n there are many paths of course, we are not bothered about how they actually diverge or converge. Our important point is whenever we start from i to reach n we must definitely pass through d that is the only requirement as far as the dominator relation is concerned. So, d dom n provided all paths from the initial node to n go through d. So, this is the definition of the dominator relationship which is actually termed as dom. Now, the initial node is the root and each node dominates only its descendants in the dominator tree. I have not yet shown you dominator tree we will do that after these definitions are over. So, assume that there is a dominator tree in which a node dominates all its descendants and obviously, the initial node will be the root of such a dominator tree. The node x strictly dominates another node y if x dominates y and x is not equal to y. So, this is a strictly dominates relationship x is not allowed to be the same node y and then we have the immediate dominator relationship x is the immediate dominator of y denoted i dom y if x is the closest strict dominator of y. So, I will show you examples of this very soon and a dominator tree shows all the immediate dominator relationships. It does not show the transitive relationships it will shows only the immediate dominator relationship. So, let me show you an example. So, for example, here this is the dominator tree for this particular control flow graph. So, suppose we consider the initial node b naught it obviously, dominates all the nodes in the flow graph because every path starting from the initial node to any other node must actually pass through the start node. So, that is why b naught is the root of this dominator tree. So, let us consider b 6 right. So, b 6 is a leaf node here. So, if and the immediate dominator of b 6 is b 3. So, this is what is shown here immediate dominator of b 3 is b 2 immediate dominator of b 2 is b 1 and that of b 1 is b 0. So, in other words the dominators of b 6 are b 3, b 2, b 1 and b 0. Let us verify that here b 0 is very trivial and b 1 is also trivial because there is only one path from start to b 1. So, all paths starting from start to b 6 will definitely have to pass through b 1. Then we have b 2 again there is only one path from b 1 to b 2. So, starting from b 0 to go to b 6 will also have to pass through b 2. So, that is so far now from b 2 how do we reach b 6. So, obviously if we go this way we cannot reach b 6. So, we will go to go through b 3 that means again b 3 will be a dominator of b 6. Then if we go through you know start from b 0 then b 1 then b 2 then b 3. Let us say we go to b 5 b 7 and then we go back to b 2 and then again b 3 and b 6. So, this is not a compulsory path you know going via these nodes is not compulsory. Therefore, since every path from the start node to b 6 does not contain b 5 and b 7, b 5 and b 7 do not dominate b 6 only b 3, b 2, b 1 and b 0 dominate b 6. So, this is very clear, but the say you know if you consider b 7 again all paths from b 0 to b 7 do not go through either b 5 or b 6, but they definitely go through b 3, b 2, b 1 and b 0. So, again only those are the ones which are dominators of b 5 and b 7. Now, let us understand how to compute the dominator. So, the principle of the dominator algorithm is very simple if p 1 to p k are all the predecessors of n o d n and d is not equal to n then d dom n if and only if d dom p i for each i. So, let me show you a picture this picture and explain what the statement means. So, we have d here we have n here. So, the from the initial node anyway the paths go through d, but the algorithm cannot determine all the paths and then determine the dominators. So, it has to be incremental in some way. So, what it does is it considers n and all the predecessors of n. So, if there are k predecessors p 1 to p k then. So, d dominates the node n if and only if d dominates every predecessor of the node n. So, if some of one of the predecessors is actually not dominated by d then you know we could have taken that path from i to the predecessors without going through d and then go on to n therefore, d would not have dominated n in that case. This shows that there can be no predecessor of n you know which is not dominated by d it must dominate all of them. So, this is the basic principle of the dominator algorithm let us you know. So, after this algorithm that we have provided here terminates the set of dominators of the node n would be the out set of the node n. So, this capital N is the set of all the nodes in the graph and for every node d n equal to out n for all n in n. So, this algorithm looks very right. So, this is actually just like a data flow analysis algorithm. In fact it is it is a data flow analysis algorithm. So, let us identify the components of the data flow analysis algorithm. So, we have out of course and we have in right then we have initializations for the out node. So, out of n naught is n naught permanently and initialization for n in n minus n naught would be capital N. So, capital N is very similar to the universal set. So, this is initialized to the universal set. Now, there is a while loop while changes to any out n or in n occur. So, this is the iterative this is the loop of the iterative algorithm and we have out n equal to n union in n. So, in other words this is out which is a function of in. So, this is again you know it is a forward flow problem because we are computing out in terms of in and in is considered as the intersection of the predecessors of n. So, out of predecessors of n. So, the confluence operator happens to be intersection. So, this is very similar to our available expressions problem in which it is a forward flow with intersection and that is the reason why because of this you know the confluence operator being intersection we have initialized the out n to the universal set n. So, the out of a node n is the self node union the in node there is nothing to be killed kill is really phi here dominators you know dominator algorithm does not require a kill it only requires gen which is the self node. So, this takes care of every node dominates itself and in is the incoming set of nodes the rather the set of nodes which dominate the incoming point of the node n. So, whatever dominates at that point will also be actually transmitted to the out part. So, this is why and we add n to it which takes care of the reflexive nature and in of n would be the intersection of out p. So, again out p is a predecessors. So, we go back to this right. So, we actually have out here in these places and this says if you know whatever elements are common among the dominators of all these. So, the nodes which dominate all of them is obtained by intersecting these out sets and that is why the in part is intersection of the out sets. So, this is how the data flow analysis algorithm computes the dominators. So, let me show you an example. So, here initialization would be the out of all the elements b 1 to b 8 would be set to b naught to b 8 out of b naught will be set as b naught permanently. So, this is our universal set b naught to b 8 the order in which we visit the nodes let us assume that it is b 0 b 1 b 2 b 3 b 4 b 5 b 6 b 7 and b 8. So, in that order now when we. So, the dominators of this out b 0 would be set as b 0 nothing more to do the in of b 1 is set as out of b naught. So, that would be b naught and the out of b 1 would be the in union with b 1. So, we get b naught comma b 1. Similarly, in of b 2 we have actually two incoming arcs here one from b 1 the other coming from b 7. So, we must take the out sets of b 1 and b 7 and intersect them. So, that gives us you know this set is the universal set b naught to b 7 b 8 rather. So, we really get b naught b 1 and as the in set out set we add b 2 to it and we get b naught b 1 b 2. So, this you know is very simple to compute. So, let us take b 7 for example, in of b 7 again we have two of them coming in right. So, we must take the intersection of these two. So, out of b 5 intersection out of b 6. So, that gives us b naught b 1 b 2 b 3 and then the out of b 7 we add b 7 to this. So, that gives us b naught b 1 b 2 b 7 b 3 b 7. So, one more iteration would be required in order to stabilize these values and then the values do not change anymore. So, this is the dominator tree that we get from the computation in this case. Another example, this is an example which is adapted from Mahon Ruhlman's book. So, here this is a this is a you know this has many loops and so on and so forth. We are still not defined loops, but that does not matter to us. The node 1 is the initial node and that dominates all other nodes in the flow graph that is very trivial. So, let us consider node 9. So, the dominators of node 9 are 8 7 4 3 and 1. These are all the dominators. So, we are here. So, it is very obvious that we will have to go through 8 in order to reach 9 that is very simple to understand and to get to 8 we need to go through 7. So, that is also easy, but to get to 7 we do not have to go through either 5 or 6. We must definitely go through 4. So, 4 is also a dominator. To reach 4 we must go through 3 and to reach 3 we do not have to go through 2, but we definitely have to go start from the initial node. So, that is that is the reason why 1 3 4 7 and 8 are the dominators of node number 9. If we remember that this there are two loop structures here 1 and 2. Suppose, the loop structures change. So, this changes this is remains as it is. So, observe that only this was changed this change to this. There is absolutely no change in the dominator relationship. So, the loop did not contribute anything to the dominator relationship. So, having known what exactly is a dominator now we are ready to define the loop structures it is called as a natural loop. So, to define a natural loop we must first define what is known as a back edge. So, edges whose heads dominate their tails are called back edges. So, for example, if there is an edge from a to b then this b is called as the head and a is called as the tail. So, the head must dominate the tail that is called as a back edge. So, let us understand the back edge here. So, let us look at the dominator tree. So, here is an edge from 4 to 3 and 3 dominates 4. Therefore, 4 to 3 is a back edge, but 3 to 4 is definitely not a back edge even though you know. So, the tail 3 dominates 4. So, this is not what we wanted whereas, in the case of 4 to 3 it is 3 which is the head which dominates 4 the tail. So, similarly if you consider the edge from 10 to 7. So, we have 7 here it dominates node number 10 which is the tail. So, 10 to 7 is also another back edge and if you look at the edge from 7 to 3. So, 3 dominates 7 therefore, 7 to 3 is yet another back edge. If you look at this edge from 10 to 3 all together then 3 dominates 10. So, 3 is here 10 is here. So, 10 to 3 is also another back edge and obviously, 11 to 1 would be a back edge because 1 dominates all nodes in the tree in the graph 11 inclusive. So, if you are given a back edge now let us define the natural loop of the edge. So, given a back edge end to D the natural loop of the edge is the head D plus the set of nodes that can reach n without going through the node D. So, in other words in some sense the loop hangs from the node D. So, all the nodes which are hanging from D and in some way related to it will be the natural loop. So, the set of nodes that can reach n without going through the node D. So, the node D is called as the header of the loop and the significance of the header is as follows. It is a single entry point to the loop that dominates all the nodes in the loop. So, this is the first property of the header and there is at least one path back to the header. So, that the loop can be iterated. So, informally when you look at it you know. So, if you consider this node from 7 to rather this back edge from 7 to 3 the loop consists of 3 and a couple of other nodes through which we can reach 7. So, we are going to look at the details of there are many other nodes which need to be looked at. We are going to consider the algorithm for computing the natural loop in the next part of the lecture. So, this is the end of today's part. Thank you.