 Hi and welcome back to program analysis. This is video number three of the lecture on path profiling and what we want to do in this third part of the lecture is to look at a generalization of the algorithm that we've seen in the previous part of the lecture where we go from directed acyclic graphs to arbitrary control flow graphs and then we'll also look briefly into some of the applications of this whole algorithm. The algorithm that we've seen in the second video of this lecture was only defined for directed acyclic graphs. Now, of course, real control flow graphs can have cycles because programs of course do have loops or maybe sometimes call a function recursively and now what we will do now is to generalize the algorithm that we've seen so far to these arbitrary control flow graphs. So the basic idea is to do the following. So for every back edge that we have in our graph where a back edge is basically one that makes an acyclic graph cyclic. So you can think of this basically as a as an edge back after a loop to the loop header. So for each of these back edges, we're adding two dummy edges to our graph. One that goes from the entry to where the back edge is going. So for example from the entry of a function to the header of a loop and then one from the beginning of the back edge. So basically from the end of a loop directly to the exit, which is both of these edges are edges that usually do not exist in the control flow graph, but we are adding them as dummy edges to our graph. And then once we have this graph augmented with these dummy edges, we are removing the back edges from the graph which will give us a graph that is a DAC, a directed acyclic graph again. And that is a good graph because on this one we can just apply the algorithm that you've already seen and it will work as described. And then we just take the instrumentation that we get out of the existing algorithm with one small addition. And this addition is to add instrumentation to every back edge in the graph. So for example every time we are going back to the header of a loop where we taking the current value of r and use this as the encoding of the path that was executed and then increment its count in this count array and then we are resetting this counter r in order to basically start over from scratch. So as an example, let's now consider different control flow graph than the one we've seen all the time because now we want to have one that actually acyclic and has one of these back edges. So this graph is what you can see here. So in particular what you can see is that we here have this back edge which typically comes from a loop and would basically go back from the last statement in the loop to the header of the loop where we decide whether we want to go on a little further in the loop or not. So what the generalized algorithm is doing with this kind of graph is to handle this back edge by adding two dummy edges for every back edge. One that goes from the beginning of the back edge. So from e to the exit node. So to f and another one that goes from the target node of the back edge. So from b or sorry from the entry node of the graph to the target node of the back edge. So from a to b And now on this changed graph, we will then apply the existing algorithm while ignoring the back edge. So just imagine that it's not there and then we will compute increments based on that graph. So I will not go through this changed graph again because we've seen the algorithm more than enough in the second video, but I'll just show you the solution. So basically it'll go over this graph and compute increments for some of the edges and those edges will be exactly those that you can see here. And then once we have this, we can of course again compute an encoding for every path. So now given this changed graph, we can basically think about four kinds of paths through the graph. So one kind of path goes from the entry to the exit of the path. So for our example, this is basically going from a to f. Then we can also think about paths that go from the entry to a back edge, which in our case means paths that go from a to e. We also can reason about paths that go from the end of a back edge to the beginning of the back edge. In this case, basically from b to e. And then finally we have some paths that go from the end of a back edge. So for example from b to the exit, so to f. And if you have these four kinds of paths and if you can compute how often each of these paths is taken, we can reconstruct the full path information by basically just putting together these partial paths and then adding their frequencies so that we at the end have the full path providing information also for graphs that contain cycles. So just to illustrate these four kinds of paths using our example, let's have a look at this table where we see all nine paths through this graph looking at these four different kinds of paths. So notice that not all paths go from a to f as was the case previously, but we also look at these paths that for example only go from b to f or from a to e or from b to e because you're going to look at all four kinds. And each of them still has a unique encoding that we get by simply adding up the edge values that are on the way when we take this path. So for example the path from a to f does not hit any increment operator so it will have encoding one. Or it's another example, the second path here a, b, c, e, f. It makes us go from a to b where we will take this dummy edge. So whenever there is a dummy edge and one of the pre-existing edges and we're taking the dummy edge or you can also think of it as we have put this edge label onto both of the two edges that go from a to b. Either way it's fine and gives the same result so we go from a to b. So our counter goes down to minus four then from b to c plus six so we are at two and then from c to e nothing changes and from e to f we are subtracting one again so the encoding for this path at the end will be one. The next path in the list a, b, c, e is the second kind of path that we see on the slide. So because that one just goes from a to e so it only goes from the entry to the beginning of a back edge and this will have encoding two because we do, we start with minus four but then plus six so we end up with two. In a similar way you can compute the encodings for each of these paths so I won't go through all of them because you've seen now how to do it yourself but I invited you to just try it out yourself to convince yourself that actually for each of these nine paths you're getting a unique encoding and that these encodings are as listed here in this table. Finally let me say a little more about the applications of path profiling so we already mentioned one of them earlier in this lecture and that's perhaps the most important application of path profiling and this is in performance optimization. So a compiler can only spend so much time and effort on optimizing different parts of the program and in order to find out which parts of a program are most worthwhile for spending time on to optimize, path profiling is a very popular technique where the assumption is that any path that is frequent should also get a lot of attention by the optimizer in order to speed up its execution. Another direction to use path profiling is in statistical debugging so this statistical debugging has this idea that you look at different execution of a program and you know which of these executions cause a bug or do not cause a bug. So for example you can think of having a large test suite where many tests are passing and some tests are failing and you believe that there's one bug responsible for most of the failing tests so what statistical debugging is doing is it compares the passing executions to the failing executions and then tries to highlight those parts of the code that seem to occur mostly or only in the failing executions and if you do this on the level of paths you actually get a pretty accurate picture of what happens during the execution where the idea is that those paths that are correlated with a failure are much more likely to actually contain the bug and the Holmes paper which is linked as one of the references for this lecture is a nice example of that and then finally yet another area of application is an energy analysis so sometimes you really care about an application not using too much energy. For example you can think of mobile applications that of course should not drain your battery too much and there are approaches that warn developers about paths and statements associated with high energy consumption and in order to know which paths and statements are actually associated with a lot of energy consumption what they essentially do is to perform path profiling as we've just seen it here in this lecture and at the same time they measure the energy that is used during an execution and then they correlate these two measurements and highlight those paths and statements that are responsible for a lot of energy that is used. All right and with this we are ready at the end of this lecture on path profiling so in this last part you've seen how to generalize the main algorithm also to control flow graphs that have cycles and I very briefly talked about some of the applications if you're interested in more applications just have a look at some of the papers that are linked with this lecture or you could also follow citations of the original path profiling paper and there are many many interesting applications to read on. Thank you very much for listening and see you next time.