 Hi and welcome to this next lecture on data structures and algorithms. In this session, we will continue our discussion on finding weighted minimum spanning trees. We will discuss a greedy algorithm that is based on greedy edge selection is called the Krux-Sulls algorithm. So, recall that we already discussed a greedy approach using the Prim's algorithm. Recall that we already discussed a greedy approach using the Prim's algorithm. The Prim's algorithm is greedy, but it is greedy on vertices. The key based on which you pick the next vertex is the weight, the minimum weight of an edge incident on that vertex to a node or a vertex which is already a part of the existing minimum spanning tree. So, that is how you greedy grow. Here you edge that does not cause a cycle in the minimum spanning tree. So, it is an edge selection rather than vertex selection strategy. However, both of these strategies are based on edge weights. So, the specific strategy is to find an edge of the least possible weight that connects any two subtrees in the forest. So, instead of building a single tree at any point of time, you have a bunch of trees, tree 1, tree 2 and so on, tree 3. So, set of trees and as far as possible, you try and connect these trees and you want to connect these trees in a way that the edge connecting them has the least possible weight. So, this edge should have least possible weight and you basically add increasing cost at each step, add one edge at a time. So, initialization is based on a forest which consists of singleton sets. Each tree will be a single node. So, initialization, your set of trees T is vertex 1, vertex 2, vertex n. This is basically your forest. You have no edge added to begin with and then you introduce edges. We will pick the edge with the least possible weight that connects two different trees. Of course, in this process, you will merge these two trees. So, you will get a new tree v 1, v 2. The others will remain what they were. In this way, you grow, you keep growing till you have found a single tree or you are not able to find any edge that connects two trees. So, grow. How can we implement this? Well, we will need to make use of some union structures where you can maintain unions of trees and you also need to keep track of these individual sets. We are going to maintain these individual sets as sequences. So, what we are going to do is have s sets as an array of sequences. So, s sets 1 will be sequence sorted in increasing id of vertices in tree 1. In general, the s sets for i will have sequence of vertices in tree i. So, as pointed out, we hope that eventually either there is only one of these sets or sequences that is non-empty or we are not able to find an edge connecting two different sequences. That is two different trees represented as sequences of vertices. We are maintaining these vertices and sorted order primarily because we want to be able to merge them in linear time. So, sorted in order to facilitate merging of two trees and sequences, tree representative sequences in linear time. We will come to that step later on here, merge s sets u comma s sets v. So, continuing, we are also going to maintain a mapping. This mapping is from every node in the every vertex in the graph to the set of the tree that it corresponds to or is a part of at any point of time. So, s set id for node i vertex i is the sequence s sets j that i is part of and again the purpose is to avoid to be able to find the tree containing i. So, that you only add edges that connect two different trees or two different sequences. So, this will become handy when we look at addition of an edge that connects two different trees in this step. So, let us continue. Next, we initialize both these data structures. We iterate to all the vertices. Again, we assume that the vertices v are numbered say 0 to n, n minus 1. Assume that v is 0, 1 to n minus 1 just makes our indexing easier. So, we are going to insert v into the set v that is the only element of that tree. These are basically the singleton trees and also set corresponding to v is set to v itself. Next, we saw the edges of g dot edges into a non-decreasing order by v and w. So, this is going to help us in our greedy selection strategy. So, then we iterate over all the edges. Now, we check if this edge indeed connects two different trees. So, if the set id of u is not equal to set id of v that means they are part of two different trees, then this t which we have not described so far. This t basically is the data structure that keeps track of the tree so far. So, this t is set of edges in tree t so far. So, it starts with an empty set, but if we find that this new edge does connect two different trees, we basically add this edge to the tree that is being grown. We basically merge the two trees. We have now merged the tree corresponding to s set v into s sets. So, that we reflect in the update for t and well we also explicitly update our tree data structure itself, our forest data structure in the next step. So, explicitly update the forest data structure which means you also have to empty v. So, s sets v has been merged into u. So, s sets u has been set to s sets u union s sets v. However, you merge them maintaining the sorted order and then you empty s sets v is the next step. And then update the set for v to be u, the set index for v is now set to u. Let us see the Cruxelles algorithm in action. And I will also suggest that you think about the convergence of this algorithm using the following loop invariant. The loop invariant will be that the tree at any point of time is basically going to contain the edges that belong to the minimum spanning tree and they are the smallest weighted edges. Whatever remains to be scanned from the sorted list of edges are those with larger weights and either they connect two different sub trees in the forest that is being grown or if they do not connect if basically they belong to the same tree they will just land up creating a cycle. So, you can think about this loop invariant. So, the loop invariant as follows t contains all edges of the minimum spanning tree with weights less than edges to be scanned or yet listed and that the edges that are yet to be listed these in turn either connect two different trees or form cycle within a tree in which case the set IDs will match. The Cruxelles algorithm in action. So, imagine that we have assigned numbers to these known vertices. So, a is numbered 1, 2, 3, 4, 5, 6, 7, 8, 9. The next step for us will be to consider 9 different trees. So, your t will basically consist of 1, 2 and so on until 9. We are also going to sort the edges in increasing order of weights. So, the first weight we will come across is make a pardon it is between h and g has weight 1. So, you are going to add 1 that will lead to the merging of h and g. So, now you have fewer trees and we are going to keep track of this h comma g in 1. Of course, maintain the sorted order a, b, c, d, e, f, g comma h will again maintain the increasing order i. The next vertex to be chosen should be between either between g and f or between i and c. Let us pick this i and c. Now, the process continues. We can look at g and f and then the next edge to consider is between a and b. None of this has actually resulted in an edge within the same tree. So, we keep adding them. We have added c f. What about the next one? It is between c and d. So, we have added a and h. How about h and i? Why do not we add h and i? So, we know that h and i actually belong to the same tree. So, h i connects i and h belong in same tree. So, recall that we had set id. So, set id of i by this point of time will point to h because that is the smaller of the two. In fact, it could be any of the other. So, in fact, set id of i will just suffices to say will equal set id of h. So, we will not add this edge h i, but we will add a h next. It does not lead to any cycle. They belong to two different trees, then d e and so on. So, have we got a tree? Well, we have actually covered all the nodes. Is this a minimum spanning tree? Yes, the loop invariant, the initialization, maintenance and termination. If proved for the loop invariant, we are guaranteed that this is a minimum spanning tree. And it is not very difficult to prove those properties for the loop invariant. Analysis of the Kruxals algorithm. So, we have a very specific implementation where we have maintained the union using list of sequences. So, for this, the analysis is very straightforward. The insertion of the V edges, V nodes will take c 1 into V times. The sorting will be an order E log E operation, the sorting on the edges in the increasing order of weights. And then we are going to iterate over these edges in increasing order of weights. For each edge, we are going to check if the set IDs are different, set IDs for the two endpoints of the edge. If they are different, we add it to the tree. This basically is an order E operation overall. We will do this for every edge, the worst case for all edges. The merge is an order E operation again, because in the worst case, the two sets that you have will need to be merged. And finally, you have some other order E operations over all the edges. So, one can show that this overall complexity is dominated by this order E log E sorting step. And the worst case E is order V square. So, therefore, order E log E can also be written as order E log V. Thank you.