 Hi, and welcome to this next session on data structures and algorithms. In this session, we will discuss another problem with graph that is of finding spanning trees. A tree is basically a subgraph which spans all the vertices of the original graph, but using only a subset of the edges in such a way that there is no resultant cycle. We will discuss the Prim's algorithm for finding a spanning tree for a graph. However, it is possible that a graph has weighted edges, so in such a situation you would be interested in finding a tree that has the smallest sum of weights over its edges and that basically amounts to finding a minimum spanning tree. The minimum spanning tree T is basically a subset of the graph G with min value of summation over all the vertices or more specifically over all the edges E of the tree of the weight of the edge such that for every vertex V in G, there exists an E in T that covers V. We will be considering weighted undirected graphs. Now the first algorithm we will discuss is the Prim's algorithm. It is a greedy algorithm very much like the Deistra's algorithm. Recall that the Deistra's algorithm also discovers a tree. However, Deistra's algorithm is for a single source and the goal there is to find the shortest path from every node to that source. Of course Deistra's algorithm assumes that all edge weights are non-negative. So here is the basic idea behind the Prim's algorithm. It is greedy in the sense of finding the next vertex to add which has the smallest edge weight to any of the existing vertices in the subtree that has been grown so far. So the key idea is as follows, one grow the minimum spanning tree T. As you grow keep track the smallest edge weight from each vertex V which is in the set of vertices of G but not in the set of vertices of T to a vertex W that is part of the existing minimum spanning tree T and the greedy step is add V from VG minus VT to the tree T with smallest value of such edge weight. So let us try and map the key ideas into the algorithm described here. We are going to keep track of the smallest edge weight from each vertex VG minus VT to a vertex in VT and basically this will form what we call the array of keys. So here is a key and we initially set the key value for every vertex to infinity however we need some root. We can arbitrarily choose a root because we are dealing with undirected graphs. So arbitrarily choose a root so all other key weights are large and now you are going to iterate over the set of vertices in P. So make a note here that for every vertex in G we added to P. P is basically maintained as a min heap. However at initialization you specify that the key that the min heap should use is this array. It is based on this array that element the next smallest element should be retrieved. So now you iterate over this min heap get the smallest element from the min heap and that corresponds to the step number 3 add vertex V to T with the smallest value of such a weight and that is exactly what we do here. However once we have identified the next note to add and initially you will actually add root or you also going to update the key values for nodes or vertices that are adjacent to you. So this step which iterates over all the adjacent vertices basically corresponds to the step of update updating the key array. And this is done for adjacent vertices to you. You also do a small bookkeeping here through the predecessor. The predecessor array is what gives you access to the edges that are included in the minimum spanning tree. So for all U comma V such that predecessor of U is V or predecessor of V is U edge U V will belong to the edge set of T. What ensures correctness of this algorithm? Well the loop invariant for this algorithm is as follows. You can guarantee that in each iteration of the while loop the vertices that are already placed into the minimum spanning tree are basically those which are not in the priority QP but are in V. So the loop invariant has a bunch of conditions. One of them is that vertices in VG minus those in the priority QP are already art of the minimum spanning tree. Of course we keep track of the minimum spanning tree to the predecessor and a termination will what does this mean? It means that P is empty. You have actually spanned all the vertices of the graph. So you have basically completed the minimum spanning tree. There are also couple of other conditions which basically ensure that you have added edges which should actually belong to the minimum spanning tree. So for all vertices V which are part of the priority QP V if the predecessor of V is not nil then the key corresponding to V should be finite and key V is weight of the edge going between V and pi V where pi V is basically already a part of the minimum spanning tree constructed so far. So one can again ensure or prove that this condition holds. This second condition was the motivation behind the Prim's algorithm in the first place. Let us see Prim's algorithm in action. We are going to arbitrarily choose a root. Let us say we choose A to be the root. The vertices that are adjacent to A will have their keys updated. So B will have its keys updated. The smallest path to A is actually 4. The smallest path to any node in the existing spanning tree is 4. So this is how you grow the tree starting in the root A. Now you add B as well. What you do next is update the key for all the other nodes and here are the new keys. H has shortest vertex to A as compared to that to B. So we will basically select, we will update this. C has a path to B but has no direct edge to A. For all the other vertices there is actually no direct edge incident on the nodes added so far to the minimum spanning tree. So the next step all you can do is choose between C and H. Arbitrally break tie and choose C. What are the new keys to be updated? So we can update the key for I. We can update the key for D and for F. Now amongst all the new keys and the old keys we find that I should be pig next. That is what is done. Again you update the keys for other nodes connected to I and we find that in the next iteration it makes sense to include F. So we go ahead and again update keys. G has a new key and looks like G should be picked up next. Again you update keys. Well certainly H can be picked up next. There are some nodes still unexplored and we find that amongst the unexplored nodes we can pick D next because it has the smallest key 7 and then amongst the rest H is already part of the vertex set. You will not add it. The only thing to consider is E. On what basis do you add E? Well this is the shortest. This comes right off the key array. So we have found a minimum spanning tree. What was the complexity? So the first initialization requires access to all the vertices and inserting for each into the min heap. Now we know that heapification can happen in order the number of entries being added. So we can recall the efficient version of heapification. So this is basically C1 into C2 times V. The next step is to find the smallest element in the min heap and you need to keep doing this for as long as the heap does not get empty. So this will happen order V times and every time you are going to do a C3 times log order V axis into the min heap. So this gives you C3 times C6 V times log V. The inner if loop is interesting. So the inner if loop is called for each node. For each node you are basically accessing the entire adjacency list. This is basically summation over V of degree of V. This is nothing but the number of edges. However there is a very interesting implicit call being made to the heap which is not a trivial operation and that is updating the key. So moment you update the key we have left it implicit but you might want to explicitly make a call saying update heap for P. So this will take order log V time and remember all this happens for time proportional to the sum of the links of all the adjacency lists. So that gives you C4 times C5 times log V multiplied by summation over all vertices of the degree of that vertex. So this is nothing but order of E times log V plus order of V times log V. Thank you.