 Hello and welcome to this next lecture on data structures and algorithms. In today's session, we will discuss a shortest path algorithm, we will discuss the simplest shortest path algorithm which is based on a single source and which happens to be greedy yet optimal, it is also known as Dijkstra's algorithm. So we will discuss shortest path in graphs that are weighted and the underlying property of some of these algorithm, in particular the Dijkstra's algorithm makes use of a property called the optimal substructure property. And finally, we discuss the procedure of edge relaxation, what is the shortest path in weighted graphs, so given 2 vertices a and g, a and g in this graph below, the problem is to find a path that has minimum total weight between them. So we have assumed non-negative weights, weights are 7, 7 and 9 along this particular path, a 23 is a total weight of this path and there are also other paths 1 through D which actually affords a slightly different weight 22, so which of these paths will be the shortest. The length of a path is the sum of weight of the constituent edges and this has several applications such as routing of vehicles, the weight could denote length of a particular road, it could also denote the time required to get from A to B based on congestion conditions and so on. One could look at flight, routing, internet packets in an IP network and so on. Two specific properties that we would like to exploit as follows, optimal substructure, given the path between a and g, what we would like to discover is sub-paths and you like the sub-paths to also correspond to shortest paths. So the sub-path of a shortest path is also a shortest path. In fact, now starting with a source node A, we can show that a tree of shortest path exists from any start vertex S to every other vertex. So for example, the shortest path from A to G here happens to have this weight 11, every other path has higher weight, however the shortest path to F is indeed through D, so ADF. You do not need to connect F to G as part of the shortest path tree because that is not the shortest path. So what we are developing is an intuition that this sub-graph containing all the shortest paths will actually be a tree, it cannot have cycles primarily because we are talking about unique shortest paths and we are only talking about finding one shortest path. So the optimal substructure property is exploited in an greedy algorithm, which is the Dystra's algorithm that the sub-path of a shortest path is also a shortest path. So Dystra's algorithm combines the optimal substructure property with a tree of shortest paths which it keeps track of. So for every node or vertex V, it stores a label DV which is the distance of V from the source S and computes DV successively starting from the neighbors of S. We are of course assuming that the graph is connected, we can make small modification to this algorithm to deal with this graph which consists of several connected components and that we will leave as a homework problem to deal with containing several connected components. We are also assuming that edges are undirected, we will leave it as a homework problem to determine how directed edges could be considered, how to deal with the mixture, combination of directed and undirected edges. And finally very critical for our algorithm is that the edge weights are non-negative. So here is the algorithm by Dystra's, given an input graph G and the source node S, Dystra's keeps track of the shortest path length from S to every possible node V. So SP is basically evolving in size, every time we find a node with the shortest path we add it to SP. SP is actually critical to our loop invariant. We will also keep track of the distance to every possible node in the graph that has been computed to be the minimum so far. So P is an interim state and it is maintained as a min heap. So this is the interim state, distance that has been found to be shortest so far but this has populated for every node. In fact that is what we do here. So initially we keep track of the shortest path to the source node to be 0 and for every other node we set that to infinity. And for every node we go ahead and set the value of the shortest path so far to be SP that is basically infinity for all nodes but the source node. So P is a priority Q and we pick the minimum element from P within this loop, U is that minimum which means U is the closest node in P from S. So initially what you will get from P is the source node itself. So you get rid of that node, the source node. Look at all the incident edges and for every incident edge E you find the other vertex W. So pictorially what you have done is found U that has been found to be the shortest from S so far. Look at all its incident edges. Here is one particular E which has W. We compute the shortest path SP from S to U to that we add the weight of edge E. This is the step called relaxation. So we are basically relaxing E absorbing it into the path to W from S. Now if this relaxed weight R turns out to be less than the shortest path to W computed so far we conclude that R is the relaxed weight is indeed the shortest path. So we set SP W equals R and we update the same within P. So note that the shortest path for all the nodes was initially set to infinity except for the source node. So the first time you hit upon a node that has been visited for the first time obviously the shortest path to that node will be infinity and you will land updating it. Question is whether you will get to update the weight for a node again and that is precisely the loop invariant. The loop invariant for the Dijkstreich algorithm is as follows. So for every node that gets removed from P we can be sure that we have actually found the shortest path to that node. So loop invariant is for each node U that ceases to remain a part of P the shortest path to U is indeed SP U. So in fact that is why we keep iterating till B becomes empty and we can easily prove this as follows. So the last time that P was updated for that particular node U we computed a new shortest path to U based on its neighboring node being found to have the shortest distance from the source S and indeed it was found that the new path to U the relaxed path to U was shorter than the earlier path that was registered for U. So this just means that we can actually not find a shorter path to U than what has already been registered. One can very formally provide the proof for this loop invariant that the loop invariant is actually satisfied at every iteration using contradiction is a very straightforward proof. So let us see the Dijkstreich algorithm in action. We start with a source node D and we look at all its neighbors a, b, e and f. So recall that the shortest path to D was registered to be 0 whereas the shortest path to every other node a, b and so on is basically infinity and this is exactly what the priority queue the min heap is going to carry in the beginning. So P dot min is just going to look at D. So P dot min will actually turn out to be D itself. Next you scan the neighbors of D 5, 9, 15 and 6 are the weights on the paths to the neighbors. You will update the weights SP of a will be 5 plus 0 which is less than exist infinity. So SP of a is 5, SP of b is 9, SP of f is 6 and the shortest path to e is 15. You are going to update the min heap. So what happens? Well the next P dot min should get you a with its corresponding value 5. Then after you will again go ahead and update all its neighbors. Let us see what that means. So yes we got a and now what you have done is updated the weights of its neighbors. You find that the new path through a gives a weight of 8 which is less than the earlier value of 9. A has only one neighbor which is not yet explored because the other neighbor is D itself. Now look at the min heap. What is the next element to pick? Well the next element to pick is f with a value of 6. So you pick f what you do next is scan its neighbors, look for the shortest path and see if the relaxed path is better than the shortest path so far. So 6 plus 8 14 for e is less than 15. For g again we find 6 plus 11 17 to be less than infinity. So you will have 17 and 14 respectively. The next element however has to be picked up and that is b with weight of 8. Again you go ahead and scan its neighbors. Well you find that 8 plus 4 12 is less than 14. So you could update the shortest path for e and likewise for c it is 16 to be updated. The next shortest node from the min queue, priority queue is going to be e with a weight of 12 scan its neighbors. Is there need to update the weights to the neighbors? Well 17 was a 16, 16 is already shorter. 12 plus 4 16 yes you can update g and so on. So again we recall that the loop invariant was that every node removed from the priority queue indeed has a shortest path and that is what we have been doing. We have been actually moving forward in a greedy manner and that is courtesy and ever emptying p. So loop invariant and the greedy nature of the algorithm are very closely tied through the ever emptying p. So let us look at how much time it takes to initialize the shortest path for every vertex it is just a scan over all the nodes. So it is c1 times order v is linear in the number of vertices. Now for every scan for getting the min element of p you will incur log times order v times. It is the worst case when p is complete of course subsequently p gets depleted this is using the standard mean heap implementation. Once you remove the element you are going to scan all the neighbors of u that is going to be accumulated across all the nodes. So we have already accounted for that c4 times the degree of v the neighbor of v and this is going to happen across for every node. And then we also have an update call where you update the weights corresponding to all the neighbors. So this overall is basically for every node so log v times. So the multiplication is between this log v and the degree summed across all the nodes. So more specifically the time required is c3 into c4 the multiplication of these two log of degree of v times and this is basically summed over all the nodes of degree of v. So please note that we have accounted for the worst case here and made it independent of v. We could make it a little bit more complicated and make this dependent on the node size of p. However that is needless because we can always come up with the worst case complexity in the specific case. So we have the summation over all the nodes degree of v plus well the overall cost incurred across all the outer iterations which is across all the nodes. So overall what we get is the summation of degree v times log v plus log v times the size number of vertices themselves. So this is order of e plus v times log of v. Thank you.