 Hello everyone, my name is Rishabh Daal and I am a student at IIT, the topic of today's talk is optimization using flow networks. The talk can be seen as in three parts. In the first part, we will have pretty basic introduction to network X library. It's a library we use for dealing with graphs. In the second part, we will discuss the thing about what are the flow networks, what is the maximum flow, what is the minimum cut. These things are mathematical concepts but we will see how we can use these concepts for dealing with some optimization problems. Flow networks are generally used for getting optimized solution for something and these days they are also used for approximate solution like an emergency evacuation planning and all that. In the third part, we will discuss a problem called project selection problem and we will discuss how we can model the problem into a flow network and prove that the minimum cut of this problem gives us the solution and within five to six lines of code, I will solve the problem in network X. So we'll start. So in network X, we have four types of graphs. First is the simple graph, undirected graph which do not have any parallel edges. Another is directed graph which have parallel edges and then undirected graph with parallel edges and similarly directed graph with parallel edges. So first of all, it's pretty basic functions like add node, node can be an integer and then we can add some attributes to node too like color, weight, these are used for various algorithms. And one good thing is the node do not need to be integer, it can be any hashable object like it can be a string, it can be a mathematical function or even it can be a graph itself. Like I have added the graph itself as in vertex into the graph and some pretty basic function like g.nodes gives us the nodes. We can iterate over the nodes, all the nodes are here. And for adding edges, there is add edge a to b, a to b were not originally added as a vertex in the graph but it will automatically add them as a vertex in the graph and weight, some colors and all that. For accessing the edge, you can access them just like you access a multi-dimensional array. So g25 and g52 are the same thing because we are dealing with undirected graphs and we can add the edges from a list as well. Similarly, we can add nodes from a list as well. And some pretty big function like number of edges, number of nodes, number of self-loops, network acts a lot of this function which comes very handy while writing small script when you are dealing with graphs. And this is the thing I like the most like you can draw the graph pretty easily. It comes very handy when you are dealing with very large graphs like you are trying to find some relationship between the population and all that. And it can be customized very easily. You can change the edge length, edge color and everything like that. So till now, the nodes were not integers like the one node was a mathematical function sign, one node was the graph itself, one node was string somewhat integers. So you can just convert them to integers if you want and keep the old levels too. So like here, the old node that was a mathematical function sign is now number five. So the graph remains the same, just the labeling is changed. It's pretty easy to treat all the edges, print the information and something like that. And these are some algorithms that these are just one, two, three algorithms that I'm discussing. There are a lot of almost all the graph algorithms are available with network acts like the connected components is the set of vertices that are connected to each other means we can reach to each other just by going over the edges. So here 0, 1, 2, and 7 are in different connected components. So if I connect 0, 7, they will be in the same connected component. You can relabel the node to create a new graph and you can take union intersection of the graphs. The shortest path gives the shortest distance using the Dijkstra algorithm. Shortest path here between 10 and 12 gives an error because the 10 and 12 do not have any path between them. Then shortest path between 0 and 8 is 0, 7, 8. And then is dominating set, like if you know about the dominating set, I'm just showing you the simple algorithms. So this was just a simple introduction like how to create a graph add an node, add an edge in the network acts. Now we will see something about the flows and the cuts in the graph. So first of all, let's just see about the history of the network flows. So around in 1955, some US Air Force researcher published a classified research in which they studied the 40 reasons of Soviet Union. The Soviet Union was connected to satellite Eastern European countries with railway networks. So they studied 40 of their regions, 105 edges. So edges are just the railway network between two regions. So they were studying it and they gave a way to each edge and the way it was how much material you can pass from one region to another. What their intention was just like the scarce, they wanted to bomb the railway networks. They wanted to disconnect the Soviet Union with its European countries. And there was a course related with each, there was a course related with bombing each railway network. So they found a set of edges such that bombing them was cheapest and it was cheapest to disconnect the Soviet Union from its European countries and they were successful. So let's see what they, what did they used? So what's a flow network? Flow network is a directed graph, we generally do not have parallel edges. And we have two very distinct nodes in it. One is the source and another is the sink. Source is from where the flow starts and sink is where the flow ends. And for each as in the flow network, each have a capacity attached to it. Like the capacity is the maximum amount of material that you can flow through that edge. So this is an example of flow network. Like S is the source, T is the sink and given are the capacities of the, all the edges like for a capacity of the S to four is 15, it means you can pass 15 unit of material through edge S to four. So sorry, it is supposed to be this way, I have changed. So the flow from source S to sink T is a function which satisfies two assumptions. Like the flow on any S is greater than zero, obviously, and it should be less than the capacity of the capacity of that edge. And for every vertex that is not the source and the sink, we have a conservation of flow like for every vertex other than source and sink, the flow that is coming is equal to the flow that is going out of the vertex. And the total flow is how much we are getting at the sink. Like this is an example of flow. So here we are pushing four unit of material from source S to two, then from two to three, three to six and six to four. This is a valid flow because we are not breaking any of the assumption. The flow is less than the capacity in every time and all the vertex have the conservation of the flow. So this is the maximum flow that we can get from this flow network. So we are pushing a flow value of 28 from the source, getting it at the sink and all the edges are satisfying the assumption like the flow in them is less than the capacity of the flow and for each vertex the conservation of flow is working. We cannot pass more than 28 unit of flow through this network. Just remember the number if you can. So this is the maximum flow and now we jump to the cuts of graph. So till now we discussed what is the maximum flow in a graph, now we will discuss what is the cut of a graph. So in cut of a graph we try to partition the graph into two subsets, A and B. Generally we say that source belongs to A and sink belongs to B. The idea here is to cut some edges such that the maximum flow from source S to sink T becomes zero. Just like the years for researchers wanted, they do not wanted any material to flow from the Soviet Union to the Eastern European countries. So they just wanted to cut some railway networks in between. The capacity of the partition is calculated as the sum of the capacities of the edges that are going from set A to the set B. We are not considering the edges that are going from set B to set A because they will not participate in the flow. So the capacity of the cut is the capacity of the edges that we cut that are going from set A to set B. The pretty simple cut of the graph will be to disconnect the source from the graph. Like here we are just disconnecting the source from the graph. So we are cutting all the three edges that are connected to the source which have capacity 10, 5 and 15. So the capacity of this cut is 30 but this is not the optimal case. Like we want to find the minimum cut. This is the trivial case. Like we can always separate the source from the network. So we want to find the minimum cut and this is the minimum cut. Like if we destroy the edge from S to 2, 3 to 6 and 7 to T, the sink, then the capacity of the cut will be 10 plus 8 plus 10 which is 28. And the max flow was also 28. So this is another theory that some guys from the Soviet Union also studied the same network at different time. They wanted the good thing. They were calculating how much material they can actually pass from the Soviet Union to European countries and the Air Force searcher wanted to cut the network. So they actually found the same number if you give them the same weight. So in every network flow, the max flow is always equal to the minimum cut. So this guy called Paul Fulkerson who gave this theorem that in any network the value of the max flow is always equal to the minimum cut. Like here all the flows and the cut of the graph are shown on the same flow network. So you can see the sum of the capacities of the cut flows and the flows are same and is equal to 28. So what are the application of network flow? Network flows are used in network connectivity, bipart matching, airline use them to find the minimum number of planes they need to serve all their flights. It's used in image segmentation to separate the background or foreground of an image. It's used in project selection which we will discuss. It is used in baseball elimination like they eliminate the teams which do not have any chance of going further. Now we go to the project selection problem. There will be little bit of mathematics involved. So you have a company which need to carry out some projects and let's say you have a set of projects P and with every project is associated a value the revenue of the project. So if the PI is greater than zero then the project is a profitable project like you are gaining some new clients and if PI is smaller than zero then it is a losing project like you need to hire some new guys. Just like in real life scenario the projects are interdependent. You need to do some project before doing another. So some projects are prerequisite of another projects. So we are given with a graph G where a edge from A to B shows that you need to complete the project B before doing project A. So now we need to find a subset of project which gives us the maximum profit like a subset of the project which is complete in itself like we are doing all the prerequisites of the project that we are involved and the profit is maximum. So just like this cat we are looking for the maximum profit and projects we need to do. So how we will reduce the problem to network flow. So first of all the problem we discussed right now that there is a set of projects we want to do there is no potential source and sink in this problem. So first of all we will make a source and a sink for the problem. And now we need to add more as is more conditions between the projects. So for every project that have a project positive revenue we will add an edge between source and the project with the capacity of the revenue of the project and for every project that have a negative revenue we will add an edge from the project to the sink with negative revenue like we are doing it for the projects with negative revenue so it means the capacity of the as remains positive and so and the edges that were given us in the graph like the edges that represented the relations between the project the projects are prerequisite of one another we will set the capacity of these edges to a really high number because while we were we are finding the cut set of the network we do not want to cut those edges because if we cut those edges we will not get feasible project I mean feasible set of projects if we are cutting a prerequisite as it means we are missing some prerequisite for some project that we are taking. So we will make the capacity of these edges to be a really high number so that they are not included in the cut set so if we make it to the equal to the sum of positive revenues it will be fine. So now we have represented the graph into a network flow we have three types of edges one is going from the source to the projects which have positive revenue the other type of edges are the interrelated edges like the some projects are prerequisite of another project and the third type of third type of edges are edges going from the project with negative revenue to the sink. So now let's see what will be the cut what will be the cut value of this graph so this bit of mathematics so we have three types of edges in our graph and we will absolutely we will not be cutting the prerequisite edges because they have really high capacity so we will be cutting two types of edges one that is going from the source to the project and second which is going from the project to the sink so when we are cutting a edge which is going from source to the project the the value it contributes to the cut is the second term in the first line and then if we are including a project with negative revenue due to the prerequisite need then we are cutting from it from the sink side so it will be it will be contributing the first term of the cut so we can expand the second term like summation of PI when I does not belong to set A we can expand it to summation of PI when I belong to the whole project subtracted summation of PI when I belonging to A the set so it will reduce that to this the summation of PI when I belonging to complete set P minus profit of A so because this is the profit of A the like if I take the summation of all the PI's in our subset A that is the profit we are getting from A so the cut value is reduced to summation of PI for which PI is greater than zero and I belonging to set P minus profit of A now since the first term is a constant term for a given set of projects we see the cut of graph is a constant minus profit of A so if we minimize the cut we will be maximizing our profit so for now we approve that the minimum cut of the graph gives us the set of projects that have the maximum revenue now we will just go for an example like this is there are some projects which some have positive revenue some have negative revenue and there is interdependence between these projects now we need to select projects which have the highest revenue and that is a feasible set so I guess it's not possible to do it by hand or mind like by just saying it so we will just represent the whole graph in network X and use the minimum cut function of network X to find a solution so for the network flow we will be creating a directed graph the line 44 I cannot point it so line 44 we will be creating a directed graph and the like the image I've shown here I'm adding the edges manually I mean if you are if you will be using network with some script it will you will not be adding the as is manually so I added the edges the pairwise all the edges and we needed to set the capacity of these edges to be a really high number and network X do it by default like I have not mentioned the capacity of these edges so it will consider that the capacity of these edges is infinite so in line 46 I have calculated the sum of all the PI that are greater than zero and that are present in our set like P so that is the constant that we found in the formula we derived and so if we found the minimum cut an extra minimum cut from source to sync we see the value of the cut is 21 and we get two sets 4 at 10 11 12 14 minus 13 minus 3 with the source so the set with the source is the set we should choose it is the feasible set which have the maximum profit and if we subtract it subtract the total revenue if we subtract the minimum cut from the constant C we will get our revenue like 43 is the total revenue you can get while choosing projects from this graph like if you choose all the projects on the left side of minus one we are good to go okay that's the end of slides any questions yeah at the end of talk any questions yeah so we have time for a few questions if anyone has any well thanks again we can continue in the coffee break and do you remember to vote sorry great thoughts in the air thank you