 I'm going to be talking about the traveling salesman problem and actually a special version of the traveling salesman problem which I will introduce in the first few slides. This is a very nice piece of work that came out in April this year by Tobias Momke and Ola Swenson and Amit actually presented pretty much the same talk at our Mysore Park meeting. So if any, those of you who were there, apologies, I'm repeating it but the set of slides are so nice that I did not want to even modify it. Let's see now. Okay, so what's the traveling salesman problem you're given in the graph? And let's say there are some lengths on these edges and one way of defining it is to just find a tool which visits all the vertices of this graph. And this should have the smallest length so maybe this is such a tool which does it visit all the vertices? No, right? So there's this one left in the middle. So we'll come around and visit it. So I'm getting rid of this requirement. Some of you might have read it as a tool which visits every vertex exactly once. So that's just not particularly necessary. You just want a tool which visits all vertices and has the smallest length. You can also view it as a Hamiltonian cycle in metric closure of the graph and then you can impose that requirement that every vertex is visited exactly once. I add those edges which are not there to make the graph complete and the lengths of those edges are just the length of the shortest path between the end point. Then you can, then this can be your tool. And as you can see, if I replace those red edges by the shortest paths, then I would get the same tool that I've shown you earlier. Okay, so that brings the definition to the particular version of traveling salesman problem we are going to be considering today. That's called the graphic traveling salesman problem. Now, you can also think of it as unweighted traveling salesman problem. So which means that you've given an unweighted graph, okay? And once again, the problem is the same, find a tour which visits all the vertices. So you really want to, since all the edge weights are unity now, you want to find a tour which has the smallest number of edges, right? So that kind of makes it simpler. And this tour actually, if you look at the edges that you visit, then you can also, you can think of it as an Eulerian subgraph of this graph. Or with, actually, yeah, you could have multiple edges, so we'll call it an Eulerian multi-graph. So what you are interested in is really an Eulerian multi-graph of this, a subgraph of this graph which with multiplicities, which could have edges with multiplicities and which is Eulerian. Connected and Eulerian, right? Everyone follows this, right? So this is, so I'm trying to put it in a graph theoretic framework so that then we can use some graph theoretic tools and techniques to try and solve the problem. So you're given a graph, a connected graph, you want to find a subgraph. You're allowed multiple edges, so you could make multiple copies of the edges. But then this should be Eulerian, right? While Eulerian because you want to be able to visit all vertices and come back at the starting point, okay? And you want to add small number of edges, okay? Good, some history, the problem is it takes hard shown by Papadimitru and Vempala, so it's hard to approach. Is it hard for the graphical version as well? Yeah, because if you have a Hamiltonian graph, right? So the optimum tool should be n and you would need at least n plus, okay, not completely clear. No, but it is, I think, okay? It is hard, yeah, it's okay. For the traveling settlement problem for the metric version, right? So we are assuming that the graph is complete and the edges satisfy triangle inequality. The best result that is known is a 1.5 approximation by Christophe Ittys. Many of you might have even read this in your undergraduate process. So this is from 1976, and so today Manindra was saying that for 24 years there was no work on the problem. I have a problem here where there was no work for 35 years. I don't know whether we need to be proud of this or it's that we are not working hard enough. So actually this bound stays, right? Everything that I'm talking of is not still improving the metric TSP bound, it still stays at 1.5. So what are we, so but for the special case of the graphical TSP, we have been, whatever, this small case once in result and another result which I'll just refer to in a minute, have managed to reduce this bound below 1.5 for the very first time. So some improvement after 35 years, but for a special case of the vector TSP, really. Related to this is the relaxation, the Helden-Karp relaxation, we'll also be referring to this. This can be thought of as a linear programming formulation for the problem and this gives a lower bound on the optimum value. Conjectured, it's conjectured that this lower bound is not so far away from the optimum value, right? So it's at most the gap between the optimum value and this lower bound is at most 4 by 3. So again, this is an old conjecture and all we know about this gap is that it is at most 1.5. So the graphical version of this problem, the graphical TSP that we're talking of, this is not that this was first considered in the paper by Moemke and Svensson. It was actually, Gamarnik et al, they looked at this problem and they, in fact, to be able to prove something, they looked at an even special case, right? So you're talking of unweighted graph, but let's say it's a cubic graph, right? So cubic and 3H connected. So what's cubic? Every vertex has degree 3, okay? So for that, they could argue a 1.487 approximation. This was in 2005 and this was improved by two sets of papers. So Boyd et al improved it to 4 by 3 and we also had a result improving it to 4 by 3. For sub-cubic graphs, this was at 7 by 5, okay? And as I said, there was another, there were two sets of results, both of them appearing in stock this year. The first was by Gharan Saberian Singh, which made a very, very tiny improvement from the 1.5 bound, right? Again for graphical DSP, right? So very tiny improvement and this is a long 40 page paper. What I'm going to show you in this paper is, this is the Moemke-Svensson paper. They improved it by a larger amount, 1.46 and this has subsequently been improved by a paper 213 by 9. This is by Marcin Mewcha. This paper of Moemke-Svensson also gives a very simple 4 by 3 approximation for the cubic case. So I'll be able to show you complete details of that. So very simple. Sub-cubic, sorry, okay? So sub-cubic is vertices of degree 2 or 3. Yes, sub-cubic. For cubic, it could be better than 4 by 3, yeah. But the tightness example for the Helcabound, I'll show you, is a sub-cubic graph, where, yeah, exactly. Yeah, for cubic, it could potentially be better than 4 by 3. So just to give you some little bit of ideas to what people were trying and why is this paper interesting. So I come to that, right? I'll, so this is showing a bunch of new ideas and which is what is exciting. And there is hope that many of those ideas can perhaps be extended or taken further. So this is the broad outline of the talk. So I'll start with, so those of you who are rusty on their Christofides algorithm, I'll quickly in two slides cover the Christofides algorithm. And then I show you a 4 by 3 approximation for cubic graphs, okay? Then to take it beyond cubic graphs, we'll need to introduce, we'll need to introduce some ideas and this is the ideas of removable edges and removable pairs and how one can find large number of such pairs. And then we'll use that to get a 4 by 3 approximation for sub-cubic graphs. So till this point, I'll be able to show you full details. I'll also show you what the Helcab relaxation is. And the extension to general graphs, I'll leave it at the level of the theorem, giving you the ideas and because it's unlikely that we'll have the time to go into the details there. So quick review of Christofides algorithm. Please feel free to stop me at any point. I have plenty of time, right? So how does Christofides algorithm go? Remember Christofides actually works even for the metric DSP, right? So it's not just the graphical DSP. So metric is you're given, you think of a complete graph and you are given these edge lengths which satisfy triangle inequality, right? If you find a minimum spanning tree there. And since the optimum solution, which is a cycle, minus an edge is also a spanning tree. So the minimum spanning tree is only less than the optimum. So this is only this cost at most opt, the optimum DSP tour. And remember, we want to look, we want to find something Eulerian, right? So is this Eulerian? No, because this has plenty of odd degree vertices. So what we will try to do is to take these odd degree vertices and make them even degree. So those are the odd degree vertices, the red vertices. And so how do we make them even degree? We'll put a matching on top of that, right? So standard stuff. So let's say this is the matching. Now this is an Eulerian graph. And so we'll just walk along this Eulerian graph and that's our tour, right? That's the tour we are looking for. So we need to say how much this matching costs. And the cost of the matching is at most half of opt where opt is the optimum tour. And this is a very simple argument for that. Right? So the argument is that if you look at this optimum tour and you look at the odd degree vertices on that optimum tour, so there were these four odd degree vertices. So there are these four vertices that are marked on the tour. There are two different matchings, right? If you look at the red set of edges, they form a matching and the blue set of edges, they form another matching, right? Of course, you might be worried that there might be some intermediate vertices. So let's just say we are looking at the, remember we have all edges added, right? We are looking at the metric completion of the graph. So there would be some direct edges also, right? And so that's, so we have really two matchings whose total length is the length of the tour. So one of the matchings has to have length which is less than half the tour. And so the minimum weight matching has to have length which is at most half the tour. So that half together with the one from the cost of the minimum spanning tree together implies that, you know, the matching plus the TSP, the matching plus the MST has cost at most 1.5 times the optimum. Yes, will there always be an even number of vertices? Yes, right? Because the sum of the degrees is an even number. So there will always be an even number of vertices. So one important thing here and which is why I think we were stuck for a very long time is that, you know, we took a tree and we added edges to it, right? We didn't remove anything from what was there earlier. We just added stuff to it and we got this 1.5. So one of the big ideas in this mom case once in paper is a way of also being able to remove stuff from what you did earlier, right? Remember your max flow algorithm. You take a path, right? And then if I were to tell you that once you've sent a flow along the path you can't change it anymore, you can't compute maximum flow. You're stuck, right? So the whole idea is this idea of augmentation, being able to modify things that you did earlier, right? That's very critical to be able to argue that, you know, max flow, you can compute max flow. Similarly in matchings, many of these algorithms need this idea of augmentation. You might have made mistakes in earlier steps, but you need to have a mechanism to be able to correct that. That, you know, this sort of this paper provides that framework. Of course, maybe, you know, it's not completely there because we are not yet at the bound we would like to be, but you know, that's what I find exciting about this paper. Okay, so now we are going to get to this four by three approximation for cubic graphs. So the first thing is that we can always assume that our graph is two connected, right? That there is no such black vertex in the middle. Why? If there is a cut vertex, so that's a cut vertex, whose removal breaks the graph into components. If there is any such vertex, what will I do? I will, you know, then the tour has two kind of, so what will a tour have to look like? It will have to be a tour in each of the components, right? So I'll just, you know, remove that vertex or make multiple copies of that vertex and solve the problem independently in the various components. So we can assume our graph is two connected. So now we have a cubic two connected graph. A graph where there's no cut vertex and every vertex has degree exactly three. That's going to be important for our argument, right? So that is an example of a cubic two connected graph. And what do I want? I want to make it, is it an Eulerian graph? What's an Eulerian graph? Every vertex has to have even degree. Every vertex here has odd degree, right? So what should I do to make it an Eulerian graph? Add a matching. If I just take a perfect matching and I add it to this, it becomes an Eulerian graph. So I could do that, except, sorry, too many edges. Okay, so the first thing is, it does always have a perfect matching. It's called Peterson's theorem, yeah, but why, yeah, okay. So we will see an argument for that. So I'll give you a linear programming argument for it, right, but for now just believe that every cubic two connected graph, the two connected is important. So is the cubic, right? Has a perfect matching. Okay, so if I add this perfect matching to this graph, it becomes Eulerian. It becomes Eulerian. Now what's the problem? There are just too many edges in this graph, right? How many edges, how many black edges are there? How many edges does a, if I have a cubic graph on N vertices, how many edges will it have? Three N by two, yes? Why, because every vertex is degree three. Three times N is the total degree, right? And divide by two because we're counting every edge twice. Three N by two is the number of black edges. Size of a matching, perfect matching here, so it will be N by two. So we have two N edges in this graph. What did we want to show? Four by three N, right? So we are very far off still. Can you remove some edges? In here, can you tell me some edges which we can remove potentially, right? So if I can take these, I can remove both of those two edges, yeah? Can you see that? See, if I remove, yeah. So if I remove edges which are parallel edges, then the graph remains Eulerian still because it will reduce degree on both ends by two. But in doing that, I have to ensure that it remains connected, right? So in fact, I could remove that pair. And yeah, so I think we could also remove other pairs as well. So that's the key idea. And we'll see how to work with that to get this four by three N now. Okay, so I need a little bit of matching theory and in fact, just this one slide perhaps. So suppose I have, I give you a graph and I want to define the, I want to look at all the matchings in the graph, right? So on all edges, so think of, take a vector which corresponds to the edges in the graph. So one entry for each edge. And there will be one at all those places where which correspond to the edges in a matching. And so this is a vector in dimensions number of edges and I want to look at all perfect matchings and I want to look at their convex hull. That's called the matching polytope, okay? And Edmonds have the very nice way of, so this is one way of describing a polytope, right? But specifying what are the vertices of the polytope. Another way of describing a polytope is by specifying what the facets of the polytope are. And Edmonds' theorem for the matching polytope says that if you have these inequalities that they define the, that they also equivalently define this polytope, okay? So what are the inequalities? If I look at any vertex, then so associate a variable with every edge. Let's say there is a variable XE with HE. So if I look at a vertex and I look at the edges incident to the vertex and I sum up those variables, one of those variables has to be one, right? If it's a matching, if it's a perfect matching, one of those variables has to be one. In fact, exactly one of those variables has to be one. So I can write a constraint which says that some of these variables should be one. But this is not sufficient, you know, this would seem that, so another inequality is required, which is called the odd set constraint. And that basically says that if I take a subset of vertices which are odd in number, okay? Take any odd set of vertices. Then what can you say about this? So let me draw that. So if I have an odd set of vertices, then in any matching, see a matching cannot match, a perfect matching cannot just match vertices inside. One of the vertices, at least one of the vertices has to get matched outside. So which means that if I look at all these edges which are going across this set, right? So some of these values, so some of the XE values of these edges should be at least one. Not exactly a one, right? There's no reason why a matching should only match one of these vertices outside. It can match all the vertices outside for all we know. But at least one of them need to be get matched outside. And so that's that constraint there. X delta S, delta S refers to the edges incident to the set S, and for all odd sets. We can't say this for an even set, right? Because in an even set, all the matching can have no edges going across the set. So this is an equivalent characterization. So if you look at the matching polytope, if you look at the, you know, these facets define a convex region, and the extreme points of these convex region will be exactly the perfect matchings of your graph, right? Now this is, and it's not very difficult to prove either, but we'll not be doing it here, okay? So this is, there are two ways of defining any polytope. One is, I said by specifying what are the, what are the vertices of the polytope, and the other is by what's specifying what are the facets that make up the polytope, and this specifies the facet, and the vertices are specified by the incidence vectors of the matching, of the perfect matchings. So this is the theorem by Edmonds that for that polytope, any vertex corresponds to a perfect matching, okay? And actually now we can use this to prove what I had said earlier, that every, what did we say? Every three, every cubic graph, two connected cubic graph has a perfect matching, right? So can anyone say it? Why would this theorem, so if I have a, okay, so the proof is here, right? Pretty much. So if I, so take a cubic two connected graph, and set every edge to one thirds, give every edge a value one third. Now you must be wondering, we're talking of matching, we're talking of zero one values, where does this one third come from, right? Okay, bear with me. So if I have this point where every edge is assigned a one third, and it satisfies all those constraints, what does that mean? That point lies somewhere inside. Yeah, because it satisfies all the constraints, right? So it's within the polytope. And so, but any of these points can also be represented as a convex combination of the vertices, right? So what does that mean? That I have, that means that there must be some, since it can be represented as a convex combination of the vertices, so there are perfect matchings in this graph. Okay, so that's what we'll use. So, okay, so why is this a valid? Why does this, if I set one third to all edges, I give all edges a value one third, why does it satisfy all the constraints? So what were our constraints? That the edges incident to a vertex sum up to one. Would that be the case? Yes, because there are exactly three edges incident to every vertex. If I look at an odd set, the sum of these values should be at least a one. Is this true? Why is this true? So I could have an odd set which has only one edge of going across it. Can I have such an odd set? Why not? It's too connected, right? Either of these two vertices will be a cut vertex. Okay, I might have an odd set which has only two edges going across it. Then if I have one third and one third, this is not summing up to one, which is what I want. So can I have an odd set which has only two edges going across it? It should have odd parity, why? Because see, the sum of these, all of these vertices here have an odd degree. So sum of their degrees, and these are an odd number of vertices. So sum of degrees is odd. The edges inside will contribute even. So the edges going out should be odd. So they'll either be one or three. And if there are three, we are done, right? Then the sum is at least a one, good. So this is a feasible solution to these set of constraints. And this is what is, if S is an odd set, it should have at least two edges going across it because it's too connected, and its parity should be odd. And so it satisfies all the constraints. And so that means that there is a, that this is a point, this point which is one third, one third, one third, sits inside this polytope. And if it sits inside this polytope, it can be expressed as a convex combination of matchings and let M1 through MK be these matchings, which it is a convex combination of. Remember these vertices are all perfect matchings. So it's a convex combination of perfect matchings. What does it mean it's a convex combination of perfect matchings? That means that I can associate multipliers, lambda one, lambda two and so on, lambda K with these matchings, such that if I look at an edge and I look at all the matchings it is part of, then the sum of those lambda I values is going to be exactly one third. That's what saying that this is a convex combination of those matchings means, that this vector which is all one thirds vector is lambda one M one. Any questions? This is all I'll require actually from the polytope idea. The idea being just that there is now a set of matchings. So if you did not follow these Edmonds' polytope business, then just remember this, that there is some set of matchings, small number of them, such that I have these multipliers with these matchings, such that if I look at an edge, then it appears in one third of the matchings, one thirds counted in a weighted sense. So now what's our algorithm going to be? We are just going to take a random matching. We are going to pick a matching with probability proportional to the lambda I value and take that and add it to the graph. What was that? So remember, we have not done anything yet. So you must be wondering, if any Mahanath Kunkari. E is still three N by two edges, by the way. We have not done anything there. We had a cubic graph. So it has three N by two edges. M is still N by two edges. So E union M is still two N edges. But now with this idea, I'll be able to remove some edges out of there. Total number of edges is still two N. So now what can we do to remove? So we'll take a DFS tree. Let's say that's our DFS tree. The edges in red are the back edges. So those are not the matching. So now I've taken the, so this is a DFS tree on the original graph. So for a minute, forget the matching that we've added. That's not there. We just take the original graph, which has three N by two edges in it, the cubic two connected graph, do a DFS, and what can I say about the matching? So it's a perfect matching. So there is an edge in the matching, which is incident on, so every vertex, or yeah, every vertex is part of the matching. So there's an edge in the matching, which is incident to a vertex. And now there are three possibilities. So remember the red edge is the back edge. I'm looking at the vertex V, and one of these three edges incident to this vertex is in the matching, which what are the three edges? One is the edge to the parent, one is the edge to the child, and one is this back edge. One of these three edges is in the matching. So the green edge is the matching edge. Of course, the picture need not look like this always. What is the other possibility? It has a parent edge and two child edges, right? But this is the other one, so you'll see that they are similar. So the green edge, which is the edge of the matching, could either be the parent edge or the child edge or the back edge, right? Those are three cases we've looked at there. And now, note that in cases two and three, I can actually remove those two edges. Why can I do that? So remember I said, you know, if I take, if I have two parallel edges, I can remove them. Provided the graph remains connected after that, right? If I remove these parallel edges, Eulerian is the property of the graph being Eulerian will not get violated because I'm reducing degrees by two, but it might disconnect. But because of the back edge being present there, right? It will not disconnect, okay? So in that case, I can remove. And also in this case, I can remove both of these edges, okay? And so now let's ask ourselves, how many edges can we remove, right? What have we done? Okay, so let's understand. We started with a graph, which was cubic, two connected, and we added a matching. What was the matching we picked? We picked the matching according to this distribution, right? We added this matching and then, okay, what are we doing? We're looking at a back edge in the graph, okay? That's important. So I'm actually not looking at these vertices, these kind of vertices where there are no back edges coming. So I'm really looking at vertices where there is a back edge coming in. How many back edges are there? There were three n by two edges in the graph, n minus one, three edges remaining back edges, n by two back edges. So I'm really looking at the place where the back edge is going to. At that vertex, I'm saying in two-thirds of the cases, I can remove two edges. Why do I say two-thirds of the cases? That's where this distribution comes from, right? Remember, each of these edges, each of these, so if I look at the green edge, that green edge is picked with the probability of one-third. In each of these matchings, one of these three is going to be, we will have one of these three cases. And each one of them is appearing with the probability of one-third. And in two of the three cases, I can drop the edge. I can drop not just one edge, but two edges. So that's what, so that's the number of edges that we can save, right? Where is the n by two coming from? The n by two is coming from the back edges, right? So because we're looking at those vertices. Look at the back edges and look at the edge, the vertex where that edge goes to. There are n by two edges, so those are the number of places, right? With the probability of two-third, I can actually save two edges, because I'm removing edges in pairs. That gives me two n by three. That's the saving I can expect. That's the expected savings. And so the number of edges that remain is four n by three. Okay, good. So no, so instead of looking at a vertex v, so what is vertex v? Look at a back edge and look at where it is going to. It goes to a unique vertex, right? We are in the regime of cubic graphs. I cannot have, except for the root, I cannot have two back edges coming to the same place, yeah? Because if I have two back edges coming to one vertex, then that vertex will have degree more than three. So forget the root for now, which can have two back edges coming. So except for that, there is only one back edge coming to a place. Now look at that vertex, right? So if there is a back edge coming there, it's really, the picture is that way, right? That there is one parent edge, there is one child edge, and there's one back edge, right? And then there are exactly those three cases you have to worry about, right? And how many such vertices are there, which I can be looking at? I cannot be looking at all the n minus one, n vertices of the graph. That's the number of back edges that I have. So that's why the first n by two shows up. Right, but there are these n by two places where I am sort of saving, right? Okay, so this is actually a very simple proof, compared to what had been done. So although it seems like we've done a very, very special thing, right? We said cubic graphs only, and for that we've argued four n by three. So let me give you a little bit of ideas to what were the other papers. The four n by three bound is not new, but this is a very much simpler proof than what was known earlier. The earlier paper by Gamalnik, so what people have been trying is, since this we are in the really in the graph theoretic setting, is to find what are called cycle covers in the graph, right? So what is a cycle cover? If you can cover all the vertices of the graph with cycles. Okay, now why is this useful? If I can find large cycles to cover all the vertices, then I'm sort of done, right? So suppose each of these cycles were of length at least six. Then how many cycles are there? At most n over six, right? How many edges are there in these cycles? n, and then I need to, is this an Eulerian graph? Eulerian, yes, but not connected, so I need to connect things up. So I take any edge between these vertices, so I take any tree, and I double up each of these edges. Sorry, this is not a tree, okay? So how many edges that I'm adding? n by six cycles, so two times n by six edges I'm adding, and so that's four n by three, right? Except that can you find a cycle cover which contains only six length cycles, or six or larger length cycles? That's hard to do. So the earlier line of work was trying to say, okay, well, that's hard to do, but if you have small length cycles, you will also have large length cycles, right? So it is trying to do some kind of an amortization across that, that was the Gamanic result, Boyd's result was along the similar line. Our result was actually we were able to find cycles of length six or more, okay? So, and so that's how we got this bound. So, but this is in a completely, it takes a completely different track approach to this, so it's, and it's much shorter, I would say, and can be extended, that's the best idea. Okay, so now if I have to take this further, even if I have to go from cubic to sub-cubic, you know, what is it that I need to do? So there's this idea of removable pairs, which we are going to, which I'm going to talk about. So what the idea is that your graph is too connected, you pair the edges up, so R is going to be some subset of edges, okay? And P is going to be pairs of edges from this subset, right? So P is the subset of R cross R. So for instance, the two red edges are from a pair and the green edges form another pair, the blue edges form another pair. So the pairs are disjoint, that's important, but there could be edges in R, which are not part of any pair. So I might have this purple edge, which is not part of any pair, okay? So what is it that defines, so okay, so I need to say a little bit more things about what R is and what P is. So first, so the pairs are disjoint, which means that every edge of R is in at most one pair. And the edges of a pair are incident to the same vertex, which has degree at least three. That will be another property of this pairing. And the third critical property is that if I remove the edges of R, such that, okay, I'm not removing all the edges of R. I'm removing some edges of R, such that from every pair I remove at most one edge, okay, then I keep the graph connected. The graph does not disconnect, okay? So you can see roughly where this is coming from, right? We were trying to remove pairs of edges such that the graph remains connected, right? So it's sort of taking that idea and extending it forward. Now in a more generic setting, it's saying, okay, R could be any subset of edges, or not any subset of edges. You'll have to carefully choose R and there are some pairs that we define on the edges of R. These are disjoint pairs, and which satisfies these three properties, right? So okay, so the one important thing that I should point out here is that you could have a set of edges R and no pairs defined. But then this would be a valid set of removable edges if and only if by removing all the edges of R, the graph still remains connected. See the way we have defined, okay? So let's look at this picture that we have. Is it a valid pairing? Is it the case that I can, I can remove any subset of edges of R such that the subset contains at most one edge of each pair? Would that be true? And the graph remains connected. So for instance, if I were to, okay, I need this pointer now, right? If I were to remove this red edge, this green edge, and this blue edge, that's perfectly okay, right? These are back edges and I could have removed them and the graph would remain connected. But if I had removed this red edge and this red edge here and this red green edge and this blue edge, would the graph still be connected? It is in fact. So this drawing is a valid drawing in that sense. And in fact, I could also add, if there is this edge, this could also be added to R. I have not paired it up with anything, but this could all still be there, right? If I remove this edge, together with all the other edges that are there, it would still keep the graph connected. And one of the lemmas in the paper is that you can find a TSP tour which contains at most four by three times the total number of edges minus two by three times the number of edges in the set R. And I'll give you an idea of what is happening here. So this here, note that we are not assuming, so we are trying to go beyond this cubic requirement, right? The first thing that I've shown you, there we had assumed that the graph is cubic. Every vertex is degree three and sort of things worked out nicely. Vertices at degree three, so there's only one back edge coming to a vertex and then sort of things were nice. But now you want to get beyond that and for that we need to do this. So even if you don't have this cubic restriction, this is still true, okay? But we are going to, we love the cubic case, right? So what we'll do is we'll try to make the graph into a cubic graph, right? And how is that done? Well, if you have a vertex of degree two, you can replace it by this gadget. And as you can see, all the vertices here are degree three. Why did we not replace it by an edge? See, replacing, getting rid of a vertex is not good, right? Because you might get a tour which doesn't go through that edge. And so now you are in trouble, right? So, right? And this replacement that we are going to do we'll try to maintain. So I'm thinking of a graph where I have somehow magically figured out what these R and P are, the set of removal edges and the pairs. And I'll now do this transformation, map it to or convert it into a cubic graph while so that the property of that R and P is still maintained, okay? So I'll show what I mean by that in a second. So could these edges here, could these edges here have been part of any pair? Could these edges have been part of any pair? So remember the R and P. So this was my original graph. Could these edges have been in R or P? Could they have been in any pair? See, remember we said the edges of the pair have to be incident at a vertex of degree at least three. So they could not have been. So that's why there's nothing to show here. But now suppose consider this setting. There is this higher degree vertex and these are two blue edges which are from one pair and these two red edges which are from another pair and no, there's no brown, okay? And the way I'm going to replace it is I'm going to, I have to make it a cubic graph. So one can always do this. So you are replacing it by this structure where the two blue edges are incident at the same vertex now, right? Because remember we want the pair, the edges of a pair to be incident at the same vertex of degree at least three, right? So that's what we are going to be doing here. So I claim that, okay, this can always be done, right? That you can always start with your original graph which is not necessarily cubic and convert it into a cubic graph. Such that the properties, so if I had an initial set of removable edges and removable pairs, I, you know, they continue to be removable edges and removable pairs in this new setting. In this, in this modified graph, yes. No, so, so this transformation that we are doing is not with respect to any DFS tree or any such thing, right? You're starting with your original graph. Somehow you figured out some R and P there, right? The set of removable edges and the removable pair. And you can always, I claim you can always transform it into a graph which is cubic and the R and P still have the same property, okay? And now we apply this idea that I had mentioned earlier, that now that since it's a cubic graph, I can have a perfect matching in that graph and I can have this probability distribution over perfect matchings and I can pick a perfect matching from the distribution and such that if I look at any pair now, right? So now once again, since it's a cubic graph, if I look at this vertex, right? Now in every matching, one of these three edges need to be picked up, yes? And since one of these three edges need to be picked up, so here now let's do our depth first search. When we do our depth first search, we will have one of these edges being a back edge, right? If, so exactly what I had said before, you know, that kind of applies here and you can argue that in one of the three cases, in two of those three cases, you can drop a pair of edges, right? So the idea is to, once you have a cubic graph, you can do that. So for instance, if that was the edge that got picked, then you could drop it. And this is what sort of, okay, so I guess I've gone a bit quickly on this, but this is what gives you this bound. So to give you a little bit motivation, so E was the initial set of edges. So where is the four by three coming from? It's coming from the fact that you took a matching and you added, right? And the matching would have one third the edges, right? So that's where the four by three comes from and the two by three R comes from the minus is a set of edges that you will be able to drop, okay? No, I'm not done a good job of this, but I guess we'll have to. And then now the question comes, how does one find all of these removable edges and removable pairs that I'm talking about? And idea is start with the deaf or deaf search tree. If you have a back edge going in there to a vertex, then you could, for instance, call this a removable pair, right? So remember, what is the idea behind a removable pair? That in any, that I can pick any subset of edges in which at most one of these edges is there and then I can drop that without violating connectivity, right? So if this is a pair, I can, yeah, if this is a back edge, I can always drop this or if I can also drop this tree edge and still keep the graph connected, right? If you have two such edges, then this cannot form a removable pair. Why can this, yeah, you could also call this a removable pair, but it'll be better to actually pair up this with that. So form it in this manner. So if there are K back edges from a subtree to a vertex V, then one of this can be paired up with the tree edge and formed one pair and all the other edges will just go into the set R. Remember, the set R is a set of edges you can remove. So the back edges can sort of be removed without destroying connectivity. It's the tree edges which we want to save on, right? But when I, when I pair them up with the tree edges, I can also have the possibility of removing the tree edges, okay? So now what is it that you want to do? Remember our setting of the two edge connect, the cubic thing, right? All of our back edges were going to distinct vertices and sort of things were nice there. I had these removable pairs. Now I want to pick in an arbitrary graph, which is not necessarily cubic, I want to pick these back edges in such a manner such that they go to distinct vertices of the tree, right? Similar idea, because if there are many back edges coming to the same vertex, I am not saving too much. If there are many back edges coming to this vertex, then, you know, this tree edge can be paired only with one. But if these back edges were all going to distinct vertices, then I would have the possibility of actually having many, many more removable pairs, right? There were many other, I could save on a lot of edges. So what you want to do is to pick your back edges carefully so that they are distributed in the tree. So that's roughly the idea of what the algorithm is that the back edges should sort of be well distributed in the tree. And how is that achieved? That's achieved by an idea, by building a circulation. So let me skip on this slide. So let me give you the idea of what the circulation is. So what you want to do is you want to give every edge. So you want to build a flow, a circulation is a flow where there are really no starting and ending vertices, but every edge has a lower bound on the amount of flow that is to go through that edge, a lower bound and an upper bound. So the first entry here is the lower bound and the other is the upper bound. So you want to pick a back edge from here, right? So you are putting a lower bound of one, which means that I want one unit of flow going down in this. And how is that one unit of flow going to be achieved by actually picking a back edge which goes across, right? So you will have to ship flow up to be able to get one unit of flow down. And that back edge will then get included in your, in the set of back edges you want to pick, right? So all the back edges which carry non-zero flow will get included. And to ensure that you don't pick many back edges which go to one vertex, you introduce a cost function which is bad when you have many edges coming to one vertex, right? So a cost, so you will define the cost of a flow as look at the total number of back edges that come here. Let's say that's the total amount of flow coming into this vertex minus a one, right? That's the sort of cost function you learn. You're happy to have, why is this minus one coming? You're happy to have one back edge coming in, but you don't want many back edges coming in. So this is sort of a mechanism to be able to distribute the back edges around the tree. This can be done. This is a normal circulation problem. And in fact, an optimum solution to this is integer, right? So you can actually find a flow which minimizes such a cost function and is integer. And the integer, why is the integer important? The flow which carries, the edges which carry one unit of flow are the edges that we will include in our solution, right? Those are the ones which will define the back edges that get included. So at the end of this, what you will have is you'll have, you started with your DFS tree and you build this circulation problem, you solved it, you got some back edges. And these back edges are nicely distributed around the tree, right? What do I mean by nicely distributed? That, well, if it is necessary, of course there would be multiple back edges coming into a vertex, but you know, it's quite likely that there will be back edges distributed around the tree, that only one back edge coming into each vertex. If you could get that, only one back edge coming into each vertex, then actually the cost of the circulation would be zero, right? Look at the way we've defined it. It's the flow coming into a vertex minus a one. So there's only one coming in, so that's carrying only one unit of flow and so the value is zero. So cost of circulation is zero. And in that case, actually you would get four by three N, right? We would get exactly the same setting we had earlier where, you know, there was one back edge and you were pairing these things up and you were saying that with probability of two thirds you will be able to drop two edges from each pair. So you'd get exactly that. So if you have a cost of circulation which is zero, which is zero, then you get a four by three bound. And in fact, one can argue, actually it's quite straightforward, but I guess I'm out of time, so that you will get, the number of edges you'll get is something like this. Four by three N plus two by three times the cost of the circulation, okay? So cost of circulation zero means four by three N exactly. And so now the entire game is to argue how small a cost you can get for this circulation. Remember, cost of circulation is saying, you know, cost of circulation would be bad if there are many back edges that will have to end up at the same vertex. You want to be distributing these back edges uniformly. And that's sort of the interesting algorithmic idea that is coming out of this paper, right? That, you know, I remember many years ago when we were trying to do two edge connected sub graphs and stuff like that, it seemed like a very nice thing to be having, you know, edges coming at different places, but we did not really formalize this. And this is a very nice way of formalizing it, ensuring that edges are coming to different places. And, you know, and this translates into, you know, that if the cost of circulation is high, that you will, you're not able to get this, then the number of, that the size of your solution is also going to be larger. Okay, so let me skip this quickly. And so this actually works for subcubic graphs also now. With this extension, what happens is that in a subcubic graph, there will be only one back edge coming to a vertex, right? There would be vertices with degree, with no back edges coming into an, having just degree two, but there would be at most one back edge. So the cost of the circulation is trivially zero. And the cost of the circulation is zero, means that you will still get the four by three and bound that we talked off. And let me skip this, but let me just go straight to the key theorem. So, algorithmically what they're doing is that they, so what do they do for a general graph? They will, how do they bound the cost of the circulation? So, we said, build a DFS tree and find the smallest cost circulation, but how do you build this DFS tree and how do you save the cost of the circulation is small? Well, here is something which did not particularly appeal to me. I would never have thought of doing it this way, but somehow it works, right? And so the DFS tree that they're building is that they solve the linear program and then they, at every point, they take the edge which has the highest value on it, right? So which means that you're building a DFS tree, so you come to this vertex, now you look at the edge which has the highest value. And so forget this middle note for a second. So the edges incident to this vertex have values, 0.9, 0.2, 0.3, and 0.5. So what you'll do is you'll take this edge, this becomes your tree edge, and then you'll go down in the tree and these other ones might have become back edges, and then you'll, you know, so you'll do your DFS tree where your choice of the edge at every point is dictated by the XC value of the edge. XC is the value of the linear program, yeah. So, held cup linear program, yeah. XC is the value of the edge in the linear program. Yeah, I, it's a bit of a puzzle there. Why this is the right thing to do? Of course, why it is due, the proof is in the pudding, right? So they can prove that the cost of the circulation is bounded when they do it in them. So what is it that they are able to prove? Let me just go straight to it. So let me just, so what they are able to prove at the end of the day is that the cost of the circulation is at most this quantity. So this times summation XC. Now, summation XC is the value of the linear program, right? So, so remember what did we say? Number of edges we'll have will be four by three, N plus two by three times this, and you know, if you work out everything, then it comes to the 1.44, 1.46 times summation XC, and that's about the gate. So, yeah, as I said, the first part, I would say is really, really nice where, you know, they're building all of this R, this idea of removable pairs and removable edges, and how the proof for these cubic and sub-cubic is really elegant, and that's the best proof we have, and perhaps that's also the best bound for those, for that setting. To be able to bound the cost of the circulation for an arbitrary graph, they seem to have this way of doing it, which gives them a bound, which gives them this 1.46 bound, but it doesn't seem so canonical, at least to me. So, you know, maybe there is room for working on that, and then seeing if there are ways of improving it in that manner. So, as I said, there has been a small improvement on this already. There's a paper by Marcin Mutcher, which follows pretty much the same technology with some minor changes that gets a bound of 13 by nine. There is some recent work just a month old on the, not the traveling salesman problem where you have specified end vertices. So, you want to just find a path from the starting vertex to the end vertex. So, that's for that, for the metric case there is the best known bound earlier was five by three, that has been improved to one plus root five by two. So, that's another recent work. That doesn't go through these ideas, that is actually taking the linear program, getting a convex decomposition into spanning trees, picking one of those trees, and adding edges. So, still there is this adding edges business, right? There's really no augmentation idea there. So, they're not dropping anything that they've picked in the first step. I think many of these ideas need to come together to be able to, for us to get the final four by three bound for this problem. Thank you for your time, and bye.