 So I'm going to talk about how to run a near-term quantum computer. My current affiliation is Google, because they go first, because they pay me at MIT where I'm on leave. And this is what I want to talk about. So that didn't work. Sorry. Maybe I'll just do this. OK. So first of all, I would like to talk about quantum computing as something where we're going to have inputs, which are strings, and we're going to have outputs, which are strings. So the reason I'm saying this, and this is my picture, of a quantum computer versus a classical computer, a conventional computer takes a string, has bit strings, which come in. And it outputs strings, and the quantum computer takes spins in this case, which are up or down in the computational basis. And output strings, which are up or down in the computational basis, the input is the answer to the question specified by the input. And of course, the difference between the quantum and the classical computer is how you go from the input to the output. But the reason that I'm putting that up is because I'm not talking here about simulation or quantum chemistry or finding energies. I'm really talking about the kind of traditional problems that computer scientists care about, which are defined on strings. That was the point of that slide. All right. So I would like to talk, though, about combinatorial optimization. Now, some of this talk is really a review of the quantum approximate optimization algorithm, which I spoke about here two years ago. But I will get to some newer applications, newer ideas. So the problem that I'm interested in is combinatorial optimization. And there I have n bits. So n bits can take binary values, each bit. So there are two to the n possible input strings, z. z is a bit string. And I have clauses. And what clauses are are constraints on subsets of the bits. And the clauses have the property that they take a true or false value on a setting of the subset of the bits. And I will assign the value 1 if the clause is satisfied by the string and 0 if it does not. And many of you can think about 3SAT or many, many problems in this language. So I have a cost function, which is the sum of the individual clause cost functions. And this function, my goal is to maximize it, which means that what I really want to do is find a string which has the property that it satisfies as many clauses as possible. Now that is the mother of all computer science problems. It's very hard to find an algorithm that will do that. And in fact, this is NP hard to find the actual max in general for this kind of thing. But I'm going to be interested in something a little less ambitious, which is to find a string which gives you a high value of this ratio, which is the cost function on the string z divided by the maximum value. And this is called the approximation ratio. And so when I talk about approximation, I'm talking about it in the very specific context of looking for a string whose output approximates the maximum of this function. And that's the only sense in which I'm using the word approximate. I'm saying that because this morning there was a different use of the word approximate. So that's the goal. And of course, if you could achieve one here, you'd be very happy. But we're going to try to be, sometimes, the best is the enemy of the good. And so we really want to look for a good solution, not necessarily the best. So we're going to approach this with a quantum algorithm. And the way we do that is I need building blocks of the quantum algorithm. Now remember that already I defined a cost function. So this cost function c I can think of as an operator. I assume we know enough about quantum computing here that I can say this. So I can view this cost function as an operator diagonal in the computational basis. So when it hits a basis state, it multiplies it by the value of the cost function. So this operator c is diagonal in the computational basis. And I would like to consider a unitary transformation, e to the minus i gamma c, where gamma is a parameter. So since c is a diagonal real operator, this is a unitary. And it depends on a parameter gamma. But note that if my clauses involve only few qubits, well, even if they don't, even if they don't excuse me, I'll get to that in one second. This is just a mathematical fact that since I said the cost function was a sum, I can write this as a product because all those terms commute because it's diagonal in the computational basis. But then it's also the case that if these clauses only involve a few qubits, or qubits, then these are few qubit operators. So this operator is a product of few qubit operators. I'd like to consider another operator b, which is the sum of the sigma x's. And then u of b beta, so this now depends on the operator b and the parameter beta, is e to the minus i beta b. And this really rotates all the spins around the x direction by the angle beta. And since these b's is made of commuting operators, this can be written as a product. And again, this is a simple one qubit operation. So I'm trying to give you the ingredients of an algorithm which is implementable on a near term quantum computer. So my u is then, this u is a product of single qubit operators. And then I have a starting state, which is the uniform superposition of rural inputs. This is also just the eigenstate of this operator with the highest eigenvalue. It's just I lined up the spins in the x direction. So I have this, and these are all easy to construct ingredients of this thing. So now what I want to do is I want to imagine that I specify an integer p. And I don't want you to think now if p is big. Think of p as small. In fact, I'm going to talk about p equals 1 in a minute. So we imagine we have an integer p. And then I have p angles gamma and p angles beta. Does this thing work? Yeah, well, barely. p angles gamma, which I make into a vector, and p angles beta. And then I want to construct a quantum state, which depends on these two p parameters. And this quantum state is first act with the operator that depends on c and gamma 1, then the operator that depends on b and beta 1, dot, dot, dot, like that. So I have a sequence of unitary operators acting on an easy to construct initial state to construct a parameter dependent state, which I call gamma beta. And the required circuit depth here is at most mp plus p, where m is the number of clauses. And again, think of p as small. Now what we're going to do is the idea here, which is similar to the variational quantum eigen solvers that Alan talked about this morning, is we want to choose the parameters to make this expected value of c big. So the idea is we're going to vary the parameters gamma and beta. We're going to choose them somehow to make the expected value of c big. Now if you can do that, then when you make a measurement of the state, you're going to get a string with a high value of the cost function. So that's our goal here is to do that. Now let's talk about this a little bit more specifically. So consider this the expected value of the cost function in the state gamma beta. Well, I'm going to call that function f sub p of gamma and beta. And now I want to choose my parameters to make that big. So let's imagine that I have achieved that. I've actually found the maximum over the parameters gamma and beta of this expected value. Well, first of all, notice that mp is larger than or equal to mp minus 1. That follows immediately because you can think of mp minus 1 as the maximization here with gamma p and beta p equal to 0. Because if gamma p and beta p are 0, I really am working with mp minus 1. So if you maximize, so mp minus 1 is a constrained version of the optimization of the pth at the pth level, so this inequality is immediate. So that means that if I can find the maximum at every p, it'll only get better. It'll never get worse if I can go to higher p. And then in fact, we can show that in the limit as p goes to infinity, you can find parameters which actually give you perfection. But I'm not really interested in that because I'm not interested in this high p value. But it's interesting to note that there is a limit in which this works perfectly. But that limit may require p to be very large. When you take a limit, maybe it's triply exponential in n or something. So now one way to approach this, which is similar to what Alan referred to this morning, is let's do the quantum algorithm with angle search. So you fix p, and then you start with a set of angles, gamma and beta. And then use the quantum computer, run it to make the state gamma beta. You build the state. And then you measure, and you get a string z, and then you can classically compute c of z. So one thing I like about this is that if you make this state, then the thing you measure is just in the computational basis. And you do a classical computation to get c of z. So that just means you just spit out this. And if you do this a few times, you can actually get an estimate of that function I had, fp of gamma and beta, remember, which is the expected value of the cost function in this state. Now you want to repeat it with new angles, and you're going to try to go uphill. And now how are we going to choose those new angles? Well, that's maybe with a classical outer loop. Maybe you're going to do gradient ascent, or you're going to be clever in some way. But without specifying how we choose the new angles, there's a paradigm here. And the paradigm is somehow or another you go uphill. And of course, if I could tell you in general how to always go uphill, I would be the most famous computer scientist of the millennium, practically, because there's no general algorithm for going uphill. You're going to have to figure it out case by case or be inspired. But this is not a problem I can just tell you how to go uphill. It's like telling you how to win on the stock market or something. OK? So now let me give you an example, though, of this. I want to talk about max cut. So I'm going to talk about a specific combinatorial search problem, which is max cut. Now what max cut is, is this problem. I give you a graph. A graph is a bunch of vertices. And the vertices are connected by edges. And I'm actually going to restrict this to three regular graphs. So that means that every vertex is connected to three and only three other vertices. OK? Now what I want to do in this math problem is I give you the graph. The graph is the input. And what you do is you want to assign a plus or minus one to every vertex. And you want to maximize the number of edges where the associated two vertices disagree. OK? So I call this disagree. It's the same as finding the ground state of an anti-ferromagnet. But whatever you do, approximate optimization, since you have a denominator, which is the max, you really have to go uphill. Because if I started doing minimum, I wouldn't know what to put in the bottom. I don't want to put a zero. So we always talk about going uphill when you talk about approximate optimization. A lot of us are interested in going downhill when we talk about quantum computers or cooling or quantum adiabatic algorithms. But here I always want to go uphill. OK? So anyway, so that's the problem. The problem is to find an assignment of the bits that maximizes the number of edges on which the bits disagree. OK? So let's think of a classical algorithm for that. Well, the classical algorithm is guess a random string. So if you guess a random string, you will satisfy half the edges. Because if you take any edge and you pick a random string, you have a 50% chance that the two bits at the end will satisfy it and a 50% chance they won't. So that actually gives a provable 50% approximation ratio. OK? Now, in fact, the QAOA with p equals 1, we were able to find parameters at p equals 1. We calculated the parameter 1 gamma and 1 beta. And we were able, this is with Goldstein and Guttman, my collaborators. And we were able to show that for this problem, we achieve a 0.6924 approximation ratio for all instances. So what was good about this is that we were able to do this. This is a worst case performance guarantee. You know, it's not a heuristic now. It's an actual worst case performance guarantee on all instances at all sizes. OK? Now, but I should be careful here that this is not better than the best classical. There is a better classical algorithm, which has a provable worst case performance guarantee. It's called Gommons-Williamson, which is one of the, you know, Gommons-Williamson was a breakthrough in computer science around 25 years ago when it was realized that there was an application of semi-definite programming, which would allow you to get a better approximation ratio. OK? So this is not better than the best classical. I'm not claiming that. On the other hand, if you make p bigger, it only gets better. And I know how it gets better for p equals 2. But the question, of course, is if you go to really high p, can you get it really good? If you beat Gommons-Williamson, it would be very significant from a computer science point of view. Gommons-Williamson is 0.878 on general graphs, not three regular. OK. Now, I'm not going to do this because I don't have enough time to talk about how you find the best angles. So I'm going to skip that and not do that. But let me just, so I jumped ahead and I guess said, so the 0.694 improves on random guessing, but it's not as good as the best classical. I'm just repeating it because there were those three slides I didn't show. OK? But I want to show you an example of something which is kind of interesting. Suppose that my graph was a ring. And the ring is the ring of disagrees, I call it. So we know how to solve this, classically. Classically, I go 1 minus 1, 1 minus 1, 1 minus 1, and I maximize the number of disagreements. So it's a very easy problem to solve. OK? But the QAOA gives an approximation ratio of 2p plus 1 over 2p plus 2, independent of n. So that's an interesting formula because it shows you that as you increase p, you approach 1. But also, I want you to look at it for a second. Let's look at the p equals 1. We get a 3 quarters here. Oh, there's, what happened? My computer didn't put the d in for the end. Oh, I think it was a Microsoft product. OK. So noted p equals 1, we get 3 quarters. And for large n, almost every step. But, oh yeah, I had a point. So, and the point here, though, is that if you look at what you get in this, what you find is that when you run the algorithm, every string that you output satisfies 3 quarters of the bonds. And now, it could have been different. It could have been that with 3 quarters probability, you satisfied all the bonds. That would have given the same 3 quarters. But so in that sense, this is really an approximate optimization algorithm. It's not a failed best optimizer, which is sometimes giving a good answer. It's always giving an answer that satisfies 3 quarters of the bonds. And I think that's interesting, because it means that when you run this thing, and there's very little variance here in these numbers. I could talk about that. There's a concentration bound on the variance of what you measure. When you measure the strings, they all come out with the same value of the cost function. So again, I'm emphasizing this. It's not like running the adiabatic algorithm, and it didn't work. But let's look at the string and hope it did a good job. OK? It's different. OK. But then, so that was interesting. But now I want to talk about another problem, maxi3lin2. This is just going to be a little story to motivate my interest, our interest in this. So maxi3lin2, the next problem we looked at is maxi3lin2. And here we have n variables with m equations. So try to follow me on this. It's not that hard, but these terms are not that bad. I'll show you. I have n variables. I just did what I always tell junior people not to do when they teach, plead. You're not supposed to do that when you talk to your audience. Plead for them to understand. I'm sorry. OK. So anyway, this is maxi3lin2. So I have n variables with m equations, each with three variables. So the three means there are exactly three variables in every equation. And it's linear algebra binary. So let me just tell you, let's read this equation. This says that zi plus zj plus zk mod 2 is 1 or 0. So the way this problem is specified is I have m equations. Each equation involves three variables. And each equation says that the sum of those three variables is a 1 or a 0. So I specify my instance by giving you a list of triples, and I tell you whether each triple should sum mod 2 to 0 or 1. OK, that's the problem. No, I haven't told you the problem. That's the setup. Now, you could ask, can I satisfy this? Can I find a solution? Well, that's easy, because you can do Gaussian elimination. These are linear equations. So I can answer the question of whether there's a solution. And so I'm not actually going to be interested in the case when there is a solution. But if you're interested in the case when there are too many equations, then there are many more equations than variables. It can't be satisfied. And now what I'm going to try to do is find the string that satisfies the most equations as possible. And that's why it's called max e3 lin 2. Is there any question about that? Was that OK? You got it? OK, good. Thank you. So it's called max e3t. So the task is to find the string number of satisfied equations. Now, it's NP-hard to find the optimal solution. OK? So since it's NP-hard to find the optimal solution, what we're going to do is try to find a good solution. OK? It's going to be too hard to find the best solution. So let's see how that goes. Well, first of all, let's try to look at the approximate optimization. So let's take an algorithm. A good algorithm for this is guess a random string. OK? Now, if you guess a random string, you see each of those equations has three variables. And whether they sum to zero or one, those three variables can take eight values and four will satisfy and four won't. So if you just guess a random string, you're going to satisfy half the equations. And that's good enough to guarantee the algorithm because if you just pop it down, you'll satisfy half the equations. If you satisfy a little less than half, if you negate them, you'll satisfy a little more than half. So it's guaranteed by random guessing that you'll satisfy half the equations. OK? So that's kind of a crappy algorithm. Guess, you know? It really doesn't seem that compelling. But suppose that you had an algorithm that achieved a half plus epsilon for the approximation ratio for any epsilon. Epsilon is 10 to the minus 20. And you said I have found an algorithm that guarantees that our approximation ratio of a half plus 10 to the minus 20 for all instances of this problem. If you had that, then P would equal NP. You would have solved the traveling salesman problem. So that's amazing because it says that, you know, computer scientists were able to show that for this problem you can't improve on random guessing because it's a beautiful result in computer science. But anyway, and that was done in 2001, but now what I'm going to do is I'm going to help you a little bit. The way I'm going to help you is I'm going to tell you that every variable is in no more than D equations. So I fixed D, a thousand, or whatever it is. And now I've given you a little information. So with a little information, maybe you can do a little better. In 2000, hostat came up with an algorithm that got a half plus constant over D. This does not contradict this because you see, if you fixed epsilon, I could always make D bigger and come under. So, you know, once I fixed D, there was an algorithm that got a half plus constant over D. OK? Is that all right? Good. Now, it turns out that in 2014 we had a D to the 3 quarters. And so that meant that at that moment, at least in time, we had a quantum algorithm which had an exponential speed-up over the best classical algorithm for a computationally interesting problem. And of course we were very pleased with ourselves, but so was Scott Aronson who blogged about this. He was very pleased with our result and that led a team of computer scientists in 2015 to come up with an algorithm that got a half plus constant over D to the 1 half. And I'll show you the list of computer scientists because it really wasn't fair. It took all of them. You know, it's like this. It was an international cabal, all, you know, gunning for us. But anyway, they did it. So that group of 11 came up with an algorithm that beat us. Ah! So then we no longer had an algorithm that beat the best classical algorithm for a hard, nice problem. Too bad. So anyway, but again, I mentioned my collaborator, Jeffrey Goldstern, but then what we did then is without changing the algorithm we just improved our analysis. And once we improved the analysis we got a half plus 1 over 101 D to the half log D. So we're only a log D away now. We got the power, but we still have the damn log. They don't have it. But we can show that on random instances we get a half plus 1 over 2 root 3 E. That's why it was constant D to the 1 half. So, you know, we've made progress on this. But this is all at P equals 1. And we don't know what happens at P equals 2. It may be, it can only get better. And, you know, maybe that will beat the best classical algorithm if you get, um, but you've got to be a little careful here because if there exists an algorithm with a half plus constant over D to the 1 half for a sufficiently large constant then P equals NP. So you're really in the weeds here in terms of trying to make an improvement. But I do think it's worthwhile to try. Okay? So now, oh yeah, another thing I forgot. The QAOA exhibits quantum supremacy. We had these nice talks this morning about quantum supremacy. But, um, you see, basically this I did with Aaron Harrow. If you consider the C as a sum of two-bit clauses like I talked about before, let's say max cut, but on a complete graph. So I have any kind of connection. And I act on my initial state with e to the minus i gamma C. And then I act with this e to the minus i pi over 4 sigma x. So this is a P equals 1 version of the QAOA. But acting on an arbitrary graph. Now, consider this quantum probability. This is the probability of producing the string z. It's the inner product of the string z with the state psi, which you get by this method. And suppose you have a classical algorithm that outputs strings with probability P of z. And P of z is close to Q of z, point by point for every z. Then if you could do this, if you could have a random number generator and it spits out strings with probability P of z, and for every z, P of z is close to Q of z in this sense, then the polynomial hierarchy collapses. And basically the proof of this, we just stole from the paper by Bremner at all, changing a few of the things. But what I like about this is it means that the quantum approximate optimization algorithm may be useful and also may be capable of supremacy, which I think is another nice feature. But we have to be very careful about what we mean by that because there's all this issue about what the norm is and what you mean by supreme and how you verify it. So there's another thing I wanted to say, fault tolerance. This is some work I did with Adam Buchat and Peter Schor. If you make a very simple error model, you see, if you make a simple error model where you imagine that you have single qubit errors acting on the individual qubits, well, what that can do is if you have single qubit error and they're not corrected, that can really degrade the fidelity of your output state. But you can ask a different question, which is does it degrade the cost function? What does it do to the expected value of the cost function? And since I don't have a huge amount of time, I want to just, without showing how it works, I want to say that if you look at the p equals 1 algorithm with max cut, let's say, and you analyze it, what you discover is that the expected value of the approximation ratio is reduced by an n independent factor, although the fidelity goes down exponentially. And I think that that's good to know. You see, sometimes I think that there's too much emphasis put on fidelity when I think what we really want is performance. And it's very easy. If I give you a million spins and one is off, I can have fidelity zero with respect to the thing I want. Or if I have a million spins and each one is a little tilted, I can have an exponentially small overlap with the desired state. It doesn't mean the state that I have is bad or that you couldn't recover the information by making multiple measurements. And I think this is a case where, you know, even if the fidelity is not that great, maybe we should run the QAOA and see whether, when we get strings, they have a good value of the cost function because those are different questions. But this may not, you know, scale too well. This does scales quite poorly with p with the depth. But it's still, there's no n in it. Okay. So anyway, I want to talk about a more general approach. And the more general approach is this. I mean, I told you that we analyzed the QAOA for p equals one with one angle for the cost function and one angle for the spin rotations. Now, of course, you don't have to have one angle. You can have angles for every, every single spin can have its own angle. And every clause could have its own angle. There are zillions of things you could think of. So one thing you say is have different angles, beta for each qubit, beta i, i goes from one to n, have different angles, gamma for each clause, gamma alpha, alpha goes from one to n for each p. Or another thing you could do is just say, I'm going to have unitaries parameterized by angles, let's say, I didn't say before, all those gamma and beta had, they were angles because my clause functions were integer value. So everything was angles. So I called them angles, but they just might be parameters. But anyway, you imagine that you have just a set of parameterized unitaries, then you might say, okay, what I'm going to do is I'm going to build a state that depends upon the unitaries I have at hand. And I'm going to use that to drive the optimization. So let's try that. So for example, suppose you had a quantum computer, which was a seven by seven grid sort of arbitrary, and then you had the ability to act with single qubit unitaries and then you had two qubit unitaries, you might use the unitaries from the grid to make this state. So in other words, I'm saying if you have a fixed architecture, you might decide to make your state only dependent upon the parameters which you are allowed to vary that the experimentalist has given you. So in that case, then I would still want to maximize, optimize the cost function, but do it using only gates that come from this, and the depth would be limited by device coherence. And of course, because that would tell you the number of gates you could have. And yeah, this is work I did in the government in Hartmut Nevin at Google. And we actually, we looked at the QAOA and we said suppose I have a grid. And we looked at, well, with this, let's see what do we do. Oh yeah, we looked at the QAOA with just two parameters, but where everything was rotating on the grid. So the unitaries come from the grid, but the cost function is just for a three regular graph. So we didn't embed the three regular graph into the grid. We just evaluated the strings and then evaluated the cost function, but then we did also achieve a provable approximation ratio of 0.53. Now of course you could say, well 0.53, it's not as good as 0.69. Yeah, but I mean the grid is not lined up with the objective function and 0.53 is better than random guessing. So I think as a proof of principle you showed that, you know, you can imagine optimization using the unitaries given by the hardware and still try to look at a different objective function. I mean it's just something to try, okay? So then, so this solves, it's some sense, it doesn't, it solves the embedding problem in a way because I haven't insisted that my unitaries depend on the cost function. I just take them and, you know, solves is probably, I think I shouldn't use the word, I can't change it here on the screen. I should probably get rid of solves, I should say evades, the embedding problem or skirts around it. But anyway, so the strategy here would be after producing the state, measure to get a string z, classically compute z, pick new values of theta and try to go uphill. It's a general strategy for combinatorial optimization using a set of unitaries which are dictated by the hardware. I haven't specified how you go uphill, but I've done numerical experiments on this where, you know, I've worked with Hartmut and my other collaborators, of course. So we look at 16 qubits of four by four grid and we went to depth four where we had 160 parameters and, you know, by searching over those parameters, I can get approximation ratios which are very close to one for random instances of max cut. I don't even embed the graph, I don't even, I just throw the graph down, the graph is random. Now, you know, you might say, well, you know, if you go to a much higher bit number and you have too many parameters, you'll never be able to find, you know, it's very hard to search in high dimensional spaces. And that's true, but on the other hand, you know, if you look at neural, if you look at machine learning, you know, they have millions of parameters sometimes and if I told you I have the problem of finding, optimizing over my millions of parameters, you may say that's intractable, but in fact, you find good solutions and that's because there are good solutions. There are many times there are many good solutions even though it's intractable to find the best solution. So I would advocate just trying it. So, you know, will this work? I don't know, run it and see. And we can almost begin to think of this, although I'm not going to talk about that now, as a quantum neural net or a quantum brain, where you vary the weights with the idea of achieving some kind of optimization. But maybe those are just words. Okay, that was it, thank you. Okay, do we have some questions? So you mentioned that these algorithms, there's this parameter p and that, you know, a lot of the algorithms that you looked at were algorithms with p equals one and then in principle you could do better with larger values of p, so what do you know about that? Can you say anything about how, you know, what kind of improvements you get with larger p? Oh, well, it depends. Well, I know that for, you know, for the three regular graphs of p equals two, for max cut, I forgot. I got like 0.75 or something, provable, so I could go from 0.69 to 0.75, but I've looked at it numerically, and numerically, and I've been doing this with other people too, but if you look at it numerically, if you're at 20 qubits and you go to p of 789, you get very high approximation ratios, like 0.99, you know, at high p. Now, this is my feeling, though. I think that we really probably have to have p be log n, and the reason for that, which I didn't show, has to do with the way the algorithm works. You see, you have clauses, and a clause is a term that involves, let's say, two qubits. Let's talk about max cut, but then every time you act with those unitaries, you connect those qubits to others. In other words, you can think of the local operator, but one term in the sum as involving, let's say, two qubits. And then when you act with the unitary involving qubits, you connect that qubit to friends, okay? But the number of friends you get grows with p, but when p is log n, you hit everybody. So I think the best hope is that you let p go to log n, which, you know, and then I think you have a good chance. Now, of course, numerically, when you play around with it, it's impossible at 20 bits to know whether, you know, log n. I don't know. If you get any good results at 3, 4, 8, is that log n? Is that, who knows? I can't tell, you know? But that would be my hope, is that if you went to log n, you have a good shot. Does that make sense? Any more questions? So I didn't completely follow everything you were saying about the variance in the measured value and the cost, you know, when you measured some stuff. Okay, I can tell you what I'm saying precisely. So, well, when you are optimizing, do I or don't I have to make many runs and measurements before I change the parameters? Okay, if I have these problems where the, like, let's just take an example, like max cut on three regular graphs, then the variance of the expected value of the cost function grows like the number of clauses, which means that the standard deviation goes like the square root. So that means that as m gets big, the number of clauses, it's very peaked. And I forget, in our paper, we tell you how many samples you need to get a very good estimate. But it's low because the variance grows only like m. And the expected value is m, the variance is m. That means the standard deviation is root m. That's small. So if you wanted to use this kind of variational approach of varying each pulse, let's say for machine learning, usually in machine learning there's a gradient descent procedure and you could compute it pretty straightforwardly. Could you get maybe like a sort of gradient descent with this with some form of sampling? Well, that's a great question. I mean, if you wanted to gradient descent, then you would have two choices. Either somehow you would try to be clever and use your quantum computer to compute the direction. But if you were really stuck, then maybe the way you compute the gradient is to vary the parameters a little bit and then you'd have to recompute the expected value at new values of the parameter and then do a finite difference approximation to the gradient. And that's something that plagues... Well, not plagues, excuse me, but that's something that many of the chemistry algorithms also have to deal with is how do you exactly... You have to do repeated measurements to get good values of expected values and you have to do repeated measurements, of course, and to do... get gradients if that's your process because you have to take the... You have to vary the parameters a little bit. So that's one approach. But what is Feynman-Hellman is... Analytically calculate that if you vary a small parameter, you take the expectation value of a certain operator that tells you the gradient. Yeah, yeah, no, that's okay. But then you'd still have to probably measure the thing you got from the Feynman-Hellman. Yeah, yeah, there are probably other tricks. But if you can't... You see, when you do simulation, you don't have these problems. But if you had a quantum computer, yeah, you'd have to have a way of turning some kind of expected value. That's what you're saying by Feynman-Hellman, where you're saying the changes are the expected value of the thing. Yeah, well, that would be okay. But then you have to measure the cost function or whatever it is, and that could be the sum of many terms. And then you have to say, how do I measure that? And often in the chemistry things, I think if the Hamiltonian is the sum of terms, they find the expected value by measuring each of the terms individually. Is that right? Yeah. So, yeah, I'd have the same issues. Mm-hmm. In the chemistry problem, maybe you can. Here it would depend on the problem, whether you can get the gradients analytically. It certainly would help. Any more questions for Eddie? Yeah. Very quick question. Apologies in advance, because I haven't read the QAOA algorithm carefully. That's okay. But can you describe if it's, or what's this relationship with doing the usual trotter approximation of the adiabatic Hamiltonian with varying trotter segments? Yeah, well, the question is, what's the relationship between the QAOA and a trotterized version of the adiabatic algorithm? Well, the adiabatic algorithm uses a Hamiltonian, which is a superposition, excuse me, a combination, not a superposition, a combination of two terms, which in this context would be the cost function and the sum of the sigma axis. So, if you then trotterize that, then you would be getting a very high P realization of the QAOA, and actually that's how we prove that as P goes to infinity, you can get perfection. It looks like the adiabatic. Right. But at P equals one, it really doesn't. And there's another fascinating subject. There was a beautiful paper, not the sum of, but the sum of, for a bang bang, where people said, you know, suppose you had a general parameterization of the time dependence of the two operators and something like the adiabatic. And you said, what's the optimal solution to optimizing your objective function? Well, the bang bang theory applied to that problem tells you that the actual best thing to do is to turn your parameters on 100% and then turn them off, which looks like the QAOA. So, the QAOA actually looks like the bang bang. If you apply bang bang to a generalized form of the adiabatic, you get the QAOA. But that result really allows for there to be many, many jumps. And I don't want that. I want to look at the QAOA at P equals one, two, up to law again. And also, if you're not in the ground state, you know, here, it's not ground state computation. In that sense, I think it's different. Any more questions? If not, let's thank the speaker. Okay.