 I'd like to tell you about some recent work on resource requirements for a quantum simulation. This is joint work with some co-authors, Demetri Mazlav, Yonsei Angnam, Julian Ross, and Yuan Su. And these results are described in a paper that you can find on the archive that we posted recently. So I was here for an earlier meeting in this very room at IBM, I guess four years ago, on this kind of this topic of what we should do with a small quantum computer. And I gave this talk about simulating Hamiltonian dynamics and what were then some of the best known quantum simulation algorithms for this kind of abstract problem of Hamiltonian simulation, which, as Robin just told us, there have been some exciting developments since then. And we have some better algorithms now. But one of the other things that I did in this talk was I kind of speculated a bit about how we might use these algorithms to simulate real physical systems on early devices and what would be interesting problems to simulate and what would the cost of doing that be. But it was very abstract and high level and really did not get into the sort of non-asymptotic details of how you would actually do this and what the specific resource requirements would be for small, concrete instances. So I realized, and many people realized at that meeting, that this would be an interesting thing to do. And so I decided to go away and think about this question. And here, four years later, we have some answers certainly, I don't claim to have sort of the complete picture of this or the entire story, but I'd like to tell you about some of the work that we've done to understand this a little bit better. So as we all know, there have been many exciting advances recently toward developing devices that we can use to implement quantum computations here at IBM and at Maryland and around the world. And I think a lot of us are asking, what can we do with those systems? Certainly an important early goal is to show that we can get some kind of quantum advantage, so that we can do something with those devices that could not be done classically. But what I would really like to ask in this talk is can we find a practical application of these devices? So what can we do that would actually solve some problem of maybe scientific or industrial, perhaps interest, beyond just the demonstration of quantum superiority? OK, so in trying to do this, there are some major challenges. Obviously, to get to this point, we're going to have to have significant improvements in our experimental systems. This is a very hard problem that a lot of people are working hard on. But I don't think it's fair to just say, for us as theorists, to say, go off to the lab, come back when you have a million qubit, quantum computer with gates that can be done to a part in 10 to the 7, or something like that. As theorists, we should think about coming up with the best possible algorithms, implementing them in the best possible way, and finding the most interesting systems to simulate, let's say, if we're approaching quantum simulation problems. So think about really what's the best way to make use of the available hardware, because at least in the near term, the qubits and also gate operation times will be not as good as we would like. And so we would like to conserve resources as much as possible. So that's our goal in this work, is to really come up with some concrete resource estimates for tasks we could carry out with quantum computers that we could not carry out with classical computers. I would like this to be a practical task, and I would like to optimize the implementation as much as possible so that it's something that we can do on the smallest possible quantum computer. So Robin gave a really nice introduction to the kind of abstract problem of quantum simulation in his talk, so there are some details here that maybe I won't have to really dwell on. So we all know that kind of quantum simulation is a problem that kind of helped to spark the idea of quantum computing through ideas of Feynman and others. And so this is the problem that will underlie the applications that I'm going to talk about in this talk. The basic problem is just the problem of simulating dynamics. So we have some Hamiltonian we would like to simulate, some initial state, some evolution time, and we would like to produce the final state when you evolve for that Hamiltonian for that time up to some error epsilon. And as Robin described, this problem is BQP complete, so we have good reason to think that in general, we will not be able to solve this problem efficiently with a classical computer. Now, there are a variety of algorithms that have been developed for solving this problem, and Robin has introduced them well, but let me just kind of quickly remind you what are some of these main algorithms. Maybe I've classified them a little bit differently on these slides, but basically I'm talking about the same kinds of algorithms that Robin introduced in his talk. So the kind of simplest algorithms, the first algorithms that were proposed going back to this paper by Seth Lloyd in the mid-90s, are these algorithms based on product formulas or Trotter product formulas, lead product formulas. There are various names for these things. So the idea is just that if we've got some Hamiltonian, which is a sum of terms, and actually for the example systems that I'm going to consider in this talk, the Hamiltonian actually will be a sum of two-body terms. It'll be a sum of two local terms, so it'll be a quite simple expression of this form. But more generally, if you have any kind of decomposition of a Hamiltonian into a sum of terms, you can simulate it with these product formulas by sort of, if the terms don't commute, of course, the exponential of the sum is not just the product of the exponentials, but if you slice the Hamiltonian into many pieces and you alternate between the various terms, then this gives you a simulation. And if you sort of truncate this slicing to some finite level, then you get, of course, some error in the approximation of the exponential, but you can sort of work out some bounds that tell you how finally you must split the evolution in order to achieve some desired error. And there are these higher order product formulas that we can consider. So the things that are shown here on the left are the kind of lowest order product formula, but Suzuki in the 90s systematically constructed higher order product formulas that you can use to get better and better approximations. And correspondingly, at least asymptotically, you need fewer and fewer elementary gates to approximate the evolution up to some desired error. So it's sort of been worked out with the complexity of these approaches asymptotically. So another class of algorithms that Robin discussed are these algorithms based on quantum walks. So you can sort of define a quantum walk that's analogous to a given Hamiltonian. And from it, using phase estimation or using some of these other approaches that he mentioned, you can simulate Hamiltonian dynamics. I actually am sort of not going to describe the details, first of all, because Robin has talked a little bit about these algorithms, and also because we're actually not going to consider implementations of these quantum walk algorithms in our work, basically because we expect there to be some fairly significant overhead. So Robin mentioned the fact that in these algorithms, you have to perform computations of trigonometric functions in quantum superposition. This is probably going to introduce quite a bit of overhead. And so we don't expect that these algorithms will be competitive with the other approaches that I'm going to talk about. OK, so another approach. So this is, I guess, what Robin called the linear combination of unitaries approach in the divide and conquer sort of formulation. So I think of this as sort of a method for directly simulating the Taylor series. So if you sort of think about decomposing the evolution according to the Hamiltonian in terms of its Taylor series, and then you truncate at some order, then if the Hamiltonian has some decomposition into a sum of terms, you can sort of think of the evolution operator as some linear combination of unitary operators. If you sort of decompose things in an appropriate way. And using this kind of technique, this LCU lemma for implementing linear combinations of unitaries on quantum computers, you can implement this operator and thereby simulate the dynamics. And so the query complexity of this approach is known. And this was sort of this kind of main relevant feature of this approach when it was first developed was that it has query complexity that scales only like the log of the inverse error. So if you want to do a simulation to very high precision, then this is a good method to use. But also, although it may not be clear from the description that's here on the slide, it's not too complicated of an algorithm. I mean, it's more complicated certainly than the product formula approach. But you could imagine that the overhead is so certainly asymptotically, this algorithm is going to run faster. But you could sort of hope, and this was one of the things that I described in my talk here four years ago, that maybe this algorithm would actually be competitive even at fairly small sizes because there's not too much complexity. The algorithm is not too complicated. So this is basically the question that was asked at the end of Robin's talk. What can we say about whether algorithms like this one are advantageous for simulating real systems? At what point will we actually want to use these other algorithms? And this is basically one of the major points that I want to address in this talk. So there are also these algorithms based on quantum signal processing that have recently been developed, which helped to understand this kind of question about trade-offs in the complexity as a function of the evolution time and the error in the system, in the simulation. And so this quantum signal processing approach gave an algorithm that meets the lower bound that we previously knew. I don't want to describe exactly how these algorithms work. They somehow effectively encode spectral information about the Hamiltonian into some two-dimensional subspace where you can manipulate it in a nice way. But for the purposes of this talk, the main thing to know is just this is another Hamiltonian simulation algorithm that we could consider. And we can throw it into the mix when we think about trying to figure out how we should actually perform simulations on early devices. And maybe these algorithms are actually worth using, or maybe not. And that's something that we should try to answer. So here's a table, maybe not quite as comprehensive as the one that Robin had. But listing many of the known Hamiltonian simulation algorithms, including these product formula algorithms, the algorithm based on quantum walk, this is with the phase estimation correction, these linear combination of unitary algorithms and the quantum signal processing algorithm. And if you look at this table at the bottom of the table, we have an algorithm with optimal query complexity. So here I've expressed the complexities as a function of d, the sparsity of a sparse Hamiltonian, t, the evolution time, and epsilon, the simulation error. And the algorithm at the bottom is optimal as a function of these parameters. But there are a few things that sort of maybe keep this from being the final answer. One is that we're interested in what happens for small systems. So if we want to perform a simulation of a 50 or 100 qubit system, have the asymptotics kicked in yet. Another issue is that these expressions actually omit what is perhaps the most significant parameter, or certainly a very significant parameter in the complexity of systems that we might want to simulate, namely the number of qubits, the system size. So actually, the query complexity of sparse Hamiltonian simulation is independent of the number of qubits. Because there's this black box that's given to us that somehow encodes information about the Hamiltonian. And if you're given that black box and you only count queries to it, actually, there is no dependence on the number of qubits of the system that you're simulating. But of course, in a real implementation of an explicit Hamiltonian, there will be some dependence on the system size. And so we certainly need to take that into account. And actually, even when we look at the asymptotic performance, it's not so clear that that algorithm will necessarily be the best if we consider some concrete system, make some concrete choices, choice of parameters, and then look at the, let's say, the system size dependence. OK. So what I would like to do now is look at some really concrete system and think about how we might simulate it. So I've gone very quickly through this background, because we've seen some of it already in the previous talk, and also because I really want to focus on these questions of implementation details. So now let's think about making these ideas really concrete. So let's try to fix some system that we might want to simulate. And we're going to try to construct the system that is sort of the simplest possible system that would be hard to simulate with a classical computer, but that we can give an efficient quantum simulation of, and really a quantum simulation with the fewest resources. OK. So I think probably the most attention in the sort of literature on applying quantum simulation algorithms to explicit physical systems has been given to the problem of simulating quantum chemistry. I think for good reason, because there are lots of important problems in quantum chemistry that we would like to solve, and quantum computers would be really useful for solving those problems. There are also a lot of details to get right and lots of different choices to make and a lot of things to explore. So I think that that's a really important question. But actually for the purpose of this talk, I want to focus on another kind of system. I want to think about sort of simple spin systems. And the reason for that is that I think there are kind of, so simulations of these systems are more straightforward. There's less kind of overhead in kind of constructing simulations of these systems. And I think that there are actually interesting problems we could approach where we could answer questions from condensed matter physics. Already by looking at kind of simple spin systems where we'll have less overhead and correspondingly, we'll be able to perform simulations using fewer gates. OK, so for the purpose of this talk, we're going to make some really explicit choices. We're going to just choose all the parameters of a system and we're just going to try to understand what is the complexity of the concrete cost of simulating this system using various Hamiltonian simulation algorithms for various system sizes. So the model that I'm going to consider is this Heisenberg model in a one-dimensional system with periodic boundary conditions. So there's this Heisenberg coupling, this xx plus yy plus zz coupling between nearest neighbor spins on this ring. And then we're also going to have a magnetic field term with some disorders. So we're going to have these local magnetic fields hj, which are chosen uniformly at random between plus h and minus h. And h is some parameter that kind of characterizes the strength of the disorder in this system. OK, so as we also heard this morning, this is kind of a system that, well, maybe you didn't talk specifically about this Hamiltonian, but in general for various kinds of model systems of this kind, this is a system where we can think about these phenomena of what are called many-body localization and self-thermalization. So there's this question of how do these systems behave? Do they show if we start from some arbitrary state, will the dynamics look like the system effectively thermalizes, or will it fail to do that because of this phenomenon of many-body localization? And as we heard already, there are lots of interesting questions about this, some of which are very well understood. So I guess the many-body localized phase is actually quite well understood. But the transition between the thermalized and many-body localized phases of models, such as this one, are actually not so well understood. And it's difficult to do numerics to study such things. And it seems like numerics get hung up on actually fairly small system sizes. So this seems like a nice candidate, as was already emphasized, for using a quantum computer to answer some question that we would really like to have the answer to. And this is something that we could do really by exploring the dynamics. So there have been some explicit proposals, actually some of which people have even tried to realize in analog quantum simulation experiments, that involve preparing some reasonably simple initial state, evolving according to the Hamiltonian, and then performing some simple final measurement. And so because of this, and the bulk of the difficulty in trying to explore the system in this way really comes from the evolution, because the initial state preparation and the final measurement are quite simple, we're really going to focus on the cost of simulating the dynamics. So we want to know what is the cost of doing that. So I'm going to, as I said, make some very concrete choices. I'm going to take this parameter h to be 1. I'm going to take an evolution time, which is the size of the system, basically because to start to see interesting things happen, well, we at least have to have enough time for an excitation to propagate across the entire system. And since it's a one-dimensional system, that's a time proportional to the system size. I'm going to take an error of 10 to the minus 3, pretty small but not too small. And I'm going to take, and I'm going to look at systems of, let's say, the range between like 20 and 100 spins. So these are cases where if we wanted to actually simulate the dynamics explicitly on a classical computer, that would be a very difficult thing to do, certainly at the upper end of this range. Although there are perhaps some heuristics that we could try to apply in special cases, but generically I think this would be quite hard. And so I would like to understand what are the resources that you would need for a quantum computer to perform such simulations. All right, so here again is this table of algorithms that I showed before. So now in the right column I've listed the gate complexity as a function of n when you think about how you would implement those algorithms and what the end dependence would be for the various steps that you would have to perform. And the algorithms that we're going to consider explicit implementations of are basically three of these. So the product formula algorithms, which are sort of the most straightforward approaches, and we're going to include the higher order product formulas in that analysis, the method that I called the Taylor series method and the quantum signal processing method. And so specifically this is the, so Robin distinguished between the quantum signal processing method with the kind of divide and conquer approach and the quantum walk approach. So we're going to look at the divide and conquer approach. So as Robin mentioned, there's kind of worse complexity for sparse Hamiltonians for this divide and conquer approach. But we're looking at a local Hamiltonian, right? Our Hamiltonian is just a local actually two-body Hamiltonian. And so that distinction is not relevant for us. And actually I think this approach will be just as efficient as the other one asymptotically and will be definitely simpler to implement because we won't have to compute these trig functions in superposition. OK, so these are the three approaches that we're going to look at. And so what we have done is to really implement these algorithms in a very concrete way. So we did this using a quantum programming language, which is called quipper, which was developed in part by Julian, one of my co-authors, along with Peter Salinger and others. And so this is a way of writing software code that describes a quantum algorithm in a fully explicit way that can then be compiled into an explicit sequence of gates. So there's a piece of just one snippet of code from this implementation, which is shown over there on the right. You're not supposed to be able to read it. You're just supposed to believe that we actually wrote some code. But as I'll mention later on, actually if you really do want to read it, so all of the code is available on GitHub. So you can go and you can download it and you can see really explicitly how we implemented these algorithms. And maybe you can improve the implementation. And if you can do that, I would like to hear about it. OK, so then when we look at the cost of these algorithms, we're going to think about synthesizing these algorithms, which is something that the compiler can do. We're going to think about synthesizing these algorithms over two gatesets. First, we'll think about the sort of set of Clifford gates and z-rotations, so single qubit rotations about the z-axis by arbitrary angles. This is maybe a set of gates that might be relevant to a kind of pre-fault tolerant implementation. And when we think about that sort of implementation, usually we'll think about the c-naught count. So the total number of two qubit gates has some kind of measure of the difficulty of performing the computation, because those are probably the hardest gates kind of at the physical level. And another thing that the algorithm that these compilers can do is produce circuits over Clifford plus t using these kind of optimal synthesis algorithms that have been developed in recent years. So using that, we can sort of produce estimates of the resources that are relevant to fault tolerant implementation. So we're not going to think explicitly about how we would do fault tolerance, what error correcting codes would we use, et cetera. But we'll come up with counts of the t gates that give us some indication of maybe how difficult it would be to perform these simulations in a fault tolerant setting. OK, so when you write code, there are usually bugs. This is something that I learned as part of this project. But we did our best to sort of verify the correctness of our implementation by doing simulations of small instances of the full algorithm where possible and looking at subroutines where that was the best that we could do. And we did this, and we found some bugs, and we fixed them. And we're kind of reasonably confident that our implementation is correct. And as I mentioned, this implementation is available. So you can go and you can download the code. It's available at this link. So if you want to see really all the gory details, you can look there. And another thing that we did as part of our final gate count, so I'm not going to discuss this too much, but we have another paper where we describe some automated circuit optimization methods. So at a high level, I mean, we really tried to implement these algorithms as efficiently as possible, to come up with really good circuit constructions, to evaluate the parameters of the algorithms in the best possible way. And I'll describe some of that in the coming slides. But another thing that we did at the end of the day was to apply this automated circuit optimizer that we wrote to reduce the gate counts further by kind of looking for some simplifications that we might not notice when we try to implement the routines by hand. And the kind of big picture summary of this is that for the product formula algorithms, we get some reasonably substantial reduction of about 30%. And for the other algorithms, we see kind of a less significant improvement. So this is reflected in the kind of final gate counts that I'm going to show you later. OK, so now I would like to describe some of the kind of implementation details that come into play when we go to actually think about how to implement these algorithms. So for the product formula algorithms, I mean, the algorithm is very straightforward. The main kind of detail that we have to think about when we implement the algorithm is how do we bound the error? Because remember, the way the product formula algorithms work is they split up the evolution into a bunch of pieces, and you have to perform the splitting finely enough that the error is small. And to see how finely you should split to get the error to be at most some desired target is some problem that you have to solve. And there are various ways of doing that. If you get a better understanding of that problem, then you end up with a better algorithm. So we consider four different methods for doing this. We consider two bounds that we call the analytic bound and the minimized bound that are basically somewhat tightened versions of previous analysis. Then we develop another bound, which we call the commutator bound, that kind of exploits the possibility that some terms in the Hamiltonian may commute. So we sort of look at what we know about the commutation relations among various terms in the Hamiltonian. And this is actually quite involved and makes up a pretty big faction of the paper about this, where we sort of develop these methods and also develop methods to actually evaluate the bound for our model system, which is not such an easy thing to do. And at the end of the day, when we do this, we get an improved bound, which actually improves the asymptotic performance as a function of n a little bit. And also, by the way, improves the constant. So you'll see that there's a fairly substantial savings in practical terms. So the final thing that we do, the fourth bound that we consider is, so unlike these three bounds that I mentioned already, does not give us a rigorous guarantee about the correctness of the algorithm. Rather, what we do is we look at the empirical performance of the algorithm by simulating small instances. And we extrapolate what it looks like the performance actually is. And this has been seen in some previous work on understanding resource requirements for simulating quantum chemistry that these bounds can be very loose. So if you look at these things like these analytic and minimized bounds, they can be very far from the truth. And actually, even this new commutator bound that we introduce, although it's much better, still is very far from the truth. It can be very loose. So what we do is we look for the small systems that we're able to simulate at how the error actually appears to scale. And you see that it's a lot better still than what we have even from the commutator bound. So what's plotted here is for these system sizes, for the parameters of the simulation that we consider, so r, that's like the number of splitings that you need to make. So you split the overall evolution into r pieces. And this r is the number of terms. So that's the r that you need to choose to get an error of 10 to the minus 3, which is our parameter. And so you see that the scaling is much better. And actually, even the asymptotic dependence on n is reduced. And I'll talk a little bit more about that later on. So if you do this, you don't have a guarantee of correctness. But at the same time, you get a lot of savings. So maybe if you believe that that's a good fit, then maybe you would be comfortable with doing that. Or perhaps putting in a little bit, maybe not taking an arm, maybe not quite that small, but it seems like you certainly can take one a lot smaller than what you can get from these proven bounds. OK, so what about the other algorithms? So in this Taylor series algorithm, there are various details that we have to consider to actually implement the algorithm. The kind of most significant one has to do with the implementation of these gates that we call select v gates. So these are just some gates that appear in the description of the algorithm. They're gates that conditioned on some register storing the value j apply some known unitary v sub j. And we developed an improved implementation of these gates, which corresponds to some kind of traversal of a binary tree. The details are maybe out of scope of this talk. But if you're interested in how you would implement these gates, it's described in our paper. Of course, also for this algorithm, we need to give some concrete error analysis. We do that. You could, of course, think about finding some empirical error bounds for these kinds of algorithms to see if that improves things. And we have a suspicion that they probably will not improve things very much for this algorithm. I could sort of explain more in the break if you're interested in knowing why that might be the case. But also for these algorithms, as you'll see when I plot later the number of qubits that you need to implement these algorithms, it's probably not so feasible to do classical simulations to empirically estimate the error. So that would be a challenge. OK, so finally, for the quantum signal processing algorithm, how do we implement that? Well, it's actually built from many of the same basic subroutines as the Taylor series algorithm. So we can reuse many of those subroutines in our implementation. But actually, the main challenge in implementing this algorithm is that it goes through this kind of quantum signal processing approach, which involves, in some sense, putting information about the Hamiltonian into some qubit subspace, and then applying some rotation on that qubit in order to implement the desired evolution. So in implementing the algorithm, you need to compute some rotation angles for this qubit. And actually, the problem of computing those rotation angles is something you can, in principle, do in polynomial time on a classical computer. But it's very, very expensive. So it involves computing the roots of some polynomial with very high degree to very high precision. And in practice, we're not able to solve this problem for any instances of reasonable size. Like already for 10 qubit simulations, we would not be able to do this. So that's kind of a problem. So how do we work around that? Well, there are two things that we do. So one is, of course, if we just want to compute the gate count for these algorithms, we can just throw in some arbitrary angles that are not the real angles. But they're going to give us the right gate count if we want to know what the gate count is. So we can determine what the complexity of the algorithm is, although in this case, we haven't really explicitly implemented it. So that's maybe a problem if you want to actually go and run the algorithm in the lab. But another thing that we can do to really get an algorithm that's fully specified is to consider what we call a segmented version of the algorithm, where we break up the evolution into a bunch of segments that are short enough that we actually can implement the algorithm that simulates the dynamics for that short time. And then we concatenate those together. And this gives us some asymptotic overhead, but actually, as you'll see, it doesn't give us too much overhead in kind of concrete terms when we look at small sizes. And again, there's this issue of error bounds. In this case, we have some kind of empirical estimation of the error in some part of the algorithm that gives us some savings if you're willing to accept some kind of not totally rigorous guarantees on correctness of the algorithm. And again, we sort of suspect that if we want to do some further error bound on the overall performance of the algorithm, that this actually would not give us much savings. And in this case, the algorithm can be run on a smaller number of qubits, so we actually have some preliminary numerics to support this. So let me get onto some data and show you what we know about the performance of these various approaches. So first of all, let me kind of compare the product formula algorithms. And so when we look at the product formula algorithm, we have various orders that we can consider because we consider these higher order product formulas. And we also have various bounds that we can consider. And so what we see here is that if we look at this minimized bound, the fourth order product formula is really the best thing to use over almost the entire range of interests that we look at. We get a significant improvement from the commutator bound by maybe another order of magnitude in terms of the gate counts over this range. And on the other hand, if we look at this empirical bound, we really get some further maybe two orders of magnitude improvement. And at this point now, even higher order product formulas are the right thing to use. So it's maybe a little bit hard to see in this plot, but if you kind of zoom in, you'll see that the fourth order formula is really the best thing to use up to around, I don't know, maybe 20, 25 spins. And then actually the sixth order formula takes over. And the eighth order formula is kind of getting close to being the right thing to use by, I don't know, 500 qubits or so. So this was kind of a surprise for me. Actually, one of the things that I said in my talk that I gave here four years ago was that I really expected that the lowest order product formula algorithms would be the right thing to use for systems of size like 50 or 100, like maybe the first or second order formulas. But that turns out not to be right. I mean, at least for the particular system that we looked at, you really already should be using like the fourth and the sixth order formulas. And these formulas have not been used in the kind of simple experiments that have been done, so small quantum simulation experiments that have been done. But I think that this is something that experimentalists who want to do product formula simulations really should be looking at, because you can get quite a bit of savings from going beyond the second order formula. We also understand something about the asymptotic performance of these product formula approximations. And this maybe relates a little bit to one of the open questions that Robin asked about. So he said, for one-dimensional systems, if we would evolve for a constant time, we might expect that we should be able to do that with a linear number of gates. And it seems like the best we know how to do is n squared. So here I'm scaling things up, because I'm considering evolution for linear time. So the sort of thing we know how to do is cubic. And I guess what we might expect would be quadratic. And so if we look at the sort of provable bounds, we see that they sort of are always above cubic. But actually, if you look at this empirical performance of the product formulas, it actually looks like it's getting close to quadratic. So maybe the true performance of the product formula approach really can give you something quadratic. But we would need to prove better bounds in order to actually get that. So maybe in some sense, we already know the algorithm. We just don't know how to set the parameters. OK. So now here's the kind of main slide. So if you've been kind of sleeping, this is the time to wake up and see kind of what is the overall kind of final conclusion. So these are sort of estimates of the overall resources that we would need for all of these algorithms. So what's plotted on the right is the total number of qubits that's needed in these algorithms. The product formula algorithms don't need any encilic qubits. So the number of qubits that they use is really just the system size. The quantum signal processing algorithm needs more, but not that many more. And this Taylor series algorithm needs quite a bit more. And so now what's plotted on the left is so now we're looking at kind of physical level operations. So just the CNOT gates in a Clifford plus RZ implementation. And so what you see is that the Taylor series algorithm, well, once you get above maybe 50 qubits or so, actually that starts to become better than the best product formula algorithm. So this is among algorithms with rigorous performance guarantees. And the quantum signal processing algorithm is actually quite a bit better. So that's kind of nice, because there's this algorithm which is really kind of doing something pretty non-trivial. It has better asymptotic performance, but maybe that doesn't set in until really large sizes. But no, that's not right. I mean, actually, at really small sizes, it really looks like it's the best thing to do. I mean, even at systems of size like 15 or so, it's already quite a bit better than the best product formula algorithms we were able to evaluate. But the other side of the story is that this is only with respect to provable guarantees. So now if we think about the empirical performance of product formulas, or if we consider, for example, this quantum signal processing algorithm, so this is the segmented version where we actually have an explicit realization, but if we consider the full version, well, we don't have an explicit realization. So let me now throw that into the picture. So here's quantum signal processing, really the full quantum signal processing algorithm. And here's the product formula algorithm with empirical error bounds. And so this is actually a fourth order up until about here, and then sixth order from about here on. Sixth order starts to take over. And so you see that now the empirically bounded product formula algorithms really are orders of magnitude better than the other algorithms. So there's a big gap when we think about algorithms that we have rigorous performance guarantees for. And when we just look at empirical estimates of the performance of these algorithms, and so this is something that I think we probably need to try to understand better. One kind of conclusion that I can make that I think is fairly definitive is that we should always use this quantum signal processing algorithm rather than the Taylor series algorithm because we save a lot of qubits and we always have better performance. OK, so these are estimates of CNOT gates, but what about fault tolerance? So as you see, the numbers of gates here are in the millions and more. So probably, if we actually want to realize these algorithms, maybe we will really need to do it on fault tolerant devices. And so if we look at the T gate counts, so this is the picture. And it's roughly similar, at least in terms of the relative ordering of the algorithms. And in particular, if we don't demand rigorous performance guarantees, these empirically bounded product formula algorithms really seem to be the best thing to do. Although in this case, the quantum signal processing algorithm is kind of closer. And presumably, it will, well, OK. So I mean, actually, I guess it will not overtake this empirical bound because the empirical bound is actually asymptotically scaling better than this quantum signal processing algorithm. But that's only because of these empirical bounds that we don't have rigorous guarantees for. OK, so here's a slide that kind of makes a kind of big picture comparison between this task and some other simulation tasks. So some other things we might think about doing on quantum computers include factoring and simulating quantum chemistry. So for the factoring problem, the best implementation of the factoring algorithm that we're aware of is this actually fairly old paper by Kudin, which if you want to factor a number with like 1,000 bits, which is beyond what we know how to do classically, beyond what we're capable of doing classically now, you need something like 3,000 qubits and like billions of gates. So that's plotted in this purple point over here in this plot where the horizontal axis is qubits and the vertical axis is number of t gates. On the other hand, if you want to simulate quantum chemistry, you can solve classically hard instances with many fewer qubits, so like maybe only with on the order of 100 qubits. But the gate count estimates tend to be quite high. And I think we'll hear more about this from Nathan this afternoon. And perhaps there are better estimates now, but at least in this paper, which looks at this simulation of this molecule that's relevant to nitrogen fixation, you need maybe something like 10 to the 14 gates to do a simulation of that molecule. So here are the kind of corresponding points for 50 spin simulations using either this segmented quantum signal processing algorithm if you want rigorous guarantees or this six-order product formula algorithm if you're OK with empirical guarantees. And you see that there are really significant improvements both in terms of the number of qubits and the total number of gates that you need. OK, so I should wrap up. So what have we learned? So what we have done in this work, I would say we would like to give a blueprint for the first quantum simulation that we can carry out on a digital quantum computer. I can't say that this is really necessarily the simplest that it can be, but at least we've established some benchmarks that have found a reasonably simple quantum simulation that I think would be useful and that is hard to do classically. So obviously, we should try to improve these benchmarks, but here at least we have some benchmarks that tell us something that we can do. I think I've tried to make the case that simulating spin systems is a lot easier than problems like factoring or quantum chemistry, but it still looks like with the implementation that I've described, it's out of reach of quantum computers before we have fault tolerance because we need millions of gates. So at the same time, maybe we can improve the implementation further, maybe we can consider other algorithms or other tasks that we might, other Hamiltonians we might like to simulate that could reduce these resource requirements. OK, I think one of the takeaways is that these higher order product formulas are useful at surprisingly small sizes, and the existing analysis of these product formula algorithms is very loose. And these more sophisticated algorithms, such as quantum signal processing, are really competitive at surprisingly small sizes, and we should take them seriously when we start to think about doing digital quantum simulations. OK, so I think there are a lot of interesting questions about the possibility of doing super classical quantum simulation without fault tolerance, whether that's something we could with further improvements actually perhaps do. Of course, there are many practical details that I didn't describe like taking the architecture of a specific device into account or really doing the details of fault tolerance. And there's this question that I raised of finding better provable bounds. But hopefully we've made a start at this problem of understanding what we would actually simulate on early quantum computers. Thanks for your attention. It's fine. Hey, so I feel like there was maybe a decision that was made in designing this experiment, which I want to ask about, which I feel in some way may have biased things against the signal processing or Taylor series based approaches, which is that it seems that you've chosen a system on a line with n terms. And because of that, you've chosen to simulate to time proportional to n, which sort of makes sense because you think that's how much time you need to build up, a lot of entanglement and so forth in the system. But suppose instead you were to consider a system that had n squared terms. It was, say, fully connected. Then you might suspect that you could reach the same sort of complexity states with much less time, presumably. Now, this would certainly make the product formula approaches much worse because they just need to explicitly simulate every term. However, if there is any structure in the way that, say, the coefficients of the terms are chosen, this might not increase the complexity of the select v oracle at all because it doesn't have a cost that scales explicitly in the number of terms just in the complexity of this column and value oracle. And so I think this is sort of consistent with what we've seen, at least in asymptotic applications of these things, say, to chemistry, which is when you're simulating systems with more terms, these signal processing or Taylor series techniques seem preferred. So anyways, I was just sort of wondering your thoughts on this decision to have n terms and linear time, as opposed to, say, n squared or more terms and less time, which I feel might have given a better shot to the signal processing. Yeah, I completely hear what you're saying. I mean, I think another kind of choice we could make. Maybe we could even still have a linear number of terms, but justify evolving for shorter times if we would consider systems of higher dimensionality. So there are other choices we could make that would affect that. I completely agree with you. I think this is a part of the landscape that we or others should explore. I mean, I think ultimately we had to make some concrete choices and do something. And we kind of didn't feel that we could explore everything. But we wanted to kind of have something that was really a particular model system that people had looked at in the condensed matter literature. That was part of the motivation. But I completely agree with you that it would be great to explore those other kinds of systems. And I would love to see the sort of results if you would make those other choices. Hey, Andrew, I wanted to ask a question about your commutator bound techniques. So in particular, the method that you used, it looked a little different in the paper from the methods that Matt, Ryan, and I ended up developing to bound the air for the second-order formulas. I was wondering if you could comment on the differences between the two methods and also, more generally, if you could comment a little bit about the techniques that you used to generalize these bounds to higher order. So I'm not really familiar with the details of those techniques. So maybe that's something that we can talk about, like over lunch or in the break. But I guess what we have done with these commutator bounds is basically try to look at sort of the lowest order contribution to the error. And try to understand how that, what is the scale of that term if we know that many terms commute. And there really are a lot of details there. I mean, maybe it's something better for us to discuss in the break. Let's take one last question. So following up on that, do you have any idea where the soft spot is or where you think it might be in the analysis of the product formulas? Like, where's the slack in the analysis? Oh, that's a great question. So I think some of the slack is in sort of the application of the triangle inequality. So you sort of bound the error in a small segment, and then you add that up many times. But that's actually, I think, not the bulk of it. Some of it comes from the fact that you, so when we look at the effect of the commutators, so our analysis really only improves the lowest order contribution to the error. But actually, when we improve that, if you look at the asymptotic performance, the higher order terms as a function of t actually become dominant sort of as a function of n. So like actually, there are further terms that we know that we could go in and we could get some better bounds on them. But it's already so much work just to improve the lowest order in t term that we haven't done that. But I mean, there definitely are some places where I think there's an opportunity for getting improved bounds. Okay, I guess we're gonna proceed to the photo part. Let's thank our speaker. Oh. Thanks.