 Come to the fourth day of the conference. We will have the talk by Vedran Dinkyo. And after that, coffee break to discuss everything we learned. Take it, please. Thank you very much for the introduction. Can anybody not hear me well? If I start speaking too quickly, please shout out. Or if I start mumbling, please shout out. I tend to do these things. And I can be understood when I try. So first I would like to thank Marcus and Christof and other organizers, Tiago and those that I forget to apologize. I'm really honored to have a chance to speak to you today. This is not exactly the kind of conference that I attend, but it's exactly what I want to be doing when I grow up. So I'm very happy to be here. Has anybody played PowerPoint karaoke? Does anybody know what that is? Very few. You get enubriated and then somebody puts up a set of slides which are not yours and you're forced to make a presentation on that. Why am I saying this? Because I tried to remember what I promised to talk about here, and then I went on the web page. And I found the title of my talk. And then I was like, really? I mean, it's an appealing conversation to be had, how to have quantum earning advantages. But I don't really know that much about it. But I guess I promised. Then later I realized that, of course, I made a mistake, or somebody made a mistake, it's learning advantages, which is maybe much less provocative, very interesting. But I realized I'd be very emiss if I didn't tell you how to make money with quantum computers. So let's talk about that, at least for a little while. Jokes aside, I do want to talk about quantum machine learning. This is something I'm really, really excited about. But interestingly, I can't tell you something about earning with quantum computers at the very introduction, then forget about money altogether forever. Why? Because I'm excited about quantum machine learning. But if you go online and you scan what people are writing about it, and if you look at Google Engrams, so this is tracking the Syntem quantum machine learning and its popularity, or actually occurrences in literature over time, I realize that there are people who are even more excited about quantum machine learning than I. And I've been in it pretty much from the beginnings. So I get confused. What do they know that I don't? That this is really so cool. And by the way, this is going to be the end of my advice. It is the next big thing. So if you want to get rich, I guess you need to buy stock of all these companies now while the going is good. So this is my end of talk on quantum earning power. And I will not get back to that. Now, just to preempt another attempt of making money from me who is extremely rich, don't sue me. This was a joke. I'm not really advocating buying stock for anybody. And I will not joke anymore too much. What I will say, however, that it's a very good question with the quantum machine learning is really such a big deal. And the answer is we don't know. And this is what I want to discuss. But there are things that we know quantum computers are certainly good for. And if you put your careful skeptical hat on, you will realize that there are many things, but not that many. And ultimately, two things which seem to be uncontroversial is that quantum computers are good for factoring numbers, which may be relevant in very particular applications. And they seem to be very good at simulating very complicated natural systems. And I suspect that everybody here has at least once heard this one of the more popular quotes by Feynman that if you want to simulate nature, you need to use a quantum mechanical simulation. So there's a reason why I have these two gentlemen here. We'll come back to both of them at a later point. The organizers pointed out that everything gets recorded so that I should be careful with licenses of images that I used. And I had a different picture of shore. And I found this one on Wikipedia, which is then free. And I noticed something funny that the picture from Wikipedia is actually from this room. He was here when he was receiving the 2018 Iraq Medal in 2017. So this is exactly the podium that he was standing, which I think is quite cool. And I definitely don't deserve to be here. But anyway, a small, interesting coincidence. So yeah, so I said, we know that quantum computers can factor a number that's an ambiguous advantage. They can also similarly quantum systems. The truth is a little bit more complicated. We know a little bit more. In fact, there's an entire field called complexity theory, which analyzes hardness of problems in the sense of what kinds of resources are needed to solve them. And you may have seen dying numbers of this type, where we have this thing here, which is the class of BQP, which is the problems that quantum computers can solve in polynomial time, and our believed relationship to other complexity classes that we studied over the years, which have nothing to do with quantum. And there are many things here in various intersections. And I will say for the remainder of this talk, I will assume that the diagram is correct. I mean, we don't have a proof that BQP is not really in BPP, but I'm going to assume it is. I'm going to assume that it's even outside of the entire polynomial hierarchy, which some people believe. But we don't really know. But let's assume this is true. I still want to raise the question, what does this say about quantum machine learning? Because it's machine learning. It's a little bit of a different animal than computational problems. So in the community, there had been this common understanding, something we didn't really say very much explicitly, but we all thought it, that quantum machine learning, if it will have advantages anywhere, is going to be when the data comes from a genuinely quantum system. Makes sense. You use quantum to learn quantum. Why would you're not going to classify cats with dogs better with the quantum neural network? Why would you? But maybe find exotic particles in these massive sets of data from CERN. Maybe that's the thing quantum computers should be doing. So this is something I think most of us believed. And I like to think whether maybe Feynman was more interested in sci-fi and knew more about AI, whether he would conjecture that learning about properties of nature requires quantum learners, rather than just talk about simulation. I have no clue. I don't know if he would agree or not. It's not even particularly relevant. What is relevant up until very recently, science we're pointing to know, it seems to be very little evidence that quantum machine learning is going to have advantages for learning data from quantum mechanical systems. And I just want to highlight here something that I'm not talking about machine learning with the data itself is a quantum state. So the experiment is done. The measurements are there. You have your data sets on your hard drive. You're going to analyze it, and you're going to say something about it later. That's the setting I want to consider. These other settings allow for more different types of advantages, but they're also more restrictive because you somehow have to get your hands on quantum data unperturbed. So here everything is measured out. And this is what I want to kind of figure out. So why am I saying that the science we're pointing to know, because data-driven methods are just so spectacularly successful? Like whatever slightly conservative computer scientists and AI people would say, well, this is not going to happen soon. Many of these things actually happened. And one prime example is this conference here, which is talking about the power of machine learning in quantum simulation. You have good ideas of what machine learning could do for many talks on this topic. Recently, I was at a conference for designing molecules. One of the bigger names in classical machine learning, Maxwelling, was talking about how he simulated Schrodinger equations. So yeah, it's a little bit worrisome. And it's interesting to see how this actually meshes with the results from complexity theory, which I just said I believe. And I'm claiming even though I believe this, it is not in contradiction with the possibility that we don't need quantum computers for machine learning from quantum data. Why? Well, because complexity theory separations might not imply learning separations. And the question is, when do they? If we know something about complexity theory, does it ever and when does it tell something about possibility of learning? I think by now you're getting the feeling that this is going to be a rather formal talk. And it will be. I will be introducing a lot of mathematical machinery. I don't expect you to be in love with all of it, but maybe at least hear certain terms for the first time. Maybe you find it interesting, and you can always look me up, and I'll be happy to talk about this to whatever extent you wish. So let's talk about learnability as opposed to computational complexity and how we can have quantum advantages. So I want to talk about things formally. I want to prove stuff, so I have to tell you what I mean by learnability. So let's start with an intuition. Suppose I'm on the following prediction problem. I describe to you a molecule, and I ask, can you please tell me what the ground energy of this molecule is? This is a hard problem for molecules that are complicated enough. But now I ask it, but suppose that I first gave you a data set of molecules from which we've solved the problem. There could be hundreds of thousands of molecules from which we know the ground energies. Does this help you solve the problem for a new molecule we have not studied before? Or do we actually need a quantum computer for a problem of this type? So I need to say how to formalize this learning problem in such a way that I can actually talk about proofs. And what I will be using is the terminology of pack learning. So just out of curiosity, has anybody encountered pack learning before? OK, so it's really still rather obscure. But let me tell you what it does. It's a formalism to capture supervised learning. And you know what supervised learning is. We talk about learning concepts. Concepts are functions which we can, for the moment, pretend go from n dimensional bit strings into a single bit. And what is a concept? It's a labeling function. You can think of a concept being, is this the letter a, where the input is a bitmap, like a black and white image of pixels. And some of them correspond to an image of the letter a, and some of them don't. And a concept for the letter a would say, one if it is, the letter a, and zero if it's not. And a concept class, like this wider set here, would be, for instance, collection of concepts one for each letter, right? So they're capturing what it means to be letter a, letter b, letter c, and so on. Now, what is machine learning? What is supervised learning? Well, I give you, I give to a learning algorithm a, a data set of bitmaps, and labels of the corresponding bitmap, for a concept that I don't know. It's a letter, but I don't know which one. And now my learning algorithm, no, no, no, no, turns, turns, turns, turns, and then spits out a function, which if I put in a new bitmap, new image, it will correctly tell me whether this is the image corresponding to this letter or not. I don't know what the letter is, but it's going to be correctly labeling. Yes, Marcus, please. For now, we're going to, we're going to generalize it. In the end, it could, it could be real value, then we talk about regression problems. It could be categorical, more complicated things will happen. But to understand the basic definition, let's take with binary, right? It's complicated enough, it turns out. So one thing which is kind of, so I think you follow this, what is important to know that, I mean, it is kind of important where these data points come from to define what it means to be a good classifier at the end, doing the right job. So data actually comes from some fixed distribution, like people have handwritings. There's some distribution of more likely and less likely bitmaps that are going to be the letter A or not. It's not flat, it's not uniform. The nature is not uniform. So the only thing I will demand is that whatever this distribution that generated the data said, that's the one relative to which I want my labeling to be correct. That's like, if you're studying for an exam from textbook A, you expect to be quizzed on textbook A. I'm not going to ask you from some other textbook which is covering completely different part of whatever it is you were studying. So this is the definition that I want. This learning algorithm A learns the concept class C. If for every concept, if you give me a data set labeled by that concept, with high probability, it outputs a labeling function such that the probability that it disagrees with the actual labeling function, sorry, when averaged over the data distribution, the one that generated the data, is below epsilon. And this has to hold for a data set whose size is polynomial in the size of the concept, size of the image if you wish, and these two error probabilities. And it's called PAC learning because it's probably approximately correct. This is probably, this is approximately correct. Now, note this thing here only talks about the size of the data set. Some things are not learnable even if you give me a data set of any size. But to really have classical quantum separations, we need something a little bit more. We're going to talk about efficient learners. And this additionally demands that this algorithm here runs in polynomial time. So I'm limiting how much time you're allowed to analyze your data. That's the definition, OK? So when I say a learning problem, what I really mean is a concept class that is defined. Does there exist a learning problem? Means does there exist a concept class where I cannot achieve this with a classical computer, but I can with a quantum? Any questions here? If you remember anything out of this, there's a thing called PAC learning. I'm personally not in love with it. It has many shortcomings, but it's an extremely good starting point to start understanding what learning means in a formal way. Good. So learning separations, can I find a concept class that I can efficiently learn using a quantum computer, but I cannot do so with a classical computer? So one thing that you might want to ask is, wait a minute, you told us that you believe that there are functions that you can evaluate on a quantum computer, but not on a classical one. Let's say a function of this type. This is a good example. I prepare some quantum state, which depends on the input in some convoluted complicated way. I apply some unitary, which depends on some parameters theta that you don't know. And then there's some measurement. And I define my concept thusly. In this case, it will be regression because it's an expectation value. I could put a sign around it if you wish it's not going to change anything. And I'm telling you, I believe that classical computers cannot evaluate this function. Then why is it not immediately obvious that a classical computer cannot learn this? It cannot evaluate the function. How could it possibly learn it? Well, here the difference is emerge. So the difference is between computational complexity and learning problems are threefold. The first one is by far the most interesting for me. The second one is very relevant. The third one is not that relevant for us for the moment. And it is, number one, what I call the data gap that is in machine learning. You have examples of the function you're asked to evaluate what it does. Somebody showed you examples of this. And this turns out to significantly empower what classical computers can do. The second one is I also need my quantum thing to be able to learn it. It's not enough to prove that a classical thing cannot learn something. I also need to prove that a quantum can if I want a separation. And that can be tricky because very simple things are sometimes not learnable at all with any kind of a learner. Like, for instance, learning logarithmically deep Boolean functions, you cannot learn that problem. It doesn't matter. It's not a computational issue. It's like information theoretic issue that you cannot learn it. The last thing is that in complexity theory, we typically ask about the worst cases. Like, you want your algorithm to be correct always. Whereas in PAC learning, I will remind you, I will call this the PAC criterion, I just need to be correct with high probability with respect to some distribution. So this is more of an average case type of complexity that we worry about. And with these distinctions, this diagram might look very, very different, as far as we know. Yes, please. Yeah, the classical computer can do much more. It's much more powerful. I'll show you in a second. Very nice. Well, what I find very illustrative examples. So here's the first one. Imagine, note, all of these statements are asymptotic. I'm talking about scaling. So when I say a function, what I really mean is a function family 1 for each input size. OK, like Hamiltonians come from a family. It's not a single Hamiltonian. It's not a single matrix. So imagine a family of circuits. And they have some gates. You could think of them as randomly selected, except for one. One of them is, let's say, a poly-Z rotation about the angle theta. And there's an observable data measure. And this quantum circuit encodes this function here. And I ask you, can you come up with an algorithm which is going to evaluate this function for every theta? Can you do it? And the answer is, we don't actually know how to do that. I mean, these gates are fixed. It doesn't really help me, because this could be somehow universal gates. It could be a programmable thing inside. We don't know how to evaluate this thing here. It is not BQP complete, as you'll learn in a second. But we still don't know how to do it. However, you can prove that regardless of what these other gates are, this function has to attain this very simple form. It is simply a trigonometric function with a certain number of free parameters, 3 in this case, gamma, beta, and alpha. And the only thing that I don't know about it is what these alpha, beta, and gamma are. But now if I give you a data set corresponding to this, like three valuations of this circuit at three random points, with very high probability you can fit the data, find it with alpha, beta, and gamma are. And voila, you know exactly what the function is. And from that point on, you can evaluate it. You just have to evaluate a trigonometric function. This, by the way, generalizes if you start adding parameters, it still generalizes. And in very many cases, it turns out that here you're going to get something called the general trigonometric polynomial, which in fact is still going to be polynomial sized itself. It's a polynomial of polynomial description size, in which case you give me a polynomial amount of data, I can still learn it. It's a very highly quantum function, but I can still learn it. It has to do with the concept of function obfuscation, which simply means that I'm expressing a very simple function in a very complicated way. It's not the function that is hard. It's my description of the function. So the data changes the computational game. And in a complex theoretic jargon, we want to talk about this new class invented just relatively recently, which I'm going to call BPP slash SAM. So these are things that randomized classical computers can do, having access to samples from what the function does under some distribution. This here implies that I'm going to be interested only up to PAH criterion. It doesn't have to be always correct, just up to some distribution with high probability. And we don't know how big this circle is relative to other circles. Even if I believe that BPP is not in BQP, well, I mean, for all I know, all of BQP can be in BPP slash SAM. And then we don't need quantum computers for anything except to generate data. That's what we learned at that point. So that's what we didn't know. And it gets very painful, because in very seemingly quintessential quantum tasks, like estimating ground states of a physical system, based on examples, under very, very mild conditions. I believe you may know this paper. It's quite famous. Very mild conditions, you can do that. You can even regress the whole shadows of the system and do whatever you want with the underlying system. So it seems like estimating ground properties, and note, estimating ground properties is not something quantum computers can do. Like if your gap is decreasing, in principle, you cannot do it. If you don't have a good initial state, it's the quintessential QMA hard problem, which is too hard for quantum computers in polynomial time. But then you give me data, and suddenly I can do it on a classical computer. That's how powerful data can be. So it's very surprising. Do we even need quantum computers? Asks Drake. But we do know we need quantum computers. And that's something that people tend to forget. It was actually proven in year 2000 by a Servadian-Gottler, and then revived again in 2020 by Leo Aranachalaman-Theme. And I'll show you the example, because it's also easy to explain and quite neat. So here's a problem. I take integers from 1 to p, where p is some large prime. And I'm going to label them blue and red. And how will I do it? I will put them on a circle. And I will draw a straight line through the middle. And one of the half planes is going to be red. The other one is going to be blue. And the half plane is fully specified by the integer that it goes through first. Is this clear? So I promise you this is an easy to learn problem. I don't know where the hyperplane goes, where the plane goes through. But if you give me a couple of examples, and I kind of take the best guess of where it should be, that's going to be a good answer. You can prove that. This is generalized as well. And I'm going to do a sneaky thing. I'm going to take these integers. And I'm going to apply modular exponentiation to them. So I'll take a generator of the multiple group relative to p. I will power the generator with x whatever x was. And now everything looks like a mess. I've effectively encrypted data. This function here is also related to how we generate random numbers. So now if I try to understand what does the labeling look like, it looks like a total mess. It's not separable by a hyperplane anymore. The points seem to be labeled randomly. So now can I learn this right inside problem? Well, if I could invert this function, it would be easy. I just take your data point, invert it back, and then apply the simple learning algorithm. I'm done. Quantum computer can do that. So the inverse of this function is the so-called discrete logarithm problem, which is in the second part of the source paper and factoring. It proves two things. One is factoring. The other one is a quantum algorithm for discrete log. The same quantum algorithm solves both problems. They turn out to be hidden Sago problems for the Abelian group. So a quantum computer can definitely solve this problem. Classical computer, well, you might be tempted to say, well, if it has to invert discrete log and we know discrete log is not classically easy, then it cannot. But that's not sound reasoning, because there might be another way to learn this function. Maybe there's a way of doing it without learning the problem. And that's the catch. Here, you can actually prove that if it could learn it, then it could also compute it. So learning is not easier than computing for this particular problem. And it heavily uses properties of the discrete logarithm. And the most important property for me right now is the following. For the discrete logarithm, so where data is x, comma, discrete logarithm of x, maybe the most significant bit thereof, I cannot compute f of x given x. But I can generate pairs efficiently. That's a tricky thing. So let me show you how this works. I can first pick a y into the core domain of the discrete log. And the discrete log is bijective. So I pick a y, then I compute the corresponding x by doing modular exponentiation. I'm sorry, this thing is a little bit iffy. And then I just relabel things. I choose y, then I compute a to the y. This is easy. And then this is equal to x, comma, the discrete log of x. Moreover, since the function is bijective, if I chose y uniform at random, this x is also chosen uniformly at random. If you apply a bijection to uniform random distribution, you get a uniform random distribution. So now I have a valid pair. I don't get to choose what my x is that's random for me, but I know it's a valid pair. So now imagine there was a learning algorithm. Well, then I can just put a preamble, generate data, learn the data, compute the function. And this entire box now has no data. The data is internally generating the algorithm. And that's a whole trick. There's a bunch of other properties that you use that show that even if you can solve it for a fraction, it's also hard that even a single bit is hard. And you end up with sort of resolution of all three problems. The data gap disappears because I can generate data at random. Quantum, I can do it because short can factor. So we can solve the discrete logarithm in this case. And heuristic is also hard because DLP has random self-reducibility, attractive mathematical properties, which make it a very good cryptographic function as well. That's the key, right? With proven, data cannot help because I can make it. If I can make it, it cannot help me compute. And that was a big thing. It's a learning separation, assuming the discrete logarithm is not in P. And there was a bunch of works that did the same thing for generative modeling, for reinforcement learning, for whatever you wish. And it's a good technique. You apply the same trick all over and over again. Data does not help because we can generate. And I'm not happy. I'm very, I'm a sad bunny and I'm not happy and this is not what I wanted. And there are two reasons why this is not what I wanted. They're here. I will briefly tell you about why this is unsatisfactory from the perspective of what is a learning separation. But I will focus more on the other problem that quantum data is not generatable. There's no reason to believe that I can randomly generate pairs x, some quantum function of x. In fact, I don't want to promise, but I think next week we're gonna have a proof that this is hard, that you cannot do that. Which sadly means that none of these proof techniques can apply. They're just a non-starter to try to prove a learning separation this way. But let me first tell you about this learning separation business. The catch is there are many, many learning types. Learning is a much more fidgety, subtle thing that I led you to believe. And the subtlety here that I want to highlight is that in our proof, I assume that learning will imply that I can label a new point efficiently. I could have asked a slightly different question. What if you simply told me which concept labeled the data? So I'm not asking you to label a new point, just tell me which one generated it. It's a slightly different question, but for the second one, we have no proof because I do not require you to be able to evaluate it. You might say, well, who cares? I guess people who thought about back learning define it thusly for a given good reason. I would say that's not a good answer because this learning meaning identifying the labeling function rather than evaluating has important applications. That's exactly what learning order parameters is because nobody expects you to be able to find ground states and complete expectation values. If you do Hamiltonian learning, learning parameters of Hamiltonian, I'm not expecting the model to be able to prepare whatever data you use to learn it, which is often properties of the ground state because ground states are QMA hard. But I still know the generator of the physics, right? So that's a very useful thing and there's nothing I can say about that or so it seems. Plus there's a conceptual unsatisfying consequence of this evaluate is that you can construct learning separations where the concept class has only one function inside. So you know exactly the function you're learning. There's only one, yet you cannot learn it according to the definition because you cannot label a new point. A quantum learner can learn it without access to data because there's only one function. You know exactly which one it is. So you might be unhappy with this. I mean, is this even learning? I don't know. I agree, this is not very cool. So we analyze this pretty much ad nauseam in a paper. I will not bore you with that, but just to point out that there are examples where we can prove that identifying the concept is a hard thing. The concepts are all easy, but you cannot figure out which one it is except if you use a quantum computer. So you can have learning separations of all sorts and types. But I wanna focus on this thing here over the remaining 15 minutes or so. Data is not generalatable, so I'm very, very sad because I cannot use these proofs. But I still want to claim this. I want to claim, nah, I'm not giving up. I'm still certain that at least in some cases no classical algorithm can evaluate the function even given access to data. And I will remind you what was the catch. In the proofs we use so far, we were using functions which are both quantum evaluatable, so in BQP, and also random-generatable. That is, I could generate pairs, random pairs of data points, and I needed this property to be able to prove that having data doesn't help, right? So this is the space we were considering before. For those who like history, the correct term for these functions was introduced in the context of cryptography that were called random verifiable. But I think for our purposes, random-generatable is more informative. So this is what we had. And our problem was that we don't know where BVP-slash-Samp is. If it swallows whole of BQP, we're in trouble, right? But then came a way to handle this thing. And I will have to introduce very briefly a new class called P-slash-Poly. So this is a class of so-called non-uniform circuits. It's also known as a class of things that classical computer can solve in polynomial time given an advice string, which depends only on the size of the instance. So let me explain. Suppose you're looking for, I don't know, this is not in P-slash-Poly, what I'm gonna say, but suppose you're asked to solve factoring, and I give you an N-bit number, and you're given additionally a string, which is the advice string, giving you the best possible advice how to factor N-bit numbers. It doesn't tell you how to factor the particular N-bit number you're given, but it's supposed to help you factor all N-bit numbers. Okay, I mean, otherwise it would be trivial, but still the advice is only allowed to depend on the size of the problem. This class is kind of a funky class. It's non-uniform. It can solve certain uncomputable problems because advice doesn't have to be computable. But the catch is, we know two things. Juan Getal proved that BPP-slash-SAMP is in that class. One, two, we know stuff about it. We know by Adelman's theorem, for instance, that it contains BPP. Even though there's no BPP here, by the way, I know that it's just P, not BPP. Second, we know that it's unlikely that it contains NP. There's something called Carpliptin theorem, which tells you that if P-slash-Poly contained NP, then the polynomial hierarchy would collapse to the third level and people say that's not gonna happen. And I said I believe in the diagrams, right? So P-slash-Poly does not contain NP. So I know things about it, right? We then later generalized and proved that also certain inclusions hold for the heuristic classes, which are the ones which are actually relevant for learning. So this brings us to our first lemma. If BQP-complete problems, so these are the ones which are the hardest that quantum computers can solve, if they don't have polynomial-sized circuits, in other words, if they're not in P-slash-Poly, then BQP-complete problems cannot be learned according to the standard definitions, regardless of data. And it's not about data being generated, it's about classical polynomial-sized Boolean circuits which represent neural networks, for instance, are not expressive enough to capture what quantum functions can explore. So it's simply not in your set. You cannot output a function, you cannot output a polynomial description or a classically-evaluatable function because it doesn't exist. Hence, data cannot help you. So this is quite important. There's a bunch of, you know, nitty-gritty groundwork you have to do to get rid of this, to get heuristic properties and so on. I'm not boring on this, I want to skip to something else. What I want to skip to is sort of the corollary of everything that all BQP-hard problems, including QMA problems, define at least one learning task where a quantum computer, where a classic computer cannot learn it. Okay? You can ask me whether a quantum computer can learn it, I will show you that we know how to do that as well. But before going there, I think I want to tell you something about this Wang et al paper. One of the consequences of this thing that we kind of worked out in detail is if you look at these seemingly mild conditions, they mean something like constant gap and smoothness, we've shown that if you relax any of these, then you get a contradiction with complexity theory. So these mild conditions are tight. You will not be able to learn it classically without causing an uproar in complexity theory. Can a quantum computer learn it? Can a quantum computer learn it? Yeah, we can construct concept classes that it can. For instance, any concept collapse, which is polynomial size. If it's polynomial size, it can brute force check every single one of them, and you might not like it, but it ticks the box of there exists a thing, right? We can do a little bit better than that. In another paper, we've actually looked into more physically relevant cases, and we've proven learning separations for learning quantum observables. So this is the model that I want to study. Let me describe what I said. So this is how the data is gonna be generated, by the way. So there's a fixed input state, and then happens a complicated physical process that I don't control. It's not controllable. I don't get to choose what X is, but I get to find out. So it's heralded. I don't have good examples from the lab. Maybe you can help me with that. I can imagine that things like this, for instance, happen in high-energy physics, right? Where you don't know what the initial state was. You get measurements at the end, but then some side measurements tell you something about what the initial state was. So you know the initial state after the fact, but you could not have controlled it, and you find out something about the end. So there's this process that I don't fully control. I find out which one happened after the fact. There's some quantum state, and then I have a measurement. And this measurement, you can think of it as an expectation value of a K local Hamiltonian. That's a good starting point. So this M is some kind of a parameterization of a K local Hamiltonian. And the problem that I'm asked to solve, the concept class is exactly this. So each concept is labeled by a measurement. So that's the thing that I'm gonna be learning. I don't know which measurement happened, but I'm supposed to determine this based on data where it tells me which complicated quantum process happened, which state was generated implicitly, and what the measurement quantity was at the end. So that's my data. And you wanna show that it's not classically learnable, but it is quantum learnable. And you can make it such. So if this process is hard enough, then it can also be BQP complete. And if it's B, so these are things like time evolutions of certain Hamiltonians. So you can imagine this U of X simply tells you what X is the parameters of the Hamiltonian that actually got time evolved, and you later find out which one it was. It could be a ground state problem. It could be the coupling coefficients of a Hamiltonian, but you don't know which one, but you find out later somehow. So this is a good thing. So if it's a hard enough Hamiltonian, that's what I need. Then you can prove that this function here for very simple measurements here already cannot be classically learnable because it's not in herpes slash poly. There exists no classical polynomial size circuits that can evaluate it. Okay, that's the entire complicated, well, complicated tedious theory that we worked on before. What is more interesting here is how do you learn these coefficients of the Hamiltonian? Well, what a quantum computer can do is for every data point that it gets, like so everything is known in the setup, so it gets a data point. Oopsie, this is the wrong button. For every data point that it gets, it also has a label, but it can also compute the label for that data point for any other parameter setting, for any, sorry, not parameter setting, for any K local poly string. Because U is known, I know what U is, so if I know what X is, I can, on a quantum computer, prepare the quantum state psi of X, and I can measure expectation values of poly strings, all of them. There's only polynomially many of them, only. I can measure all of them, and if I do that, I end up with a system of linear equations, the solution of which tells me which linear combination actually happened in the actual measurement. So I get a linear system. It's a little bit more complicated than that because I'm assuming that the measurements are noisy, they have at least short noise, the system can be over determined, and it can even be unsolvable, but it can still prove that if you use a particular reading of the Lasso regression algorithm, it will actually converge to provably good solution. Provably good in a sense, I will not tell you maybe necessarily what theta is, but the theta that I find is gonna perform well. It's gonna label the thing the same way theta, the correct theta would have. Which is the definition of learning, right? I don't need to find which theta it was, I just need to get something which is gonna do the job, right? That's what the evaluate definition is. So this is a learning problem, so what am I learning? Which measurement happened to complicated quantum state? Measurement being, okay, local, only observable. We can generalize this actually quite a bit, and this is, you can think of it as a meta theorem if you wish. So I'm certain many of you have seen these recent results again by Huang, Kung, and Presk, Lidal, where they're learning unitaries. Like there's an unknown unitary, you get to choose your initial states and they end up being sometimes very simple, like single qubit stabilizes states, plus minus zero one and plus I minus I, you shove it into the unitary. You measure essentially shadows, like in shadow tomography at the end of it, and there's a learning algorithm which reconstructs what you has to be. And it is universal, it always works, and it's quite a shocking thing to realize that those things they can learn in the quantum case, the analogs classically are completely unlearnable. Like I can learn arbitrary shallow unitaries, but if I just want to consider classical Boolean circuits, these are not learnable when the data is classical, when the input is classical, bit strings to bit strings. So there's an interesting mismatch that I don't fully understand, but there are many of these algorithms which learn unitaries, right? So for a particular set of unitaries, there's a setting where you prepare some initial states, you measure something at the end and you figure out what the unitary was. So give me any such algorithm, and I will construct for you a learning problem where the data is classical, the label is classical, and there's a provable learning separation. So it's a meta statement. For every coherent learning algorithm of unitaries, there exists a learning problem with classical data and classical outputs. And the construction is not super complicated. I mean, essentially what it does is it has a part which is going to do something which is computationally hard. Then there's going to be a part which is going to be simulating the preparation of arbitrary, ridiculously slow, arbitrary probe states. There's a part here that's going to effectively learn the unitary from classical outcomes, because ultimately, yes, it's an algorithm that they have where you put in quantum states read all the classical things, but I know which quantum states, I have the classical description thereof, and the outcomes are quantum measurements, but they're classical outcomes, so I have classical information. Ultimately, this is the only thing we work with, right? We don't really deal with quantum states, we deal with their descriptions. So this I can embed in here and get a learning separation. So now, just to summarize, at this point, where we were and where we are now, and this is like a two-year research program that was happening, we were in a situation we had zero evidence that there's a learning separation for any quantum mechanical problem. The only separations we had were for factoring things like this, critically using mathematical properties that we know don't hold for quantum systems, for quantum functions. So this we overcame by realizing that stronger assumptions of complexity theory can actually be used, like this idea of using p slash poly, as the thing we believe quantum computing is not in. Then we looked at the types of separations we get and we asked, okay, but when can a quantum computer actually learn something interesting which satisfies these conditions? And we had the abstract statement, now we had these examples, which end up being examples about learning of observables in very general settings. They're very general settings where learning observables is classically intractable, you can do it quantumly. And I think I have a table here, which this is a paper that is not yet out, where we kind of grouped everything that we know so far. So for time evolution problems, there is a learning separation, ground separation, yes. If I flip around with the label of the concept and input is, it turns out that's classically easy. You can learn it, so you don't need a quantum computer. If observables are parameterized by unitaries, then there are learning separations for learning observable. No learning separations for Hamiltonian learning because we have an algorithm for Hamiltonian learning. And for the problem of identifying the unitary which did it, we don't know. This is open, whether identification is hard or not. Yeah, so that's pretty much all we know. And now I have maybe one or two more minutes just for the, perhaps the most important slide in the case you don't already know these things, is I never told you which Hamiltonians, right? So which are these Hamiltonians that you have learning separations for? And it turns out there is many of them. So specifically, to just give you some examples, for the Bose-Harburn model, it was proven that the time evolution is BQP complete. That's a result of child's and his collaborators from 2007 or so. XY Hamiltonian on graphs, ground state problems are BQP hard. They can be QMA complete. Fermi-Harburn model and to the left is the same thing. The electronic structure problem, this is quantum chemistry business, for fixed basis also turns out to be QMA hard. In high energy physics, anything to do with top, well, many things to do with topological quantum field theories. So that's thanks to, you know, Witten and Kitayev who invented topological quantum computing model. And who thought about how hard it is to compute invariance of knots like the Jones polynomial. Then one plus one massive five four theories, that's Preskelin and Jordan and others, turns out to be BQP complete. This is computing scattering amplitudes of that model. Certain quaggot's asking theories, I'm not sure, but I think they're hard as well. And it turns out certain supersymmetric theories, certainly supersymmetric quantum mechanical theories turn out to be QMA one hard. So there as well you can find learning problems that you cannot learn with a classic computer, but you can with a quantum. They're not all of them, not everything you might want to learn is necessarily hard. But what I'm claiming is that you gave me a favorite Hamiltonian family, I'll devise for you a family of learning problems where you need a quantum computer. Yeah, the key catch was we changed the complexity theoretic assumption that we believe into something slightly stronger, but still reasonable. I will skip this consequence on NISC. You can ask me about this if you care. So I think at the end of the day, I'm now more than ever convinced that quantum machine learning should be working with quantum data and that's the main thing we should be focusing on. I want to thank Kasperi Kader-Machtab that we derived these results with. And we have openings, specifically on this topic at Pozo and PhD level. So thank you. Thank you for this amazing talk. Very interesting. I have a question about your measurements, which you were also mentioning as classical data, or saying like classical data. Does this, would this also include more generalized measurements, like say in a large tidbit space like POVMs, or could they somewhat change the game? I don't see how they would change the game. In none of these things, none of the separations rely on a quantum computer having access to measurements that a classic computer could not simulate. It relies on classic computers not being able to evaluate time evolution of quantum systems. So I would have to think about it a little bit more, but God feeling, I don't think that matters for us. Okay, okay, so it's like not about say, this like quantum data for you is like, maybe it's also a question like, what would be an actual quantum data for you? So like, it would be more the state itself? Yes, there's a lot of work in there where the input is a labeled quantum data, right? There's a quantum, there's a psi or rho, whatever you wish, comma, some label, and now you can ask, okay, can a quantum algorithm learn some property whatever generated the quantum state, or rather the correlation between the quantum state and data, you could be learning observables again, for instance, but the catch is I mean, it is ill-defined for a classical algorithm, right? I mean, what does it mean to fit a quantum state into a classical, I need to invent an interface at which point, at that point, I've limited the classical computer, right? So if you do that, you can do much more. I don't care, I challenge you, you invent a thing, you need a fully quantum thing, we did study this a little bit, but here the story is, no, no, the measurement was fixed, done, it's over, it's all classical, it's fair game, and this is why all separations we get are contingent complexity theory, they cannot be unconditional, which you can get for quantum data inputs. Any more questions? Yeah, thank you for the very nice talk. I've got just one question regarding the Hamiltonian learning, because when you say regarding quantum chemistry, that's something that I have a hard time imagining, what do you mean by learning that Hamiltonian? I didn't say, it's okay, so this was not about Hamiltonian learning problem per se, this was referring to the fact that here, I said things like the experiment involves something to do with a hard Hamiltonian, I'm not doing the learning of the Hamiltonian, it could be specifying the experiment itself, and I didn't tell you what it means to be a hard Hamiltonian, right? And there are various flavors to this, but some of them are for certain Hamiltonian families, you can show that computing certain things they can do is hard, like computing time evolution of a complicated enough Hamiltonian, again, what do I mean by computing time? I tell you the Hamiltonian, tell you the input state is all zero, and tell you compute for me the expectation value of the poly observable in the first qubit. That thing is hard if your Hamiltonian is the Bose-Habbert Hamiltonian, generically. Similarly, for these other fermionic Hamiltonians, computing properties of ground states is gonna be hard for a classical computer. I tell you the Hamiltonian, I ask, okay, what is the ground state energy, for instance? The ground state energy in particular is gonna be QMA hard. So this is what I mean by hard Hamiltonian, it's something about it is hard to compute. Yeah, not about learning, learning it is always easy. We always know how to do that. Yeah, thanks for the inspiring talk. So about this quantum data, and so you're looking at classical algorithms, and do you have an intuition kind of what would change if I have now a quantum data, quantum, say, prepare Bose-Habbert in cold atoms, and either I take my measurements and put them into a classical algorithm, or I do kind of controlled rotations on my state, manipulate them as a quantum algorithm? Do you think that would be potentially more powerful, or can you say how this compares? We have a paper called Limitations of Measure First Protocols, which gives us a proof that there are things you cannot do without fully quantum machine learning. So there cannot exist a universal measurement protocol, which is independent from the concept, but specific to the concept class, which is such that you do this measurement on your data, and then classically post-process it, even without computational assumptions. Even if this classical processing was exponential time, you still lose too much information with the active measurement. So we have a paper out on it proving the separation. It goes back to a certain communication complexity lower bounds. You cannot communicate quantum state in a polynomial amount of information. We can talk about it. It's a very interesting question. In that case, let's thank the speaker again.