 Okay. Recording in progress. Go ahead. Hi. My name is John Nelson, and today I'll be presenting on how to use the way it's quantum annealing hardware to perform a high quality deep sampling. Um, and this is work with, uh, Mark McFrey, Andre locove, uh, to me all bash and Colton Coffin. See. Okay, so brief introduction. Our hope is to use D wave to find ground states of classical icing models and a reminder of this icing model is basically it's a some of these terms where we have a variable for these spins that can take the value plus one or minus one. And so we have these coupling terms and then we have these bias terms. Um, and so it biases each spin to either up or down. And to find these ground states, what we do is we start in the ground state of Hamiltonian that's easy to prepare. And then we interpolate between this Hamiltonian and the icing Hamiltonian that we want to, to find the ground state of. And if we kneel slow enough, then we hope to stay in this ground state and eventually find the ground state of this ladder Hamiltonian. And so in reality, um, we don't actually find the ground state every time, but because, um, we only have so much time to perform this annealing, uh, there's some thermal excitation. So, um, we, uh, produce some higher energy states and also air. So in general, we end up with a distribution over low energy states of this icing model. And so once type of distribution is this, um, we'd like to say that it's, uh, like Gibbs like or Gibbs distribution. Um, and that would be really nice because, um, there's lots of practical applications for Gibbs sampling in general. It's, uh, has applications for, uh, simulating physics and sampling from the thermal states of systems would find a temperature. And, um, it's also a sub routine in many machine learning algorithms. Um, and so reminder of this Gibbs distribution, basically, uh, the probability mass of each state depends on its energy and the lower energy states are weighted higher. Um, and the mass decays exponentially with a higher energy. Um, there's also this beta term. Um, and the, uh, this is like, uh, the inverse of the temperature. Um, so if you have zero temperature, then, uh, beta is infinite. And basically you sample uniformly from the ground states. Um, but, uh, there are some challenges that have been observed in previous work. So one is that it's been observed that D wave does not actually sample uniformly from states with the same energy. For instance, uh, it doesn't sample uniformly from, uh, the ground states. Um, and at least under certain conditions. Um, and this is often explained by a freeze out argument. Um, which says that, uh, postulates that at some point in the anneal it gets stuck and doesn't actually finish the anneal. And so there's some residual transverse field. Uh, so that's like these, uh, poly X terms. Um, and that remains, uh, after the anneal and this introduces biases of biases to some ground states over others. Um, and so clearly this is not, uh, like very Gibbs like, but, um, there's another source of error and that's if you scale down the interaction strength. So the interaction strength here, I'm referring to the magnitude of these magnitudes. Um, and so we always take the J and H values, the magnitude to be constant throughout the entire Hamiltonian, but, uh, what changes is the, the sign. So it can take a positive or negative sign. But, uh, yeah, we keep the, the, these magnitudes, uh, constant. I'm referring to that as interaction strength. So when that is small, um, it's been shown that actually it does suffer from this issue of, uh, non-uniform sampling from states with same energy and these, these, uh, residual transverse field, but it does suffer from noise. Um, and so this noise can make the distribution, uh, significantly different from the original, uh, distribution that we want to sample from. And, uh, that was observed by my co-authors a couple years ago. Um, and so these two sources of error have inspired our work. And, uh, in our work, we argue that both sources of distortion can be avoided by carefully tuning the interaction strength. Um, and that comes from the intuition that this residual transverse field, it dominates when there's large interaction strength and the flux noise dominates when the interaction strength is low. Um, and there is a sweet spot in between. And so this number is specific to the de-weighted machine, but around point two to point four, whereas the full range is between zero and one. Um, and in this regime, uh, we observed that, uh, both of these, um, sources of distortion are significantly mitigated. Um, and we then, uh, do some like extensive experiments to evaluate, uh, how well we can sample, do dip sampling for randomly generated icing model instances. Um, and then we also show how you can tune the temperature of the Gibbs distribution you want to sample from, um, by tuning parameters in de-wave. Um, so first, um, I want to present some evidence for this intuition of, uh, why we believe, uh, of these two different sources of error and the, their, um, their, uh, how they dominate in either the small interaction strength regime or the large interaction strength regime. Um, and so to do this, uh, I'll present this like, uh, toy model and then, um, the main results for our, uh, will be after. Um, but okay, so for this toy model, we basically have a three spin system where it spins one and two are coupled and spins two and three are coupled. Uh, then we just collected many samples, uh, from this one. Um, and then using the, this quantum annealing process on de-wave and getting the ground state, which is, uh, these, uh, some spins configurations. Um, and then using the spin statistics, uh, we reconstruct each term, each coupling term. And this will be the coupling term between one, two, two, three and also one three, which is not present in the original Hamiltonian. Um, and we, uh, use a Hamiltonian learning algorithm that we can use to, um, and then finally we'll, uh, re-re-reduce this experiment, but instead of sampling from de-wave, we'll sample from, uh, this, uh, a model of the, uh, Hamiltonian and this model includes the transverse field term and a noise term. Um, and then we'll compare how well the model agrees with the actual data. And, um, the things to note is that the transverse field, uh, depends on interaction strength. So it scales with it. So it gets much worse in the interaction strength. It's, uh, betters that that captures this idea that I mentioned earlier. And, uh, the, the flux noise does not scale with interaction strength. It's actually, we take the aided parameter to be quite small. Um, so it's very insignificant when J, uh, when J gets large. And so then we sample from this like noise average thermal Gibbs state and, uh, do the same Hamiltonian, uh, reconstruction. So, um, here is the data and we see like very good agreement. Um, and the things to note is that the main source of distortion or is this, uh, with this coupling between one and three is not actually present in the, um, original Hamiltonian. And so that, that's, uh, what is really, um, making, uh, our output statistics very non, uh, not matched our desire to Gibbs distribution. And you can see this, uh, like spurious, uh, coupling become much, uh, greater with the greater interaction strength. Um, and also it's hard to see, but it's also becomes negative when you go below 0.25, but it's around zero around this, uh, 0.25 regime. Um, and so we kind of designate that as our sweet spot. Um, and also, uh, we like just by experimenting with this model, we can see that the main feature of the noise is this negative, uh, spurious coupling value for low J. And, uh, this distortion here can be attributed to this transverse field term that scales with J. Um, okay. Okay. So now, um, on to the main experience is, uh, we, uh, a bunch of 16 spin icing models with various ground state degeneracies. Um, for each of these, uh, we redo the experiment at various, uh, interaction strengths. So remember we keep J and H the magnitude to be constant for each of the models. And we take it, uh, or I guess not zero, but a slightly above zero and up to one. Um, and then, uh, we also redo the experiment for different annealing times. And, uh, for each of these configurations, we take, uh, about a million samples from D wave. Um, and then we compare the output distribution to Gibbs distribution. And we do this by, uh, classically computing the exact distribution. And we just use this way, or we do this by brute force because we only have 16 spins. So we've tried to take as big of a model as we can that can still be relatively easily, uh, simulated classically, but brute force. Um, and then, uh, we compute the total variation between the exact Gibbs distribution and, uh, the D wave distribution. And so this variation metric is just the absolute, uh, value of the difference between the, uh, probabilities for each, uh, uh, output. It's been configuration. Um, so, um, here is an example of one of, uh, these randomly generated, uh, IC models and we show all the different annealing times. And basically the punchline is that, uh, alpha in is like an interaction strength. And so for low alpha in, uh, we see like a big divergence from the Gibbs distribution of the due of statistics for large alpha in, we see something similar, but as long as, uh, alpha in is between the sweet spot that agrees with our original free spin, uh, toy model experience experiment. Um, then the, the, uh, distribution matches the, uh, Gibbs distribution very well. And so we kind of designate 5% as this accuracy, uh, threshold. And, uh, it's also interesting that varying the annealing time doesn't really affect, uh, where the sweet spot regime is. Um, and so here are some more results. Uh, this is for various, uh, different randomly generated instances. And by random, I just mean we take the, uh, we include every coupling term in the due of architecture. And we just randomly choose positive or negative. Um, and, uh, the left terms don't include this, uh, field bias and the right terms include this like single spin, uh, field bias. Um, and we see that consistently, uh, we're getting, uh, very good agreement with the Gibbs distribution for this regime between, uh, 0.2 and 0.4. Um, and it's also important to be able to tune the temperature of the Gibbs distribution that we're sampling from. And we can do this by, uh, one, we can vary our interaction strength between 0.2 and 0.4. So it is a kind of a narrow regime that we're trying to hit, but it's not, you know, there is some room to vary. Uh, and then we also can vary the annealing time and that affects, uh, the temperature that you're sampling from. And, um, by varying these together, you can get a pretty, uh, consistent, um, uh, range of, um, temperatures. So in conclusion, um, we analyzed major sources of errors, uh, when trying to perform Gibbs sampling on, on annealing hardware. Um, and again, we identified that, uh, there, there's a specific scaling regime, uh, that you can use to, um, really mitigate these errors. So this is, we showed it for Gibbs sampling, but like in general, um, these errors are not desirable noise and, uh, having this residual transverse field. So it kind of serves as a general, uh, piece of advice to what, if you're going to use these quantum annealing hardware, uh, devices, um, to not push your, uh, interaction strength all the way to one or max it out. But really you want it to be in this middle, uh, middle ground area. Um, and then we showed that by, uh, specifically tuning the energy scale for Hamiltonians, we can, um, improve the Gibbs sampling performance. And we also show that, uh, it's possible to, um, tune the specific temperature that you're sampling from. All right. Thank you. And I'll take any questions. Questions. How did you, yes. How did you fix the temperature? Um, as in how did we, uh, sorry, um, like how do we compute the temperature or how do, how do we adjust it? The machine is the exact Gibbs distribution. Okay. Yeah. I can just explain that a little more in depth. I did brush over that part. Um, so one just, uh, we expect the, um, like scaling the, basically the, the energy scale of, um, the, the input problems that is, uh, effectively changing the temperature of the system. But, um, the same goes with the kneeling time. So, uh, longer and kneeling times, uh, basically relates to, um, more often finding the ground state, which is essentially similar to having like lower temperature in the system. Um, but there's also this question that I glossed over of, uh, how do we actually know what is this temperature that we're sampling from? Um, and so that's calculated by, uh, so when I say we classically compute the exact Gibbs distribution using brute force, we actually calculate this Gibbs distribution for many different, uh, temperatures. Um, this classical Gibbs distribution. And then we see which, uh, Gibbs distribution matches best with D wave. So we're basically implicitly inferring what, through this process, we're inferring what the, this effective temperature is D wave. Um, and so we have to do that step as well to actually see what is the temperature we're sampling from. Uh, yeah. Thank you. Yeah. I think there is a question in the chat. Maybe you could read that and answer the question. Uh, so it says, does this tuning effect both energies and uniformity of samples, both the Gibbs distribution of energies and the uniform distribution of samples at, uh, fixed energy. Um, yeah, so, um, I guess, uh, both. Um, so sampling from the Gibbs distribution is actually, uh, a large part of getting it right is being able to sample uniformly from low energy states. Um, because all the probability mass is on the low energy states and it decays exponentially after that. So you really have to get those right. And so if you, especially if you mess up like the ground state and you don't sample uniformly, then that's like going to be the dominant source of, uh, the power between your distribution and the Gibbs distribution. So, um, yeah, I would say that. One of the main, uh, Uh, like tuning this, um, energy scale is, uh, mainly to kind of fix this to, to try to make, make it so that, um, these, these low energy states are being sampled uniformly. So, um, I guess my answer is yes to that. Any other questions? Questions online? Okay. I have a question. Um, do you think your technique is scalable? I mean, I see that, uh, the temperature I can sample from depends on the annealing time. So I guess if I increase the system size, probably need to increase as well the annealing time. So maybe there will be system size where things might break down. Yeah. That's definitely, um, something that, uh, I guess we'd like to experimentally test, um, how it scales. It's definitely a future direction of this work. Um, I think in general, since we, uh, show from these very small systems, like these three spin systems, the presence of these distortions when you increase the interaction strength. So, uh, And we see that it still persists when you get from, go from three spins to 16 spins. I think that in general, um, it's not a good idea to, uh, really blast up the interaction strength, um, because I, it does really seem like there, there is something, uh, going on where, uh, that really, uh, exacerbates the, um, the effect of this, uh, freeze out, or this residual transverse field that, uh, stays there. Um, and I, so I think like, um, in general, yeah, I think it would scale in the sense that I think always, uh, having too low of interaction strength makes you susceptible to noise and too high, um, makes this, uh, residual transverse field issue worse. Um, and so I think that regime is always going to stay consistent. How good of a gift sampler it is for, uh, bigger systems. I think that remains to be known. I think most likely the performance would, uh, degrade as you get, uh, larger system sizes, but the, the general takeaway message I think would stay consistent in terms of this, uh, uh, trying to operate within this sweet spot regime. Thanks. Well, if there's no other question, let's thank the speaker again. Thank you.