 All right, thanks for being here at the last session of the day. Always nice, it's a good turnout for the last session of the day. Hope everybody had a good QCon. So we're going to talk about middleware for quantum, how we enable advanced quantum computing workloads on Kubernetes. So I'm Paul, I work at IBM, I work in serverless, so things like K-native, and then also quantum, the middleware we're going to talk about today, and my colleague. Yeah, I'm David, senior software developer in IBM too, who is a developer expert in web technologies and cloud, and yeah, that's me. All right, so we're going to talk about a couple of things today. First we're just a brief introduction to quantum computing, what it is, why it matters, why it's important. Then we're going to talk about our middleware, then we'll run an example application and leave a little bit of time for Q&A at the end. Okay, so introduction to quantum, what is quantum computing? And one of the places I always like to start in talking about quantum is not so much what it is, but why it's important. And if you remember back to algorithm classes, when you looked at easy problems and hard problems, and we kind of divided algorithms into things that we could do efficiently on classical computers, those were easy problems, things that ran in polynomial time, and our hard problems, things that did run in exponential time that we couldn't solve in classical computers. What quantum computing does is it creates a new class of problems, something that we call quantum easy problems, but are problems that can be efficiently solved on quantum computers. Now it's not all NP-hard problems, but it's some of them. So two kind of things to take away from that. One is that classical computers aren't going to be replaced by quantum computers. To give an example, a classical computer can't factor numbers very fast, but it can multiply them really fast. You can multiply two numbers on a classical computer in milliseconds. A quantum computer to multiply two big numbers takes a minute or two. It's very slow multiplying numbers, but factoring them, that it can do quickly. So again, key takeaway point one, it's classical and quantum, not replacing classical with quantum. The second is that you notice that it's only some problems. To be able to use a quantum computer, we need to have an efficient quantum algorithm, an algorithm that can take advantage of how quantum computers work. So what does that mean? Where does a speed up come from? And it comes from the fact that quantum computers use something called a qubit to represent information. On a classical computer, we have bits. We have zero or one. The qubit can also be zero or one, but there's another thing. It can be zero and one. It can be a linear combination of zero and one. Something that we call superposition, it can represent kind of all different possibilities of those two choices. What that means is that you can represent more logical states in a qubit than you can in a regular bit. To give an example of what that means, to represent two states in a classical bit, you need two times the number of states you want to represent. So two, four, six, eight. If you have 10 bits, you can represent 20 states. If you have, what is it, 100, you can represent 200. For a qubit, you represent two to the end. It's exponential growth. So two, four, eight, 10 qubits is 1,024. 100 qubits is one times 10 to the 30th. That's a one followed by 30 zeros. I think, correct me if I'm wrong, that's more than the number of particles there are in the universe. So big numbers with just 100 qubits. So these qubits, they hold our superposition, one and zero, all the different combinations. We chain them together. But when we put zero and one in superposition, we need, this is where the quantum algorithm comes in. We have to have a way to go from that intermingling of states to get down to a single solution. That's where our quantum algorithm comes in. We run operations on these qubits, and we can reduce it down to a single solution. So that's the magic of quantum computing. And to give you kind of one example of this, it ensures the algorithm for factoring large numbers. The red curve here represents what it would take on a classical computer, and it's exponential growth. The larger your number gets, the longer it takes to the point where the fastest we've factored now is 230 digits, and that took lots of compute. RSA is hopefully 496, almost 4,000 bits, so that's nowhere near being done yet. But on a quantum computer that's large enough, it's not a difficult thing to do. And the magic of Shor's algorithm is he figured out a way that he could take all these possible factors and use quantum to reduce that down to the single answer. So that's why quantum computing is important. Again, I mentioned that, so again, we need that quantum algorithm to be able to take advantage of this. So again, how do we use this? This is where we want to talk about quantum middleware a little bit. Again, it's the blending of, or it's the marriage of quantum and classical together. And there's a couple of ways we can use this. The one we're going to talk about today, quantum serverless, is using classical compute to get right, basically to do the pre- and post-processing of our quantum operations. On the other hand, we can also use classical to help refine the quantum circuits that we want to run. We call this the circuit knitting toolkit, C-E-K-T that we have over there. And that's a way that we can use classical compute to reduce our quantum circuits to smaller sizes that can fit on the hardware that we have today. So with that, I'm going to talk a little bit about the hardware. So with that, we are going to talk about quantum middleware is the tool that we are developing in EVM quantum right now to try to execute our quantum complex programs in not only in our machine, if not in the cloud. So one of the things that we think when we want to run something in the cloud is the resources, right? So the typical classical resources that we are going to manage when we want to run something in the cloud are the CPU, memory, or the GPU. For quantum computing, it's practically the same, like we have the CPU, memory, and GPU, with the addition of the QPU, where the QPU is interesting to have it as a resource. The QPU is interesting to have it as a resource because there is no QPU, same as other QPUs, even if the architecture of the QPU is the same, because the fabrication process, as it happens with the CPUs, for example, generates QBits with different characteristics, like you can have, due to the quality of the process at that moment, you could have QBits with more or less error rate, or you could even have QBits where the execution of a gate could have a different performance depending on the QBit that you select to execute that gate, something that is important. This representation is the architecture for the Brevian quantum computer, if I remember correctly, with 127 QBits for the Eagle architecture, and in that quantum URL, you can check the different, I mean, quantum computers that you have available, the architecture that those quantum computer have, quantum computer architecture has, and the characteristics that they manage. So, continuing with this, Paul, I don't know if you can explain us the workflow for the quantum framework. So, how do we build these quantum programs, or these quantum patterns? I just want to point out that everything up here is open source, so we start off with the quantum serverless Python library, that's part of the Kiskit ecosystem, it's an extension to Kiskit, but it's pip install quantum serverless, good to go on the client side. So, you've got your software development set up, next thing is you write a pattern, so we're going to make a Python script that has the steps that we want to follow, we're going to write our algorithm, we're going to set our parameters, do our processing, send it off to run on quantum hardware, pull the results back and process them. So that's our Kiskit pattern. Once we're ready to do that, we send it off, so everything kind of from the gateway over is running in Kubernetes, so the gateway, so we built, and again, we've got the link for the GitHub repo later in the slide deck, but we built a couple of components to help manage these workloads. We have a gateway that handles authentication, handles, you know, divvying up user resources, it's backed by a database, we submit a job to the gateway, it then runs it to a scheduler, the scheduler makes sure that we can allocate it the proper amount of resources it needs to run the job, and once the scheduler is happy with that and it runs a job, it ships it off to Kubernetes, or basically spawns it off in a new pod in Kubernetes to actually run the job using Ray. And in this case, for example, what we do is that once the scheduler receives a Kiskit pattern, a job that it needs to be run, in this case, in the cluster, what it does, simplifying it. I mean, it's not 100% accurate what I'm going to say now, but simplifying the process because we do some optimizations, but simplifying the process, every time that we receive an execution from a user, what we do is create a Ray cluster for that user, a Ray cluster with some configurations as we are going to see now. The most important parts of the Ray cluster are going to be the head node and the worker node, that is where we are going to execute the different things. And in this case, for the user, what we are going to do is create a Ray cluster with a configuration similar to this one. This is exposed in the repository. So, for example, if you enter in the repository and you install this in your Kubernetes, you are going to be able to modify the behavior of your Ray cluster because depending on the use case, maybe you are more interested in this case. For example, we are configuring the Ray cluster with the head node and the worker node with four CPUs, eight gigabytes of memory, and a maximum replica of two. This means that a user can have a head node and a worker node running at the same time with four CPUs each one, eight gigabytes of memory each one, that it will summarize a total of eight CPUs for your entire Ray cluster and 16 gigabytes for your entire Ray cluster for the workload that you want to run. So maybe you are more interested to have less CPUs or less memories for each node and have more replicas in case you want to parallelize more. Or maybe we saw in quantum computing for some clients, we saw some scenarios where we were interested to have more bigger workloads with less replicas that will increase the performance of the execution. So depending on the use case, you are going to need to probably configure the cluster how you want to behave. But more or less to summarize, this is the process that it follows that our cluster when a user runs a program or a Qiskit pattern in the cloud. So this is a typical workload when you upload something to the cluster. The entry point is the Qiskit pattern and the Qiskit pattern once type is executed, it will generate different tasks. Each task could be, for example, circuit preparation that is a task where we prepare the circuits to be executed in our quantum computers and once time we receive, we return the values, the results from that execution, we post process that values. And this is interesting because as Paul commented just a moment ago, the quantum computer generates a lot of outputs. The size of the execution of a quantum computer is around gigabytes of results that we need to process right now. So it's a very chunk amount of data that we need to post process in that task. And not only that, it's not interesting part of this kind of jobs is that you are going to need to rerun several times to obtain the correct value that you want to follow. So you are going to need to be able to apply a configuration for the CPU memory or GPUs to each of the tasks that you want because every task will need to have different resources amount. Like, for example, the circuit preparation, maybe what you are going to need to have is more CPU to process everything. In the execution, for example, what you are going to need to have is a QPU where you want to run the execution. And in the post processing, for example, what you are maybe more interested to have is memory and GPU to post process all the information and not all the CPU, for example. So this is the kind of scenarios where we think and we think no, because we are using it right now in Indian quantum is useful, quantum middleware, because it provides us a way to improve the performance of our executors, not only paralyzing everything, if not dedicating the resources that every task needs at the moment that is executed. So this is a practical example that we run several times. So this is a variational quantum eigen solver. And one of the things that this is, I'll talk about this in a minute, but this is what you might do just kind of like the energy in a particular state. So let's say you are building a new battery for your electronic vehicles. You want to make sure it's as efficient as possible. It's something that we're all important about. You know, we heard from Six Sustainability earlier in the week. That's important. We want our batteries to be efficient. The way you would run this type of workload would be you would make, you would set your parameters for the quantum circuit you want to run. So you do your initial pre-processing, you set it up, and that's going to take a bunch of compute. You might need to run that on, you know, a very big cluster, build your pre-processing set, then you ship it off to quantum. And the quantum circuit may take several hours to run, may take several days, might take a week, depending on what you're running, it could take a while. So you don't want all this compute to be just sitting here, so you want to shrink it down. It's serverless, right? Spin it up when you need it, shrink it down when you don't. Send it to quantum, quantum output results. Again, you want to spin back up when those results come back out and do your post-processing. But you don't just stop there. It's not, you know, we ran it once. Okay, that's the answer. The results that we're going to get are going to feed back in and we're going to re-run that equation because now we can tweak our initial parameters to optimize the system. So it's that constant spinning up and spinning down of resources in problems like this and in a lot of quantum problems that show the need for something like quantum serverless. So that's kind of the theory behind it. Now let's look at a practical example. And here's where we come with the fun part. So, okay, let me see because, whoop. Okay, so now what Paul is going to show us is the status of the cluster where we are going to run the different, well, this kisky pattern in particular. And the first cell that Paul will run, what we are going to do there is configure the provider to connect the kisky library that we were talking about just a moment ago with the gateway that you saw with the API that we have to connect our cluster with the user interface. So if, yeah, Paul runs that, exactly. We obtain the provider. So now we are connected with the gateway and the next step that we are going to execute is configure the kisky pattern and upload it to the object storage database, whatever you have where you want to store the program. And, well, just, I don't know if we want to take a look to the kisky pattern. Just real quick before we look, can everybody read this? Somewhat okay in the back, said, okay, awesome. No. Sorry. I never used a trackpad on this Lenovo. And apparently if you click in the middle, it closes things. So, bear with me one second while we reopen that. Okay, so the kisky pattern in this case is pretty simple. The only thing that we are doing is receive a set of circuits that we are going to run in a distributed task to calculate the quasi-distribution. That is something similar to a probabilistic distribution. We are going to run those circuits in parallel and obtain the distribution from those circuits and return the result. So, this is what the kisky circuit does. And, okay, we upload the... I don't know if there's something One thing I'll note here. So, where we do this run because we're live and we've got about 15 minutes left in this presentation, we're not going to actually run this on real quantum hardware because we wouldn't get results back in time to show you anything cool. So, we're using a simulator. The other piece of that is if... I don't know if you can see the command prompts. The Kubernetes cluster this is running on is also local. But this, you can run this on any cloud you want to. You can use any quantum provider you want to. We use IBM Quantum, but if there's a provider that you want to use, it's open source. You can use it how you want. Yeah, essentially, mean while you have a Kubernetes cluster, you're fine. So, we uploaded... Yes. Yes, there would be, there's a queue. So, the job may not have time to run before... Not get a chance to run before we came back. Okay, so the third system that we are doing is one of the tasks that we commented, right? We are preparing the circuits to send it to the Qiskit pattern. So, here we have the three circuits that we prepared. If you want to take a look at what those look like... They are one of the circuits, so... Circuit two. And this is just using the regular Qiskit software development kit to build these circuits and then visualize them. And we just run the Qiskit pattern with the configuration that we want. Like, we send the circuits to the Qiskit pattern. And the job is sent. As you saw, at the right at the top, we create the right cluster. At the right cluster, what it's going to do is create the head and create the worker because they are the maximum number of nodes that we set up for the cluster. The job is created, and we need to wait until it finish. Once time, how we know that we finish it because, for example, in the pods right now, we are seeing that it's running. And once time, it finished execution, we are going to close the nodes. This is where the risky part of the last day of the conference. We succeeded. Yeah, exactly. So, as you see, the nodes are terminating. So, we are removing all the data from the user from the right cluster because the user doesn't have more resources needed. So, we remove the resources from the user. And here, we have the results, okay? So, yeah. I think the demo is done. I don't know if we have something more to explain. Oh, yeah. This is yours. Yep. So, just to kind of reiterate, so that's the demo we showed how to run Quantum Workload, how to orchestrate it with Classical Compute. And again, the key kind of takeaway from all of this is that, again, it's not Quantum Replacing Classical. It's Classical and Quantum Together. Think about Quantum as just another type of resource that you can use to solve the right kind of problem. Some problems run great on an HPC. Some problems run great in Kubernetes. Some problems need GPUs. Some problems are going to need QPUs or Quantum Processing Units. And so, that's kind of the way to think about Quantum and how we can use Kubernetes and open-source technologies to help orchestrate those workloads. Something that, for example, at IBM Quantum, we are trying to achieve right now is the utility for Quantum Computers. It's not anything more like the Quantum Computers being useful for the users, right? So, with that, we got a little bit of time for questions. I will just kind of throw this up there. If you would like to learn more, the leftmost link is to some of our KISS Kit Learning, or IBM Quantum Learning. There's textbooks there where you can learn more about Quantum Computing. You can learn, this is a course that Dr. Max and I and others helped put together. You can learn about Quantum Safe Cryptography, how we can make our technology safe from the threat of Quantum Computers. In the middle is the KISS Kit SDK. That's software for writing Quantum programs. It's Python, so it's nice and easy. And then, the leftmost is the Quantum Service Repo. That's the open-source tooling that we showed today that enables you to do this orchestration. Yeah, we are not Venizman, so we are not here to sell in anything. Everything that we saw is working. We are using it, and it's in the repository, open-source. Yeah, I have a question. Let's go back to the opening part where we're talking about qubits. And my understanding is a challenge with a physical qubit is very prone to noise and errors, and you need to combine them to make a logical qubit that's error corrected. And a lot of potential for some AI. So I was curious if you have any thoughts on how many physical qubits will we need for things to become more feasible for AI applications and about when do you think we'll hit that? I mean, for AI-specific, I don't know, to be honest. It depends on the AI algorithm that you want to run. Like, right now, for example, we only have Open to the users of quantum computers with 127 qubits, logical qubits, real qubits, and the demonstrators that we can do with that are limited. Like, we need to wait to have more qubits in general, but it's true that with data architecture, for example, we publish a paper this year where we demonstrate with the execution of an algorithm being useful, the quantum computer... I mean, the utility of the quantum computer is demonstrating that it could achieve better performance with the execution of that method. But, yeah, for AI in general, it will take time. At the same time that, for example, Max and Paul had a really useful talk about quantum-safe cryptography. And they were talking like, obviously, it's a scenario where we need to work, we need to investigate because the problematic is there, but the number of qubits that is going to be taken to break the different cryptography that we have right now is not tomorrow. Like, it will take time. And in terms of numbers of qubits, Peter Shore gave a talk about a month ago. It was a retrospective kind of on Shore's algorithm for factoring. But in that talk, there was a chart that listed, like, eight or nine kind of key quantum algorithms and the numbers of qubits required, numbers of logical qubits required. I mentioned that because I don't remember offhand what the numbers were, but there is some info out there in terms of the numbers of logical qubits that we needed for some of these types of algorithms. Something interesting that we didn't explain in this talk, for example, is like, even if it's exist an algorithm in classical computing, it doesn't mean that you can apply that algorithm in quantum computing. Design an algorithm for quantum computing is very complex due to the nature of quantum computing. So even if you have, for example, an algorithm in AI that could work in classical computing, it doesn't mean that it will work for quantum computing in this case. Probably it will need some kind of review to be able to work on quantum computer. The question that I have regarding stability, a lot of times, and I'm not too familiar with the QPU service that you guys were using there, but a lot of times when running quantum workloads, you can get into kind of unexpected due coherence events, things where your circuit just kind of breaks apart just because that's just how it happens sometimes. I was curious about how your framework handles the scheduling of those sorts of failures and if you have like a concept of retrying and if you can detect that. Thanks. Yeah, one of the things that we've built in, so we use Ray kind of as Yorkshire under the hood, which does some retrying in there, but we've also implemented some retries in there as well, so if results fail, we will retry it. Still, we're still learning, but we are working on ways to handle retries so that when things fail, because it's running on Kubernetes, you know, sometimes networks flip, you know, you have to build resiliency and so it's things we're working on, but it's work in progress. In general, for example right now, well, I didn't specify, but obviously the quantum middleware tool that we are developing is not related specifically with quantum. I mean, you could perfectly run classical methods and or classical programs in this tool. The thing at the end of the day is like error happens, obviously, and for example, one of the things that we take, that we took a look at the moment when we were working with resources management in general is like for example, in this case, once type your cluster, for example, it has all the resources dedicated to a task or those kind of things. Initially, the task waits to have more valuable resources. So obviously, we configure the timeout, depending on the use case that you have, the timeout could be greater or less, but yeah, that's kind of a stability that we look at for. All right, thank you. About a new circuit and then how long does running it actually take as well, like how many jobs per minute, day, year, do you get through per quantum computer like you've got 128 qubits, but how many jobs, how long does it take for a job to get through that and the next job to go? When do you mean configure a QPU? What do you mean exactly? So like in the middleware, like if I'm a user and I upload my circuit and I'm ready for it to be run on hardware, how long, once it's queued up and it's the next one to go, how long does it take to configure the hardware to run my circuit and then get an answer and then get back to me the next person to go? Yeah, I mean, for quantum computers in general, something that is in a play in a part that we are working on right now, what you can select currently is the quantum computer where you want that the program is being executed, like for example, if you want that a specific computer runs the code because you know that the qubits in that quantum computer runs the case that you have for your circuit better than others and the time that it takes is variable, honestly. I mean, it depends of the status of the Kiwi at that time. Like for example, if suddenly it happened at that day that you run the job, there is a university or a company that was running, I don't know, 100 jobs, but yeah, it's not immediately, it's not immediately. So there's no more questions. Thanks, everyone, for coming. Hope you had a great KubeCon and safe travels home. Thank you, everybody.