 I'm a PhD student at Kilimanjaro Quantum Tech in Barcelona, and I'll be talking about a quantum handling algorithm for the qubit allocation problem. This is joint work with Artur García Sáez and Marta Estarellas. So first, I don't know how many of you may be familiar with the qubit allocation problem. I'm guessing many of you. It's also known as the qubit routing problem. And it's a problem that comes up in the compilation process of gate-based quantum computing. So we have a user that defines a quantum circuit that they want to run on some gate-based hardware. And they would define this quantum circuit in some algorithmic qubits, which are logical qubits. And this quantum circuit will consist on an instruction set that will be an ordered list of links to be performed on each layer of the circuit. So here we are imagining that we are forgetting about the single qubit gates, since we are dealing with connectivity issues, as you will see in a moment. And we are only considering the two qubit gates of this circuit that was sent by the user. So we have this instruction set gamma, that is this ordered list of links to be performed at each layer. And on the other hand, we have our hardware connectivity, which is not all-to-all but limited, as you would see this omega here as a simple example. And the question comes, okay, I do not have all-to-all connectivity in my device, but most likely this gate-based circuit that I have just received will require this all-to-all connectivity. How do I get around this? And this is a problem that already the devices that we have have to face, obviously. So the way that one would go about this is to introduce extra gates that allow us to implement this connectivity at the cost of having a deeper circuit, basically, with all that entails. So we have two types of transformations in order to do this. We have the possibility of introducing swap gates, which what they do is let the algorithmic qubits migrate around their physical allocations in the hardware device. That way we can bring closer together the two qubits that we need to interact. And then we have also the possibility of implementing a bridge gate, which allows us to connect two qubits that are separated by only a distance one. And in this case, the algorithmic qubits stay put in their respective physical sites. So as I already said, this introducing further gates deepens our circuit. You already know the drill, more ledgers, longer computation, increased error, we don't like this. So we would like to find the optimal solution to this problem, basically. And what you see here on the upper right would just be a good solution to the toy problem that we were using to explain this. So the way that we usually go about this now is by heuristics. Qubit allocation is an NP-complete problem. So this is basically the way to go. And this heuristics usually consists of two parts. You need to find an initial allocation to start building your algorithm, well, to start building a set of transformations that you basically propagate from this initial allocation in the hopes of it having a low cost at the end. Such that the connectivity required at each layer of the circuit is satisfied. So let me stress here that the actual solution to a qubit allocation instance is this initial mapping, this initial algorithmic to physical qubit mapping plus the set of transformations that is derived from it, such that the connectivity is satisfied at all times. All right, I think I'm leaving something here. It will come up, okay. So what we'll actually concern ourselves with is finding the initial allocation, ah, right, this is what I left. So this final addition allocation is usually found also by heuristic methods. And what we are proposing is an algorithm to find this initial allocation such that it belongs to the truly optimal path. What I have here. So let me define a bit of terminology of what you can see here. So we are going to find the final parenthesis initial, I'll explain that in a second, configuration corresponding to the optimal path. Okay, every time I say configuration, I actually mean algorithmic to physical qubit mapping which is really long to pronounce so I'll probably just use configuration. And then when I say path, I mean the ordered set of configurations corresponding to the optimal sequence of transformations that is actually the solution to the qubit allocation problem. So by providing this starting point for the heuristic algorithms to build up the set of transformations, we are providing with the possibility of actually attaining the optimal solution which is not guaranteed at all and also enhancing the probability of finding better solutions in general. So in general, this problem is highly dependent on the initial mapping that you choose as well. So our algorithm, the general idea is inspired by Aharnov's construction of the proof of universality of adiabatic quantum computation which most of you may have read at some point but I will quickly review just to refresh our memories and for those who haven't. So this proof is based on the history state that was built by KitaF previously which basically contains the states of gate-based quantum computation at all times of the computation. So how is this done? Basically the qubit register is separated into two parts, a clock register and then another algorithmic register that is actually containing the quantum state that would be present in the gate-based system. And then this clock register is just generating coding to encode the circuit layers that we are going to be considering and how this analytical superposition of all of those, all the states that the quantum circuit goes through throughout the program. So this history state is the ground state of this H final Hamiltonian that you see over here. So this H clock is just a term that is implementing the constraints for the clock register to stay unary. Basically this H input is simply implementing the initial state of the gate-based quantum computation. And what is particularly interesting is this HL over here that is the one providing this propagation of the computation that is afterwards present in the ground state. And as you can see, it consists of the sum of terms that implement each of the unitaries corresponding to the unitary describing the operation of each layer with tensor with the advancement of the clock register. So this entanglement that this Hamiltonian creates is what is allowing for this information to be propagated from the initial states of the computation to the final states of the computation and that's how you get this superposition over here. So what we are going to do is something sort of similar. So we will intuitively compute simultaneously the analogous history states corresponding to every path, path in the sense that was defined previously. So this analogous here is quite broad because here in this case, we are weighing each transformation by its corresponding cost. And this is actually the way in which we're going to implement the cost function that we would like to have. And you can think of this maybe if it helps as the scheme that we just saw with the perfect simulation of gate-based computation if you had noisy channels, then your states would be propagating all over and you would simply be following the effect of the unitaries in all the states that are being propagated by the noisy channels. So in this case, our ground state would look, well, something like that, which is more complicated than what we had, but what you're going to do? And basically we will have to each clock state, we will have associated as a proposition of all the possible configurations that we may have. So all the algorithmic to physical qubit mappings, remember? And we will be particularly interested in what we have at the final stage of the computation. Oh, okay, I'll go back to that. So in order to do this, what we will do is just directly bias this final state in the clock register such that we concentrate the probability of our wave function over there in the ground state. So this VLV will be greater and hopefully much greater than the rest. And then once we have, once we look at the configuration states associated to that time, the one that has the maximum probability would correspond to the configuration at the optimal path at this final time of the algorithm. I think I didn't say before, I just realized, so this initial, between parentheses, and I promised that I would explain and I didn't. It's just that you can use this algorithm to find either the final or the initial configuration by simply reversing your instructions and what you defined your algorithm. So that's why I'll say final all the time in the interest of time ordering, but just know that it can be both. Okay, so this would be a schematic picture of the algorithm that I just described. Basically it is the Hamiltonian transition elements corresponding to the analogous to the HL that we saw previously that I'll explain in more detail in a sec, the ones that are implementing this cost function. So this transition elements will be inversely proportional to the distance covered by the transformations that we need to implement and directly proportional to the gate fidelity, which is also an important aspect that we would like to take into account. So here as you can see, we would have that for each time step, which also corresponds to each layer. We have each of the nodes would be a configuration that is a rhythmic to physical QB delegation mapping. And we see that the lines connecting configurations at different time steps have different thicknesses. And this would be this transition amplitudes and then just thicker lines correspond to cheaper transformations. Now where lines correspond to more expensive transformations. And the optimal path will be defined only by the end of the instruction set. This is because it's not, if you are going through your instruction set that has, I don't know, 10 layers, when you're at the fifth layer, the optimal solution corresponding to just going through the fifth layer is not necessarily the one that will be optimal when you look at the 10th layer, if you follow that path. So let's take a look of the register encoding that we are proposing. So not much special to see here. Clock register is still unary as in Hanoff's approach and the algorithmic register is also fairly standard in its encoding. So we will have NH sub blocks within this algorithmic register, NH being the number of harbor qubits that we have in our physical device. And within each sub block, we have NV plus one slots to allocate our qubits and these NV refer to the number of algorithmic qubits that are present in the program that we want to encode in our quantum gate-based device. And this plus one is in case we have NH larger than NV so that we allow for the possibility of physical sites to remain empty. So the enforcement of the restriction to the physically meaningful states within this encoding is fairly standard again. So this term that you have down here is just a term penalizing everything that goes out of the unary encoding. And then what we have up here is the way to enforce the constraints that make this encoding make sense. Additionally, we would like for all the states that we encounter to, well, we require, it's not that we would like, to respect the hardware connectivity of our hardware device, that was the point. So yeah, we would like to implement this as part of our constraints in order to have our algorithm be more efficient. So we're gonna do that. We're just gonna introduce this additional constraint in the initial and in the final Hamiltonian. So with this, our final Hamiltonian would look like this. We have, it's fairly analogous to the universality proof initial Hamiltonian as well. So we have this h clock initial that initializes the clock at t equals zero. We have this constraints that I just described and then we have this mixing Hamiltonian that creates an equal superposition of the states of the computation. So basically in the end, we have that this ground state should be a superposition of the states that we want to have. So the ones that have physical meaning. Note, by the way, that the ground state of this Hamiltonian is not revealed at all to prepare. So you would have to actually optimize an ailing process in order to reach the preparation, the proper preparation of this ground state. The nice thing is that you only need to do it once. That's not a worrisome overhead, I think. I mean, not the most worrisome overhead. Okay, so the final Hamiltonian in this case, we finally got to it. So we have this h clock final that is the one that is going to bias us towards the final state of the instruction set with this simple term over here. Then again, we have the constraints and we have this h transformations, which is the one inducing the implementation of the cost function. So within this h transformations, we have three blocks or three types of terms. We have this hb, which is the one implementing the bridge transformations that I described earlier. So it implements a bridge transformation and advances the clock. This hns, which advances the clock when no transformation is required. That is when the configuration already satisfies the required connectivity at that given layer. And we have this hs over here, which is the most complicated, that implements, swaps and advances the clock. So I don't have a lot of time in this talk, so I cannot go into this in detail. So I'll just put the two simplest terms here, so that you can see the structure and the logic with which they are built, but do not pay too much attention or do, but please do not get distracted by this many formulas or this many tiny definitions. So just note, please, that this hns is already three local and this hb is already four local. So you can imagine that the swapping term is actually going to be highly non-local and fairly rapidly. So the smallest swap needs to be implemented with five local terms with this logic and it explodes really, really fast as we introduce more and more swaps to connect further apart sites. So, okay, I'm coming to the end. Just let me say that building h transformations also requires classical preprocessing to characterize the cost of the different transformations that we have in our hardware graph because we want to first identify the shortest paths between the different physical sites and also take into account the gate quality. So we need to make a thorough analysis of what we have in our hardware. On the other hand, this is something that makes sense with the problem that we are tackling with so we wouldn't expect not to have to do this. All right, so let me finish saying that we have provided an algorithm that gives a final or an initial or an initial algorithmic to physical qubit mapping for the qubit allocation problem corresponding to the exact solution. This is done by explicitly exploiting the entanglement between the configuration and clock register qubits and also there's this possibility of taking this little step further and building an iterative algorithm that allows us to find some more configurations at the initial and final stages and also we are in the lookout and thinking about problems with a similar structure that may be formulated in this manner as well. Thank you. There are questions. Thanks for the talk. Could you clarify exactly how many qubits this would need? Like, actually it was a function of the number of qubits in the circuit and also the circuit depth, I would guess. Yes, so this would be NH times NV plus one plus LV. LV, sorry, I haven't said it. It's the depth of the circuit. Yeah, it's important, sorry. It's linear in the depth of the circuit. Yes. I have a question that's in the chat. Okay. Have you considered implementing this on a quantum annealing system? I kind of think it might be, there might be a lot of problem blow up and it may not fit on, an interesting problem may not fit on current size chips but maybe you could use a hybrid approach that could read much larger qubos. The thing is that it is very highly non-local so I really haven't thought about the possibility of implementing it right now. I wouldn't know how to make it fit anywhere, to be honest. It's the swapping terms so I can maybe, so yeah, this gets messy really fast basically. Just for L equal three you have, it's not just that the interactions when you are thinking about implementing long distance swaps are highly non-local is that you have to consider all the possibilities that come up within this long-range interaction so you would have to be reducing the order of many terms that are highly non-local which is further going to increase the number of extra qubits that you would need to do this rapidly. So maybe you could use a gate model approach to reduce the size of the Hamiltonian for the quantum annealing approach. That was a joke. Or, but maybe a hybrid method would be interesting to maybe try this out on a classical quantum system that's big enough to implement it so. I haven't thought about it so I actually do have, I've tried to simulate this myself on my computer as you can guess this scales terribly so I only got to like size four and depth 13. But you do get the solution but you cannot go much further because this. Right, thank you. Any other question here or from the chat? If not, thanks Anna. Thank you Anna again.