 So our next speaker is Sai Srikhar Kasi from Princeton, a cost and power feasibility analysis of quantum annealing for next-G cellular wireless networks. Thanks for the introduction. I'm Srikhar Kasi from Princeton, and today I'm going to present our recent work, a cost and power feasibility analysis of quantum annealing for next-G cellular wireless networks. And this is a collaborative work with University College London and Interdigital. So in a wireless communication scenario, there are like base stations which are serving a number of users. And today the number of internet users has been increasing significantly. And according to like Cisco industry reports, by 2023 about 66% of the population will be using internet. And to meet the resulting demand, there are new robust technologies coming in such as multiple input, multiple output communication, a robust channel coding schemes and the use of millimeter of communication. And the result of like these two things is basically the increasing power consumption of the base stations. And this has two major problems, like one along the economic side, like due to increased operational expenditures and another along environmental side due to increased carbon emissions. So how to control this power consumption? So there are several traditional techniques, like one of them is sleep mode to just to turn the base station on and off during load traffic times. And another approach is to optimize the radio transmission process itself. It says basically to use approximate algorithm instead of optimal algorithms. And take it as power. But these techniques like trade-offs, trades of performance with like how much power they can save. So the most effective way is to improve the hardware components themselves. And traditionally like CMOS hardware is being used at base stations and its performance per watt efficiency has been improving because of the Moore's law scaling. But Moore's law scaling will eventually end and it is expected to terminate around 2030. So this raises the question like whether CMOS hardware can achieve like next cellular spectral energy efficiency targets. While this may well hold true like there are, recently there is much interest in the community to explore alternative approaches to CMOS and one such approach we explore is quantum computing. So the first question that arises is like why quantum computing for wireless networks? So in wireless networks there are like very strict timing deadlines. By that I mean like first if we look at like how base station allocates bandwidth. So time and frequency are divided into several time frequency slots and each subframe is about one millisecond right now in 5G. And these slots will be allocated to different users and over time the load varies at base station. So by strict timing deadlines like when we move from 5G to 6G or 7G this subframe decoding time is decreasing and decreasing while the load is increasing. And to decode signals the optimal algorithms are like computationally very heavy and which results in high power consumption. So with quantum computing like a recent work has shown like indeed like wireless problems can be converted into Ising models and the work has also shown like the real results from real quantum handling machines and quantum machines also have like low power consumption and with engineering advances like we might get like a substantial speed ups over conventional computer. So this is our overall envision scenario. We envision quantum processing units that are centralized and baseband units where a processing from like a number of base stations is being aggregated. Quantum computing will take care of heavyweight baseband processing tasks and classical computation will take care of lightweight tasks. And in this kind of system what we are essentially doing is we are investing in capital expenditure and quantum hardware of high cost. But we are reducing the operational expenditure because of low power consumption of quantum hardware and there is no need for additional cooling as well. So the total cost of ownership which is like a capex based opex there is a trade off because like we are increasing capex and decreasing opex so in this work like we are going to find out like systems for which like the total cost of ownership decreases or over time the total cost of ownership decreases. So let me summarize like some of the key questions and answers in this kind of scenarios and from the perspective of quantum annealing devices. So first question is that like how many qubits do we need for in 5G processing? And for a small base station we will need about like 40,000 qubits by small base station I mean like 10 megahertz bandwidth and 32 antenna base station and for a macro base station we need about 3 million qubits which is about 200 megahertz bandwidth and 120 antenna base station. And the connectivity between these qubits is like highly sparse meaning that like while there are like several problems in wireless networks like most of their ISIN models are like highly sparse connectivity. And we can also because like we are solving problems related to like different different users so we can also envision like so we don't need like all these 3 million qubits to be entirely connected so they can have multiple independent chips because we are solving problems related to different users. If we have this much number of qubits in the machine how much power or cost can quantum annealing save over CMOS? So again like for a small base station we will not get any benefit because the computation is like not huge enough for CMOS to blow up. And in macro base station scenario like we can get about 41 kilowatt power which is 45 percent lower than CMOS. And what year like this cube these systems will become feasible based on like recent triad in the industry like we can get 40,000 qubits by 2026 in the best case scenario and for 3 million qubit machine like we might need to go wait until 15 years in the best case scenario. So how do we compute these values like I'll explain it in a bit. So our evaluation methodology is like to consider like two key figures of merit in wireless systems like the first is spectral efficiency which is like the number of bits process per second per heads of frequency spectrum. And spectral efficiency is affected by latency and also the number of qubits we have. Another figure of merit is energy efficiency which depends on the power consumption and also the number of qubits and the interplay between these three values like the latency, qubit count and power consumption of QA hardware will determine whether QA can benefit or CMOS so our goal in this work is to like project like target values for these properties so that like we can get benefit of CMOS and we evaluate like CMOS versus QA at equal spectral efficiency targets. So let's analyze like each of these three components one by one. So we have an input problem which is basically an icing problem corresponding to a baseband unit task and when we send an input problem to a quantum processing unit like first the quantum processing unit will program the problem and set up for annealing and next we will anneal the problem and once annealing is finished we will read out the solution and there is also a readout delay which to prepare the qubits for like next samples annealing. So this anneal readout cycle corresponds to one sample and we conduct multiple samples for the problem and overall sampling is called sampling time and solutions corresponding to like multiple samples are also post-processed once in batches and this post-processing parallels with annealer computation so it does not factor into the overall processing time and this last batch of post-processing parallels with next problems programming. So once the post-processing is finished we will collect the solutions and what happens in this programming so in programming like we have like one is the co-efficient setting and the programming thermalization and cubic tracer. So here on the left hand side like we have the left figure corresponds to like the physical hardware and on the right figure we correspond to graphical hardware of a chimera graph and this long circular loops of wires are called qubits like these L-shaped structures are couplers and these small cylindrical structures are flux stacks. So when we send a problem to qubits, the electronics will program these flux stacks about the co-efficient values and these flux stacks will program then on to qubits and couplers and this process currently takes about like 4 to 40 microseconds and if you want to maintain the same time on like you know larger devices we need to have like more control and bandwidth and for a 10 million qubit device like we need to have control and bandwidth of about tens of gigahertz and once we program this the current of heat is dissipated in the QPU and so we need to thermalize the system now and based on like the number of flux stacks which are programming these QPUs and what is the critical current like based on these values we can actually compute like how much energy is dissipated on chip and in the worst case scenario like when all the qubits and all the couplers are programmed and all the values are changed from like with the 5 bit precision like 16 to 16 it turns out that like in a 10 million qubit device with like 15 couplers per qubit about 36 picojoles of heat will be dissipated and in a QA refrigeration unit at the 15 milli Kelvin stage we have about 30 microwatt of power so in an ideal scenario like how much time we need to cool this much amount of energy is basically like energy divided by power so it turns out to be about 1.2 microseconds and next like we will initialize the qubits by initialization I mean like to start the annealing algorithm in an equal superposition state or like in an intended ground state of initial Hamiltonian and when we initialize qubits like qubits transition from a higher energy state to an intended ground state and this process like emits photons and also like it corresponds to like heat release it's called partial loss and using partial filters will reduce this loss and it takes about like 0.8 microseconds to reset qubits with 90% confidence so overall programming time is some of these times and we expect it to be 42 microseconds for the last kilobytes and once we program we need to anneal currently the minimum annealing time is about 1 microseconds and which is dictated by control and bandwidth there are also works which talk about high control and bandwidth using the use of flex print cables which can reduce annealing time to about like 0.5 nanoseconds and in redot architecture like the redot information is basically the persistent current direction in qubits at the end of annealing process and redot takes place along this flux bias lines we have qubits and there are like several electrical circuits called like quantum flux parametrons which resonate the current direction in qubits and propagate them from qubit to detectors which are located at the perimeter of the qpo chip and currently the redot is one division meaning that like one qubit per line is rated at time and it takes about 25 to 150 microseconds per sample and there are also like frequency multiplex redot schemes like which can parallely read out like several qubits at a time and we expect that with frequency multiplex like this time can maybe reduce to like one microseconds per sample and the redot in redot delay like we reset the qubits again and we consider it to be one microsecond per sample so the overall time for NS sample we consider 42 plus 3 NS microseconds where 42 corresponds to programming time and this 3 comes from like one for each of these annealing redot cycles so next let's see like how we compute this how much amount of qubits we need so we have a cellular baseband unit in wireless systems and the cellular baseband unit performs a number of like baseband processing tasks out of which like I show here like some of the heavy computational tasks such as frequency detection or forward error correction filtering and equalization so I'll explain like one particular example about like how we estimate qubits so given a 5G scenario like an example is shown here with like 200 megahertz band 120 at antenna and so on so given a base station scenario like we have a target amount of computation in the worst case meaning that like when the base station is full in 100 time and frequency duty cycles so in this scenario like for forward error correction we have a target of about like 89.6 tera operations per second and we convert the operations per second to problems per second by using the number of operations per problem we need for example in 5G for forward error correction LDPC codes are used and for the longest LDPC code in 5G it requires about like 150 million operations to decode one such problem so using these two values we can compute like how many problems per second we need to solve it the base station and in the ising model formulation like from our previous work basically we compute the number of qubits required per problem and we also consider this 42 plus 3NS microseconds overall problem processing time it is called so using these values the entire qubit requirement to satisfy this 5G forward error correction demand becomes the number of problems per second times the number of qubits per problem times the runtime per problem the runtime per problem here corresponds to like one or two microseconds which is basically like taking 20 samples so if we want to take more and more samples as you can observe here we will need more and more qubits and it turns out with 20 samples we will need about 1.3 million qubits for FEC so we repeat this computation for all of these baseband processing tasks we need to solve the qubits we need so the total qubit requirement for 5G turns out to be like on x axis like we have bandwidth on the y axis we have qubit requirement and for a 32 antenna system even at all of the bandwidth the total number of qubit requirement is less than a million and when we go to 64 antenna systems that requirement becomes slightly higher than a million and when we go to larger and larger systems we will need more and more qubits so the take away point from this slide is basically like even for large system like 256 antenna 200 megahertz bandwidth system the total qubit requirement is less than 10 million so let's see like how we do actual like power comparison analysis so for quantum annealing hardware the power consumption is currently 25 kilowatt which is dominated by refrigeration unit and it is not expected to scale significantly for larger devices but the problem becomes like if we have like certain amount of qubit requirement for 5G and if we want to maintain the same power all of these qubits must fall under the same refrigeration unit so this raises the question how many qubits can we actually fit in the refrigeration unit so to answer this question like we consider the physical size of qubits a tile of 8 qubits takes about like 3.35 micrometer squared and in a dilution refrigerator we have an experimental space of about 250 millimeter radius so now this turns into a problem of like classic dice per wafer or how many squares we can fit in a circle and it turns out like about we can fit about 1.75 million dice and each die has like 8 qubits so it turns out like we can fit about 40 million qubits in a single refrigeration unit and since 5G qubit count estimates are like significantly lower than 40 million so QA power consumption we consider is 25 kilowatt so we compute like CMOS hardware power based on like amount of computation and the performance for watt efficiency of CMOS and we do evaluation for like two types of CMOS which is current CMOS 40 nanometer CMOS is a current device it has about like 0.076 terawatt per second per watt efficiency and we also do compare against 1.5 nanometer CMOS which is expected to be the CMOS technology at the end of Moore law scaling and it has about 0.3 terawatt per second per watt efficiency we also account for leakage power in CMOS and it is set to about 30% of the dynamic power and if you look at results like on X axis we have bandwidth on the Y axis like we have power of CMOS 1.5 nanometer CMOS the higher underline corresponds to like the QA power consumption of 25 kilowatt so if we have like a smaller system with like 32 antennas like we cannot get any benefit over CMOS because CMOS power is like already much lesser than QA even for 64 antenna systems we don't get any benefit but when we go to larger and larger systems we start getting benefits so when we go to a system of like 128 antennas we have benefited 180 megahertz bandwidth and 128 antenna systems and when we go to like more larger systems we get benefits over a wide range of bandwidths so the left hand side plot has shown like this is just the baseband power if we look at the entire seat and power so in this scenario the left bar show power consumption when CMOS is at the seat and at the baseband unit and the right bar show that when QA is at the baseband unit so you can see like when QA with QA there is like power reduction in baseband unit and in remote radio heads like you know absolute power reduction but like you know the percentages are different because the total power is different in front hall also we don't have any power reduction because like QA is not solving these problems in power system we do have a reduction because like we don't need additional cooling and if you look at like the entire seat and power consumption we have like several hundred about like 150 kilowatt power reduction when we have QA at base station we have all these things together in a feasibility timeline so on x axis we have year on y axis we have qubit count so the data points in this hash data are the historical qubit counts and after 2020 the extrapolations are basically the extrapolations of best case and the worst case qubit growth respectively so the best case qubit growth trend is from like 2070 to 2020 and the worst case is from 2020 to 2023 so we count scale scales along this best case trend by 2026 we might solve like a small system this system basically corresponds to the number of qubits required over here on the x axis on the y axis so as we go ahead by 2030 we can solve like a larger system and we can solve more larger systems as we move forward but so even if we solve like a small systems like we saw before like we don't get it in small systems so if we want to get a power advantage we need to solve a large enough system and it is projected to be like about 15 years away in the best case if the qubit count scale along this best case thank you I'll take any questions now thank you thank you for your talk we have one question in the chat if you'd like to take a look at the chat and read out the question and reply to it if you expect you need 3 million qubits and a hybrid system yeah actually not in this but in a different work we do consider hybrid approaches basically because if you look at like current near term machines to get like a larger machine it will take a lot of time so we also need to consider hybrid approaches we do consider them so this analysis is not based on chimera or jeffir graph so basically we have some set of problems in wireless networks and we assume like you know the future qa hardware will be exactly as that of the connectivity of i think so it's like a custom hardware and it is not like a bad assumption basically because like the connectivity is very sparse if you look at like LEPC quotes for example like more than 80% of qubits will have about less than 15 couplers per qubit and about less than 2% or 1% will have more than 30 couplers okay are there any other questions if not let's thank our speaker again