 Hello everyone and welcome to Virtual Eurocrypt. I'm Chris Pickert and this talk is about using Cooperburg's collimation sieve or sea sieve to cryptanalyze a recent proposed post-quantum construction called seaside. I wish we could be together in Zagreb or in one of the beautiful nearby seaside towns like this one but it's better under the circumstances that we're instead here together on the internet. Because life has been upended in so many ways this talk will be no exception and I'll begin with the conclusions. The main conclusion for this work is that the proposed parameters of seaside offer relatively little post-quantum security beyond what's given by simply the cost of quantumly evaluating the function itself on a uniform superposition. For example for the main parameterization that's been studied in the literature seaside 512 key recovery costs about only two to the 16 quantum evaluations of the function using about two to the 40 bits that is 128 gigabytes of classical RAM which is quantumly accessible and I'll explain what that means later in the talk. Now while the main focus of this work is on the number of evaluations needed to mount an attack we also give some estimates for the total quantum teagate complexity of a complete attack and this is assuming that evaluating the function on the uniform superposition costs not too much more than for evaluating on the so-called best case distribution which is close to but not quite the same as the uniform distribution. Under this assumption attacking seaside 512 can be done with only about two to the 60 teagates and a same amount of RAM two to the 40 bits so seaside 512 falls well short of the claimed NIST level one post quantum security which as a reminder corresponds to AES 128 key search which NIST has estimated to cost about two to the 170 teagates divided by max depth where max depth is the maximum feasible or plausible quantum depth that can be implemented and NIST estimates that to be between two to the 40 on the low end and two to the 96 on the high end. Another parameterization is seaside 1024 which was proposed for NIST level two. We find that to be breakable under the assumption with about two to the 72 teagates and only about two to the 44 bits of quantumly accessible RAM and therefore it also falls short of level one. In addition seaside 1792 was proposed for NIST level three. We find that this is breakable with about two to the 84 teagates and only about two to the 48 bits of quantumly accessible classical memory so it also does not reach level one possibly except for the very high end of the max depth. But let's now start at the beginning with seaside which is a isogenic based commutative group action proposed in Asia Crypt 18 by Kastrik, Langa, Martindale, Pani, and Renus following in the template of Kovanis and Rostov-Stav Stolbenov. In such a group action we have a group G, some other set Z and an action star of G on Z which returns another element of Z. It's important to distinguish this from other isogenic based proposals like SIDH which are non-Abelian and in fact don't involve a group action at all. One of the advantages of having a commutative group action is that it allows for a very simple Diffie-Hellman style key exchange. Here we have a public parameter Z that's in the set capital Z and Alice and Bob each have their respective secret keys in the group and their public key is the action of their secret key on the public parameter. Then they can each agree upon a common shared key by taking their own secret key and acting upon the public key of the other party which yields this result A plus B star Z by commutativity. The seaside proposal is very efficient. For example the parameters targeting NIST level one quantum security allow us to have as small as 64 byte keys and an 80 millisecond key exchange. Subsequent work has developed signature schemes based on this commutative group action which are quite compact. The public key and signature combined are below 1.5 kilobytes at the same targeted security level which is much smaller than any signature scheme proposed to the NIST post-quantum process. So seaside has a very attractive efficiency profile but what about its security and specifically its post-quantum security? The main crypt analytic problem of secret key recovery is roughly analogous to the discrete logarithm problem. That is if we're given some public parameter Z and somebody's public key A star Z then the goal is to find their secret key A which is an element of the group. And it turns out as observed by Chow and Sukarev in 2010 that this task reduces to a hidden shift problem on the group G which is a problem that turns out to have been studied going back to the early 2000s with the work of Cooperburg. And there are three main ingredients to a quantum algorithm for hidden shift problems. The first algorithm is an oracle that whenever invoked outputs some kind of labeled quantum state and this works by evaluating the group action on a uniform superposition over the group. The second component of an algorithm is a so-called sieve which combines these labeled states to generate more and more favorable states. And the third component takes an ultimate very favorable state that was produced by the sieve and measures it in some way to recover a bit or multiple bits of the hidden shift. And there have been a sequence of algorithms that implement the second ingredient the sieve. Cooperburg's original algorithm in 2003 used two to the order square root n oracle queries and quantum bits where n is the logarithm of the group size. Soon after that, Regev gave a variant of Cooperburg's algorithm that uses slightly more oracle queries two to the order square root n log n but only polynomial amount of quantum memory. And then several years later Cooperburg gave another algorithm which again uses two to the order square root n oracle queries and bits of memory but in this case the memory need only be classical RAM that is quantumly accessible. And what that means is that while the memory itself consists of ordinary bits it can be accessed in a superposition of all the memory addresses at once. And he argued that this memory is quite plausibly much cheaper than fully quantum memory. He also in this work gave a different kind of sieving algorithm which he called the collimation sieve which actually subsumes the sieve algorithms from the prior two works and offers a variety of other tradeoffs that were not available before. For example, if one makes the log of the number of oracle queries times the log of the amount of quantumly accessible RAM larger than about n then the sieve can be successful. And much of this prior work was used to give security estimates for seaside and particular the seaside 512 parameterization. With respect to the oracle Bernstein, Langa, Martindale and Panium last year's Eurocrypt showed that the oracle for seaside 512 can be implemented in about two to the 43 quantum T gates plus a comparable number of much cheaper linear gates. Now this is for what they called the best case distribution which is close to uniform but not quite uniform superposition over the group. However, work by Buhlen's Klein-Jung and Verkaterin shows that there's very good reason to expect very similar costs of implementing the oracle for a truly uniform superposition. Regarding the sieve, the second main component of the hidden shift algorithm, the original seaside paper used the low space algorithm of regov to estimate that it would take about two to the 62 oracle queries to break seaside. Subsequently, Bonnetain and Schrottenlauer, in a work that will appear at the next talk of this session, used Cooperberg's original large space algorithm to estimate that seaside 512 could be broken with about two to the 32 oracle queries using two to the 31 quantum bits of memory. But with regard to Cooperberg's follow-up algorithm, there had been no prior work estimating the security of seaside against that algorithm. And that brings us to this work in which we generalize and improve upon Cooperberg's collimation sieve in a number of different directions. And we also analyze its concrete complexity on the proposed seaside parameters. In particular, we show how to generalize the collimation sieve to handle arbitrary group orders generalizing from two-power or other smooth group orders. And this is necessary for seaside because the group orders there frequently have huge prime divisors. We also show how to recover several of the bits of the secret key from each single round of the sieve. And we give some techniques for better handling the classical memory and time complexities after discovering some unexpected phenomena that occur in the sieve. Finally, one nice aspect of these quantum sieves is that they can be classically simulated on regular computers. And we do so up to the actual seaside 512 group order, which is about two to the 257. Previous work had only been able to do such simulations for group orders up to about 2 to the 100. So here's a table showing our results in comparison to prior ones. For example, with only about 2 to the 32 bits of quantumly accessible classical memory, we can break seaside 512 in less than 2 to the 19 oracle queries. And with increasing amounts of memory, we get a decrease correspondingly in the number of oracle queries down to about 2 to the 14, with still reasonable amounts of classical memory. I should mention that independently and concurrently, Bonnetain and Schrottenlauer also gave a complementary and mainly theoretical analysis of the collimation sieve, arriving at very similar conclusions, although they looked at a completely different point on the parameter spectrum for the collimation sieve. Let's now go over at a high level, Cooperberg's collimation sieve, but modified to collimate on the most significant bits, rather than the least significant bits as originally described. And this modification will allow it to work for arbitrary group orders, rather than requiring a tower of subgroups in the low bits case. So just to recall, the collimation sieve solves the hidden shift problem on some finite cyclic group Cn of known order n. And the main object of interest for this algorithm are these pure quantum states, which are called phase vectors. And each phase vector has some vector of integer multipliers or phase multipliers as they're known. The sieve is given as input a collection of so-called naughty phase vectors. And what makes them naughty is that they have huge and random phase multipliers in the interval between zero and n. These vectors can have any desired length L that you like, except that it has to be feasible in that the phase vectors have to be storeable within the available classical memory. So we can think of L as being roughly equal to the amount of quantumly accessible classical memory that we have. The goal of the sieve is to construct from these naughty phase vectors a very nice phase vector of roughly the same length L. What makes this phase vector nice is that it should have very small, but still mostly random multipliers from a very small interval S, where S is also approximately equal to L or even smaller, if possible. From such a very nice phase vector, we can then use the quantum Fourier transform to extract one or more of the secret bits as we'll see later. So how the sieve works is it converts these naughty input vectors into a very nice output vector by making progressively nicer phase vectors where their multipliers are contained in successively smaller intervals, starting from n and going all the way down to S. And it does so by a procedure called columnation. The columnation sieve can be depicted pictorially as a complete tree, in this case a binary tree, although it doesn't have to be binary. We fix some interval sizes where at the leaf level we have the interval 0 through n, the full group order, and at the root level we have the smallest interval S0, which is approximately L, our available amount of RAM. At each level of the tree, the interval size is a factor about L smaller than the level directly below it. Therefore the depth of the tree is about log base L of n minus 1. In the sieve, the leaf nodes get the naughty phase vectors whose multipliers are on the entire range of the interval n. And each internal node columnates its two children, thereby narrowing the size of the interval by about a factor of L. The key insight we exploit is that the more quantumly accessible RAM we have, the larger we can take L to be. This decreases the depth of the tree and therefore requires fewer naughty vectors to be supplied to the leaf level of the tree. Moreover, this means we need fewer oracle queries to prepare those naughty phase vectors. Now let's take a closer look at these objects called phase vectors and how columnation works on them. For a secret group element S, a phase vector of length L is merely a pure quantum state on L different basis elements ket j here, where each basis element has a phase factor attached to it. And the phase is given by a known multiplier B of j. For example, by querying the hidden shift oracle, we get a single qubit whose phase on 0 is 0 and whose phase on 1 is some known uniformly random value B prime between 0 and n. So this qubit is actually a length two phase vector where B of 0 is 0 and B of 1 is just B prime. Moreover, if we get L of these labeled qubits each with their own single label, we can tensor them together and get a length two to the L phase vector whose multipliers are just all the subset sums of the individual labels of the input qubits. Again, the phase multipliers are known integers and we store them in a sorted list in the classical memory. So overall, a phase vector requires roughly L classical bits, but only log L qubits, because it's over L different basis elements. And this fact is the essential source of the exponential improvement in the quantum space requirements of Kuberberg second sieve versus his first one. Now we can describe the collimation procedure on phase vectors. The input to the procedure is two phase vectors psi i each of length roughly L on an interval s prime. And the output of the procedure is a single phase vector psi of roughly the same length as the inputs, but on a smaller interval s that is smaller by a factor of roughly L. The procedure works as follows. First we tensor the two input phase vectors to get a new phase vector psi prime. It's now indexed by the product of the two index sets of the input vectors and its multipliers are the pair sums of the multipliers of the two input vectors. Then we perform a measurement on this combined phase vector according to the ratio of its multiplier divided by s and ignoring the fractional part. What this means is that all the multipliers that survive this measurement have exactly the same quotient q with s and possibly different remainders. And so these remainders themselves are in the interval zero through s. This common quotient corresponds to a global phase and the remainders correspond to new multipliers that lie in the interval zero through s. Finally we have to do some re-indexing and bookkeeping. We need to compute which of the indices of the composite phase vector survived the measurement and then re-index those to a natural interval from zero up to the length of the phase vector. Let's now see what this procedure does. The composite phase vector psi prime has a length roughly equal to L squared because it's the tensor of the two input vectors of length L. And its multipliers which are the pair sums of the input vectors are pretty well distributed in the interval zero through two s prime. What this means is that most of the size s subintervals of this big interval have about the same number of multipliers, L squared times s over two s prime which is roughly equal to L based on our relationship between s and s prime. It turns out that in practice we need some additional tricks to control the variance because some of the subintervals have many fewer multipliers than others. For example the two endpoints of the interval between zero and s prime have many fewer than the subintervals in the middle. Importantly step three is the only step that requires any significant amount of work. It requires a small constant number of lookups into the quantumly accessible RAM with L different addresses and it requires a quasi-linear and L amount of classical work to do the resorting and re-indexing. Once we've run through the entire columnation sieve we get a phase vector on a very small interval s whose length is also roughly equal to s although we can't control it completely. We can use this phase vector to gain information about the secret key as follows. Let's be optimistic and suppose for the moment that the length of the vector is exactly s and moreover that the phase multipliers of this phase vector form a bijection from s to s. Then we can re-index the phase vector replacing each phase multiplier B of j with just j itself. And then we observe that this state is essentially just the inverse quantum Fourier transform of the point function at the secret suitably scaled by s over n. So if we take the Fourier transform of this state and measure this yields the log s most significant bits of our secret with quite high probability. In reality it's unlikely that L is exactly equal to s and that the phase vectors form a bijection. In this case we can perform a measurement to make the phase vectors injective from L into some rather large subset x of s. We can then perform a similar re-indexing above to get a state that looks like this which is exactly the same except it's only over a subset of the basis elements. This is essentially a very densely sub sampled Fourier transform of the same point function as above and it turns out that measuring its Fourier transform also yields almost log s bits of our secret element. This completes all the main ideas behind the columnation sieve. But in our classical simulations of the algorithm we uncovered a number of practical issues that we didn't anticipate. The first is that when we perform a columnation of two phase vectors the length of the resulting phase vector can be quite variable and unpredictable. If it's longer than expected it requires too much classical memory to store and too much classical time to operate on later in the algorithm. If it's shorter than expected this is an even bigger problem because it has to be columnated with a correspondingly longer phase vector in order to produce useful outputs as they work their way up the columnation tree. Our solution to this is to tweak the columnation sieve so that each stage of the recursion can request a desired phase vector length adaptively and if the returned vector is much shorter than expected to simply discard it and then recompute a fresh one from scratch. Empirically we find that discarding about the two percent shortest phase vectors saves us a factor of more than a thousand in the length of the longest vector we need to successfully complete the sieve. The second issue is more of an optimization. As I've described so far when we measure the output of the sieve on a phase vector that's on the interval s we get about log s of the most significant bits of the secret. But as far as we know that isn't enough to break seaside and we need almost all of the bits of the secret. Our solution to this is to generalize the sieve so that it can output phase vectors that are supported on scaled intervals. For example ones where the all the multipliers are multiples of s between zero and s squared. By then tensoring the outputs of the sieve for various scaling factors we can actually measure it and get the entire secret all at once. I'll wrap up now with a few open questions and directions for further research. Probably the most important question is to establish the true complexity of the seaside oracle in order to complete the full attack. The existing estimates for this oracle recall are for the so-called best conceivable distribution which is close to but not quite the uniform distribution. And the sieving attacks need the uniform distribution. Or do they? That is could the attacks work by evaluating on an approximately uniform superposition? Even if we do need a truly uniform distribution it's not so difficult to achieve. The work of Beelans et al explicitly computed the group on which seaside 512 is defined. And their work enables the fast transformation of the truly uniform distribution to exponent vectors that have very similar statistics as the best conceivable distribution. So it just remains to estimate the quantum cost and depth of this procedure. Alternatively the existing estimates for the complexity of the seaside oracle come from generically transforming a classical algorithm into a quantum one. If we could get more direct constructions of quantum seaside circuits then that would possibly reduce the complexity of the oracle. Another question is whether it's possible to somehow amortize the several oracle computations that are used in the attack to get the initial phase vectors. Another important question is whether it's possible to break seaside using just partial information about the secret such as is returned from just one run of the sieve. And this requires more oracle calls than a single run of the sieve would. Now I'll conclude with the conclusions. Again the main one being that the proposed parameters for seaside provide relatively little quantum security beyond the cost of quantumly evaluating the function itself. I'll encourage you to read the full paper which is on eprint and also have fun with the simulation code which is available on github. And I hope to see you all in person healthy and well as soon as possible. And thanks for your attention.