 Hello everyone, and welcome to this talk about global balance on lattice saving and information set decoding. This is joint work with Elena Kishanova, and this talk is for Crypto 2021. Okay, so as we all know, letters-based and code-based cryptography are two of the prime candidates for designing post-quantum cryptography, which is hopefully secure against quantum attacks. So, for instance, in the NIST standardization process, which is going on right now, several of the finalists are based on lettuces and codes. And of course, the security of these letters-based and code-based cryptographic schemes relies on the hardness of the underlying letters problems and decoding problems. So it is important that we study these problems well, and that we study the algorithms for these problems well, so that we know the security of these cryptographic schemes and how to browse. And note that as we are usually interested in long-term security, we would ideally like to have conservative bounds on the attack cost. Like, we would like to account for possible improvements in the hardware that people just have faster algorithms in the future because their computers get faster. But also, maybe someone will come up with better analytics techniques, better algorithms to solve these problems. Ideally, we would also account for that with our parameter selection now, so that in 10 years, like, security does not have to be lower than we hoped for, that we aimed for. So the only way to really try to solve this problem is to look at the state-of-the-art script analysis, like the best algorithms for solving these hard problems and see if they can be improved or not. And so in the context of lettuces, letter sieving combined with a basis reduction, that is, the basis reduction is currently, like, the state-of-the-art. And in code-based cryptography, like, information decoding algorithms are some of the most important tools. And both of these approaches kind of follow a similar approach at some point, where they sample a large database of long vectors, where long is long under some metric, which I will get to later. So we sample a lot of these vectors. Then we combine nearby vectors to obtain shorter vectors. So we have this whole list of vectors, and now we try to find a search for pairs of vectors that we can combine, say, by adding them or subtracting them or something, something simple at least, so that we get a nicer, shorter vector. And we only want to combine vectors, like, we can only combine vectors which are nearby in space. So again, nearby is under the same metric of long and short here. And then we just repeat this of combining vectors until we find many short vectors. So we start with the large database of long vectors, and then we iteratively like to find shorter and shorter vectors. Until at the end, we have a bunch of short vectors. And then, for instance, we can solve the shortest vector problem on letters by just doing this several times. And then at the end, we find issues in our list. So a key sub-routine in this approach in both settings is the closest pairs problem, where you have a list of vectors in your space, say, in this database, in this case. And we want to find almost all pairs of vectors that can be combined to find shorter vectors. So we want to find all pairs of nearby vectors in our list. And then this sub-routine dominates the overall algorithm complexities, like if we can improve the closest pair algorithm, then we can improve the overall complexity of letters saving and for information set decoding algorithms. And note that naively, you can just find all pairs of vectors by going through all pairs of vectors. So there will be like quadratic time in the side of the database. But there are these smarter approaches, which also work in high dimensions. So for instance, these hash-based approaches, they have a sub-primary complexity, like depending on how nearby you want to find them. But usually they are like values of quadratic in the side of the list. And right now there are no matching lower bounds, neither for letters saving or for information set decoding setting. So there might be improvements still possible, which is kind of problematic if we want to choose parameters now, if someone comes up with further improvements in 10 years, then there will be a problem. Are improvements still possible, or can we rule them out now? So the main contribution of this paper is to try to study lower bounds for the corresponding nearest neighbor problems that you want to solve when you solve this closest pair problem. Our lower bounds are conditional on the hash-based model, which you will get soon. They don't necessarily apply to all approaches for solving closest pairs problem, but at least within this model, which is quite broad, like it covers almost all approaches, at least like for saving to date. So if you are in this model, then this lower bound applies. And for the letter saving setting, this lower bound is actually tight. Like it matches the current known best techniques for solving this nearest neighbor problem. So for instance, this gives you conditional lower bounds that result from 2016 with a 0.292 exponent for SCP is actually optimal in this model. And also for information and decoding, we get lower bounds, but these are unfortunately not asymptotically tight. They are almost tight, but not quite tight. So this could, for instance, mean that the best known, best currently known nearest neighbor technique for my and Oso from 2015 is actually suboptimal. What could be that just our lower bound is not good? So it could be that there's a better lower bound, which actually matches this result of 2015. So it's not really clear which of the two is the case, but we do want to stress actually the gap between the lower bound and the upper bound is quite small. So there might be some small improvements in the future, but not really that much. Low bound is quite close to the upper bound. And so, yeah, what are the implications of these results? So mainly this gives you a better understanding of the hardness of the underlying problems, like the lattice problems and the decoding problems. And so for the cryptanalysis, if you're working on cryptanalysis, especially in the lattices part, then probably the takeaway message here is that you should search for improvements elsewhere, like not in the nearest neighbor technique anymore, like this kind of, this lower bound kind of solves the problem in that the lower bound matches the upper bound. So maybe you can find improvements in other parts of lattice saving or in other lattice algorithms in general. This does motivate further work on the best lattice shift, like this has already been done recently, like studying like this result from 2016, studying this algorithm and looking at like the specification overhead or how you implement this efficiently, like on GPUs for instance, and seeing what you can get for the performance here, because like now we know that this result from 2016 is actually optimal, at least within this model. So it makes sense to really zoom in on this particular approach. And if you are working on, say, designing cryptography and you want to choose your parameters, then yeah, what is the result? What our contributions tell you is that you have some kind of conditional security guarantee, like if an attacker is trying to attack your scheme by, say, solving the underlying lattice problem and they are solving this lattice problem by running, say, lattice saving with some hash-based nearest neighbor technique, then they're not going to better than what is already known from 2016. So the attacker would really have to improve some other part of the algorithm to really break your scheme and decrease your security. Okay, so I will not cover all the results in this talk, in the remainder of this talk. So I will just briefly try to sketch the lattice saving contribution because that's arguably the nicest result that we have. So just to sketch bit formally, like the closest pairs problem, so we are given some bounded metric space, we have the m with a distance d and we have a distance metric d and we have a target distance r, like which qualifies when vectors are nearby. And we have a list l which is a subset of m with elements drawn uniformly at random from m. So we are not considering worst case settings, we are considering average case settings where l is really random from m. And the problem asked to find almost all pairs x, y and l such that the distance is at most r. And almost all here means that present you want to find 90% of them or 95% or like you can miss a constant fraction but you cannot miss too many of them. But you also don't need to find all of them, like if you find 90% of them, that's already quite good. And so one of the main approaches for solving this problem uses locality-sensitive hash functions which are functions h here, which satisfy this property that if you have two vectors in your metric space, say x, y and m, and their distance is small, then the probability that the hash values are the same, so that the hash vector is equal to the hash of y is much larger than if x and y were just drawn uniformly at random from m without this guarantee on the distance between x and y. These hash functions are locality-sensitive in the sense that if two vectors are nearby in space, then the hash values will be more similar than if they are random in space. So if you have such hash functions, like if you have a big gap between the left-hand side and the right-hand side, then you can design an algorithm around this by building and populating hash tables using these hash functions. So for a vector x, you store it in a hash bucket labeled the hash effects, and for a vector y, you store it in the hash bucket labeled the hash of y. You do this for all vectors, and then within each hash bucket, you combine pairs of vectors. You try to combine them to make progress. You hope that these are closed pairs, and because this hash function has this property, ideally this would then give you a faster algorithm than going through all pairs of vectors. If you want to instantiate this completely, like for letter-saving, the space is the unisphere, like this abstraction, this heuristics, based on that you have points in the unisphere, and then the distance is like a Euclidean metric, but the distance is all the same on the unisphere, so you can also consider the angular distance by having a large load product. It is important to note here that the size of the list that you encounter in letter-saving, like the size of this L here, is exponential in the dimension of your space, and for the information set decoding setting, the metric space would be the hamming cube with the hamming distance, the L1 distance, and again the size of the list would be exponential in the dimension of your space, and this look at the sense of hashing has been studied outside cryptography for a long time already, and people have been studying this for many different metrics and such, but it is important to note that in a nearest neighbor literature, commonly people consider the case where the size of the list is sub-exonential in the dimension, so people have also been looking at lower bounds for this hatch-based model, for look at sense of hashing in general, but they are usually in this case where the size of the list is sub-exonential in the dimension, and that means that they do not apply to the setting here for letter-saving or information decoding, because we have an exponential side of the list. Okay, so these hash functions, like ideally we would find such hash functions where the gap between the left-hand side and the right-hand side is as big as possible. Okay, so just to instantiate this on the Euclidean sphere, then you can just replace this M by the Unisphere, and two vectors being close, you can also interpret this as the dot product between X and Y being larger than some parameter gamma in this case, and note that the hash of X is equal to the hash of Y if they are both equal to some value N, and you can sum over all of these N to rewrite these probabilities as a summation in this case, like this on both the left-hand side and the right-hand side, and notice that the hash of X is equal to N if X is in the preimage of N under the hash function, so you can imagine that this hash function, like there's some kind of partition of the sphere where each part of this partition has a different hash value, and the H inverse of N is like some region of the sphere which maps under H to this hash value, and so the hash of X being equal to the hash of Y being equal to N can be rewritten as that X and Y are both in the preimage of N. Usually, these hash regions have similar shapes for all values and like for all hash values. You cannot really assume that if you want to prove a lower bound that for this talk, if you just assume that for now, just for intuition, then you can see that you can rewrite these sums as just N times one of these terms, say the H inverse of zero. Then again, to have the left side much bigger than the right side, then we can also just eliminate the N on both sides now. And now we can also rewrite the H inverse of zero as some region A on the sphere. And note that the right-hand side, this is a probability that X and Y are in this region A, where X and Y are independent and uniformly random from the sphere. So we can rewrite the right-hand side as the surface area, the relative volume of A compared to the unisphere. And we want to find functions for which these partitions or for which these regions A on the left-hand side here are much bigger than the relative volume squares of this region A. So completely, you can rephrase this whole problem, roughly speaking, as that if you are given a fixed size of your region on the sphere. So if you are given that the size of this A has to be like a quarter of the sphere, say you want to find the region which maximizes the left-hand side, which maximizes the probability that if two vectors are actually close, then they are both in the same region with the maximum probability that you can get for this size frame. So actually to get a lower bound here, what we used is the Bernstein-Taylor inequality from the 70s from the last century. This is kind of abstract formal statements. But yeah, what you have here is some inequality between left-hand sides, double integral of F, G and H. And on the right-hand side, you have F star and G star and H, where F star and G star are some kind of, they are the symmetric non-decreasing rearrangements of F and G, which is a marvel. Like F star is a function of F in a sense. Like if you are given F, then you can construct the unique F star, which has this property that it is a symmetric non-decreasing rearrangement of F and the same for G. So okay, actually we just use this by, we use this for a specific choice of F, G and H. In this case, F of X being the indicated function of X being in some set A. We have not fixed A yet, but for any A we can use these F and G. And the H here is defined as the indicator of that a dot product is bigger than gamma. And if you use these F and G, then actually the F star and the G star that you get are the indicated functions that X and Y are in C of A. Where C of A is the spherical cap with the same volume as A. So like the sigma of A is the same as sigma of C of A. So that's just the F star that you get if you take this F. That's just what you get. And you can plug this into this inequality that has been proven before. So then you get the left-hand side and the right-hand side here. And now you can look, if you look closely at what it really says on the left-hand side and the right-hand side, then you can see that these are really just probabilities over the unisphere. And actually the left-hand side is just the probability over X and Y uniformly at random from the sphere that X and Y are in this region A. And that the dot product is larger than gamma. And the same on the right-hand side except that you don't have A, but you have this spherical cap with the same volume as A. And then, okay, now you have a probability over X and Y being A and X dot A being bigger than gamma. But you can divide by the probability that X dot Y is bigger than gamma and then you just get this inequality from that. Which is kind of like what we were looking for on the previous slide. Like we wanted to maximize this probability that X and Y are in A, given that X and Y are uniformly at random from the sphere condition from X and Y having a dot product larger than gamma. And now we have this inequality which tells you that, actually this one, this probability is maximized when you take the spherical cap with the same volume as A. And so, like, if you draw it in the figure, then you can see like the, if you want X and Y to be often in the same region A, where X and Y are close on the sphere, then the best you can do is like use these spherical caps on the right. Like this wiggly shape A is not really going to be effective. The best you can really do is use spherical caps. So I'll just go back to the previous slide, like this probability that we have here. This inequality, this Mercy-Taylor inequality tells us that this probability is always at most equal to the right-hand side. And we wanted to maximize it. So we really want to make this an equality. And we can make this an equality because this C of A on the right is actually has the same volume as A. So like this has the same, yeah, got this volume on the sphere. And so this really tells us that the solution to this problem here is to use spherical caps. And note that, yeah, okay, if you now want to go back to proving that these high-scratches are optimal, for instance, you also need to take into account that maybe the shapes of the regions are not exactly the same. Yeah, and you need to have this sum over different regions as well. So I mean, there's more details in the paper to go through this proof. But like this is the main intuition that tells you like, okay, the actual, the best shape that you can use is the spherical cap. And it comes from this, it's not from Mercy-Taylor from the 70s, yeah, from the 70s. And so, okay, so if you then use this result, so ultimately, when you complete the proof, then you get this conditional bound that if you do lattice-saving with hash-based nearest-neighbour-searching, then you cannot do better than using these spherical cap shapes. And this has already been studied by in many different settings like for classical saving, for instance, like which is like just combining pairs of factors using some hash-based approach. Then, yeah, now we know that this result from 2016 with this 292 exponent is actually conditionally optimal, conditional on being in this hash-based setting. You can also do saving and then do some quantum techniques on top of it, for instance, grover-searching. So this has been studied in 2016 as well. And then we're shown that you get this exponent 0.265, actually better than the 90. And again, this is now conditionally optimal, conditional on using lattice-saving with some form of hash-based searching and using grover-searching specifically. Like maybe you can do better with all the quantum techniques and actually recently there was this result from 2021, which showed that if you do lattice-saving again with some form of nearest-neighbour-searching, but then you don't use grover, but you use quantum random walks and actually you get an even better exponent like 0.2570. But this improves the quantum part. It does not improve the nearest-neighbour part. So this does not actually violate the lower bound that we proved slightly before this recently. Everything is fine. And also for tuple-saving or for closest-vector algorithms, again, these results are now conditionally optimal, conditional on that you use some form of hash-based nearest-neighbour-searching. And again, there may be different ways of doing nearest-neighbour-searching or closest-vector searching, but like so far, at least in lattice-saving, people have really studied hash-based approaches a lot like several of these shifts that have appeared in the last decade, they're all based on hash-based nearest-neighbour-searching. Okay, and then just to conclude with a few open problems that might be of interest. So, yeah, as I stated before, so all these results are conditional on that you are using some form of hash-based nearest-neighbour-searching to solve the closest pairs problem. But yeah, maybe you can actually use some other closest pair techniques, which actually don't fall within this model, and then you can still get better results. I mean, it might be possible, so who knows? I mean, you're just not going to do better within this hash-based model. Also, this only affects the closest pairs subroutine of these algorithms. But yeah, you may be able to improve other pairs like people have been doing with, say, this Jessica, for instance, for letter sieving. You just improve the base reduction or you improve some sub-extrational overhead somewhere. I mean, that's still worth to do there. Also, these are only, like, asymptotics about the leading constant. So these results are only saying that asymptotically, if you are only focused on minimizing the constant in the exponent, then you are not going to do better than, like, using these spherical caps for letter sieving, at least. But maybe you can decrease the sub-extrational overhead like either using the same method or even using different methods which have a worse constant in the exponent but have a lower overhead that might already be better. So because, I mean, concretely, we are interested in parameters that we're using in cryptography and not just in asymptotics. And also the bound for ISD. I didn't really go into it in this talk, but the bound is not tight, as I mentioned before. So either there may be better techniques to get better upper bounds or there may be better lower bounds which actually do match the correct techniques. So that is still also an open problem and we haven't solved it yet. Anyway, so, yeah, thank you for watching.