 So hello, I'm Andre, and I will give a talk on dissection BKW, which is a joint work with Felix Heuer, who's also attending the audience today, as well as Robert Kübler and Alexander May, all from Ruhi University-Bochum and Christian Zola from Technical University-Doetmund. So let's start with the definition of the problem we will focus on today. We've seen this problem also in the talk of Schaffi Goldwasser, the learning parity with noise problem or, for short, LPN problem. And for this problem, we are given an arbitrary amount of samples, where a sample consists of a randomly sampled vector from F2 to the k here, AI, and together with a label. And this label is the scalar product of these random vectors with a secret vector plus an error term. And this error term can be 0 or 1, and it's 1 with probability tau, which is strictly less than 1 half. And the goal is now given an arbitrary amount of these samples to recover the secret vector s. So why is this problem important for us? Of course, the common use hardness assumption is that this problem is not solvable in polynomial time for an error rate that is high enough. And so it forms the basis of a lot of cryptographic applications. So for example, there are authentication schemes as well as encryption schemes based on the hardness of the LPN problem. And as soon as this is the case, we need to understand how hard this problem is actually solvable, of course, to determine proper parameters for the mentioned schemes to ensure certain security levels. So the state-of-the-art algorithm to solve the LPN problem is the one by Blum-Kalai and Wasserman from 2000. And this algorithm has the advantage of a slightly sub-exponential runtime while it has the drawback of a memory and sample complexity that is as high as time complexity, so slightly sub-exponential. And this makes the algorithm quite impractical. I mean, there have been some experiments done in 2016 and 2017, but all were restricted to a very small dimension due to the huge memory requirement of this algorithm. And if we now want to estimate the concrete hardness of suggested parameters for this problem, we need to rely on experimental data. And as long as these experiments are restricted to such a small dimension, we get some estimates that are quite inaccurate. And so we want to present a big W variant here that is applicable for any given amount of memory. So we will achieve this by giving time memory trade-off, first time memory trade-off for the BKW algorithm, which of course will reduce the memory complexity while we have to suffer a slight increase in time complexity. So first I will illustrate the BKW algorithm and, or a slight variance thereof. And before I do so, I would like to emphasize its core idea. And this is, if we're given two samples, say one and a two with corresponding labels, we can simply add these samples component-wise to obtain a new sample. This is just because of the linearity of the scalar product. And it's worth mentioning that the new error term E prime, which is now the sum of both previous error terms, is one with a higher probability than before, but we don't care for the moment. We just keep in mind we can add samples to create new ones. So the algorithm then starts with a list containing these AIs as rows and starts searching for pairs that have a spatial form, namely, they shall end on the same pattern of bits. And as we will call it, they shall be equal on a stripe. And the algorithm then proceeds by adding these samples together to form a new sample that ends on zeros. And then the algorithm continues, searching more pairs behaving the same until it generated a list that is as large as the initial one. And then it can simply start all over again, doing the same procedure for the next stripe and the next stripe until just one random bit is left. And with a good probability, this bit is one and we have a lot of unit vectors. And what can we do with unit vectors? Okay, if we have a unit vector and we look at the label, which is the scalar product of the secret with the unit vector, which is obviously just one bit of the secret here, the first plus the error term. And since this error term is strictly less than one half, even if we add samples together, it stays strictly less than one half. If we have enough of these unit vectors, we can do a majority vote for one secret bit. And if you analyze this algorithm, you will conclude that it indeed has a slightly sub-exponential runtime of two to the k of a log k, so also memory and sample complexity. So how do we construct time memory trade-offs from here? For this, we need one main observation and this is the following. We do not need to restrict the algorithm to just form pairs of vectors to eliminate a stripe. We could also allow the algorithm to sum up three or even more, let's say, C vectors to eliminate such a stripe. And the advantage should be that this number of combinations of C elements increases exponentially in C and so does the number of combinations that adds up to zero on such a stripe and so we are able to start with a smaller input list. So in the following, I will give a framework that does not restrict the algorithm to just forming pairs of vectors, but allows to sum up three or even more, let's say, C. Vectors to eliminate such a stripe and for this reason, we need to define the season problem, which is a known variant of a known, well-known problem in computer science. So it's not new, but we define it slightly differently. So, and this is just given a list of size n containing uniformly distributed elements from F2 to the B where B is now the width of such a stripe. Find n combinations of C elements that each add up to zero, so finding a number of these C sums that adds up to zero that is as large as the initial list. So as seen for the original BKW or the slide variant we've seen, solve the two sum problem by finding an amount of two sums adding up to zero that were as large as the initial list. And yeah, to ensure a solution to our problem, we obviously need n to be sufficiently large since otherwise there don't exist enough combinations. And since expected number of combinations adding up to zero on a stripe is n to C over two to the B, since there are n to C combinations of C elements and the probability that one sum of zero on a stripe should be two to the minus B. And so we need this to be at least the initial list size to make the problem solvable and solving this for n gives the lower bound on n of two to the B over C minus one. And since we want to start with the smallest list size as possible, we'll use exactly this value for n. And here we nicely see the dependence of n to B on B and C. So if we start with a, if we increase C we can indeed start with a smaller list size but if we increase the width of a stripe we have to increase our list size again. So it keeps us in mind and the main idea is now to solve the C sum problem repeatedly on each stripe and just instead of just forming pairs of vectors. So, okay we need one more observation to before stating the framework. And this is, we have to understand where actually the time memory trade off comes from since to now it looks like a memory reduction technique for the algorithm but it unfortunately has a drawback. And this is the following to understand this. We have to go back to the original BKW algorithm which forms pairs of vectors to eliminate a stripe and it does so in each iteration. So in the end we have constructed a unit vector or a lot of unit vectors that are the sum of two to the number of iteration samples. Since in each iteration we take two vectors from one list to form an element of the next one. And as I told you in the beginning if we add samples together we increase the error term. So the error term of the final unit vectors is much higher than the one of the initial samples. What I did not mention is that the BKW algorithm somehow operates on the limit of adding samples. So we cannot afford to add more than these eight samples together otherwise we would lose the slightly sub-exponential runtime of the algorithm. So if we now start adding three vectors for example together we can start with smaller lists, smaller lists of all iterations but we have added too many samples. So we have to compensate somehow and this is done by increasing the width of a stripe which then leads to overall less iterations. And for proper spreading of the stripe we will end up with a sum of A samples again to construct the unit vectors as in the case when just building two sums. So but as we've seen before if we increase the width of the stripe we have to increase our list sizes again and this is exactly what we will lose in time complexity. So now we are ready to state the framework. Okay it should be relatively clear how this works but we have to make an observation here. So we start with a list of size n and then we repeatedly solve to see some problem until we generated unit vectors and then we can do a majority vote for one bit of the secret. So in this observation here is that the time and memory complexity of, or first the memory complexity just is dependent on the initial list size which we have seen is just dependent on the choosing C some instance. And the time complexity of this algorithm or this algorithmic framework is just dependent on the algorithm used to solve the C some problem as long as we do not iterate over to many stripes but we won't. So we can conclude that the time and memory complexity of our algorithmic framework are equal to the time and memory complexity of the algorithm used to solve the C some problem. So knowing this now we can concentrate on algorithm solving the C some problem. We will do so and then plugging them in into our framework and forming time memory trade-offs. So let's start with a very simple naive algorithm solving the C some problem leading to our first time memory trade-off. So it's a nearly brute force approach. We call it C some naive algorithm and it does the following. It just computes all C minus one sums of the list, search the sum in the list and if it exists it can store the corresponding C some. And this algorithm has obviously time complexity upper bounded by N to the C minus one since it has to compute all C minus one sums. And the memory complexity is N for a properly choosing this size N as we've seen before. So using this algorithm now inside of our algorithmic framework to solve the C some problem we get an algorithm solving the LPN problem and these are the time and memory complexities of this algorithm. So the exponent of the time and memory complexity and we see that in comparison to the original BKW we achieve this log C factor in the exponent so it's increasing logarithmic in C and the memory exponent is decreasing nearly linearly in C that's quite nice behavior and if we illustrate this in a graphic where we plot time over memory we see that the coordinate one one corresponds to the original BKW runtimes and memory complexity of K over log K and if we now plug in you can also achieve this complexity by plugging in C equals two in our equations and if we now plug in higher values for C let's say C equals three we get a point to the upper left as expected from a time memory trade-off reducing the memory complexity exponent while increasing the time complexity. So for even higher instantiations of C we will achieve these instantiations of our framework. So in the following we want to concentrate on more sophisticated algorithms solving the C some problem and for this reason I'm going to explain the idea of Schroed-Pelencher-Mier or to be more specific what we will see is the heuristic version of Hogarth, Graham and Ju. This algorithm is especially designed to solve the four sum problem and so it will just offer one instantiation of our framework but there is a more general class of algorithms forming an abstraction of the Schroed-Pelencher-Mier technique which is called dissection and this will offer more instantiations for our framework but for a brief understanding I will just concentrate on the Schroed-Pelencher-Mier technique. So the first observation we need is that if we are searching for four sums in one big list we can simply split the list in four equal parts treat it as four different lists and start searching for four sums just between these lists so one element from each list and what we will lose is a polynomial factor of four sums not affecting the asymptotics so we can just treat the problem like this and what the algorithm then does it is it combines two lists at a time and it does so by matching the lists on a specific pattern on the last bits here zero zero and also it searches pairs one from each list that adds up to zero zero on the last bits and the same is done for the right two lists and then the middle lists are combined in the same fashion so that they add up to zero on all coordinates so we have generated our first four sums here and obviously the last list is not as large as the initial one so we have to compute more four sums somehow and the algorithm does so by starting all over again and matching the base lists to a different pattern on the last bits so same pattern is used for both lists and this, yeah then the lists are again combined to form a solution on all bits but note that by construction if we take a vector from the left and from the right middle list they add up to zero on the last bits so the constraint for the last level just holds on the first not matched yet bits and we generate an expectation the same amount of solution vectors so in particular it does not matter on which pattern we match on the middle level we can just repeat this for all different patterns to do an exhaustive search for all four sums matching to zero and for properly choosing this size the initial list and the output list size will be equal after doing the exhaustive search so this algorithm has a time complexity of n squared in comparison to the naive approach which had a complexity of n cubed and there is this more general class of algorithms I mentioned which is called dissection and the algorithms of this class have in general a runtime of n to the c minus square root of c and yeah in comparison to the naive approach which had a runtime of n to the c minus one so an overall improvement and it should therefore lead to a better time memory trade off and indeed it does so if we now use the dissection algorithms so the members of the dissection class to instantiate our framework so the subroutine for solving the season problem is now dissection or the dissection algorithms we achieve this time and memory complexities where we have the benefit of this one minus two over square root of c factor in the exponent which is obviously less than one and therefore improves the trade off so again illustrating this in the graphic where we see time over memory if we now use the Schruppel-Schamier algorithm to instantiate our framework we achieve this instantiation mark at lying at two thirds four thirds instead of two thirds two meaning we decrease the time exponent from two to four thirds which is a quite nice improvement and using the other members of the dissection class we achieve these instantiations lying all beneath the previous trade off and therefore forming a better trade off so here are some lines illustrating the behavior of the trade off but I want to mention that if we are memory-wise somewhere in this region so lying so we have more memory available than we need for building four sums but less memory than we need for building two sums we have no instantiation besides the Schruppel-Schamier variant of the BKW so the green instantiation mark at the end of the red line so we would like to have an algorithm that is able to leverage the effect of an increased memory by decreasing the time exponent again and to overcome this issue we give a slight generalization of the dissection technique which we call tailored dissection since it's tailored to any given amount of memory and the technique was independently discovered by Itaïdino it's the same time and it works as follows so it's applicable to any algorithm of the dissection class but again I concentrate on the Schruppel-Schamier technique so what can we do if we have more memory available than we need for building four sums but less than we need for building two sums we can simply increase the base list by the amount of memory we get which forces us then to match the middle list to a bigger pattern just to control the size of the list and then we will generate more solution vectors in one iteration since the lists are larger now and they add up to zero on nearly all coordinates so by construction so if we repeat this for different patterns we will generate more solution vectors but if we now repeat for all patterns we will generate way too much solutions and so we have to stop at some point and this point is exactly reached when the input and output list sizes are equal so again illustrating this in our graphic if we apply this tailoring technique to the Schruppel-Schamier variant of our BKW we achieve this solid line as straight off as linear interpolation between the original BKW complexities and the Schruppel-Schamier variant of the BKW which is in fact able to leverage the increase of memory between these both complexities by decreasing the time complexity and applying this technique to the other members of the dissection class and using them inside of our framework we achieve this solid line as piecewise continuous straight off so which is in fact able to leverage any increase of memory by decreasing the time complexity so summarizing our results we give the first BKW variant that is applicable for any given amount of memory which we achieved by abstracting solving the RPN problem by other BKW algorithm to solving the CSUM problem and in this context we give a general elevation of the dissection technique which is in fact a time memory trade off for solving the CSUM problem via the dissection algorithms additionally not addressed in this talk but addressed in our paper we give a first quantum version of the BKW algorithm which is a trade off so it of course performs worse in terms of times in the original BKW but it forms the best trade off so far and our results are easily applicable to the LWE setting thank you very much