 Hello, my name is Itaï and I would like to tell you about my paper, Cryptanalytic Applications of the Poinomial Method for Solving Multivariate Equation Systems over GF2. In this work we consider the problem of solving Poinomial Equation Systems. The input to the problem is a Poinomial system denoted by E that consists of M Poinomials over a finite field F. Where each Poinomial has N variables and is given by its algebraic normal form or ANF as a sum of monomials. Now the algebraic degree of each Poinomial is bounded by a small constant D. The goal is to find a solution to E, meaning an assignment to all the N variables that basically zeroes all the Poinomials. Now for D equals 1 the system is linear and can be solved in a Poinomial time using Gaussian elimination. However, the problem is NPR already for quadratic equations even for the specific field of F2. Now in terms of our prior works, the standard technique to solve Poinomial systems is to find a convenient representation of the ideal generated by the Poinomials usually in the form of a Grubner basis. And it should be said that the complexity analysis of such algorithms is typically heuristic. On the other hand, at Soda 2017, Lokstanov et al. presented the first worst case algorithms with exponential speed ups over brute force for solving Poinomial systems. Now these algorithms were based on the so-called Poinomial method in circuit complexity, which is basically a technique for proving circuit lower bounds that has recently been applied in algorithm design. Now this work we will be mainly interested in solving Poinomial systems over the field F2, so the problem is basically the same, but now we're specializing the field to be F2. And as I already mentioned, the problem is NP-hard even for quadratic systems. In fact, assuming the exponential time hypothesis, there is no sub-exponential algorithm for the problem. However, there's still a fundamental problem in computer science, and it's widely studied in cryptography, especially in the domain of multivariate cryptosystems. This slide summarized the most relevant prior work. In 2010, Bouillot et al. presented an optimized exhaustive search algorithm for the problem with very detailed complexity analysis, and the algorithm was shown to run very fast in practice. In the year 2013, Bardet et al. presented an algorithm which is based on a hybrid between exhaustive search and linear algebra. The complexity of the algorithm was at the order of 2 to the 0.79n. And then later at 2017, Zhu and Vica presented another variant of the hybrid algorithm. It had no detailed complexity analysis, but it was shown to run quite fast in practice. Now what about the Poinomial method algorithms? So the paper that I already mentioned from 2017 by L'Occitane et al. It had the complexity for quadratic systems, for solving quadratic systems of the order of 2 to the 0.87n. And it also had an extended analysis for a larger degree. Now this algorithm was improved in 2019 by Bjorkland and Cascand Williams to run in the time which is at the order of 2 to the 0.804n. And again, they had an extension for a larger degree. And in 2021, this algorithm was further improved. Now in this work, we're mainly interested in concrete complexity of solving F2 equations. And now what I mean by concrete is non-symptotic, meaning the complexity analysis should have no hidden terms. Now this type of analysis is relevant for choosing parameters for concrete cryptosystems. In terms of prior work, so I already mentioned that these two algorithms were shown to run fast in practice. And the algorithm by Barde et al. was analyzed by the authors and the estimate that it beats a brute force for instances within at least 200. So these are quite large instances, but still they're relevant for cryptography. Now in terms of the poinomial method algorithms, unfortunately, the complexity analysis of previous algorithms was entirely asymptotic. If you look deeper into these algorithms, it's not very difficult to see that this asymptotic notation hides quite large hidden constants. So it's not really expected that as they are, these algorithms will be relevant for cryptography. The main result of this paper is a concretely efficient poinomial method-based algorithm for solving equations over F2. And the complexity of the algorithm for random equation systems measured in terms of bit operations is n squared times 2 to the 0.815n for quadratic systems. And there's also an extension for a larger degree. Unfortunately, there is an obstacle for obtaining a fast practical implementation of the algorithm, which is a high memory complexity, which is roughly 2 to the 0.63n for quadratic equations. Now this algorithm seems to beat previous works in terms of concrete time complexity for many interesting parameter ranges. But as I already mentioned, it's downside is that it currently has no fast practical implementation. Here's the complexity for some specific instances. You can see that for quadratic systems, the algorithm beats exhaustive search in terms of time complexity, starting from fairly small values of n, say n equals 80 or smaller. And perhaps a bit surprisingly, it also beats exhaustive search in terms of time complexity, also for a larger degree, say degree equals 4, starting from n equals 100 or so. The main application of our algorithm is in crypt analysis of the picnic signature scheme, which is an alternate third round candidate and the post quantum standardization project currently being run by NIST. And what we show is that some instances of picnic three did not achieve their claimed security level. You can see this from the stable. So the larger instances with the claim security level of 196 and 255 bits. So our tech is in terms of time complexity is below the claim security level. Of course, the attacks also consume a very large amount of memory, but the security claims were only formulated in terms of the time complexity. Next, I will give some background, then I will give an overview of the algorithm and finally I will conclude. Let me first define some notation. So I'm going to use capital letters for symbolic variables and lowercase letters for assignments to these variables. As an example, here's an equation system E with n equals n equals five variables m equals three equations and algebraic degree of D equals two. Okay, so let me now give some background about the polynomial method. Given an equation system E consisting of m polynomials, let's define the polynomial f, which is just a product of m terms where the i-therm is equal to Pi of x plus one. Now let's assume an assignment x is a solution to E, meaning it zeroes out all of these polynomials, and therefore all the terms of f are going to equal one and therefore f of x is equal to one. And it's also not very difficult to see that the other direction holds as well. And this is why essentially I'm going to call f the identifying polynomial of E. So it will be kind of useful to analyze this polynomial f. Unfortunately, in general, the degree, algebraic degree of the polynomial f can be as high as D times m because it contains m terms each one is of degree D. And this degree is basically can be basically too high to manipulate efficiently. Therefore, what we will do is we will define what is known as a probabilistic polynomial, and we're going to denote this polynomial by f-tailed. So f-tailed contains l terms for some parameter l, which is smaller than m, where the i-therm is equal to r i of x plus one, where each r i of x is basically a random linear combination of the m polynomials of E. Now it's similarly to the previous definition, f-tailed is the identifying polynomial of this equation system E-tailed which basically consists of the polynomials r i. Now this f-tailed has two interesting properties. The first property is that it approximates f in the following sense. So let's assume that f of x equals one, meaning that x is a solution to the equation system E. And therefore it zeros out all these other polynomials, p i of x. And as a result, it also zeros out all the linear combinations, which means that it is a solution to E-tailed as well. And therefore f-tailed of x equals one. Now if f of x equals zero, this basically means that at least one of these polynomials, its value on x is one. And because these r i's are random linear combinations of the polynomials of E, it's not very difficult to prove that with probability at least one minus two to the minus l, then this equation system E-tailed will have at least one of its polynomials will equal one as well. And therefore this means that f-tailed of x equals zero with probability one minus two to the minus l. Okay, so that's the first property that f-tailed satisfies. The second property is that its degree is relatively low compared to the degree of f. And this is because it contains l terms, each one of them is of degree d. And therefore its degree is at most d times l, and we're going to choose this l such that the degree of f-tailed will be relatively low, it will be more efficient to manipulate this polynomial. So how do we exploit this f-tailed in an efficient algorithm? So the way it's done is as follows, we're going to define another parameter n one, which is smaller than n, and we're going to partition the variables according to this parameter. We have the kind of the most significant variables, y one up to yn minus n one, and the z variables, z one up to zn one. Now the basic algorithm, the basic building block that we're going to use is the following. I'm going to describe it next, but the main idea is to compute expressions of the form u of y, where u of y is basically a sum over all z's of f-tailed applied to y and z. So this is basically a function of y, and the algorithm, the basic building block is going to compute these expressions for all y, for all possible y's, and it's going to do this relatively efficiently, meaning it's going to do this faster than exhaustive search, or faster than brute force, which has complexity of two to the n, because you can kind of very easily do this in complexity roughly two to the n, just by iterating over each y independently, there are two to the n minus n one such y's, and for each one of them, you just evaluate f-tailed applied to y and z for all possible z's, and then you sum the values mod two. So this basically iterates over the entire space, and it has complexity of two to the n, but we're going to do this more efficiently than that. Now after defining the basic building block, the main algorithms that are based on the polynomial method, they use this building block in order to compute the solutions to the equation system e. Now I'm going to describe to you how this is done later, but the main intuition here is that if you consider this expression u of y, it basically counts the number of solutions mod two to the equation system e-tailed, when these variables are just fixed y, and this is because f-tailed is the identifying polynomial of e-tailed, and when you sum over all z's mod two, you're basically counting the number of solutions mod two to this restricted equation system, and you're essentially counting the parity of solutions to this restricted equation system. So this kind of gives you intuition why this expression relates to the solution of the polynomial system. Now here's how we're going to implement this basic building block. We're going to consider this u as a polynomial in the symbolic variables y. Now because u is a sum of partial evaluations of f-tailed, which is of low degree, then the degree of u is, because it's at most the degree of f-tailed, then the degree of u is also relatively low. And we're going to exploit that in the algorithm. So here's a sketch of the algorithm. The first step is basically going to interpolate the algebraic normal form of u as the function of the symbolic variables y. And the second step we're going to evaluate the polynomial u on all possible assignments, meaning all possible y's. And for both of these steps, we're actually going to use a fast polynomial interpolation and evaluation algorithm. So let's do a sketch of the complexity analysis. So the first step, its complexity formula involves two terms. The first term, this term here essentially accounts for the number of monomials in the symbolic representation of u. And why do we have this number of monomials? Well, basically the number of variables, the number of y's is n minus n1. This is the number of y variables. And the term here is basically the degree of u and the number of monomials in u is essentially n minus n1. Choose the degree of u approximately. And the second term here is n is 2 to the power of n1. And this accounts for the fact that in order to evaluate u at any point at any y, we have to sum over 2 to the n1 values of z. So basically in order to interpolate the algebraic normal form of u, we have to evaluate this expression for this many values of y. And this explains the complexity of the first step. As for the second step, we're evaluating u on all possible y's and there are 2 to the n minus n1 such y's and this explains the complexity of the second step. Now if you select the parameters carefully, you can balance these two steps so that each one of them, its complexity is smaller than 2 to the n, which means that the overall complexity of the algorithm is faster than 2 to the n as I promised. Now polynomial method algorithms use such parity computations, use this basic building block in order to output solutions to the equation system e. Now one of our main contributions in the paper is essentially a simpler and more efficient way of computing solutions to the polynomial systems from such parity computations. And this is what I'm going to describe next. Now I'm going to give an overview of the algorithm. Remember that we have our equation system e and we've defined the polynomial f and called it the identifying polynomial of e. We've also defined probabilistic polynomial f tilde, which is the identifying polynomial of the equation system e tilde. Previous algorithms solved e by defining and manipulating many independent probabilistic polynomials that approximate f and this resulted in a large concrete overhead. A simple observation that we're going to use is that each solution of e also solves e tilde. Because if x zeroes out all the polynomials of e then it obviously zeroes out other linear combinations r1 up to rl. And the algorithm is going to use this basically by iterating over all solutions of e tilde and checking if each one solves e. And this in this way it is sufficient to find all solutions of e. Remains to describe how to iterate over the solutions of e tilde. We're going to use a variable partition. So for n1 smaller than n, we're going to partition the variables x into two sets, the y's which are the most significant bits and the z's which are the least significant bits. I'm going to set the parameters such that for each value of the y's there is an expected number of a single value for the z such that y z is a solution to e tilde. And if you do the calculation then based on some randomness assumptions we need to set L to be roughly equal to n1. But I'm not going to use this in the remainder of the presentation. So I assume we've set the parameters so that this property holds. Now the algorithm is a basic divide and conquer algorithm and it works as follows. So for each value of the y's the most significant bits we're going to fill in the say single value of z such that y z is a solution to e tilde. I'm going to do this by some parity computations. Now we have a solution to e tilde and we can test it on e and if it satisfies e then we can output the solution. So it remains to describe how to fill in this value of z for each y. Recall that u evaluated on y counts the parity of solutions to the equation system e tilde restricted to the specific value of y. Now in fact it turns out that f y prime z prime is the only solution to e tilde restricted to y prime. Then you can use n1 tweaked parity computations of a very similar type to u in order to fill in the n1 bits of z prime. I'm not going to describe exactly how this is done and you can look it up in the paper. So this is kind of a sketch of the divide and conquer algorithm and let's try to sketch the complexity analysis. So the complexity of iterating over all y's is 2 to the n minus n1. So if you want to just optimize this complexity then the step then you will set n1 to be roughly n to be very close to n. However this is not a very good idea because remember that the parity computation costs n1 minus n1 choose the degree of u times 2 to the n1. And if you choose n1 to be close to n then this term will explode. And in fact if you remember the degree of u also depends indirectly on n1 because it depends on the degree of f tilde which is roughly d times l and l again if you recall is close to n1. So if we set n1 to high then also the degree of u is going to be relatively high. So the conclusion from this is that we cannot set n1 to be very high and we need to choose it carefully in order to balance the complexity of iterating over these y's and the parity calculations. Finally let me conclude the talk. So I sketched a concretely efficient polynomial method algorithm for solving f2 equations. And at the paper we have additional contributions. For example we reduce the memory complexity of a naive implementation of the algorithm by an exponential factor. Unfortunately it still remains relatively high. Furthermore we optimize the algorithm for breaking concrete crypto systems such as Picnic. So there are several open problems that remain. For example can the algorithm be further optimized. And in particular can it be optimized for solving over defined systems in which the number of equations is much larger than the number of variables. Finally it would be very interesting to have a fast implementation of a variant of the algorithm maybe a variant which uses time space tradeoffs. So that's all. Thank you very much for your attention.