 Welcome to this presentation about the cryptanalysis of the Legendre PRF and some of its generalizations. The Legendre PRF is a pseudo-random function based on the Legendre symbol, which is a common function in number theory. The Legendre symbol of an element of a finite field is 0 if the element is 0, 1 if the element is a non-zero square, and minus 1 otherwise. In the early 1900s, a number of so-called echidistribution results about Legendre symbols were shown. In particular, you can show that if you take the string of Legendre symbols of the integers 1, 2 up to p minus 1, then the number of occurrences of any fixed pattern of 1 or minus 1 is roughly the same. At Krypton 1990, Darmgat proposed to use the sequence of Legendre symbols of a key k, k plus 1 and so on as a pseudo-random generator. More recently, at CCS 2016, Grassi et al. extended this to a pseudo-random function by defining this function as the Legendre symbol of the input plus a key k with respect to a fixed prime p. This is a very NPC-friendly function because of the multiplicativity property of the Legendre symbol, which allows for easy masking. Because of this, the Legendre PRF might be interesting for a number of applications. For example, it's possible to build picnic-type signature schemes from this PRF, as in the Lagrange signature scheme. Also, Ethereum 2.0 wanted to use this PRF in a proof-of-custody mechanism, and this actually provides the indirect motivation for our war. Because of its potential use case in this proof-of-custody mechanism, the Ethereum Foundation decided to start a number of challenges related to the cryptanalysis of the Legendre PRF. But before the start of this challenge, Kovratowicz gave a first birthday-bound attack on the Legendre PRF, which runs in time p over m when you're given m queries. In the challenges, 2 to the 20 consecutive inputs are given, and the prime sizes reach from 64 bits to 148 bits. The security level that's shown here on the slide is already modified for the attacks that we give. In particular, we give a p over m squared attack, when m is not too large, and this allowed us to solve the 64 and 74-bit challenges. In fact, shortly after our results appeared on ePrint, the 84-bit challenge was also solved thanks to a log p improvement made by Kovratowicz. So I'll start by explaining this basic attack, which runs in the birthday-bound of Kovratowicz. So for this, I'll introduce some notation, so let's denote by square brackets of m a sequence of integers, from 0 up to m minus 1. Then by some abuse of notation, the Legendre PRF of a sequence x, x plus 1 up to x plus m minus 1 gives a sequence of Legendre PRF outputs. The basic observation that's used in this birthday-bound attack is the fact that the Legendre PRF of an input x is the same as the Legendre PRF of k plus x, where the key is 0. So the attack proceeds as follows. We make m queries to the Legendre PRF, and then extract from this roughly m sequences of Legendre symbols. We store those sequences in a table indexed by the offset a, and then in the second step, we sample the Legendre sequences with offset c randomly, until we find a collision with something in the table. Now if our sequences are long enough, then probably the arguments to the Legendre symbols must be the same, which means that the offset c must be equal to k plus a. This attack has a cost of m plus p over m operations, and it uses about m memory, but it's also possible to make this into a memoryless attack by using memory escalation search techniques. Our improvement over this attack is based on the multiplicativity of the Legendre symbol. By this I mean that the Legendre symbol of a times b is equal to the Legendre symbol of a times the Legendre symbol of b. And this allows us to rewrite the Legendre symbol of a sequence with offset a and the step size of b in terms of a sequence of Legendre symbols with a step size equal to 1. The attack proceeds in essentially the same way as before. We query m values of the Legendre PRF, and then because we now have parameters a and b, we can extract nearly m squared sequences from this. And then in the second step of the attack, we sample random sequences until we find a collision with one of the sequences stored in the table. If the sequences are long enough, then any such collision with high probability gives us an equality of the arguments. So this means that then c must be equal to k plus a over b, and this allows us to recover the k. The cost of this approach is m squared memory, so slightly more as before, but the time complexity goes down to m squared plus p over m squared, which is indeed lower, especially when the amount of data that's available is not very high, which is a realistic scenario. A number of optimizations to this attack can be made. One is that we can use consecutive samples in the offline phase. This means that we first compute a sequence of Legendre symbols of length w and then extract nearly w squared sequences from it. This would work very well, but there is one issue with this, which is that the sequences in the table are also extracted in a similar way, so they're not random. And because of this, we'll actually need slightly more queries in the second phase before we find the correct key. However, we still have a number of advantages. We can amortize away the cost of computing the Legendre symbol so that the cost is dominated by the sequence extraction and table lookups. In addition, we can only store sequences where a is less than b, leading to reduced memory requirements. So in the end, we get a cost which no longer depends on the cost of evaluating the Legendre symbols and uses slightly less memory than the attack I presented on the previous slide. Of course, we also implemented this attack since we wanted to solve those two challenges. So here we are given 2 to the 20 consecutive PRF outputs, and using this attack we broke the 64 and 74 bit challenges. The first draw of this table is not that interesting, because it doesn't correspond to an actual challenge, just a test case. The full implementation can be found at the URL shown at the bottom of this slide. In the remainder of this presentation, I want to discuss 3 generalizations of the Legendre PRF, which were previously proposed, but did not receive a lot of cryptanalysis yet. The first of these is the higher degree Legendre PRF, which was first analyzed by Kovratowicz. I also want to discuss the Akobi and Power Residue symbol PRFs, which are PRF versions of the pseudo-random generators proposed by Damgaert at Crypto 1990. The ordinary Legendre PRF can be thought of as a degree 1 Legendre PRF. This is because we are taking the Legendre symbol of a polynomial of degree 1 in the input. One possible generalization is to replace this by taking the Legendre symbol of a polynomial of degree d in the input. The coefficients of this polynomial then form the key. So we now have a much longer key, and so it's natural to expect a higher security level. The initial cryptanalysis seems to suggest that this is indeed the case. In particular, it's possible to use a similar attack as the basic approach by Kovratowicz to obtain an attack in P to the d-1 time, given P queries. Our take of using multiplicativity also carries over to this setting and allows us to get an attack of P to the d-2 time, again using P queries. In fact, Kovratowicz had all improved this to a P to the d-3 attack. But we also give in this paper a number of wiki attacks, which I'll present in the next slide, and which are probably more relevant for practical cases. One possible use case for the higher degree Legendre PRF is to use a polynomial which factors completely over the base fields into degree 1 polynomials. This basically amounts to taking d Legendre PRFs with a possibly smaller modulus and then multiplying their outputs together in the hopes of achieving higher security. And this is possible, but the security that's obtained is not as good as one would expect based on the previous slide. In particular, as I'll show on the next slide, there's a birthday bar to take on this construction. In general, any key so that the corresponding polynomial is reducible could be called a weak key, but the degree to which it is weak depends on the structure of the factorization. In the worst case, we have two factors of equal degree. And in that case, there's a collision attack that we can apply. So here we simply need to do roughly low p queries to the PRF, and then we can find the collision between two sequences, one of which depends on k1 and the other depends on k2. And if this collides, then probably we know what k1 and k2 are. As shown in this figure, the complexity of the attack ranges from p to the d over 2, in the worst case, to p to the d depending on which fraction of keys can be attacked. The next variant of the Legendre symbol PRF that I want to discuss is the Akobi symbol PRF. In this case, we use composite modulus. For this, we need the Akobi symbols, which are simply defined as the product of Legendre symbols corresponding to the prime factors of the modulus. However, this generalization turns out to be not very useful, because we can simply make queries for inputs which are a multiple of p, and then this corresponds to making queries to the Legendre PRF with a different key, but modular prime q. Using this, we can attack the Legendre PRF to obtain the value of the key modulo q, and then we can do the same thing with p and q swapped to obtain the value of the key k modulo p. Finally, we simply apply the Chinese remainder theorem in order to obtain k mod p times q. Finally, I want to discuss the power residue PRF. In this case, we have a prime p, so that some integer r divides p minus 1. We can then define the r power residue symbol of an element x of the field fp to be x to the power of p minus 1 over r. If x is an r power, this evaluates to 1, and otherwise it evaluates to some r through the unity. So this means that we can extract log r bits from each power residue symbol. And that's potentially useful, for instance it's used in the pork roast signature schemes which are variant of this leg roast scheme that I mentioned earlier. The basic attack that we gave on the Legendre PRF of course generalizes because it's based only on the multiplicativity of the Legendre symbol. However, this attack does not depend on the value of r, or at least it depends only logarithmically on it. So we would like to show that for large r, this construction also degrades in security. The paper we give an attack which runs in p over m times r time and uses m memory. However, we are not able to find an attack which runs in something like p over m squared times r time. To conclude, in this paper we give improved attacks on the Legendre PRF which are especially relevant in the low data setting. And this allowed us to break the concrete challenges proposed by the Ethereum Foundation for the 64 and 74 bit primes. We also give improved attacks on higher degree variants of the Legendre PRF. And I think in particular the takeaway here is that there exists a large number of weakies corresponding to reducible polynomials. Finally, we also evaluate the security of two other variants of the Legendre PRF which originate in the work of Damgaert. The first of these is the Akobi symbol PRF which turned out to be not more secure than the weakest Legendre PRF corresponding to one of the prime factors of the modulus. The second is the power residue symbol PRF which could be used to extract more output bits. But we also show that the security does degrade when the output size increases. Our implementation of the attack on the Legendre PRF is available online at the URL shown on this slide.