 Okay, good morning everyone. So, given that this is my first talk at crypto in front of this distinguished audience, I thought it would be wise to start off with a slide that maybe is going to offend everyone in the room in one way or another. So, in this talk, I want to, at least to start, I want to highlight a gap in the field of cryptography and specifically a gap between the way that the theory community views things and the way that the practitioner community views things. So, in the theory community, there's a relatively liberal notion of efficiency. So, polynomial time is the standard benchmark. And the security goals that people hope to achieve are usually quite strong. So, a common thing is a reduction to some hard problem like the hardness of factoring. In the practitioner community, the algorithms tend to be much more efficient, which makes sense given that people need to use them in practice. And this often comes at the expense of more modest security goals. So, a typical thing here would be some arguments showing that known types of attacks will not break a particular construction, but not something like a reduction to the hardness of factoring. Obviously, this is a very, very broad generalization. Practitioners do care about strong security against all types of attacks. And theoreticians do care about more fine-grained notions of efficiency. But there's a trade-off here, and I think it's fair to say that they're viewed somewhat differently. So, first, I'm going to focus on constructing random-looking functions. So, what do I mean by that? I mean a family of functions indexed by some key k, where a random member of this family is indistinguishable from a truly random function. Okay, so both communities have an instantiation of this notion. So, on the theory side, these are typically called pseudo-random functions, going back to the work of Goldwright, Goldwasser, and McCauley. And in the practitioner community, block ciphers or MACs, one thing I just want to point out at the beginning is that, so block ciphers tend to be fixed input length, and it's not the case that the common modes of operations give you a pseudo-random function. So, even if you had a 128-bit block cipher that was as secure as possible, applying any of the standard modes, like CBC-MAC, there are distinguishing attacks against these, so. And what we want with pseudo-random functions is that the security grows with the input length n. Okay, so let's look a bit more closely at the gap with these types of constructions. So, there's two aspects of the gap that I want to focus on. One is the key length k. So, and specifically, it's relation to the input length, then. So, in pseudo-random functions, all known constructions based on complexity theoretic assumptions have a key length which is at least quadratic in the input length. The most efficient pseudo-random function due to an AR and Reingold, which is based on the hardness of factoring, and indeed, they have key length which is quadratic in the input length. In contrast, typical block ciphers, including AES, the advanced encryption standard, have a key length which is roughly equal to the input length, then. So, in AES, the longest key length is twice the input length. This parameter is important because it's a lower bound on the efficiency of computing these algorithms. So, if your key length has size n squared, you need to process the whole key, and your algorithm is going to take time n squared. The other aspect of this gap that I want to focus on is the methodology by which these are constructed. So, for pseudo-random functions, they're usually constructed based on one-way functions or pseudo-random generators and often involve computationally expensive components. And a common method for constructing block ciphers and MACs and hash functions is what's called the substitution permutation network, which probably most people in the audience are familiar with. I'm going to talk more detail about it in a minute, but for now, I just want to point out that in the PRF community, this structure is very understudied. There's no known constructions of PRFs using the structure that we're aware of. So, what do we contribute to this? So, our goal in this work is to try to bridge the gap somewhat, although I should warn that we have our feet firmly planted on the theory side of the river when trying to build this bridge here. But we have two main contributions. So, one is that we give a number of new candidate pseudo-random functions based on the substitution permutation network structure. So, our candidates are more efficient than previous candidates in a number of models. In this talk, I'm going to focus on the circuit model. This has applications to the natural proofs barrier of Rasbarov and Rudich. If you're familiar with that, I'm going to mention more about that at the end. And so, our constructions, we don't have something like reduction to the hardness of factoring, but we do extend some security proofs from the practitioner community and show that they apply to our candidates as well. The second contribution is a type of proof-of-concept theorem for the SP network structure. So, specifically, we show that when you instantiate a certain component, the S-box, with a truly random function, then what you get out is a secure pseudo-random function. Of course, this is inefficient because it uses a random function, which, with high probability, is not efficiently computable. But still, this is, as I said, a nice proof-of-concept theorem showing that this structure is plausible for constructing pseudo-random functions. And this is somewhat analogous to the work of Lubion Rakoff, who showed that with the Feistel network structure, when you instantiate a certain component with a random function, what you get out is secure. Okay, so the outline of the rest of the talk, I'm going to give some more details of the SP network structure. Then I'm going to give the details of our new pseudo-random function candidates, at least two of them. Then I'm going to mention something about this proof-of-concept theorem. And finally, I'm going to conclude with the connection to natural proofs. Okay, so what is an SP network? So, again, this is probably a review for most people in the room. But an SP network is an algorithm on n-bits, and it's computed over a number of rounds, r. And each round has three steps. In the first step, the input is broken into chunks, and here I'm going to use b to denote the size of each of these chunks. And each chunk is run through what's called the s-box, the substitution box in parallel. The substitution box is chosen to be, to have good crypto properties, good cryptographic properties, and I'm going to say more about that in a second. It's typically a computationally expensive operation, but we save on efficiency by only applying it to small pieces of the input at a time. The second step in each round is the linear transformation m. This is a linear operation, so it's relatively computationally cheap. And the properties that we need here is that it has good diffusion properties. Diffusion is a little bit of a vague word, but I'll say precisely what I mean about this in a minute. And then finally, the third step in each round is the xor with the round key. This is obviously important because this is the only source of secrecy in the whole algorithm. The s-box and the linear transformation are known to everyone. The key is secret. In all of our candidates, we're going to assume uniform and independent round keys. In practical constructions, this is not typically done, so in AES, the round keys are generated via the key schedule. However, this is common in analysis of these functions, and in particular, it's believed to only make the security stronger to use independent round keys, as opposed to generating them from a key schedule. So probably the main security feature of the substitution permutation network structure is that it is provably resistant to two types of attacks called linear and differential cryptanalysis. I don't have time to go into the details of these attacks, but fortunately, the last talk had an absolutely perfect explanation of linear cryptanalysis. So I do want to explain what features of the SPN enable resistance to these kinds of attacks, and so I just want to note that there's two parameters, one for each type of attack, and if you can bound these parameters to be exponentially small in N, the input size, this is what's considered provable security against these attacks. So the details are here. I'll just mention that the parameter for linear cryptanalysis is the linear hull parameter from the last talk. Okay, so what do we need from our SPN in order to be able to resist these attacks? There's two main things. One is that the S-box itself should be resistant to linear and differential cryptanalysis, by which I mean that if you measure these two parameters for the S-box, they should be exponentially small in B, the size of the S-box. Fortunately, thanks to the work of Neyberg and others, we know that the function which computes inversion in the field of size 2 to the B, which can be accomplished by raising to 2 to the B minus 2, does have these good properties against linear and differential cryptanalysis. The second design feature that we need is that the linear transformation M has what's called maximal branch number. So the branch number is a measure of the diffusion of the matrix, and it can be computed as follows. Consider any non-zero input to M, count the number of non-zero chunks in the input and the number of non-zero chunks in the output and add them together, and the minimum of this number, overall inputs, is the branch number. It's not hard to see that the highest that this can be is one more than the number of chunks, and indeed there are constructions that achieve this, and I'll mention more about that in a couple slides. So the intuition here is that the S-box provides strong security against linear and differential cryptanalysis, and the diffusion properties of the matrix propagate the security to the entire state. This intuition can be made precise, as we'll see in a moment. Okay, so this was the background on SPN, so now let me get into what our new candidates are. So I'm going to mention two of them. The first candidate is computable by quasi-linear size circuit. So this is circuits of size N times polylog N. And we show that this has exponential security against linear and differential cryptanalysis. So I want to compare this to what I already mentioned is the most efficient pseudo-random function based on complexity theoretic assumptions, which is the one of Neor and Reingold. Again, they prove security by reduction to the hardness of factoring, but the size of their, which is stronger than what we prove, but the size of their circuits computing this function is at least quadratic because of the key length. Okay, so how do we construct this pseudo-random function? Well, it's basically an SPN with particular choice of parameters. So for the S-box, again, we choose the inversion S-box that I mentioned. We compute it on B, bits where B is logarithmic in N. So N is the input length, and each S-box is computed on log N bits. Because this is a polynomial time function, inversion is a polynomial time function, each S-box takes size polylog N, and most none of them, in fact, N over log N. So we can compute all of the S-boxes in polylogarithmic size in each round. The linear transformation M is constructed using Reed-Solomon code. This is a very common method of constructing maximal branch number linear transformations. So in particular, if you take the Reed-Solomon code, the generator matrix in reduced echelon form and take the second half of this matrix, this gives maximum branch number. To compute this efficiently, we note that this is... Well, actually, we don't know, excuse me, Roth and Sirusi noted in the 80s that a matrix taken of this form is a Cauchy matrix, which is a particular type of matrix. And there's an algorithm by Jarrah-Soulis to multiply by Cauchy matrices in quasi-linear size. We adapt the algorithm to fields of characteristic 2. So note that each layer of S-boxes is computable in quasi-linear size. The linear transformation is as well. And the number of rounds we also take to be logarithmic in N. So overall, we get a function that's computable by quasi-linear size circuits. Okay, so what security can we prove? Well, so I mentioned the intuition about SPN security. There's a very nice theorem going back to the work of Kang et al. around 2000. And they show that if the S-box has strong security and the linear transformation has maximum branch number and the keys, the round keys are independent, then you get exponential security. Their work only shows this for two round SPNs. So we, in fact, extend it to SPNs with our rounds and show that you still have exponential security, provided that the number of rounds does not grow too quickly compared to the S-box. So in particular, if the number of rounds is half the size of the S-box, then the security is 2 to the minus N over 2. Okay, and again, note that the S-box we're using does have these good linear and differential cryptanalysis bounds. And so plugging the parameters into the theorem, we get that our construction has exponential security. Okay, the second PRF candidate that I want to mention is a much simpler candidate. So this can be viewed as a very extreme setting of the SPN with basically one round and one S-box, no diffusion matrix. So this PRF candidate is computed in three steps. So there's two round keys, k and k prime. First, we XOR the input with the first round key, k. Then we run it through this inversion S-box, but note here that this inversion is computed on the entire state at once and it's not broken into chunks. And then finally, we take the inner product of what comes out of the S-box with the second round key and output a single bit. So I want to note that for block ciphers, outputting a single bit doesn't make any sense, but for pseudo random functions, this is still a very interesting setting. So the security that we're able to prove for this is that it exponentially fools parity tests that look at up to 2 to the 0.9 N outputs. And you can make that 0.9 as close to 1 as you want. I think that a nice comparison with this, and in fact one of the inspirations for constructing this was the even-man source cipher. So in the even-man source cipher, you do XOR a key, k at the beginning. Instead of this S-box, you have a truly random function. Actually, they consider permutations. And then finally, you XOR with the second round key, k. So we modify this in two ways. First, we replace the random function with an S-box. And second, we replace the inner product, excuse me, the second XOR with an inner product. And it's not that only replacing the S-box, excuse me, the random function with an S-box, there's actually a very simple attack on this that we note in the paper. But if you also replace the second XOR with an inner product, then we can actually show that it fools parity tests. And as for the efficiency, this is also computable in quasi-linear size because computing inversion in fields of size 2 to the N is computable in quasi-linear size. Okay. So now I want to briefly mention so here we want to consider an SP network where the S-box is chosen uniformly at random and you can imagine this as being chosen along with the key. So this is not going to compute a permutation, but that's okay, we're interested in pseudo-random functions. Okay. So what do we show? We show that if you take an SPN with a random S-box and where the linear transformation has maximum branch number, then any polynomial time adversary has exponentially small distinguishing advantage where I mean exponentially small in the size of the S-box. Okay. So in particular when the S-box is, the size of the S-box is super logarithmic, we get super polynomial security. The bound here is somewhat comparable to the one of Luby and Rakoff. They also show an exponentially small distinguishing advantage in the size of the random function with a polynomial loss in the number of queries. And so we exploit the SPN structure to bound collision probabilities in order to prove this theorem. So I don't have time to go through the proof, but I just want to give you some intuition. So we want to bound the collision probabilities in the final round of S-boxes. So here I'm only showing two rounds. In order to do this, we need that the linear transformation M has all entries not equal to zero, and is invertible. So in the theorem statement I mentioned that we can use maximum branch number transformations, and in fact, any maximum branch number transformation does have these two properties. So for any set of fixed queries, we can show that the probability that there's any collisions in the final round of S-boxes is small, and clearly because the S-box is chosen at random, if there's no collisions, then the output is uniform. There's some more work in the theorem to extend to adaptive queries, but this is sort of the central piece. Okay, so now I want to just briefly mention two natural proofs. So if you've never seen natural proofs before, this is one of these barrier results that attempts to explain our difficulty in proving lower bounds against circuit classes. So Rasbarov and Rudich observed that most lower bound proofs against any circuit class, so for example, the proof that AC0 cannot compute parity. They do more than show that a specific function is not in that circuit class. They actually give a way to distinguish truth tables from this circuit class from truly random truth tables. So most lower bound proofs, at least up to the point of this work, have this property. The implication of this is that if a circuit class can compute a pseudo-random function, then it's going to be hard to prove lower bounds against it, or at least that it's going to require new techniques. And the reason for this is that distinguishing a truth table from the circuit class from a truly random truth table is exactly a distinguishing attack on a pseudo-random function. So this was a beautiful observation and gave some explanation for why we can't prove some circuit lower bounds. However, there's still a gap here because the best pseudo-random function that we have has size at least quadratic, but the best lower bound we have is for linear size circuits. So in particular, this observation does not give an explanation for why we cannot prove super linear lower bounds. So the efficiency of our PRF constructions, if you assume that they're exponentially secure, narrows this gap in a variety of models. So in particular, Boolean circuits of quasi-linear size, we have a candidate computable in that. We also have a candidate computable by TC0 circuit, so this is constant depth circuits with majority gates of size n to the 1 plus epsilon for any epsilon. In particular, this is related to a plan of attack by Alinder and Kutski for approving TC0 lower bounds. And we also show that a very natural generalization of AES is computable in quadratic time on single-tape Turing machines, which is also... So quadratic time on single-tape Turing machines is the best lower bound that we have for single-tape Turing machines. Okay, so in conclusion, the main thing I hope people take away from this talk is that the SP network structure seems extremely under-explored for constructing pseudo-random functions. In particular, we're not aware of any work that tried to do this before. It lends itself very nicely to being computed by efficient circuits, and also the hardness seems to stem from more combinatorial considerations as opposed to algebraic, which is standard in PRF. So some future directions and a great question is what's the simplest, most efficient, plausible PRF candidate you can construct, and can SPNs be used to give this? And also it would be nice to analyze some of our candidates against more types of attacks than just linear and differential cryptonosses. Okay, thank you very much. Thank you.