 Hi, good morning. Thanks for being here so late in the conference. So yeah, this is joint work with Wei from UCSD and Tung from Florida State. And this talk is about indistinguishability. And the starting point is the fact that proving indistinguishability is a central step in assessing the security of symmetric cryptography. And in fact, very often we end up with fairly sophisticated proofs which are made of complex and error-prone probabilistic arguments. And for this reason, several frameworks have been developed over the years to ease this task and make it easier for us. Still, there is a number of very hard technical questions that remain open and for which we have no solutions. And so the main purpose of this work was to come up with new techniques to enhance the set of results we can cover and for which we can prove indistinguishability. And in particular, we have two contributions. The first one is a new framework which we call the chi-square method for indistinguishability proofs, which is then applied to a number of constructions that we analyze for which we obtain not only simpler proofs but often tighter bounds. And for time reasons in this talk, I'm going to focus on one of these construction which is the excerpts of the random permutation. It's a very simple construction which has alluded a simple analysis for almost two decades to date and hence it's a great test-bed for our framework. So for this reason, I want to start this talk by giving you an overview of this construction and the problem it solves before moving to our actual contributions. So the problem we are considering is the one of building goods to the random functions or PRFs. So remember that this object is just a function which takes additional a secret key and if this key is chosen randomly, no adversary usually called a distinguisher can tell apart an interaction with the key function from an interaction with a truly random function which returns uniform random outputs. And this indistinguishability is formalized by having the distinguisher output a decision bit and then we capture its distinguishing advantage or PRF advantage, which is the difference between the probabilities that the distinguisher output one in each of the two experiments. And generally we are interested in the best possible such advantage given a certain distinguisher running time t and a budget of queries q and we would like of course for f to be a good PRF this to be as small as possible for t and q as large as possible. A related notion is that of a pseudo random permutations or PRP and it's better tailored at block ciphers like AES. So here the context is that our function, our key function is a permutation for each value of the key. So it's one to one and has no collisions in the outputs. So it's fair here to only require indistinguishability from a random permutation which returns outputs at a random but are however distinct. And of course one can formalize this again in terms of corresponding PRP distinguishing advantage. So again the important point here is that from a practical standpoint what we want are good PRFs. Because these are great tools that allow us to achieve a lot of applications for encryption and authentication and so on. But what we typically have are good PRPs in the form of block ciphers like AES that are satisfied, that are assumed to satisfy this property in a strong sense. And the catch with it is that a block cipher can never be a good PRFs if the number of queries exceed the square root of the domain size. And this is just by the birthday bound beyond the number of queries you will notice the lack of collisions and hence you can distinguish from a random function. So it's totally conceivable to have regimes for adversarial complexity where AES is a good PRP however it's a completely insecure PRF. So a question that was asked for the first time by two very close works, timely close works by Belare, Kroves and Roggaway and Hal Wagner, Kelsey and Schneier in 98 is whether one can find transformations that transform PRPs into PRFs while preserving security. And the very neat constructions that was first suggested by Belare and Hal in their paper without a proof is the extra of PRPs. So the idea here is that we obtain a PRF which depends on the key of the PRF for two block cipher keys and to evaluate it on input X we evaluate the block cipher on the two keys on the same input X and X or the outputs. I refer to this construction in the following as X or two because of its two key nature you can easily obtain a one key version of this construction by losing one bit of input length and pre-pending a zero or one to the input and invoking the block cipher on the same on these two inputs and X or in the output and we call this one the X or construction. Now these starting these constructions is far from just purely theoretical interest so there are practical schemes that are meant to achieve beyond barely secure security that rely on it. And in fact very recently Ivata and Soran have proposed a modification of the GC-MSIV authentication encryption scheme that relies on the construction for more secure key derivation from nonces. So it's important for this reason to have good bounds on the PRF security of the X or two and the X or construction. And the way we approach proving something like this a bound on the PRF security is typically by transitioning to a simpler to handle intermediate world where we replace the block cipher instances with independent randomly chosen permutations. It's easy to bound the distance between the two left worlds just by the PRP advantage of the underlying block cipher which is usually very small. And then the problem is to find a bound for distinguishing the two right hand side worlds and combining the two bounds we can get a bound for the PRF security. And so we typically focus on upper bounding this term which is the really hard problem where we have this idealized version of the X or two construction or respect to be the X or construction and giving such bounds typically is a pure information theoretic problem. We don't really know how to exploit computational bounds on the power of the distinguisher. We typically only exploit the number of queries it makes. So there has been work and technically very involved work on analyzing this quantity and analyzing these constructions. The first such result was by Belaradim Paliazzo who published in 99 an unpublished manuscript that gave a bound that essentially implies security up to almost two to the n queries. Unfortunately this paper has some minor errors that by now we know how to fix but remain unpublished and looks one year later published actually a bound which is inferior for the two key version of the construction and only guarantees security up to two to the n over three queries. So it was up to only until 2008 and 2010 that Pateran published two analysis of the two variants of the construction which essentially give optimal security or near optimal security. And in this work we actually give very similar bounds to Pateran's. And the point however, and why this is important is because Pateran's bounds followed by applying a very heavy hammer that Bart is going to talk about in his talk later on called Mirror Theory and the other results are proofs that are very involved, they exceed 50 pages in length and I think at least the second one is probably not yet published and not fully verified today. So in contrast, our proof follows as a much simpler application of our chi-square method and while for the one key version we pretty much match and get only slightly weaker bound than Pateran's which is tight, for the two key version we get a much superior bound where we have an exponent higher than one which gives us of course much smaller advantage for query regimes that are smaller than two to the n. So I want to give you an overview now how the chi-square method works and what it does and to do so we have to look at the general problem of a distinguisher that attempts to teleport two interactive objects or systems F and G by making queries to them and we want to upper bound its distinguishing advantage in doing so. Now it is convenient to look more closely at how an interaction between the distinguisher and the system looks like. So the distinguisher will proceed by making queries and receiving corresponding answers from the system F and make a certain bound and number of them say Q before outputting its decision bit. Now a neat thing is that for information theoretic analysis we can usually assume that the distinguisher is without loss of generality deterministic meaning it doesn't make any random choices. And these allow us to describe the interaction uniquely by just looking at the answers of the queries because if the distinguisher is deterministic you can reconstruct the queries themselves uniquely from their answers. They're always uniquely determined. So I'm going to denote the sequence of outputs obtained from the system by the distinguished D which we now fix as the transcript Y superscript F. Of course you can do the same for the system G and you're going to obtain a corresponding transcript as Y superscript G. Now here is the point why we want to look at transcripts. So what we know from previous works already is a simple observation is that if you want to upper bound the distinguishing advantage of D you can do so by the statistical distance of the transcripts. In fact this bound is tight if the distinguisher chooses its speed optimally. And so here the statistical distance is just one half the L one distance between the probability distributions and this observation has been exploiting numerous frameworks for indistinguishability proofs in particular in the H coefficient method. Here however we focus on a specific class of proofs which I call next output indistinguishability proofs. So what that means is that we consider settings where the distinguisher has performed a partial interaction say with the system F as observe Y minus I outputs here Y will be equal three. And now given this observation there is a well-defined probability distribution for the height output, the output to the height query. And we want to compare this next output probability distribution with the one we will get if you were actually interacting with the system G and we had seen the same partial set of outputs so far. And by comparing typically we will try for example to understand and upper bound the statistical distance between these next output distributions. If you can do that then every cryptographer will probably almost instantaneously then try to apply a hybrid argument to upper bound the overall statistical distance by summing up these next output statistical distances. So this is more of a probabilistic version of how you will state a hybrid argument you have to be careful. These next output distances depend on the sequence of previous output so they are random variables and therefore you will need to take an expectation here when you're summing up. But it's really the usual hybrid argument and one of the problems with this is that these hybrid arguments are known not to be tight in general. And for this reason here we suggest a different approach which is inspired by similar phenomena that have been observed in statistics where instead of upper bounding the next output statistical distances we upper bound the chi-square divergence of the next output distributions. You don't need to understand chi-square divergence in detail beyond the fact that it's some sort of weighted version of L2 distance which is asymmetric because the weights depend on one of the two probability distributions. But the key point is that when you do that we can actually give an alternative version of the hybrid argument which is what we refer to as the chi-square method which essentially upper bound the statistical distance with the square root of the sum of the expected chi-square divergences over the number of quarks. So it's very similar but you have a square root. Now of course I might be treating you here. I mean, this is just a different formula. You don't have any sense how statistical distance relates to chi-square. You know, are we getting anything new? And I want to give you one abstract examples before turning to a more concrete one. Assume for example that we can show that the next output probability distributions are very close in a point-wise sense. For example, for every y for every partial sequence of outputs you know that the probability that the next output in y at the ratio between the probability that the next output is y in F and the probability that the next output is y in G is very close to one is within epsilon in ratio. Now if we do the standard hybrid argument and you want to compute the statistical distance between the next output distributions given this information only you have to trust me it's a very simple calculation. All you get is an upper bound of epsilon over two. Put this into the hybrid argument you're going to get something that grows linearly in the number of queries once you fix epsilon. If you actually apply the chi-square method then if you compute the chi-square divergence given the same information again, little calculation you have to trust me here you get epsilon square as an upper bound much smaller and if you apply the refined hybrid arguments now you get something which grows with the square root of Q rather than Q. As Q gets large this is a huge improvement. Now I want to show you more concretely how this chi-square method applies to analyzing the XOR construction. In particular I want to look at the XOR 2 construction again it's easier to analyze in the paper we also have that analysis for the single key version as I will say shortly and what that means is that we are considering now our real system to be the construction instantiated with two independent random permutations whereas our system G is going to be the truly random function. Now if you want to apply the method what we need to do now is try to understand this next output probability distributions. So we imagine that we run the execution for a while there's some fixed distinguish rates implicit here making some queries it has made i minus one queries and we have obtained the first i minus one outputs this is this ball face y i minus one vector and now we want to understand what's the probability that the output to the next query the id query here again i is equal four equals small y. Of course this is a complicated distribution we will need to understand but the distribution is easy to understand in the case we were actually interacting with the random function there the probability will be uniform so it's one over two to the n. So this allows us at least to put into place a formula for the chi-square divergence we are going to upper bound for every i and our goal is to bound its expectation and we can in fact here nicely exploit the linearity of expectation and the problem is equivalent to upper bounding the individual summons in the sum over all outputs y so we can fix an output y any possible string and upper bound that term for any y. So an important point here again I want to stress it again this probability is quite hard to understand and it's the core of the proof but a need observation that was already pointed out in the work by Belarad in Paliazzo is that it's easier to deal with this probability if rather than conditioning on the sequence of i-1 outputs which implies a lot of different things we rather think of conditioning on the internal values that have been output by the two permutations call them u-y to u-i-1 and v-1 to v-y-1. If you actually condition on those then it turns out that it's much easier to describe the probability that we reach a particular output. It's not clear you can do that in fact but a simple application of Jensen's inequality tells you that if you're actually computing the expectation only of these summands we can do this by just not making the term smaller and we will be able to upper bound the right hand side again with something small. In fact there's this minor error here that was pointed out by Nandi and his student in the version we have in the proceedings so we are updating this in an e-print where we are claiming that these are equal they are not but you can prove easily this with Jensen's inequality. And so we are left with the problem of upper bounding this probability really and understanding it to upper bound the term. It turns out at this point this is not very hard. In fact all we need to do is we fix some values so we fix the values of the permutation output and we want to understand what's the probability that the i-t query takes the value y. And in fact to do this it's really enough to think how many possible values the i-t output of the first permutation call it ui can take which are consistent with the output still possibly being this fixed y. Remember we fixed y. And we can do that. So in particular if you look at the space of all possible two to the n values we see that ui can definitely not be equal u1 to ui minus one because it's a permutation so the i-t output of the permutation needs to be distinct. Also you have to be careful that there are some other values it cannot take because the value vi such that ui x or vi equal y is uniquely determined and you need to make sure that this is also consistent with the fact that pi two is a permutation. So there's a set of other values that ui cannot take. Okay, otherwise you will never reach y guaranteed. And so these are all of them, the two sets. These are the ones you can't have. Everything else is fine. And if we denote the intersect, the size of the intersection of these two sets by diy then it turns out you can use inclusion exclusion and give a very simple formula that you don't need to parse of the probability that the value y is taken condition on the internal values and internal outputs of the permutation. So the key point here is that you have to trust me but these are all simple calculations. And if you plug this in through some simple algebraic manipulations and some inequalities and some working around them, you can upper bond a statistical distance with an expression which is the square root of a double sum of variances of these random variables diy that I just defined. We're again remembering that these are nothing but random variables obtained by sampling twice independently, i minus one m b strings without replacement and then shifting them by y, one of the two set by y and looking at the size of the intersection. In fact, it's not hard to see that y does not even matter once you do that and you can really set y to zero. Again, this is not a hard calculation. You have to work a little bit but something you could ask in a probability class you can upper bound the variance and you plug it in and you get the desired bound. I mean the key point here is that I omitted a lot of calculations but I really want to stress that the actual steps on the proof I haven't cheated are exactly what I wrote here plus some calculations that require some pain but can be done pretty easily. And if you compare it with existing proofs that's a massive improvement. Now going towards conclusions there are more results we present in the paper. So we as promised we analyze the single key XOR construction. Here the analysis is more painful in particular because you have to deal with the fact that there's dependence between you don't have two independent permutations so there's a higher degree of dependence and this makes probability calculations somewhat more painful but the high level structure of the proof is very similar to what I have just shown you. The bound is weaker and the main reason for this is actually not even due to our technique but simply the fact that this is a construction that can never output zero and therefore this makes the bound q over 2 to the n-type. Another construction that we analyze in our work is the encrypted Davis-Mayer construction that was proposed by Colliatti and Soran here at crypto last year. And they propose a bound up to 2 to the n 2 to the 2n over 3 security. We improve this so 2 to the 3 quarter n security by using the chi-square method. And here I have to say that I have to advertise that there's going to be a talk later in this session which is going to go back to mirror theory to show a tighter bounds that implies security up to 2 to the n for the same construction. I also want to stress that it's not clear whether we can tighten our analysis using the chi-square method and it's still open and we might be able to do so. It's an open problem. And finally, we also study the swap or not cipher which is a construction that was proposed by Hong-Morris and Roggaway in the context of format preserving encryption. Here the result is a bit harder to state compactly but essentially using the chi-square method we prove better trade-offs between the number of rounds in the cipher and the achievable security level. So I want to conclude with a big disclaimer here which is that the techniques that I have used here from a statistical viewpoints are really nothing new. So some of these techniques appear in previous works and also special cases of these techniques have used even in computer science. So a notable example is a paper by Kamin Chung and Salil Vadhan from 2008. They're both cryptographers but the paper is not on cryptography. It's on analyzing hashing on block sources which uses a special case of our framework in their analysis. There is even an older paper by Stamm, not the cryptographer, a statistician who actually proved something that rephrasing our own language using the chi-square method will directly imply good bounds for the truncated random permutations and giving bounds. But the methods were somewhat similar or special cases of what we have. But the bottom line here and I think the lesson that we get out of this at least with the example of the extra PRPs is that when we have problems that are really technically hard, sometimes we are stuck on using frameworks that are well-established to give these proofs. And in many cases thinking a bit more broadly like maybe people will do outside cryptography and statistics using, for example, other metrics might help us solve the problem much more elegantly and much more compactly and with better bounds. Also not an entirely new observation even in this community. There's a notable example, a paper by John Steinberger from 2012 where he used Hellinger distance to prove bounds on key alternating cypher. So the fact that we can swap metrics in our type of proof is also not something new. And there are open questions. As I mentioned before, the encryptive is my reconstruction. It will be nice to close the gap with a shorter proof using the chi-square method. Another interesting question is on our two extra analysis. I advertise the fact that having an exponent larger than one of course gives us better bounds and a better decay. But it's not clear that the 1.5 exponent is tight and in fact, two might be more like a reasonable answer. I don't know, but we don't know how to prove it. And of course finding more applications is also something very important. Okay, this is everything I wanted to say. So I thank you for your attention. The paper is an e-print and I'm happy to take your questions.