 Hello everybody, my name is Etay and I would like to tell you about my paper, Tight Time Space Lower Bounds for finding multiple collision pairs and their applications. So I'll start by considering a very basic birthday problem. So some were given oracle access to a random function with a domain and range of size n. And the goal is to output some colliding pair meaning x and y that are different such that f of x equals f of y. So this can be done using t queries to the function or time complexity t such that t squared equals n. And this is known to be tight and it is basically the birthday bound. So now let's consider a generalized variant of this problem. So the problem the setting is essentially the same but now we're given also this parameter c. The goal is to output the c distinct colliding pairs x1, y1, xc, yc and this function f meaning that each pair for each pair x i is different than yi but they have the same image under f. And also consider a variant of this problem where here we are actually given two access to two random functions and again a parameter c. And now the goal is to output the colliding pairs c colliding pairs between these two functions meaning that for every i f1 of x i equals f2 of x. And these variants are essentially equivalent and I'm not going to really distinguish between them throughout the rest of this. Okay so what do we know about this variant about this problem basically it can be solved in time t such that t squared is about c times n. Now if you plug in one for the c here then you get back the birthday complexity and these parameters are actually known to be tight and this is a variant or generalization of the birthday bound. So after now it was not very interesting so let's make it a bit more interesting by adding some restrictions so let's assume that the algorithm, the space of the algorithm is restricted by s bits. So now it's no longer trivial trivial, because it seems like in order to find these c distinct colliding pairs, you kind of have to maybe store the outputs of f, right. And if your space is restricted it's not clear how to do this. So what's the best known algorithm for this problem. It's all parallel collision search or PCS and it's a classical cryptanalytic algorithm, published by Warnorscht and Weiner in 96. And it basically gives you the optimal trade off assuming s equals c. Okay so if your amount of space is equal roughly to the number of collisions that you want to find then you can get this optimal trade off. The question is what happens when you have less memory density? Well in this case you can generalize the PCS algorithm and it gives you this trade off t squared s times s equals c squared times n. And you can see that if you plug in s equals c here you get back the straight off which is known to be optimal. However, when s is smaller than c it's no longer clear that this is actually optimal. So I want to give you a sense of how this parallel collision search algorithm works and I'll consider a very degenerate set of parameters. So let's assume that your space is very small say 1 or log n bits something like that and you want to output n collisions. Well what can you do in this case? Well what you can do is basically run some memory less cycle collision finding algorithms such as classical Floyd's algorithm. And it's going to find you one collision in about the square root of n time. So if you repeat this about n times then you would get roughly the n collisions that you want. You would be able to output them and the time complexity will be about n to the 1.5. And the generalize sorry the parallel collision search algorithm is basically a generalization of this algorithm for a larger amount of memory. Okay so the straight off that you get basically is this by PCS and the question is whether it is optimal. So before diving into this question I want to motivate it. So why is this even interesting? So notice that when s is smaller than c which is the parameter range we're considering then you can actually you cannot store the output. Because you don't have enough space so it's interesting in terms of applications and it turns out that it is. So I'll give you an example which is actually quite important and so the application will be breaking double encryption. Okay so assume we're given block cipher double encryption so in order to encrypt p you encrypt it under k1 and then the result will encrypt under k2 and the ciphertext is denoted by c. So let's assume for simplicity that all parameters are taken from the same range so from the same set of size n. And the setting is that you're given several plain text ciphertext pairs encrypted under the same key and the goal is to recover the key. And the best attack on double encryption is the meet in the middle of tech and it gives you essentially optimal time complexity of n however it requires a large amount of space. Basically what you do is as you encrypt p1 under all possible values of k1 and store these results in the middle and then you kind of try to match from the decryption. Okay well however this requires a large amount of space and what happens when your space is limited say one or log n bits and so forth. So what you can do in this case is you define two functions so the function f1 is defined as far as going to take as input the key k1 and it's going to be defined by encrypting the fixed plain text p1 under k1. That's f1 and it corresponds to this downwards arrow here and the function f2 is going to take as input k2 and it's going to return the decryption of this fixed c1 under k2. So it basically corresponds to this arrow here. Okay so now basically the problem reduces to collision fighting between f1 and f2 because each such collision gives you a key candidate that encrypts p1 to c1. However you're not sure that this candidate is correct because you have to test it under the remaining keys. Sorry under the remaining plain text cipher text pairs. And in order to analyze this you have to estimate how many collisions you have to find between these two functions until you find this right key. So essentially it turns out that you have to find almost n collisions between these two functions. And the reason is that you cannot kind of tell the keys apart just by looking at p1 and c1. You have to test each one of them on the remaining plain text cipher text pairs so you have to find basically n collisions. To reduce the problem to a problem of collision finding where the c equals n. Okay we call that s is small it's like, I don't know, one or log n bits. And now if you plug in these parameters into the time space trade-off of parallel collision search we get this formula. So t is roughly n to the 1.5 and this is the best known attack with these parameters on double encryption. And what's important here is that the space is much smaller than the number of collisions we want to find. So this demonstrates that this is actually an interesting range of parameters. So going back to the question is this trade-off optimal. So this question has several applications. So if this trade-off is not optimal then what you can do is you can improve the best known time space trade-off for break and double encryption as I have just shown you. And actually there are additional applications to this. If this trade-off is not optimal you can actually improve the trade-offs for various meet in the middle type of attacks. At least in some range of parameters like break and triple or multiple encryption, some dedicated meet in the middle attacks on specific crypto systems and an application of solving a generalized birthday problem, solving subsets some problem and additional applications. So if this trade-off is not optimal this would have kind of a far reaching consequences. And improve many time-space algorithms for well-known problems. So summarizing our results, the first result, our first result is that we proved that this time space trade-off or collision search is actually optimal. Again this is true for all parameters including values of s which are smaller than s. And the conclusion of this is that the trade-off algorithms for all the applications that I discussed cannot be improved by more efficient collision search algorithm because it does not exist. However it does not mean that you cannot improve the trade-offs for these applications by some other algorithm. So that would be of course very interesting to see if we can improve for example the time-space trade-off of double encryption or rather prove that we cannot improve it, that it is optimal. Unfortunately, if we manage to improve that the time-space trade-off of breaking double encryption and the additional applications is optimal, it would overcome some long-standing barrier in complexity theory. Basically it means that you expect this to be very hard to prove the optimality of this trade-off. So at this stage it should not be very clear why we can prove the optimality of the time-space trade-off for collision search but not for these applications. So I'll get to this at a later stage. However, I should mention that this barrier only applies in unrestricted computational models. However, if you restrict the algorithm in some ways then sometimes you can prove time-space lower bounds and this actually brings me to the second result. So we focus on the application of breaking double encryption and show that under some restriction the best known time-space trade-off is actually optimal. This is the second result. So in the remainder of the talk I'll focus on these two results and I'll start with the first one. Okay, so we want to prove that this trade-off is optimal for the collision search problem. And in order to do so what we do is to adapt the framework of Borden and Koch, which was published already in 1982, and it was used to derive several time-space lower bounds for interesting applications such as sorting, matrix multiplication, etc. However, this seems to be the first time it is used in the domain of cryptography, which is kind of interesting. So I'm going to give you a very high-level intuition of how the proof works. So let's assume that we have an algorithm that outputs collisions and it runs in time t. And we're going to divide this interval, long interval of t into L short time intervals, each one of length t prime, which is t divided by L. And we say that algorithm makes progress in an interval if it outputs c prime, which is c divided by L collisions in an interval. And consider a mini problem of outputting c prime collisions in time t prime. And the first stage of the proof would be to prove that any such many algorithms succeeds with a very small probability, which is bounded by some small epsilon. And the probability here will be over the choice of the random function f. Notice that up to this point, actually, this proof is completely independent of the memory of the algorithm, which will come into play in the second part. Okay, so in the second part, I noticed that in order to output c collisions, overall, the algorithm must output c prime collisions in some interval. And I think that some mini algorithm must output c prime collisions. And notice that actually a mini algorithm in some interval is defined from some initial memory state of the algorithm. Right from some in it from some memory state. Well, how many memory states can there be the algorithm only has as bits of memory. Probably at most two to the s such states. And so that can be at most two to the s kind of mini algorithms defined by this big algorithm. And so if you take a union bound over all these mini algorithms, we know that each one can succeed with probability at most steps and we have at most two to the s such algorithms. And you can bound the total success probability of the algorithm by these by this formula two to the s times episode. And so if you prove that episode is much smaller than two to the minus s basically you're done. You bounded the success probability of the of the of the algorithm. And this is actually possible. This is what what I do in the paper. And there are a few complications which I don't have time to discuss. Obviously, we'll have to take a look at in the paper to see how exactly this is that this was just intuition. This is not formal and anyway. Okay, so let's move to the second part. So now let's consider the applications. So why cannot we cannot use this framework to prove optimality for example of double encryption of the attack on double encryption and so forth. So notice that we in the proof we kind of relied in a very strong way on the fact that the output of the algorithm is long, like the number C of the collisions should be large. So as we divided out there like the algorithm into intervals and we said that the algorithm has a small chance of outputting as kind of a fraction or chunk, a small chunk of the output in an interval. However, if the output is short, then you can no longer do this. So it's not clear how to measure the progress of the algorithm with the short output to words or solving the problem. So basically, there is a long standing barrier in complexity theory, which states as basically very hard to prove some meaningful by some definition of meaningful time space, trade off lower bound for short output problems and general computational. So if you restrict the algorithm in some way, then there is some chance and this is what actually what we're doing the second result. So in the second words, the second result focus on on breaking double encryption. So the algorithm for for this problem is based on parallel collision search. Okay, actually, I'll kind of show you how what how the reduction works. And this is the time space trade off that you get. And what we're interested in, of course, is showing that this is optimum. Okay, so is this optimal. And by what I've just said, we cannot really hope to prove this unconditionally so we have to make some restriction. And the restriction that we make is as defined, we're going to define new computational model and we're going to call it the post filtering model. So in general the post in the post filtering model algorithm is going to get full access only to a part of the input. So the access kind of to the remaining part is going to be respected by some post filtering Oracle, basically for this to be meaningful, given the first part of the input, the show the exist many equally likely potential solutions that algorithm cannot tell a part just by looking at the first part of the input and algorithm kind of has to feed this these potential solutions to the post filtering Oracle which gets as an input a potential solution and outputs through let's say if this is a right solution for the entire problem. And if there are many equally likely solutions then the algorithm has to produce kind of many, all of these solutions to the post filtering Oracle. So now that the model forces a reduction from a short output problem to related long output problem and for these long output problems we already know how to prove time space lower balance as I have shown. So let's be more specific and focus on the problem of breaking double encryption. So we call that the best known attack for this concentrates on P1 and C1 and analyzes this despair, but uses the remaining plain tech cipher tech spares only for post filtering purposes, okay to post filter key suggestions. So what is gone. So how is the post filtering model going to be defined for this specific problem. Well the algorithm is going to get access to the block cipher of course the plain text, the first plain tech cipher to experiment. And the addition is going to get an access to post filter or code which gets as an input again and returns one only on the correct. Okay, so notice that this captures the parallel collision search based attack, but also various generalizations of this which we maybe haven't done. And we prove that the best known time space trade off attack for double encryption the best PCS based one is optimal for any post filtering attack on double encryption. And this is, this is kind of nice because the model is clean it abstracts the way the lower level collision search problems so there are no collisions defined in this model, kind of kind of clean model. And the conclusion from this, from this result is that in order to improve this right off if you want to improve it, you must somehow combine information for multiple plain tech cipher tech spares in a non trivial way and analysis. So this kind of gives you a hint on what you have to do if you want to improve this. So let me conclude the talk so I've shown you that the best known time space trade off for the collision search problem, best known time space trade off all the collisions. And I presented the post filtering model which is a restricted model of computation and showed that under this model, the best known time space trade off algorithm for breaking double encryption is optimal you cannot do better under this model. So in the future it would be interesting to extend this post filtering model and prove time space lower bounds for additional problems. And it would be interesting to try to bypass this model as I told you maybe a minute ago, and try to improve algorithms for example to improve the best known algorithm for double encryption, which would be of course a very, very nice result. And currently we don't know how to rule this possibility out with the current techniques. So I hope you've enjoyed this, this talk, and I encourage you to take a look at the paper. And I thank you very much for your attention.