 Hello everyone, this is Dror Chawin from Tel Aviv University in Israel. Today I'd like to present new lower bounds on the time memory trade-off of function inversion. This is joint work with my thesis advisor Iftach Heitner and fellow researcher Noam Azor. First, let's describe the problem. Function inversion is basically given a function and an image of the function y, find some x which is mapped by f to y, basically find the pre-image for y over f. To do so, we are allowed to have some oracle access to f and we also have access to side information a. So you can actually picture this problem as having two phases. In the first phase, the function is given to a preliminary stage, a pre-processor, which then outputs the side information. The side information is given to the online phase, the decoder, which is then given a y, an image at random, and can communicate with the function f using an oracle, and hopefully outputs a value in the pre-image of y. We only require this to succeed with high probability of a random choice of f and y, where f is chosen uniformly at random from the set of all functions from n to n, and y is simply the image of a uniformly chosen input to f. Now, we don't assume anything about the computation is not bounded, basically. In particular, the pre-processor is unbounded in time and space. However, we do have two interesting parameters in this question, which do provide certain lower bounds. First is the advice length s. This is basically, this function is a lower bound on the amount of space needed to perform the inversion. And we also have q, which is the number of queries performed by the decoder in the online phase. And this is an effective lower bound on the time requirements of the algorithm. Another variant of this problem exists, which is non-adaptive function inversion. The basic problem is the same except that now all of the queries must be performed in bulk at once. To do that, we introduced an intermediate phase, the query selection phase. The query selection algorithm also has access to the side information, the advice string, as well as to the value y, which we wish to invert. It then outputs a series of indices, which are the queries for f. Next, in the online phase, when we wish to invert y, the decoder knows y. He also knows the advice string a, and it is also given the responses, the answers of the queries by f. And again, as before outputs a value, which is hopefully in the pre-image of y. Now, every non-adaptive algorithm is by definition also an adaptive algorithm. So naturally any non-adaptive upper bound for inversion also applies to adaptive inversion. And the opposite is also true that any lower bound on adaptive inversion naturally extends to non-adaptive inversion. Next, this problem is interesting because of several reasons. First of all, many crypto systems in use in practice today, actually, they are analyzed under the assumption that the underlying hash or cypher function are ideal. They behave like a random function. So basically any attacks which are possible for the function inversion are also possible for real case uses of cryptographic hash functions and cyphers. Next, we also have black box separations, which basically can treat the one-way function as a black box. And the function inversion problem also explores the capabilities, what we can or cannot do using this model. Finally, another interesting result that has been recently published is non-adaptive lower bounds can imply new Boolean circuit lower bounds, which are so far out of our reach. So that can also be interesting for some of you. Now, let's start by reviewing what we've known all along. Upper bounds, that is, algorithms, we've got the trivial inverters. For example, I can simply query the entire function until I find the good pre-image and I don't even have to use advice for that. Or I can do the opposite. I can just keep the entire function in my advice string, but then I don't have to query the function at all. Then Q can be zero, basically. And I can also do a trade-off between the two. For example, I keep half of the function in the advice and I query the other half if the need arises. Now, in particular to the adaptive case, we have one very successful algorithm first described by Martin Hellman and later on developed on by Fiat and O. This allows actually nice trade-offs between space and time, and this admits the possibility of both space as well as time being sublinear in N. And in particular, for the case of a permutation, an even better trade-off is achievable where both S and Q can, for example, be square root of N. However, for the non-adaptive case, absolutely no non-trivial upper bound is known, which is interesting. Now, for the lower bounds, we have a single interesting bound. This is Yao's bound. This bound is actually tied for adaptive permutation inversion as described in the previous slide. And of course, this bound applies actually to both adaptive and non-adaptive inversion. In the adaptive case, there is still a gap, a certain gap between the best upper bound for functions, not for permutations. And we're actually not sure which one is the correct bound. And that remains to be seen. But for the non-adaptive case, there is a huge gap. And we believe this gap exists because, as we've just mentioned, any new lower bounds which improve on Yao's bound actually imply new circuit lower bounds, which are groundbreaking. Alright, so let's review our results in this paper. To start with, let's see how we formulate adaptive inversion. An inverter, we describe it as an algorithm pair. We have a preprocessor and a decoder. We start with the preprocessor, which receives the function and outputs an advice string. Next, the decoder receives the advice string, the element to invert Y, and can make at most Q queries to F. We say such an algorithm has high success probability if over random choice of F and Y, it succeeds in at least half of cases. But this constant is not actually important, it's quite arbitrary. Now, to produce a lower bound, we made a further assumption on how this algorithm operates. Suppose such an algorithm has linear advice. What does it mean? Linear advice basically means that given any functions F and G, suppose we have the advice of F and the advice string for G. We can use both strings, we can combine them to figure out the advice for another function, F plus G. F plus G, that is coordinate-wise group operation, could be anything actually, as long as it's coordinate-wise. And the plus over on the right side could actually be any operation. It doesn't have to behave like addition, as long as we can figure out the left-hand side from the two elements on the right-hand side. So our bound for this case is suppose there is such an algorithm, an inversion algorithm, then it has linear advice and it's also successful. Then basically, either its advice or its queries must be at least order of N over log N, which basically matches the trivial upper bound. Alright, now we prove this by reduction from test set disjointness, you will see this briefly. We have another case, very similar to the first, it naturally extends to the case of additive advice, which is basically any case where given the two functions F and G, we can figure out a way with the little communication to compute the advice of the sum F plus G. So basically the previous theorem is just a private case of this one. Here it states that either the communication required to compute the combined advice or the number of queries must be at least order of N over log N. Now our next result pertains to non-adaptive inversion, and this time the constraint lies on the decoder phase, not the preprocessor. But first of all, let's define, let's formulate non-adaptive inversion. Here we use a triplet of algorithms. We have the preprocessor, the query selector and the decoder. The preprocessor as before receives the functions and outputs and advice string. The query selector receives the advice string and the element we wish to invert and outputs a series of queries, which are then given to the decoder. It is given the advice string and the element to invert and in conjunction with the query results must output an answer. And this time we concentrate on a very limited kind of decoder and a fine decoder. That is, given any fixed values for Y and A, the output of the decoder must behave like an affine function over the function F if we treat F as a vector. Here we assume that this all happens over a finite field. And here the inner product between some vector and F where F of the function simply treated as a vector. Now our result for this kind of decoder is basically that regardless of how many queries it uses, and given the caveat that the size of the field must equal N, then the advice used by the algorithm must be at least order of N. Which again basically closes the gap with the trivial upper bound. This is actually quite a degenerate inverter, it can do much, but the reason it is interesting is because it actually allows us to modify it and then yield bounds which are more interesting for more complex decoders, which is done in the next slide. Now onto our final result, which is similar to the previous one, except the slightly more advanced decoder. Here our decoder is not an affine decoder, but rather an affine decision tree decoder. This time for any given Y and A, the decoder outputs a function which behaves like a decision tree over values of F, over affine functions of F. For example, you can see here there is just like a toy decision tree. The path is determined by the values of affine functions of F, and these functions, these alphas and betas are all determined uniquely by Y and A. This is interesting because D here is the depth of the tree and using different depths, we can actually create an entire spectrum of complexity of the decoder. For example, when D is equal to 1, that is basically, that reduces to the previous case, just having a single affine computation over the values of F and can't do much. However, if I increase D by as little as 1, now D equals 2, that is already enough to perform multiplication and also a single pointer jump between the different queries. And if I take it to the extreme, for example, D equals Q, then this basically can describe any arbitrary non-adaptive inverter, which is the general case and the most interesting. And the result is as follows. Given such an inverter with high success probability and a D depth affine decision tree decoder, then if the field size, again, must be of size n, then we have some interesting lower bound on the size of the advice. And this actually, we can split it into separate cases. For example, if the number of queries is linear in n, then you can say that S must be at least order of n over D log n. And if Q is sub-linear in n, then S must be at least n over D. And in the case where D equals Q, that is the arbitrary non-adaptive decoder, we find that in this case, actually, this simply reconstructs Yao's bound. And just one comment about the field size. This result can still hold for somewhat smaller fields than n. With some tweaking, we didn't really address this in our paper. While for much smaller fields, this isn't an interesting question in the first place. For larger fields, we believe this does hold, but we have yet to show it. All right, so now let's go on to prove our first theorem of linear advice adaptive inverters. Now let's just recap for a moment what the bound was. We require that the pre-processing phase behave under this constraint where the sum of two devices equals to the device of the sum of the functions. And the sum could be any coordinate-wise sum between the functions. And we use reduction from a classical problem called set-disjointness. I guess you know it, but I'll just describe it briefly. Given two parties, Alice and Bob, which is given an input set, x and y. Now, two parties may communicate a certain number of bits, and they wish to answer the question, are our sets disjoint? And the seminal result of Razbrov showed that any randomized, even randomized or not randomized protocol which decides set-disjointness must use at least order of n bits of communication between the two parties. And this holds even under the more specific case where both sets are exactly of size n over 4 and the intersection is either empty or contains just a single element. And this also holds even when we allow a small error. So let's see our reduction. So here we assume these two parties, they each have an input set, x and y, they're interested in figuring out whether or not they are disjoint. And in their possession, they have a linear advice in the version algorithm C. Now, first of all, Alice uses her own set x to define the function fx. This function is defined as follows. For each i in the set, fx of i is 0. For each i not in the set, fx of i is chosen uniformly at random. Bob does the exact same thing with his own input set y. Now Alice computes the advice string for her own function and sends it to Bob. Bob computes the advice string for his own function, adds it to the advice he received from Alice, and then he basically has in his possession the advice for the sum of the functions, even though he doesn't have the actual access to this new function. Now, the next step, Bob takes our inverter C and emulates it and tries to invert the new function f, which is the sum of both individual functions at the value 0. So now we're trying to find a value which is mapped by this f to 0. Now, while Bob is simulating this inversion algorithm, every time he receives a query from the algorithm, he simply asks Alice. Alice replies with her own value and Bob adds it to his own fy of k, which is by definition equal to f of k, and then he feeds it back to the algorithm. Eventually, the algorithm finishes and outputs a value w. Now, Bob communicates w to Alice and they both check whether or not w exists in the intersection between both sets. And of course, if it does exist in the intersection, then they say that the sets are not disjoint. If it does, if it is not in the intersection, then they say the sets are disjoint. Now, before I convince you that this does make sense and that this is correct, let's just see how much communication we've used here. Now, first of all, Alice sent s bits for the advice. Next, for each query, we had two log n bits, so 2k, 2q log n, and it doesn't require much to figure out the last stage. Okay, now let's see that this is actually correct. First observation to make is there is only one side error in this protocol. For example, if there is no value, which belongs to both sets, then there will never be a w that will make Alice and Bob output that sets are not disjoint because of the last stage. So the error can only be one way. The second key observation is that suppose there exists some element i in both sets, then by definition f of i must equal 0. Why is that? Because i belongs to x, so fx of i is 0, fy of i is also 0, therefore the sum of both, which is f of i, is also 0. And the rationale is that now suppose Bob manages to find a preimage for 0 over this function f, and hopefully we will get this element i, which exists in both sets, and then we can figure out the sets are not disjoint. Now, suppose what if c fails doesn't give a correct answer? Well, we can simply amplify by repeating this protocol several times. Next another problem, suppose c does return a valid preimage for 0, but does not return that correct i just a random other value, which is possible, since these functions are quite random. Well, this doesn't happen with high probability because the preimage of 0 for this function is actually expected to be of constant size, even as n grows. That means that by repeating this protocol sufficient number of times, we have pretty good probability of coming up with the single correct i, which does exist in both sets. And of course, this is treated more formally in the paper and it's quite simple. Next, another few problems is that the function f should behave completely uniformly, which doesn't right now. Also, we always wish to invert at 0, and the answer is also determined by x and y, if the sets are not disjoint. And this can also be easily fixed. You can simply open the paper, but it's not very interesting. And finally, we also need to prove that every iteration of the protocol is independent of the others. And we also show that in the paper, we do some very simple tricks to ensure that happens. So in total, what happened here is we assume the existence of such an algorithm c, and we've employed it to solve the set disjoint as problem using not that many bits of information. And this follows by raspberry of that, either the advice or the number of queries must be at least order of n over log n. Alright, so now just to finish off, I'll give a few words on our second result on non-adaptive inverters with fine decoding. Just to recall the result, given such an inverter where the decoder behaves like an affine function of f and has high success probability, it requires at least a linear amount of advice, which is almost a description of the entire function. And the main lemma on which the proof hinges is as follows. We use an inverter D, which has a fixed advice or a zero-length advice, but it still has an affine decoder. And we define a few random variables. f is the function chosen for inversion. y from one to whatever are various challenges which we try to invert using D. And the x's for any i are these outputs on the various y's. So we choose a function at random and we have this zero-advice inverter and we give it different y values and each time it outputs an x value. Next, we define the event zi, which means that from one to i, we've succeeded doing all of the first i inversions, actually, all of them succeeded. D managed to find a proper preimage. And the main lemma itself is as follows. Even when conditioned on previous i-1 successes, there is still not such a good chance that we are able to invert the i's challenge. And this constant is not very important. It's quite arbitrary. And this holds even for quite large values of i. And if qi, if the queries, basically, are independent of the elements asked at each stage, then it's quite easy to use an information theoretical argument or a probabilistic argument to show that it's quite obvious this lemma. But they are dependent and this is actually the hard part. And we also mentioned that for adaptive inverters in general, this is false. And our proof, it exploits the linearity of the decoder. Basically, what it does is finds a linear equation over the random variable f, which is implied by this conditioning. So this conditioning is actually translated to a linear equation and a fine equation actually over f. And next we show that this equation does not supply enough information about f to find any particular preimage for yi, for the i's challenge. Right, and just to finish off, let's do a short summary. We've shown new bounds on adaptive inversion with linear or additive advice, non-adaptive inversion with a fine decoding and non-adaptive inversion with a fine decision tree decoding. Some main open questions that remain is how can we deal with arbitrary non-adaptive or adaptive inverters since there's still a huge gap for the arbitrary case. Also, how about random permutation for non-adaptive inverters because in the adaptive case, we do find that there is a difference. And finally, also to find better time memory trade-offs for other cryptographic primitives, which can also use this. So thank you very much for listening and you're welcome to check out our paper. I hope you enjoyed it and that it's clear for you and have a good day.