 of collision resistance and hash functions and specifically the parallel complexity of such a domain extensions. Namely, how many rounds does the scheme, the final scheme have to invoke the underlying function f and as my students taught me, the best way to explain something is by way of examples. So let's review the well-known domain extension paradigms that you are all probably familiar with, like the Merkel-Dumgard scheme that takes as input a message divided into blocks and initial vector IV and applies f iteratively where each application of f needs to have the result of the previous application and therefore all in all, the output of the construction is the output of the last invocation of f and all in all we have a linear number of iterations where linear in the number of blocks of the original message. So this is a very, very good domain extension scheme but not when you consider the complexity, the parallel complexity. So maybe the Merkel tree is an improvement where in this scheme, f is applied in a form of a binary tree where at each level we apply f to the results of the previous level and finally the output of the scheme is the output at the root of the tree and this is a great improvement. We have an exponential improvement from being linear. We have a logarithmic number of rounds of course to the underlying function but we would like to be even better than that. We'd like to go constant. Can we have a constant number of rounds and Maurer and Tesaro showed how to obtain a two round domain extension scheme and their scheme is almost as good as you can expect. It's two rounds but it falls a bit short from going fully parallel, namely a single round of calls. It does obtain a stronger property which is called indistinguishability which is a notion of indistinguishability from random proposed by Maurer, Rene and Hollenstein. And one drawback of this construction for me is that I couldn't draw it on a single slide at least. And let's go back to our question. So the question is how parallel can you go? Can you go fully parallel? And this is a very interesting question especially when you are concerned with hardware applications of hash functions. And can we do it? So I won't keep you intense too long. I will tell you that the main result is affirmative. Yes, we can. We construct a fully parallel domain extension scheme that has a single round of polynomially many invocations of F and in the random oracle model it guarantees collision resistance and some notion of being random like which we call weak indistinguishability. It's very simple and it has a nice property that it preserved the algebraic degree of the underlying function F. So take any of your favorite function F to replace the random oracle. Our construction will have the same algebraic degree as your function. To compare with previous constructions all constructions have at least two rounds so they are going to be at least quadratic in the original function. They have a quadratic degree in the degree of F. So let's move on to describing the construction. I will describe a general paradigm for obtaining parallel domain extension fully parallel. You take the original message now don't divide it into blocks. You apply a deterministic function to it and we will call this function a code and you obtain C of M. You divide C of M into blocks and you apply F independently to each of the blocks. And finally you XOR the results of these applications and that's the whole construction. The output is the XOR of all the applications and it's very easy to see that there is only one single round of calls to the underlying function F. It's very easy to see that the construction is simple and the only question, so you take the message, you apply a code and then you apply XOR. So the only question is what should the code be? What do we need to require from the code? So in order to understand that let's define security in the random oracle model so here we consider an unbounded adversary that is bounded by the number of calls it can make to the function to the oracle F. And having this security definition in mind now we can think about what should be required from the code C. Maybe the code C is not required at all and how do we understand that? Again by way of examples, by way of attacking the scheme. So let me remind you that to attack the scheme the adversary is required to come up with two distinct code words that collide on the scheme namely if you XOR all the F applied to all the blocks of X and you XOR all F applied to all the blocks of Y you get the same result and the reason I can consider code words is because giving two colliding code words an unbounded adversary can recover the original messages without querying the oracle at all. Okay so let's try with a very simple code. Let's assume that we don't need the code at all maybe the code can be the identity every word is a code word and it's very easy to attack this scheme with this code just take any code word that has two blocks that disagree and take it again only interchange these two blocks and by symmetry of the XOR function no matter what F is these two code words are going to collide. So we actually found a collision so the XOR of all the results of F on all the blocks are going to be the same for the two code words because they are the same blocks and we actually found a collision without ever asking any query to the oracle so this is not good so let's require that the code is well ordered namely no block can appear in two different indices so this block if it appears in the first as a first block it cannot appear as a second block for any two code words. Okay so obtaining this is the well ordered obtaining it it's not too difficult but one way to do it is just by augmenting each of the blocks with its index so we cannot use the identity let's use the well ordered identity namely if each block is going to have a prefix that is any string and suffix that is the index of the code at the block and let's try to attack the protocol we take for as a pair for X we'll take a code word Y that disagrees with X on each and every block on the prefix and now we can ask ourselves what is the probability that these two code words collide? Well this is going to be very small it's two to the minus K where K is the size of the block and this is not going to be a very successful attack however the adversary can create using these two code words or these two N blocks it can create two to the N code words by simply taking all the choices of taking subsets of blocks from X and the remaining blocks from Y there are two to the N such possibilities and since two to the N is much larger than two to the K by the pigeonhole principle we're going to have a collision and the adversary only needed to query the oracle on two N blocks and this brings us to the next requirement you cannot construct too many words from a few blocks and another requirement is going to be that the code needs to have a large hamming distance between any two code words and this is a bit more subtle to see why it happens not getting into it so having these requirements in mind one may think where can we find such a code? Well luckily they exist they are called least recoverable codes and let me introduce you to them there are generalizations of unique error correction code with unique decoding these are codes that take the message M and return a code word C of M such that if for example Alice wants to send Bob C of M the blocks of C of M over a noisy channel such that most of the blocks are going to arrive safely but some alpha fraction of the blocks may be corrupted on the way still we are guaranteed that the decoding algorithm will uniquely find the original message M another extension of that is what's called list decoding where the decoding algorithm no longer recovers a unique message but rather a list of L messages and we are guaranteed that the original message M lies within these L messages finally the next generalization is least recoverable codes in this scenario when Alice sends the blocks of C of M to Bob Bob does not only get one block for each index but rather a list of possible blocks for each of the indices and we call the list of blocks for the index i, T i and now the guarantee is that the original message is alpha consistent with the union of these blocks namely that for at least alpha fraction of the indices i the actual block of the code word at index i appears in the list T i so this happens for at least alpha fraction of the blocks furthermore there are not too many such words such messages that are alpha consistent with the T i's as long as T is not too big and finally we have a recovery algorithm that given these T i's recovers all the big L lists messages that are alpha consistent with the T i's so let me formally define list recoverable codes a code C is alpha small L big L list recoverable if for all sets of blocks of size at most small L there are at most big L strings that are alpha list recoverable sorry that are alpha consistent with the T and this is a reminder what alpha consistent means and let me distinguish between two cases the case that alpha equals one is closely related to unbalanced expanders and was already used by Mauer and Tesaro when they constructed their two round scheme and in this work we require that alpha is strictly smaller than one and we show that it's actually necessary to go from two rounds to a single round of calls and this is strongly related to random randomness condensers and actually when one of the codes that we use is the code of Guru Swami Umans and Vadan that was presented in the context of condenser and it's based on the code of Pavarash and Vardi which is also a list recoverable and their code is alpha small L big L list recoverable where big L can be anything up to two to the K over C where K is the length of each of the blocks and C is some constant and you can see that small L the number of queries that the adversary can ask is practically as good as you can expect since on average every end blocks allow you to construct on average one constant number of strings code words and their code has a nice property that it is linear over the binary field so now that we know that the list recoverable codes exists we can say the converse of saying that we require them they are also sufficient for us so our main theorem is that if you give us a code see that it is alpha small L big L list recoverable and you plug it in our construction as this code then our construction is big L squared over two to the K collision resistant against any adversary that makes at most L a query to the Oracle and this is as good as you get since it's right with the bound of the birthday attack that is applicable to any hash function so plugging in the GUV code we can get many choices of parameters but one choice is to allow the adversary two to the K over four queries and in this case big L will be still two to the K over four and more importantly the probability that the attacker finds a collision is going to be smaller than two to the minus K over four and our construction is going to be degree preserving in the algebraic degree of F but so someone can say okay you got the collision resistance but you use the random Oracle maybe you should give us something else maybe like in differentiability I don't know maybe you can act like your random and if we were optimistic maybe it would be nice to hope that our construction as is is random like in some sense and this would allow us to use applications that are central in cryptography like the Fiat-Chamir paradigm which I'm not I'm going to just scheme through you take a public on three message proof system and you obtain a single message proof system and then you can apply Killian's and Fiat-Chamir and get a sublinear non-interactive argument for any NP language so this is really good and it turns out that we can prove some notion of indistinguishability from random we define it as weak in differentiability and it's a relaxed notion of the indistinguishability I mentioned before and it turns out that it suffices we can prove we prove in the paper that it suffices for the above applications so this is a nice addition and to sum it all we have all the applications that require collision resistance and all the applications that I just mentioned that require some form of indistinguishability from random and all these applications are going to have a low round complexity specifically just as an example the sublinear commitment is going to be one parallel so is the Fiat-Chamir paradigm and the last one since it uses both is going to be two parallel so I just need to know that if you compare other parameters for example with the Merkel tree then we have some loss in parameters which depend on the number of queries you ask you allow the adversary to make so to conclude we have the first fully parallel domain extension scheme and in the random oracle model it guarantees collision resistance and weak indistinguishability it is simple and it is degree preserving and it has all the applications that I just mentioned and a few open questions well it would be nice to have a cleaner implementation of F replacing the random oracle and maybe even a low degree implementation which will make use of the degree preserving property of our construction however we don't really know what are the sufficient and necessary requirements from such an underlying function F we do know that collision resistance of F is not sufficient nor necessary for the collision resistance of our construction and finally least recoverable code seem like a very nice primitive to have it seems like they should have other applications in cryptography and that's it thank you very much