 So good morning everyone. First I have an announcement if you're a woman who takes off her necklace when you shower then come to me afterwards, especially if it's around necklace. So see me afterwards. I've been given the task of being the chair for the first time when the sessions are really short so I don't know what that implies of how Phil perceives me but I'll try to do my best and then I was also given a session with very hard names to read. So the first talk is the left over hash lemma revisited. It's by Bos Barak, Evgeny Dodis, Hugo Kravchik, Olivier Pereira, Christoph Petzak, Francoise Xavier Standard and Yu Yu, and Evgeny is going to be giving the talk. Okay, Evgeny, I'm pressing the, oh you're still not done. Hello. Yeah. Evgeny, don't forget your time there. Okay, thanks. Okay, so I'm going to be talking about left over hash lemma and revisited so I'll define all the terms. So let me start by talking about imperfect random sources. We all know that randomness is very important in many areas of computer science but it is especially important in cryptography because secret keys have to be random and unknown to the attacker. Unfortunately in practical situations we often have to deal with imperfect sources of randomness such as physical sources, biometric data, but also even if the data was originally random there is this whole class of side channel attacks, another kind of forms of leakage that the attacker can get about your secret key such that conditioned on the attacker's partial information your key which was originally random is no longer random from the perspective of the attacker. So and also another problem which is also very very important is when we do for example Diffy-Hellman key exchange or some kind of key exchange based on public key infrastructure we get some kind of group element and then we need to condense it to a bit string to be used for AS. So it is very important that we need to deal with such sources and of course we need to ask ourselves what is the minimal assumption that we need to have about our sources that we have some hope at all to do something and it turns out that the right assumption is to assume that the source has entropy and the right form of entropy in this case is called mean entropy which says that for any outcome little x the probability of hitting that particular outcome is at most 2 to the minus m. So this letter m is going to be called the entropy of our source and the big question of course is can we extract nearly perfect randomness from such realistic imperfect sources so this question is very well known it received a lot of attention and the right primitive is called randomness extractor it does exactly what it says you know it takes a source in perfect source x and now put a random string R unfortunately this is too good to be true so it turns out that to make it work you also need some short random seed s to go along with x and then you extract this key R and of course you might have the question if I already have a random seed s why don't I just use s the point is s is public x is secret so we want the extracted bits to be random even if s is completely public unknown to that actor so so even condition that the extracted bits should be random and it has it's a very important primitive it has many usages well beyond key derivation but for perspective of this talk let's really think about our basic problem having perfect source in want to derive in a key for a yes for example okay so very briefly let's go through the parameters there are a lot of them but all the letters will be introduced on this slide so first mean entropy m already defined output lengths we would like to extract as many bits as possible well hopefully at least 128 for a yes but here for most of the talk I will be talking about an equivalent measure which is the entropy loss which is a different between the entropy that we have in the number of bits that we extract so we have 100 bits of entropy and extracted 80 bits we lost 20 bits notice this quantity could also be negative sometimes you know we really have only 100 bits like you know some small elliptic care for something like that but we want to extract 128 bits maybe we are not going to lose all the security sometimes this quantity could be negative if you're really unlucky then there is this error epsilon so this is a statistical error which is the best advantage in computation and bounded attacker can have in distinguishing extracted bits from random bits and they define security parameter K which is log 1 over epsilon and finally there is a seed length and how many extra bits we need in order to achieve this task so the optimal parameters turn out to be pretty good the seed lengths is logarithmic is essentially linear in security parameter and then triple loss sorry for this really annoying green color but the entropy loss is to log 1 over epsilon and the question of course is can we make them efficiently because also existential results it just shows that a random function achieves this kind of very good parameters and sometimes you know when a question is asked in 96 you have to look back at some yesterday leftovers to see maybe we already know the answer and it turns out we already know the answer at least some answer and this is given exactly by the leftover hash lemma which is really a beautiful thing it's a amazing thing I don't know if the inventors of the lemma anticipated the amount of applications it would have but essentially it says that universal hash functions are good extractors so let me briefly remind you what universal hash functions are essentially it's a familiar function such that for any two distinct inputs the probability if you give me two distinct inputs and choose a hash function at random the probability of collision is as low as it can get which happens to be one of the two of them so leftover hash lemma says universal hash functions are good extractors so let's look at the parameters so the entropy loss happens to be two log one of epsilon which is optimal so this is great news unfortunately the seed lengths it turns out even though the definition of universal hash function is so simple they must have seed lengths at least linear in the source length so we have some physical I don't know some kind of noise which is like one megabytes long or something like that we need a seed lengths of one megabyte or something like that so this is not so good and but on the other hand despite this there are a lot of advantages it's very simple to construct universal hash functions the order of magnitude faster than cryptographic hash functions so they're like very very fast if you know what you're doing and they also have some nice algebraic properties which were recently used for designing public key cryptosystems resilient to leakage and so on but the cons unfortunately the first cons is kind of forbidden the seed lengths is large but also argues that even the entropy loss is pretty large so I'm going to explain both of those disadvantages of the leftover hash lemma in a second this is roughly speaking the two parts of our talk we're going to improve both of those disadvantages of this beautiful invention of leftover hash lemma so let's start with the first one improving the entropy loss like this guy has pretty severe entropy loss so we'll try to kind of improve it and the setting of leftover hash lemma so let's ask two questions the first question we'll ask it is it important because I already told this like such a small entropy loss who cares well the answer is definitely yes especially if you talk to practitioners because many sources of randomness they just do not have this two log one of the absolute say hundred sixty or two hundred bits of extra random randomness lying around like biometric physical sources and even when we control the source like if you want to do a different helman key exchange on the elliptic curve so the smallest possible size we can control the size but we still need the size to be large enough to extract randomness so here we don't want to like unnecessarily increase you know the size of a elliptic curve like 320 bits or something like that so this is like very important even when we control the source in some sense and also the other motivation is in practice we all you know as a theoretician we all love leftover hash lemma but in practice people are just going to use some you know hash function like shave without understanding that it wasn't designed to be randomness extractor we had a paper at crypto 5 about the soundness of this idea and this idea you know if you really make this heuristic random or a class assumption such extractors actually very good I will put a formula a little bit later but just trust me I put it in quotes that cryptography this idealistic non-existing cryptographic hash functions have no entropy loss so we have a steep competition we have a triple loss to log one of epsilon this guy have no entropy loss so as an end result practitioners of course you know even as all of torque has lemma is so beautiful they just say you know we don't care let's just use cryptographic hash functions and you know don't worry about anything everything is fine so our goal at least initial goal would be to reduce is probably reduces to log one of epsilon entropy loss of leftover hash lemma and to make it closer to no entropy loss of heuristic extractors so the second question is a well it's a great goal you want to compete with random oracle good luck but you know you just told us that this is optimal so how can we improve to log one of epsilon if it's already optimal it turns out the proof of optimality of this entropy loss is if you care about all kinds of possible computation and boundary distinguishes do kind of everything in cryptography actually we care intuitive about a restrictive class of distinguishes at least when for the task of key derivation which will be the point of the stock so informally and I'll make it a little bit more precise is the kind of distinguishes that we care about meeting a cryptographic definition is we have some kind of signature scheme or something like that is a distinguisher who plays a game between the attacker and a challenger some well-defined game like this is essential for stability or something like that and the distinguisher output one of the attacker ones again so there is some restrictions there is some kind of structure on this distinguisher so maybe the low bound doesn't apply so to make it less abstract let's look at the case study let's say key derivation for signatures of max so here we know that the signature is secure with ideal keys and what we would hope is that the probability suddenly if I plug instead of ideal key I plug the extracted key the probability doesn't jump up too much from epsilon to epsilon prime so the key inside is that in the ideal model essentially the distinguisher almost never outputs one because if he outputs one the signatures and secure in the ideal model so we shouldn't use it extracted randomness definitely we shouldn't use it even this perfect randomness so in this case a distinguisher for example is very restricted because we know that it almost never outputs one so maybe we can circumvent the slow bound and improve it and indeed this is what our results are so let me first formalize the setting and then briefly state our results so we have this application P which needs a V bit secret key R in the ideal model we just sample a uniform key in real model we'll apply the extractor where the min entropy of the source will be V plus L plus entropy loss and you would like to have good results so we need to assume that the application was epsilon secure in the ideal model and we would like to conclude that the application is epsilon prime secure in the real model and with this kind of notation it's pretty straightforward to translate the left over hash lemma didn't put the parameters but trust me the bound becomes that epsilon prime is that most original security and the security drops by this factor which is turns out to be exactly the statistical distance that we are talking about so our result says that for a wide range of application speed which I'll define in two slides from now the security is improved in particular there is an extra term epsilon under the square root this is kind of an improved bound so the moral is you might extract more if you know why you're extracting which is a special kind of case of a general philosophy in life if you know what you're doing you can get better results so you can have applied it for the setting of left over hash lemma so let's give me a brief comparison between all the bounds and you know take a look at the paper because you know just trust me those of the bounds so this is the bound of the standard left over hash lemma which kind of says that if you want comparable security the entropy loss has to be at least two log one of epsilon which is exactly what I said and this bound unfortunately is also not minimal if you need to extract more than what you have even if you have a super secure tourism minus million secure application you know suddenly all this tourism minus million security is lost you know when you have no entry you know when L is equal to zero so random oracle heuristic it turns out we kind of do this bound but this turns out to be the right bound for the random oracle heuristic which is much better in multiple terms so in particular as I told you it suffices to have even when L equals to zero you already get comparable security and when L goes you know becomes more than zero you essentially reach epsilon and it's also suddenly which is what we expect it is meaningful even when L is lesser equals than zero which is what we expect because suddenly the application doesn't become completely insecure and just just like when you turn the switch for example the first bit of the key is always zero but the rest is truly uniform you kind of know that you will lose it most a factor of two so it's nice that we borrow a little bit from the application even if we extract more than what we have and here this is our result and this is how we so we didn't quite match the heuristic you know random oracle model but we at least made headway they're kind of halfway in between we already achieve comparable security when L is log one of epsilon and just like this bound our bound is meaningful for L lesser equals than zero for example for no entropy loss maybe we don't get fantastic security but at least we get something like square root of epsilon all right so very briefly which applications are all unpredictability applications so whenever the attacker has to forge something or you know output kind of a long string like max signature one with function and so on also prominent in most surprising result which we don't have time to talk about is we actually show that this holds for essentially all important in this ability applications not all but most like chosen plenty chosen cypher to execute encryption scheme something called weak to the random functions unfortunately doesn't work for all for example doesn't work with pseudo random functions trim cypher's and one time pad which in some sense is expected but I want to make a very quick note it is okay you know if you use one of those rad applications but you use it as a building block for a green application for good application it's fine so it's okay to derive AS key even though AS is a PRP which is red but you want to design you know derive it to get a Mac or an encryption scheme this is fine we only care about the final application and a simple observation is if you care about red applications which sometimes you know some people do here is a quick simple trick that you can kind of first extract a key for a weak PRF and compose it related to the random point and because weak PRF is green you get a key which is good for all applications including this but not one-time pad because they use a computational primitive so essentially I can cover all applications the cost is meaningful it's one extra call to AS and the seed becomes a little bit longer because I need to contain the random point all right so let me move to the second part of the talk which is improving the seed length so the approach I'm going to define an approach which I call expands and extract so first let's recall that the best seed length for left over hash lemma is order of security parameter unfortunately left over hash lemma I just told you requires a really big kind of seed so what do we do if you want to have a short seed and we need a big seed apply pseudo random generator so the idea is we'll assume we have a pseudo random generator G we have it has a short seed S but then we'll expand it and apply left over hash lemma on top of it so now we have a very short seed and so this is very friendly to like streaming sources you know the source can come online and it can result in very fast implementations because both stream side percent universal hash functions are blazingly fast so the hope is that the extracted bits of pseudo random because you use you know computational assumption but this is fine but the question is is this idea sound so at first it appears almost as a triviality of course it sounds to use a pseudo random generator it's you know otherwise you break to the generator but the trivial implication is actually not what we want if you think about it it only tells you that the extracted bits of pseudo random even if I give you the long pseudo random seed but I don't want to give you a long pseudo random seed because then I might as well have a long random seed I want to what I want is that the output looks random even if you use a short seed which is used to expand the big seed so it's not so obvious and indeed our first observation is that under a well established assumption from public e cryptography called decision of the helmet assumption there exists a pseudo random generator in fact a very natural pseudo random generator and a universal hash function again relatively natural universal hash function which is there for an extractor such that this thing not only is it false but it is as false as it can get we can distinguish this probability almost one on any source even uniform distribution on any source so therefore unfortunately our brilliant idea is not sound in general okay so this is the first thing but I'm not going to stop here despite this bad news I'm going to have some good news the first good news is we are going to show that it is okay to extract small number of bits more specifically despite this counter example which I didn't show you it is okay to extract the number of bits which is roughly speaking log of this of the security of the pseudo random generator a bit more formal it means that the pseudo random generator should be secure against you know attackers who run on time exponential in the number of out the bits it's not the thing which is kind of subtle but was very surprising remember I told you that we expect to settle for computational security it turns out here we actually still get statistical security so we use a computational assumption of pseudo random generator in a weird part of the proof but we get a regular extractor you know standard extractor this way and you know there are drops of square root of epsilon but that's a little bit of a technicality but still it's somewhat of a surprising result so the corollary is that if you want to extract logarithmic number of bits from the you know like one coin flip you want to decide if you go to a soccer game or to an opera that's okay or if you assume exponentially strong pseudo random generators it's okay to extract even linear number of bits unfortunately if you look at the seed lengths at best it has to be log of the security of pseudo random generator which turns out to be this and if you want this seed lengths there are already very simple relaxations of universal hash functions called almost universal hash functions which already achieved so it's great that we use this big hammer of pseudo random generators but you know it's surprising that the result still works given our counter example but the parameters on the seed lengths are still not the greatest so I'm not going to stop here so the most surprising result which is the last thing that I'm going to tell you will show that the expending the extract approach is secure in this hypothetical world which is called mini-crypt so let me explain this result and also convince you why we should care about this result so the counter example we had it used the you know very simple assumption natural assumption DDH assumption but that's an assumption from public key world and here we extract keys for AES and so on so it's kind of weird why we used a public key assumption for a counter example so the world mini-crypt is one of these beautiful worlds of russian and paliazza where they assume that symmetric e cryptography is possible but public e cryptography is impossible just as a sort of experiment and our theorem says that in this hypothetical world the public encryption doesn't exist but to the random generators exist this approach is always secure and this is true for any number of extracted bits so you don't need like logarithmic number or something but of course just as we expect now we need to sample settle for efficiently sampleable sources which is fine and also for computational security which is fine which is what we were hoping from the beginning and it's similar in the spirit to some other beautiful results of this kind but arguably simpler so let me actually now you know we have this one line theorem I'll expand it to a three line theory which I can prove in one minute so the tile doesn't yell at me so the theorem so what are we trying to prove we try to prove that expanded as an extract approach is secure and mini-crypt so we prove it by contradiction so I assume not I assume there exists an efficiently sampleable source and some pseudo random generator and some distinguisher who can tell it apart right so the approach is insecure we need to get a contradiction the contradiction in the wealth of mini-crypt means we need to build a public encryption which would be a contradiction right so this is a statement and the public encryption is really simple the secret key is a seed for pseudo random generator the public is an output of a pseudo random generator and to encrypt a bit here is what I do if I want to encrypt one I send a random string if I want to encrypt zero I sample my source acts I apply extractor to accent public key and sends output how do I decrypt that's of course a big question and here I assume this hypothetical distinguisher who actually now knows a short seed and now he can precisely distinguish encryptions of zero from encryption of one and semantic security follows from a trivial observation because if you don't know the secret key only know the public key security trivially follows from the security of pseudo random generator alright great so why do we care about this result it's kind of a weird result who cares about mini-crypt so here is a corollary which is an equivalent restatement of the theorem it says that let G be a pseudo random generator and I assume there exists no public key encryption of very restricted form the secret key is a seed s the public key is g of s this encryption should have pseudo random ciphertext and exact security should be you know exactly essentially exactly the security of our PRG then we know that the expand as an extractor approach is secure with this pseudo random generator right and I argue and this is the last thing before conclusions that that practical PRGs like a yes or stream ciphers built from is unlikely to yield such a restrictive and remarkable public encryption first we don't even have a black box construction known of this very restrictive kind and it is possible from what we know that even if public encryptions exist maybe public encryptions are you know in general much less secure than symmetric encryptions and even practice for a say you know the keys are like 2,000 bits for a yes is 128 bits so it's kind of conceivable that forget about this restricted form we cannot even achieve this okay and it would be I think overall if you think about a yes to build such a thing from us would be like really remarkable okay so the moral is that we give formal evidence and expect an extract approach might be secure as practice is actually used stream ciphers so the summary is we show that we can improve large enter billows and seed lengths of LHL for enter billows we go from 2 log 1 of epsilon to log 1 of epsilon for a lot of green applications but using the strictest WPRF we can use any application and the seed lengths essentially the important result that it seems our approach seems to be practical for all pseudo random generators which don't imply a public encryption scheme and the paper is available on internet. Thank you very much, okay so we don't have time for questions because we ran over a little bit so where is the person