 It comes Wednesday at the randomness session. Anyone interested in randomness, please come in now. Okay, hello everyone. Welcome to the randomness session. The first talk in this session is about fuzzy extractors and the possibility by Benjamin Fuller, Leonid Reisin and Adam Thunis, and Ben will give the talk. Okay, thank you. Thanks everybody for coming. As Dominic said, this is joint work with Leo Reisin and Adam Smith. So before I can tell you when fuzzy extractors are possible, I'll tell you a little bit about what they are. So the goal of this talk is to understand when we can get key derivation from noisy sources. So noisy sources are things like physical and clotable functions, biometric data, and so what we're considering is a class of sources of entropy where they have noise in them. So we can take some initial reading from them that we're gonna call W0, and this is not equal to a later reading if we measure the same thing again at a new time. It's gonna be a different value, W1. But these two readings aren't gonna be independent. We're gonna assume that there's a bound on the distance, that the distance between W0 and W1 is less than T. So for most of this talk, we're gonna consider the hamming distance, but you can define this perfectly well for any distance metric. And so what we wanna do is get a cryptographically strong and stable output from these readings. So W0 and W1 should map to the same output that should look uniform to the adversary. And if we can do this, we can use it to bootstrap authentication. So a little bit more what this setting looks like is that we have our traditional Alice and Bob. So we have a baby who wants to use Facebook. And so it takes an initial reading W0 from its biometric data, sends this value over to Facebook. And then at some later time, maybe when it's grown up, takes some later reading W1. And now Alice and Bob want to communicate except now that there's a non-tampering Eve that is observing their communication. And they should, because they have these values W0 and W1, be able to agree on a cryptographic key. So this problem and variants of it have been studied for 40 years starting with the great work of Weiner and the Wiretap channel in 1975. So today we're gonna consider the non-interactive setting so where there's a single message from Alice to Bob. And in particular this, in this setting we're gonna use the notation of fuzzy extractors. So we have an enrollment algorithm that's run by Alice called generate. So it takes this value W0 and produces a cryptographic key R as well as a non-secret value P. And the point of this non-secret value is now we can run reproduce when the distance between W0 and W1 is less than T. We take W1 as well as this public value P to get back our value R. And the correctness condition is that reproduce should give us the same key if W0 and W1 were less than distance T apart. And the security condition is that, well R should look uniform given this public value if the source is good enough. And what we're gonna talk about in this talk is what does good enough mean. So to start we're gonna go to a little bit more familiar setting which is rather than asking when are fuzzy extractors possible we're gonna ask when are extractors possible. I hope people in this talk are a little bit more familiar with randomness extractors. So here we get rid of this distance T and normally we call this value P a seed. So in the randomness extractors game we know that an adversary can always try a guess W0 in the distribution. So whatever our source looks like here they can pick some value and run that in the reproduced algorithm and they get some candidate key up. So the necessary condition for security is that every possibility W0 has to have low probability. And so we codify this notion in something called mid entropy which says that every point in the distribution has low probability. It's necessary for security and the nice thing about randomness extractors is that it's also sufficient by the leftover hash lemma. We can show that universal hash functions actually work as extractors although they have some limitations they have long seed they have some losses associated with them. So for randomness extractors we have a really tight and clean condition that's both necessary and sufficient for security. So now going back to the fuzzy extractor setting we wanna ask what's the equivalent condition? So what we can do in this setting is that an adversary doesn't have to exactly guess W0. What they have to do is they get to provide an input W1 and they should provide an input W1 that's near many possible values. So what it looks like is they don't have to guess any of these points they can provide an input point here and they win the game if the point they provide was close to the actually generated point. So the equivalent condition is that every ball of radius T should have low probability. Little weight of the distribution should reside in any ball. So we're gonna define an analogous condition that we call fuzzy min entropy that like min entropy measures the maximum probability point this measures the maximum probability ball in the metric space. So you might ask why should we bother with this new notion? We can show that fuzzy min entropy is at least min entropy minus the size of the ball. Why don't you just measure things like this? In fact, that's what fuzzy extractor's mostly did. Is they talked about your starting entropy and they had some losses associated with the size of the ball due to losses from coding theory. And so the reason for this is that there's many distributions that occur in practice where the size of this ball is actually greater than your starting entropy, starting min entropy. For example, the human iris has this condition where the T that you need to correct has like two to the 200 points. Actually, it has like two to the 500 points. And the entropy of the distribution is about 250 bits. So if you measure things this way it says that the iris is impossible to secure. But this is what the distribution actually looks like. And if you look at that attacker who just provided an input point this doesn't look too bad because they're not gonna capture a lot of W zero. So our hope is that this fuzzy min entropy is a much tighter measure of whether a distribution is actually capable of being secured by a fuzzy extractor. So maybe we actually have this setting where H buzz is greater than zero. So an adversary can always try this. So it's pretty easy to show that fuzzy min entropy is in fact necessary for security. So our goal is to say we wanna make sure that that's all an adversary can do. And in the computational setting the answer is yes using obfuscation. And this is the work of Batanski et al who showed virtual gray box obfuscation for NC1 circuits. So basically what you would do is inside of generate you pick a random key R and this public value would be an obfuscation of a proximity point program. And then in reproduce you would just send your input to this obfuscated program. And so it's for most natural distance metrics you can compute this distance in NC1. The work of Batanski et al gives you a VGB obfuscator and thus a fuzzy extractor. And this construction is exactly secure when the distribution has fuzzy min entropy. So that's kind of evidence that we're on the right path. So the question we ask in this talk is what about information theoretic adversaries? And so what we show is two things. One is we show that this measure is sufficient if the constructing algorithm knows the distribution completely. But we also show something I think is a little bit more interesting which is if the constructing algorithm only knows the distribution comes from a family of distributions that this may actually be insufficient. And we have several variants of this impossibility theorem. So this sufficiency theorem is using some universal hashing techniques with a little bit more sophistication. I'm not gonna spend much time on it today. What I wanna focus on is the insufficiency. So, right. The reason we wanna focus on this case of distributional uncertainty is that's almost always what we're working with. In the randomist extractor setting you wanna provide security for all sources with a certain amount of min entropy. So here we'd like to provide security for all sources with a certain amount of fuzzy min entropy. Same kind of condition as in the extractor setting. There's many reasons why you would wanna do this. The first is that if you're constructing an algorithm you've seen some samples from a high entropy distribution. You don't know the entire thing. So whatever you haven't seen well there's a bunch of possible distributions that could fit that. It may be that Eve knows more about your distribution than you do. Once you come up with a construction they can spend more time and money to learn how the IRIS works to build an underlying model for it. And the third thing is that you can't hope to have a computationally efficient construction if you have to record all two to the 500 or two to the 1000 points in this distribution. So this kind of setting of not knowing exactly what our distribution is, is almost always what we wanna do. So we're gonna ask if you can always build such a thing for a family of distributions we're going to allow the fuzzy extractor construction to be computationally unlimited, okay? So as kind of a warm up just showing that you can't always build fuzzy extractors for a family of distributions we're gonna look at a slightly weaker object. And the reason we're gonna do this is that the way fuzzy extractors are normally built is that they do two things. The first is that in the generate algorithm they use a standard randomness extractor to get this key. And this is also run in reproduce. And the main thing you're trying to do is you need to provide the correct input to the randomness extractor so there's something called a secure sketch that performs something called information reconciliation. If you haven't heard the terms privacy amplification information reconciliation don't worry about them. But what it does is it produces this public value P that along with W1 gets W0 back. And the whole goal of this secure sketch is that P shouldn't have a lot of information about W0 should only have enough information to get it back from a nearby W1, okay? So we're first gonna show that it might be impossible to build this for a family of distributions. And that'll provide intuition for how it works for general fuzzy extractors. But I should say almost all fuzzy extractors are built this way. So let's look at how a secure sketch would work for a family of distribution. So when I'm actually creating the algorithms the only thing I see is my input point W0. I may have a guess for what other points in the distribution exist but the only thing I actually see is W0. So the thing I know is I know by my correctness condition that if I'm given a nearby W1 I have to get this point back, okay? At least with good probability. So I don't know what these other points are out here. It could be these points but I do know that for anything nearby I have to get this point back. So this correctness condition of mapping back here has to be true whether these are the correct points, these are the correct points or these are the correct points. And right, if we have a family of distributions that all intersect at W0 but have all these different points any of these points could be true and I don't know anything about them. So we need to recover from this new value W1 regardless of which distribution the original W0 came from, okay? And that's just because of correctness. Now what a good fuzzy extractor algorithm is gonna do is to provide security this isn't gonna be the only ball where we have kind of this decoding to the center. The algorithm is gonna create other points in the metric space where everything nearby maps to the center of this ball. And the reason it's gonna do that is these provide other candidates for the value of W0 which is necessary to hide the value of W0 from Eve. So what a good sketch algorithm is gonna do is it's gonna tile this metric space with as many of these points as possible. And it seems like it's done a really good job here. There's a lot of points of this distribution which lie at the center of these balls, okay? The issue with this is now I didn't actually know what these other points of the distribution were. So if I look at one of those alternate distributions I showed you those points are no longer at the center of balls. The only point that's at a center of one of these balls is the actual point W0 that was used in January. And no matter which distribution W0 came from it's gonna lie at the center of one of these balls. So this is kind of the core of this impossibility result is that few points from each distribution in the family are gonna lie at the center of the ball. But we know that the true sketch point is always going to be at the center of a ball. So we can just look at as an adversary if we know the exact distribution we can look at the intersection of the exact distribution with the points that lie at the center of these balls. So what this allows us to show is that there's actually a family of distributions that have linear fuzzy man entropy such that any secure sketch construction that has to handle a linear error rate retains only two bits of entropy. This is bad. And this is only having to be correct 50% of the time. So this is a really bad impossibility result. And the crux of the main technical result is that we need to build a family of distributions where each distribution has high fuzzy man entropy and they actually share very few points. But you might be saying we don't need to build a secure sketch. We can do something in fact entirely different and maybe a lot more clever than what this secure sketch construction looks like. So I'm gonna provide a brief summary of how you might provide an impossibility for a general fuzzy extractor. So let's consider a fuzzy extractor that just outputs a one bit key R. So again, we have our lovely metric space here and what the reproduced algorithm does is it partitions this metric space into values W1 where it reproduced outputs the key zero and where it reproduced outputs the key one. And furthermore, the adversary knowing this public value P knows this partition. They can see this partition. So here, the point is that our value W0 could not have lived near the boundary of this partition. If our point W0 lived near the boundary, what we would have is we'd have a circle that lives near the boundary and all these points in red would incorrect, would not give us the correct key anymore. So what this allows us to do is to say that we can actually rule out everything close to the boundary as a possible input W0. And for this metric space that I've drawn, and this is just because of correctness, for this metric space I've drawn, this doesn't look too bad, but you can actually show that for high-dimensional metric spaces, in particular for the high-dimensional hamming metric, almost everything lives near the boundary. And so this doesn't look like a small fraction of the space it looks like, almost everything. And so what that lets us do is combined with the knowledge of the distribution, the fact that so many points are ruled out by being close to a boundary, allows us to leave very few possible guesses for the value W0. So here we get a slightly weaker result which says that there's a family of distributions that have linear fuzzy and entropy that we again handle linear error rate, so you can't get a three-bit key out. I should note one caveat here is that we have this condition that we have to be 100% correct. On our previous result, we only needed about 50% correctness. So this is our second main result. We have a third variant, which is building off a result of Hollenstein and Renner from about 10 years ago who showed the impossibility of key agreement from bit fixing sources. So we actually get a similar result to our previous theorem, but we can remove this requirement for 100% correctness at the cost of slightly worse parameters that are hidden by asthmatotics. So the requirement for, or the quality of the sources goes down and they have to support a higher error rate. So what we show in this talk is a new and necessary notion for driving keys from noisy sources that we call fuzzy manentropy. It measures the maximum probability ball in the metric space and it just measures the adversary's kind of ideal success. If we give them oracle access to the functionality we've provided. This is sufficient with computational security if you believe in obfuscation, it's an assumption. It's sufficient if the distribution is known. So our construction here for information theoretic is very much not a polynomial time construction. And it's insufficient if you don't know what the distribution looks like. And this is in contrast to the randomness extractor setting where you don't have to know the distribution and you're still able to do the task of key derivation. There's a big difference between the two settings. We have several variants of this insufficiency setting both for secure sketches and fuzzy extractors. And the question that I'll leave with is it seems like this is kind of guiding us that computational security is the right thing for fuzzy extractors. But right now the construction that we have requires a very strong assumption. We wanna ask, can you provide computational fuzzy extractors for sources with fuzzy manentropy based on something that's a little bit weaker than general obfuscation? That I'll stop and take questions. So any questions? The next speaker could already come forward. Great talk. So this seems in your theorem, basically your term is performed on us but you find a specific value of distribution and you can break it in whatever sense. But what if we always work on the largest possible value, which is essentially any possible distribution with certain entropy? Is your posteriori, would you like to ask you about that? So the short answer is no. It is worth saying that, right, the fully uniform distribution, there's no like family of distributions, right? Because it- But for example, we just defined the largest possible value of the distribution with certain entropy low amount. Okay, so our proof does not extend to that setting but I think the problem's still there and the problem is, at least for secure sketches, I'm not sure about fuzzy extractors, the problem is just your, right, no, you're creating, you're setting points to be distinguished based on being at the center of the ball and you're choosing those points without knowing what the rest of the distribution looks like and so the fact that we have to build a construction for a particular family where we can show distributions don't overlap very often but it may be a weaker result but I think you can extend it for essentially any distribution or any family but we haven't done it. Can I ask one more question? Actually, we have a bit of time to catch up from the session before so I suggest that further questions are taken offline and thanks again for a great talk.