 Hello, and welcome to my Eurocrypt 2020 talk on how to extract useful randomness from unreliable sources. My name is João, I'm a third year PhD student at Imperial College, and this is joint work with Divesh Agarwal and Machio Bremsky at CQT at the National University of Singapore, Luisa Siniskalki, who's now at Oro University, and Ivan Visconti from the University of Salerno. So there's a fundamental connection between randomness and cryptography and also many other aspects of computer science. And a lot of tasks fundamentally require randomness in order to be realized. Additionally, a lot of these tasks also require access to a stream of perfectly random or uniformly random bits. But this is not very realistic because in practice sources of randomness are not perfect. For example, they may introduce correlations between different output bits, or the bits themselves may not be uniformly distributed. So how do we model such weak sources? Well, a well-known and very well-studied way of doing this is just by assuming a lower bound on the minentropy of the source, which is a much more relaxed assumption. So we may say that a weak source has k bits of minentropy for some k, and this means that the probability of any outcome of the source is upper bounded by 2 to the minus k. So for example, if we're dealing with, say, n-bit strings, then if k equals n, this means we're working only with the uniform distribution, but if k is much smaller than n, then this is a much more relaxed assumption on the structure of the source. And so now that we have this model for weak sources, a basic and very important question we can ask is whether we can extract some perfect randomness from an arbitrary weak source. Or in other words, can we design a deterministic function, an extractor x, that takes as input an arbitrary weak source with k bits of minentropy and outputs something that is statistically close to uniform. It's very easy to see that this is impossible to achieve, and so we need to take into account alternative settings for randomness extraction. And one of the most well-studied settings is what I call here the multisource setting, where we have access to several independent weak sources, which you may think are sampled from different devices at different locations. And this setting will be the main focus of this talk. So here's the multisource setting again. We have here l independent weak sources each with k bits of minentropy. And if we accept this model, then we're implicitly trusting the sampling processes that were used to generate each of these weak sources. And this is especially critical when we're dealing with public or shared randomness, as is often the case in cryptography. So the starting point for this work is the question of what happens or what can we still achieve if some of these sources are corrupted by an adversary. And note that in this case we lose the guarantee of independence and we also lose the general lower bound on the minentropy because sources that are adversarily corrupted may even have zero bits of minentropy. So what we'll see for the rest of the talk is at first we're going to see how to model this adversarial setting, and then we're going to discuss randomness extraction. So the way we model this adversarial multisource setting is through what we call shallow sources, and shallow stands for somewhere honest and tropic look ahead. So a shallow source is composed of l weak sources, which we'll call blocks to avoid confusion, and for starters we may assume that these blocks are all independent and have k bits of minentropy. But then we allow an adversary to choose l minus t blocks to be corrupted. So for example here the adversary could corrupt x2 and xl. So t stands also for the number of honest or uncorrupted blocks in the shallow source. Now when an adversary corrupts a block he has full control over it and he can set its value to be anything. And on top of that there's a sampling order on the shallow source. So sampling goes from x1 to xl, which means that the adversary can fix the value of a corrupted block arbitrarily based on the previous samples. So in this case for example we sample first from x1 and then x2 is controlled by the adversary so we can set it to be an arbitrary value based on x1. And then we sample from x3 and so on until we get to the last block which is again controlled by the adversary and so the adversary can set the value of the block to be an arbitrary function of all the previous samples. And the outcome of this process will be the output of the shallow source. Now I'd like to stress that throughout this whole process the adversary knows the positions and the distributions of the t honest blocks and that these blocks are all independent of each other and they have k bits of minetropy. Also I'd like to say that although we motivate this with multi-source extraction the model for shallow sources is also a very natural model for randomness extraction from the blockchain. And before we proceed I'd just like to mention that shallow sources fit very naturally into the long line of research on adversarial source models which started off with seminal works on Santa Vasirani sources or bit fixing sources and which has seen also some renewed interest with many recent papers including another paper at Eurogroup. The shallow source model and the type of results we obtain are incomparable to the works I referenced here so for the rest of the talk I'll show you what kind of new perspectives we bring to this topic. So the most basic question we can ask about shallow sources is whether we can extract perfect randomness from them like we could in the analogous non-adversarial setting. So here we're going to look at the regime where we have a constant fraction of adversarial corruptions and a number of shallow blocks is larger than some fixed constant. And in this case the answer turns out to be no we cannot extract perfect randomness and the way we can see this at a high level is by relating the impossibility of extraction for shallow sources to the impossibility of extraction for so-called perisatable sources which were also introduced recently. In turn the impossibility result for perisatable sources goes all the way back to the impossibility result for a special subset of Santa Vasirani sources and this impossibility result is quite strong in the sense that it holds even when the blocks of a shallow source are perfectly uniform. So this negative answer lives in an interesting state of affairs and what we do is we're going to change the question and instead of asking whether we can extract perfect randomness we're going to ask whether we can extract some sort of useful randomness from shallow sources and by useful here I mean randomness that can still be used to run a lot of applications. What we'll see is that the answer is yes because now we'll focus on the next best thing after uniform randomness which are summer random sources and a summer random source is composed by several blocks as I hear L prime blocks and the only guarantee we have is that there's a subset of T prime good blocks in the sense that the joint distribution of these T prime good blocks is perfectly uniform while the blocks outside this subset may be arbitrarily correlated with the good blocks. So for technical reasons we'll be working mostly with convex combinations of summer random sources which we'll call these convex SR sources but this doesn't hurt us because essentially any application that works well with an arbitrary summer random source will also work well with an arbitrary convex combination of summer random sources. So for the rest of the talk we'll show that first summer random sources are really useful I'll show you some concrete applications and second I'll show you that starting with a shallow source with very bad parameters we can extract great summer random sources so we can run a lot of applications starting with shallow sources. Okay, so let us start with the applications which will also allow us to better understand which parameters of a summer random source are particularly important. So the most basic application of summer random sources is to the simulation of randomized algorithms with one side there. So as an example we have here a randomized algorithm with one side there for designing a language L. So this algorithm when it receives some X in the language it always outputs yes which means it's always correct but when X is not in the language then the algorithm may be incorrect with probability one third. Now in general this property is only guaranteed under uniform randomness so maybe if we give a weak source to the randomized algorithm then if X is not in the language this algorithm will always be incorrect with defective randomness. However there's a very simple way of modifying this algorithm so that even if we don't give it uniform randomness but we give it a summer random source then it will still be a one side there algorithm for deciding this language. And the way this works is very simple. So we have here our summer random source and we're just going to run the original algorithm several times. So first we run it on input X using the randomness from the first block of the summer random source. Then we do the same but now using the randomness from the second block of the summer random source and so on until we reach the end and we run it through all the blocks. Now we have a bunch of outputs from the original algorithm and the way we decide on our final output is simple we just output yes if and only if all the runs of the algorithm output yes. It's very easy to see that this is also a one side there algorithm for deciding the language L and essentially this is because we're guaranteed that one of the blocks in the summer random source will be uniformly distributed and for that particular block the algorithm will behave correctly on input X and that's all we need. So in order to understand which parameters are important let us look at the runtime of this modified algorithm. Well if the original algorithm runs in time Ta the new algorithm will run in time L prime times Ta. Well L prime is the number of blocks in the summer random source. So given this it's fairly natural when we want to extract summer random sources we want to extract them but minimizing the number of blocks in the resulting summer random source and at the same time we want to maximize the number of bits in each block of the summer random source and this is because we would then have more random bits to feed into an algorithm or an application in case it needs it. Okay so I'm just going to briefly mention that we can take this one side of our idea further and we can apply it to cryptography and in our paper we use this idea to design non-iterative primitives that work with the summer random source as a CRS as opposed to a perfectly uniform CRS. So in particular from generic assumptions we construct non-interactive witness and distinguishable proof systems and non-interactive commitment and there's also another work where publicly verifiable proof systems are also discussed in the same setting. So as we can see there are a lot of applications of summer random sources. So now that we've seen these applications we can dig deeper into this summer extraction from shallow sources or extraction from shallow sources. The goal here is to design a deterministic function X which we call a summer extractor that takes as input an arbitrary shallow source with given parameters and such that its output will be statistically close to a convex combination of summer random sources. So based on what we've seen before we have a good idea of which parameters we're interested in. So for example we want to minimize the number of output blocks L' which is a number of blocks in the resulting summer random source we want to minimize the statistical error because this is important especially for cryptographic applications and at the same time we want to maximize the output block length M which is the number of bits in each block of the resulting summer random source. So we're going to start with a naive construction so one way of designing a summer extractor is just to take any two source extractor and apply it to every pair of blocks of a shallow source. The reason why this works is that as long as we have two honest blocks in the shallow source then the output of the extractor on these two honest blocks will be close to uniform and this is enough to get us something close to a summer random source. However there are some downsides to this naive construction. The first one is that the number of blocks in the resulting summer random source is quadratic in the number of blocks of the shallow source and second we can only get negligible error which is useful for cryptography when the min entropy of the honest shallow blocks is very large. So natural question can we do better than this and the answer is that yes we can do much better and I'm going to present this to you right now. So here's our new summer extractor for shallow sources. I'm going to present it to you in the high entropy setting so honest blocks of a shallow source now have very high entropy and let's say that we have here these five blocks, two of them are corrupted by an adversary and now instead of using any two source extractor we're going to use a very special object which is an unbalanced two source extractor which has a property that the left source can have low entropy while the right source must have high entropy. There's a beautiful construction of this object by Ras but we can also construct them from strong seeded extractors. So the way our approach works is through a kind of a sliding window technique. So starting with the first output block we just look at the first two shallow blocks and we take the first shallow block as the left source of the unbalanced extractor and the second shallow block as the right source of the unbalanced extractor and then for the second output block we just slide the window which means that now we take the first two shallow blocks as the left source of the unbalanced extractor and the third shallow block as the right source of the unbalanced extractor and we can keep going in this fashion and get the next two output blocks and what I claim here is that the blocks which have the output blocks of the unbalanced right sources are jointly uniformly distributed or in other words I claim that Y2 and Y4 are jointly close to uniform because they have Y2 as X3 as its right source and Y4 as X5 as its right source and they're both honest. So let's see why Y2 is close to uniform. Well the argument is not very complicated we just look at the left source and note that it contains enough min entropy X1 is honest and second X3 by the properties of the shallow source X3 is independent of the previous sources and it has high min entropy. Now we want to see that Y2 and Y4 are jointly close to uniform and this is also simple we can consider an arbitrary fixing of Y2 and show that with high probability over this fixing then Y4 is going to be close to uniform. Again we follow the same type of argument so we look at the left source and note that it still contains enough min entropy even when you condition on all the previous output blocks and if you look at this right source X5 well again by the properties of the shallow source because the adversary behaves in an online way X5 will be independent of all the previous blocks and it will have high min entropy which means that we're done. So we can extrapolate this to the general setting and what we get is an output that is statistically close to a convex combination of some random sources with very good parameters. So for example the number of blocks of the resulting some random source is now linear in the number of shallow blocks while in the naive construction it was at least quadratic. The number of good blocks in the some random source which is useful for success probability amplification for example when you saw these algorithms before is only one less than the number of honest blocks in the shallow source which means that this construction gives a useful some random source even when there are only two honest blocks in the shallow source and besides that the statistical error and the output block lengths of our some more extractor are essentially the best we can hope for. Now the some more extractor we just saw as great parameters but it has the downside that it really only works if the honest blocks of the shallow source have very high mean entropy. However the good news is that we can modify this construction in order for it to work also for low entropy shallow sources in other words shallow sources where the honest blocks have mean entropy delta times n for delta an arbitrarily small constant and I won't give details but the rough idea is that we can combine this high entropy construction with so called some more condensers which are relaxations of some more extractors which actually work for arbitrary weak sources and which have been well studied in the literature actually there are several beautiful constructions with great parameters and the idea is that we first take the shallow blocks we run them through some more condensers and then we apply similar ideas to what we saw in the previous slide and the outcome is that we get essentially the same parameters except we get an extra few constants depending on delta but the important thing here now is that we handle low entropy shallow sources and this is a huge improvement over the naive some more extractor. Now to finalize this talk I just want to discuss with you one lasting possibility result. So the previous some more extractors we saw exploit the structure of shallow sources in a very fundamental way so natural question is whether we can still extract useful some more random sources without exploiting the structure or in other words can we extract useful some more random sources if we just treat a shallow source as an arbitrary weak source with comparable length and comparable entropy. In this case there is one naive some more extractor which is obtained as follows just take any strong seeded extractor and apply it to X where you take you treat X as an arbitrary weak source with sufficient entropy and you get one block for each value of the seed. Now the outcome of the some more extractor will be statistically close to a some random source but the downside is that extractor lower bounds imply that whatever the extractor you use here if you want negligible statistical error then you need a super polynomial number of blocks in the some random source and this means that this some more random source cannot be used in efficient application so it's not a useful some more random source. Of course this is a very specific some more extractor for weak sources so now the question is can we do better for weak sources and it turns out that the answer is no so for example one of the results we prove is that any some more extractor for arbitrary weak sources with comparable length and entropy to shallow sources basically requires a super polynomial number of output blocks or number of resulting blocks in the some more random source if you want negligible statistical error which we do for cryptographic applications and if the output block length of the some more extractor isn't extremely small what this means is that we cannot extract useful some more random sources from arbitrary weak sources the way the proof goes is that we relate some more extractors for weak sources to so-called randomness dispersers for which you know very good lower bounds and then we just translate these lower bounds to the some more extraction setting I'd like to mention that although we present only the result here we actually in the paper have results impossibility results that apply to even weaker notions of some more extraction from weak sources which means we have a very strong separation between some more extraction from shallow sources and some more extraction from comparable weak sources now to finalize I'd just like to leave here one interesting open problem so the result we present here only gives something interesting when M isn't very small but it would be interesting to prove an analogous result even when the output block length of the some more extractor is extremely small like say M equals 1 bit we have some partial results about this in the paper and in the follow-up work but we still haven't gotten to the general case okay so we've got to the end of the talk so let me just sum up what we saw well first we saw that shallow sources are a natural model for an adversarial multi-source setting and although we cannot extract perfect randomness from shallow sources we can extract some random sources with great parameters even when we have a shallow source with very bad parameters like for example it only has 2 honest blocks we also saw that some random sources are extremely useful for algorithms and for cryptography and we also saw that we really need to exploit the structure of shallow sources to extract useful some random sources that's all from me thanks for watching and bye-bye