 Hi everyone, I'm Mihir. This is a joint work with Daniel Kane and Phil Rogwe, and we're interested in using gigantic keys as a means to inhibit their exfiltration. The backdrop for all this, for us, was the Snowden revelations a few years back, which showed large government programs aimed at compromising privacy. And for us, it led us to kind of rethink what adversaries should look like in talks and papers in our community. We kind of view them as little pictures that are more cute than scary, and we set up the rules that they follow. We tell them they have to, in our models, employ certain types of watercores and do this and not do that, and they obey them. And somehow, the real world now looks a little different. Not only does the adversary look a little scarier, but it has enormous resources, billions of dollars in budget, tens of thousands of employees, and it's doing things that don't follow our rules. It's planting malware, opening packages and installing things while computers are in transit, planting backdoors and standards and so forth. So in this brave new world, one thing we thought we might consider is what's sometimes called an advanced persistent threat, and according to Wikipedia it's something, it's malware planted on your computer that aims to, in particular, expiltrate your key. So for example, if you're using a symmetric encryption scheme like here, then the malware sitting here has direct access to your key and could simply try to use your network to send it out, where it would be picked up by an accomplice adversary. And knowing the NSA's capabilities in malware, this looked like something potential interest. Once an adversary can do this, it may seem like you're lost. There's nothing you can do if your keys can simply be exfiltrated. Adi Shamir disagreed. At an RSA conference a few years back, he suggested that you make secrets incredibly large. And since you might imagine that there are some limitations on exfiltration, it costs more and it's more difficult and takes longer to exfiltrate a lot of stuff than a small amount of stuff, and it's also possible to detect it if it's going on for a long time, then this might get us some protection. In the case that the secret is a key, this corresponds to a model that was defined well before the Snowden revelations or Adi's annunciation of that idea and prescient work in the theory community starting with Zimbowski in 2006. And it evolved through the series of papers into what's now called the bounded retrieval model. And this quote by Alvin Dodis and Wix in a survey in 2010 says exactly the same thing, that if you imagine that the malware can exfiltrate, say, 10 gigabytes, you put some limit on it, then by making keys longer than that, you might hope to get some security. And the model has kind of three components. First of all, very large keys. Secondly, leakage corresponding to what the adversary exfiltrates, and the only assumption about it is that it's short of the length of the key, but otherwise it's arbitrary information about the key. And you want to maintain security even in the presence of that leakage. When you have a key as big as this, it would be very impractical to have to run a scheme, which actually processes it for every encryption. And so the third requirement locality says that when you use the key for one encryption, you make, say, a small number of probes into it, 500 or so, and thereby using only a limited part of the key, you have some efficiency. So they treated a large number of problems in this model including being able to give public key encryption schemes. So we came into this asking whether something like this could be a practical defense against malware APTs. And in that framework, we decided to look at the problem with a bit of a practical slant. And for that, for us, that meant, first of all, looking at symmetric rather than asymmetric encryption as the primitive more commonly used. And hence, what would happen is that you would have an encryption function here, which takes the message. It has access to the big key, the decryption function of the other end also has access to the same big key. On this same machine, there's your leakage represented by a leak function that takes this key and exfiltrates some function of it. And the adversary gets the ciphertext and the leakage. And it's trying to compromise security of this message. In this case, these three elements correspond to, again, the key being very large, but now the symmetric key, we call it KK, little k bits long, maybe a terabyte. The leakage is some arbitrary function of the big key and would be limited, say, for example, to 10% of the length of the key. And locality would correspond to these two functions would only make a small number of probes into the key, say, about 500. We want in this context to give schemes that are not only efficient, but we want to give concrete, non asymptotic bounds and adversary advantage, which, of course, translates also to efficiency, since you can use smaller parameters and you know exactly what you're getting. And we're willing in this context to use a random oracle model, unlike the prior theoretical works. So roughly briefly, what we did then is we gave a general lemma, information theoretic lemma on a problem called subkey prediction, which asks how hard it is for an adversary given leakage on a big key to say what are a small number of random bit positions in that key. From that, we can build a key encapsulation scheme, which shows how you get small random looking keys out of a big key. And then we can get symmetric encryption in the bounded retrieval model. The first scheme is a random oracle model one, a slightly less efficient standard model one. These provide only privacy. And then finally, we look at what we call hedged encryption as a way to get a little bit of integrity. So on the rest, I'll tell you a little more about all this, mostly focusing on the random oracle scheme. So I'll start with a little more precise rendition of the security goal. And this is called leakage and distributability. That symbol there's not a one. It's an L standing for leakage. We're in the same setting over here. And we associate to any adversary a number. This is not asymptotic. So it's just a number. It's advantage to be breaking this leakage and distributability. And informally, what it means is that it's the probability or advantage of the adversary and compromising security of encrypted messages, when it's given some function, some information about the key. And that corresponds to function of the key of the adversary's choice. A little more precisely in our usual cryptographic games, the adversary is aiming to figure out a randomly chosen challenge bit in the following process. We're in the random oracle model. So it gets access to random oracle. And the first thing it does is specify a leakage function. Importantly, the leakage function itself has access to the random oracle. And it'll map a big key to some amount of leakage of length somewhat shorter than K. Once that function is specified, a big key is chosen at random. And the leakage is computed by applying the function to that key. And that's returned to the adversary. Now the adversary attacks the encryption in a kind of typical way. It can ask for the encryption of a message. And depending on the challenge bit, either gets really the encryption of that message under the big key, or the encryption of the string of all zeros. And it's got to figure out which of the two it got. Its advantage will depend on its resources. These include things like the number of queries to this oracle, the maximum length of messages, and it's running time. But also what fraction of bits of the big key it has been able to or allowed to leak. Okay, so our scheme is quite simply described. And it looks like this. We have our encryption function here access to random oracle. It's as it's given the big key KK to which it must make a small number of probes and the message M. The first thing it does is to pick a random selector, say 256 bit string, something like that. And now, based on the it applies a random oracle to that selector to get probes. So what's a probe? It's just you're poking your finger into the big key. It's an index between one and the length of the key. And we have P such indices, those indices specify bits of the big key. And we collect up all those bits and we call them J. And that's the first pre key. That will have some level of security in particular be unpredictable, but not a pseudo random. By applying a random oracle to it and are we get something that looks pseudo random. And finally, we just use that key little the key K to apply some say standard symmetric encryption scheme, which works with small keys, like some AES mode of operation. We return the ciphertext of the standard scheme plus the selector, because once you are you need the selector in order to decrypt. And that's that's it for the scheme looks like. So now we turn to analyzing the scheme. And what our result is about is saying, consider some adversary attacking it, we've defined our notion goal of security, it's this leakage indistinguishability. And it has some advantage and we want an upper bound on it. And we give this upper bound as a function of the different resources of the adversary, the queries, time messages, and of the fraction of bits leaked. So looking at this bound, it has some conventional terms kind of corresponding to the symmetric crypto here, and the selector, and they can be made small and sort of usual ways or are small. And some are not that interesting. So we forget about those. And we turn to what's really related to the big key part of the problem, which is that the adversary advantage decreases exponentially in the number of probes. So as you increase the number of probes, there's some epsilon here and that will quickly pull down the adversary advantage. These functions are fairly complex. So you start from the binary entropy function, this W is some kind of inverse of it, and that's applied to the fraction of bits that are leaked. It's kind of hard to get a sense of these functions from the formulas, but we might plug in some numbers in the following way. Let's say we want the exponent in that denominator to be 256, which would be reasonable security there. And we have decided on some amount of fraction of leak as we want to tolerate. Then we can ask how many probes would it take for that to become true? And according to this table, if you want to tolerate 10% leakage, then you would need 468 probes. If you want to tolerate 30% leakage, 845, 50% leakage about 1500 probes. So this quite concretely tells you what what you get out of this. Okay, so so in the rest, I'll try to give you some more technical sense about the elements of this scheme. And it had three steps. The first one was by making a small number of probes into the big key to extract a key that's somewhat secure, unpredictable. And this is what's the subject of our subkey prediction lemma. The lemma itself is not about encryption. There's no crypto or random oracles anywhere in it. It just considers the simple information theoretic problem, which is that you pick a large key. Well, at this point, the length is a parameter some key at random, and then apply some leakage function to it to get L. You also pick some random probes into the key that means some indices here in the range one through K p of them, and look at the corresponding bits of the key and call that J. Now you give the adversary the leakage and you tell it what positions you've probed. Of course, you don't give the adversary the key K, but you ask it to figure out the result of the key on these probes from its leakage. What we're interested in is how well it can do. If you fixed a particular leakage function, this represents the best possible probability of figuring out J. So it's the maximum over all adversaries that they can figure out what J is. Of course, the advantage will depend on the leakage function. So here what we do is say, well, what's the from the adversary perspective, the best leakage function. So maximize over those and you get as a function of the number of bits L leaked, how many what this advantage looks like. And so now this is the thing we want to bound it depends on the key length, the number of probes. The number one here is just for consistency with a more general version in our paper, but can be ignored. And it'll also depend on the fraction L of bits leaked. So if you want to upper bound this, we can say, okay, let's see how if we can look for a C such that this bound that this is at most to the minus C. These kinds of problems have been looked at in the literature. There's a lemma by Nissan and Zuckerman back from 96, and an extension by Vadan. We tried to apply these, but it doesn't quite apply. And also they're not entirely concrete. They have hidden constants in them, and not the best things to get a precise bound. There is, however, a lemma by Alvin Dodis and Wicks, hidden in an appendix in their paper, it's a very powerful, elegant lemma. And not only does it apply to this, but it doesn't have any hidden constants. So in fact, a solution to this problem emerges directly by applying this lemma. So what are we to say in addition to that, well, we're interested in concrete security. And in the value of that constant. And the difference or novelty of our work is to is to give an analysis with a with a better value of that constant. So if you look at the lemma and plug in our numbers, you get this formula for what that C is. And this represents the formula that we prove. Again, this is the the functions that correspond to it. Again, since these things are kind of hard to evaluate, analytically, we can plug in some numbers and see the difference. And this table looks at it in two ways. One is number of probes. So for example, what do you get with 500 probes? According to our formula, the new one, you would get two to the negative 274 here, which is a cryptographically good level of security. According to the old formula, you would only get two to the negative five. So really not much security. Even with 1000 probes, and this is 500, this is only about two to the negative 10. So you see a pretty large improvement. Another way to look at it is how many probes do you need to make to make to get some amount of security, for example, 256 bits. So when when is if you want C to be 256, how many probes do you need to make? According to our formula, 468. According to the old one, about 25,000. So this this is a big enough difference to matter for for the efficiency of the schemes. So this is the problem, the main or perhaps only technically novel thing in the paper. And it gets it's quite subtle. So that perhaps I'll give a little bit of sense of what goes on in the sub key prediction lemma. One evidence of the subtlety is that if you look at this, you might think that the naive strategy for the adversary of simply leaking a certain number of bits of the key is the best if I'm allowed to leak four bits of a seven bit key. Why don't I just leak the first four? Why is that any worse than anything else? So maybe the maximum occurs there. And what we illustrate here is that it doesn't. And this kind of leads into how you figure out what the bound actually is. And the it makes a connection with hamming balls and error correcting codes to show this. And what happens is that if you look at this example where the key length is seven and you're leaking four bits, and imagine there's only one probe, you can easily figure out what the advantage for this strategy is. And it turns out to be 11 14s. Now consider the alternative strategy, which takes the seven for hamming code, which has 16 code words, all of them of length seven, and gives you an encoding and decoding function. And the leakage that is going to be provided is to take the key and decode it. So think of it as a the seven bit string as falling in one of the hamming balls which partitioned the space and return the corresponding, consider the corresponding code word, which is the center of the ball. And then the message that corresponds to that code word, which turns out to be only four bits. If you leak that message, then there's a simple strategy for the adversary to predict any bit of the original key, which is to reencode the leakage, get back the code word and pretend it's the key and use the corresponding bit of that for prediction. It takes a bit of analysis, but when you do, you see that you get an advantage significantly better than the naive strategy. So pursuing that sort of avenue, the analysis showing our boundaries is quite technical. And the result is here, again with that particular function in there. And very, very roughly, the way the analysis works is to indeed consider the partition of the space and the hamming balls, and then show a number of things. First of all, show by some kind of discrete concavity argument that the best case for the adversary is where all these balls are the same size. And then show that in each particular case, you get the best advantage for a particular leakage when the preimage of that leakage is either a hamming ball or sandwich between adjacent ones. And when it corresponds to a set that's monotone, which is something having this definition. And all that is packaged up into this bound. So the other steps are relatively straightforward. The second step, which is applying the hash function over here, uses, is effectively extraction. One might ask, why not use a standard randomness extractor? And the answer is because that's information theoretic, you would be limited to a very small number of extractions. And by using a random or something computational, you can get a lot more. The standard model scheme is very similar. What it doesn't do is apply the random oracle to the selector in order to make the probes, rather it picks the probes directly at random. But otherwise, it's the same. What it does is that it derives the key here using in the place of an extractor a function that's UCE secure. And this is a strong assumption, but it's a standard model one. And with some increase in ciphertext size, we can get a standard model scheme. It would be nice in this context to have integrity or authenticity, but it turns out that by a result of Alvin Dodis and Wix on signatures, whose technique also applies here, that's simply impossible because the leakage can always include a valid ciphertext. So what we do instead is say that perhaps it would be worthwhile if integrity was provided in the absence of leakage and privacy in the presence of it with the rationale that mass surveillance is largely about compromising privacy, not integrity. And then these types of hedge schemes we can build relatively easily. So to conclude, some of the limitations of this work include that in this bounded retrieval model, the leakage is upfront. It's not adaptive. It's computed before encryption begins and before the key is chosen, which is certainly a weakness. One thing a lot of people ask as well, I wouldn't take a lot of randomness to generate the big key. You can do it from a short seed and then discard the seed. We certainly had as a goal to investigate how practical this might be as an actual defense. We don't think we're really there. There are lots of issues, including the model, the key distribution, and other things which say that perhaps this has some scope and practice, but it's not a solution at this point. People ask also about whether you can make it more efficient by having probes return words rather than bits, and definitely so it'd be nice to see a bound proven for that. On the more theoretical side, one might try to compact ciphertext in the absence of a random oracle as well. Okay, that's it.