 All right. So the setting for this talk is encrypted outsourced storage. So imagine you have a client that has some key that I've denoted here by K1. And it encrypts large volumes of data, a bunch of messages, before uploading them to some outsourced storage service, perhaps a cloud service or some backup. And because we're using encryption, we're worried about confidentiality, particularly of our messages. So we don't want to trust the encryption keys or other information that can be used to drive message data to the storage service. But we do trust the storage service to store our data for us. One thing that we would like in these settings is the ability to do key rotation. So given our key K1, we can generate a new key K2 for a second time period, take all of our ciphertext encrypted under K1 and somehow convert them to ciphertext, which are encryptions of the messages under the new key K2. So key rotation turns out to be a pretty important property in practice, most pragmatically because there's a variety of regulatory regulations that require periodic key rotations. So in the credit card industry, the PCI DSS standard requires this, say, for encrypting credit card data. More generally, you might want this in the context of post-compromise cleanup. If K1 was exfiltrated or you had concern that it might have been exfiltrated at some point, you'd like to rotate to new keys. And there's plenty of other good reasons why key rotation is important. So much so that major companies, when they build the encryption APIs for their customers, include key rotation functionality and workflow specifically to support doing this efficiently. So how do we do key rotation? Well, there's a couple of obvious but unsatisfying approaches. The first and most obvious is just to send both keys K1 and K2 up to the storage service. They can decrypt and re-encrypt stuff. This isn't satisfying because obviously we've exposed the keys to a location where we would not like it to be exposed to. Second trivial approach is just download all the ciphertext and do the re-encryption locally on the client. But there's this obvious performance concern that if you have large amounts of data, say terabytes of data, this is going to be prohibitively expensive in terms of performance. So in practice what people do is use what we'll call the authenticated encryption or a hybrid approach. And that just means that we're going to have an encryption that's like two parts, right? The first part is going to be an encryption under K1 of a data encryption key that we'll label as x. And the second portion will be an encryption of the message under x. So sometimes we refer to these as key encapsulation mechanisms and data encapsulation mechanisms or chem-dem style constructions. So this is nice because when you want to do a rotation, then you can just do a rotation by re-encrypting x. So you choose a new key K2, you just go fetch this short header, this chem ciphertext C1 star, and then decrypt that and then re-encrypt under the new key K2. And this is indeed what's being offered right now by these APIs I mentioned earlier. But there's ultimately a little bit unsatisfying. In particular, the data encryption key x never changes. And when I first started thinking about this problem, it was actually because an Amazon engineer came and said, hey, it's a little bit weird that we have these key rotations that aren't actually rotating all of the secrets underlying the ciphertext. So motivated by that same, I think, observation, there's a nice prior work by Bonet, Louis, Montgomery, and Ragnathon, or I'll refer to them as BLMR. And they introduced this idea of updatable encryption to achieve this type of key rotation where all secrets are changed. So they also use a chem dem style construction where they have an encryption of a dem key under using some kind of standard symmetric encryption. But then the data encapsulation portion is a little bit unique. They use what's called a key homomorphic PRG or super random generator that has some special properties. So g takes a seed x and maps it to a group element in a way that has this additional homomorphism property that you can take g of x plus g of x prime for some other seed x prime. And this is equal to g of x xward x prime. And in turn, you can build these from key homomorphic PRFs. And in fact, the BLMR paper was primarily about how to build key homomorphic PRFs in the standard model. And the updatable encryption was just like one component of that much broader paper. So do a rotation now. We can do something interesting, which is that you get back this type of c1 star. You can use that to recover x. You can sample a new data encapsulation key x prime and then send back to the server now a new header c2 star plus this delta token, which is the xor of xxward x prime. And the server can then kind of rotate the data encapsulation key by adding to the portion of the cybertext c1, this g of delta. And by applying the homomorphism property, you see that c1, this is plus g of delta, is indeed equal to g of x prime plus m. And so we've effectively rotated all the secrets. So that's nice refreshes everything. It has low bandwidth cost because c1 star and c2 star and delta are compact. But it requires quite a number of exponentiations because you're using asymmetric crypto mechanisms underneath these key homomorphic PRG. And I should mention also that they focus primarily on IDCPA style encryptions. They weren't worried about authentication or ctext. And so they're just using IDCPA encryption for this. This will be important in a few minutes. So the status before our work on key rotation is that we have these naive schemes that are ultimately unsatisfying. We have a scheme that's used in practice that has had no formal security analysis. And it's not exactly clear what security this achieves. And in particular, it wasn't really clear what we forego by not having complete rotation, like what type of attacks arise because of this. And then we had this nice work by BLMR15 that targeted chosen plaintext attack security. And they provided a notion up IDCPA that lifts IDCPA to this key rotation setting. And also a notion of ciphertext independence that I won't come into too much details on, but tries to capture this idea that you're refreshing all the secrets. So we looked at this and saw, well, one, it's clear that we can probably achieve stronger security notions than what they are achieving. We wanted to also treat authentication. So we need in practice not taking encryption, not just IDCPA security. And there was also a subtle bug in the proof sketch of up IDCPA from the BLMR15 scheme that I just showed. And so we wanted to fix that as well. And finally, there's a question of whether these schemes that use more expensive operations can be made practical, or is this really going to be too prohibitive in practice? So in our work, we give a treatment of key rotation for symmetric encryption, including authenticating encryption, and introduce three new security notions. An up ID notion that's stronger than BLMR15s, a C-text notion that captures authenticity goals in this setting, and then a so-called re-encryption indistinguishability notion that is in our, our belief is a bit more intuitive than the cyber-text independence notion, and it also captures a much broader class of attacks. And then we use this to analyze both old and new schemes, and perhaps most noticeably introduce two new schemes, one called KSS that doesn't achieve up re-encrypt security, but is very fast, and another re-encrypt, which is a variant of the BLMR scheme that repairs the issues from before, but also extends it to meet our stronger security goals. Now, sort of embarrassingly, just last week, Joseph Yeager found a bug in some of our proofs for up-C-text, and this actually invalidates the security of the schemes for KSS re-encrypt in the context of C-text, as given in the camera ready. So that's kind of a bummer, but Joseph's very nice to point this out to us. We have fixes for the schemes, and are working on a write-up qualitatively. It shouldn't change the result takeaways from the paper, but we'll be putting that up on your print very shortly. And for that reason, I'll be focusing just on the CPA kind of portions of our work. Yeah, didn't mean that to be a joke, but I can see why that's funny. And also, because I didn't have time, that's the real reason, yeah. Okay, so we'll go through these each in turn. So to fix some notation, an updateable AE scheme is, you know, combines the basic three algorithms that we're used to, key generation, encryption, decryption, with a re-key generation algorithm that takes two keys, KI, KJ, and a ciphertext header CI star, and then generates a new, an update token that can then be in turn used with a re-encryption algorithm to rotate the key underlying a ciphertext. And this kind of gives us the right syntax semantics. And I say this is building off the BLMR formalization as well. So you can encrypt something, you can generate a re-key token using it, use re-encryption, and then successfully decrypt the resulting ciphertext under the key to which it was rotated. And we have a formalized correctness conditions for all this in the paper. In terms of confidentiality, we introduce a new kind of class of definitions all based on a relatively complicated, at least looking, security game called up IND. This is a kind of left or right indistinguishability notion for chosen plaintext attacks. We generate a bunch of different keys. In fact, there's two sets of keys, ones that are going to be uncompromised and then ones that are actually explicitly handed over directly to the attacker. And then the goal of the adversary is to, you know, query a challenge LR oracle here at the bottom, with two messages to one of the uncompromised keys and get back an encryption of one of the two messages chosen at random and try to figure out what this bit B is. The, additionally, the adversary has access to all these other oracles, like a regular encryption oracle to get examples of encryptions that aren't challenge encryptions, re-key generation and re-encryptions. Now we have to be very careful here to avoid trivial wins, right? If you query the LR oracle on an uncompromised key, but then immediately you can go get a re-encryption of that ciphertext to a compromised key, one of these ones that's handed to the adversary directly, there's going to be a problem, right? You're going to win trivially. So we spent a lot of time trying to figure out exactly how to formalize what are valid and invalid queries that should be allowed to re-key gen and re-encrypt. And so we give these predicates, these invalidity predicates that define exactly that. We spent a lot of time trying to make the most permissive invalidity predicates that we could come up with. The BLMR confidentiality notion actually ends up being captured in this framework. We just have much stricter invalidity procedures that rule out any queries to compromised keys. And additionally, they don't return a ciphertext header on an invalid re-encryption query. This last thing turns out to be something that was kind of interesting to us because it surfaced a compromised scenario that we hadn't thought of before, and in fact, allows showing that in our model, these things like A hybrid don't even meet this up IND security notion. So let me just go over that very briefly. So remember, our A hybrid is just this chem-demp thing where we encrypted a data encapsulation key X under our key K and then the message under X. And in this context, our adversary can do the following. Query, the left-rights oracle to get a challenge ciphertext. Query, a re-encryption now to re-encrypt this ciphertext to a compromised key, so T plus one. So this is one of the ones that's given to the attacker. And at this point, the invalid procedure will say, oh, this is kind of an invalid query because you're rotating a challenge ciphertext to a compromised key. But because we're being very permissive, we're gonna give back the header C, in this case, C, T plus one star. Okay, but that's just an encryption under K, T plus one, and for this scheme of X. And so we have the key there, so we can just decrypt and get back X and then recover from C1, M sub B, and win there. So this ends up being a sort of, perhaps in practice esoteric compromise setting because you need a certain combination of values, but it was interesting to us that this got surfaced from exploring these definitions formally. And it turns out that we can actually achieve this security, achieve security and avoid these types of attacks by a pretty simple change to AE hybrid. So that change is what we call a chemdem with secret sharing. The basic problem with the security in our model from before is that if you get this compromised header, header for compromised key, that's enough to reveal the data encapsulation key completely. And so what we can do is prevent that by doing a kind of secret share of the data encapsulation key across the two portions of the ciphertext. So now we sample an X and a Y. We store inside the key encapsulation an X, X word Y and store Y with the other portion of the ciphertext. And this is just like I said, a simple secret sharing scheme. When we do rekey generation, we can get back the header, decrypt it to get back this X or X, X or Y. We can sample new Y prime, kind of refresh the secret share by X or Y prime into X, sending Y prime as well over to the storage service and refreshing on that side as well. So it turns out we can prove that this is up IND secure and it really has a very small overhead compared to A hybrid. It's just the addition of these small XORs that are, you know, say 128 bits. But note that the data encryption key here is still never rotated, okay? So we're still kind of in the same fundamental place that X is never rotated. And this begs the question of what we're giving up basically by having schemes that say are just up IND secure and not refreshing all of the secrets underlying ciphertext. So we spent a lot of time trying to figure out what are interesting attack models that can take advantage of this lack of refresh of the data encapsulation key. It wasn't at all clear to begin with. And we came up with a few different things and I'll explain just one exfiltration attack scenario that seems to be important for this property. So imagine you have in a time period one and we'll just do this in the context, A hybrid is a little bit simpler, but the same thing applies to KSS. In the first time period the attackers say compromises both gets access to both K1 as well as a full ciphertext. And at this point, perhaps because he's under in this time period some type of constraints about how much data he can exfiltrate, he can't exfiltrate the whole message, but he can get out some small secret that allows him to output the data encapsulation key X. And then later if we do a rotation, say cleaning up after that compromise and moving to a new key K2 and the adversary later on gets access now just to the ciphertext, but not K2. We've done better jobs securing our key K2. Nevertheless, this X value is still enough to recover the message, right? Because we haven't rotated that key. So this type of, we'd like to have security models that deal with this type of attacks as well. And so we introduce this re-encryption indistinguishability notion that speaks to it. And it's basically a strengthening of, or it's stronger than the ciphertext independence notion from the LMR. And I'll just again give some intuition about what it's doing for us. So it's again using some type of game to formalize security, and instead of having left to right challenge oracle that you query two different messages to, we're gonna have a left to right re-encryption oracle that we can query two different ciphertext to. And the idea is that the attacker shouldn't be able to tell which of the two ciphertext was actually re-encrypted, okay? The intuition for why this is, and we have all the same problems in validity predicates, and there's other oracles which I'm not showing here, the re-key generation and re-encryption and encryption stuff. So the intuition is that basically we wanna capture an idea in which, I don't know what that was, in which the ciphertext basically from one time period is useless to an attacker in the next time period. And this captures that, because the attacker can't even tell which ciphertext was used from one period in the rotation. And in particular, it rules out these exfiltration attacks. So the question is can we then achieve both up IND security as well as this up re-encryption security? And the natural starting point for that is the BLMR scheme that uses this key homomorphic PRG. But as I mentioned briefly before, there's a bug in the proof, and it turns out it's not even up IND CPA. And this turns out to be somewhat easy to fix, but technically it was kind of interesting understanding this bug, so let me go over it very quickly. So called the BLMR scheme's unprovable, the BLMR scheme is using this chem-demp style thing with this key homomorphic PRG. And this allows updates by exploiting the homomorphism properties of the PRG. As they put in their paper, basically a theorem that a high level is saying that if E, the chem encryption is IND CPA and G is a secure PRG, then the scheme meets this up IND CPA security. So it turns out this is problematic, in particular because in the security games, both in their paper and in ours, the adversary gets access to re-key generation oracles, which means that in particular, they can mount a kind of chosen ciphertext attack against the underlying chem encryption E. And so IND CPA doesn't seem like it should be enough to prove security. So that was easy to understand, but trying to figure out whether this is actually insecure took a lot of work, and instead, we weren't able to come up with a direct attack showing that this is insecure for some instantiation of E, but what we were able to show as a relativized result that if you could give a proof of this up IND CPA security just from the IND CPA security of E, then this would imply a proof that this particular construction E is, it's IND CPA security implies that it's actually circular secure. There's been a long line of work on circular security and showing like negative results about IND CPA implying a circular security. Those don't directly apply because we have a very special form of E in our result, but nevertheless it seems like a lot of evidence that you're not gonna be able to prove something strong here. Okay, so it's easy to fix up that one issue because we can replace IND CPA chems with an authenticated encryption scheme, and in addition we add other features to the BLMR scheme to arrive at this re-crypt scheme that achieves our stronger security goals as well. So using AE for the chem prevents this type of modeling issue that came up in the proof or the attempted proof of the previous BLMR scheme. We also do the same type of technique as we did for KSS which is the secret share, the data encapsulation key across the two components of the cipher text. And when we do a refresh we now need to not only refresh the key X, but also the secret share Y. So I won't go into details, but it turns out we can show that this is both up IND and up re-encrypt security, so in words this means that we get this stronger confidentiality even with access to these re-key gen and re-encryption oracles, and that the cipher text in a similar kind of attack setting is basically useless, cipher text from one time period under one key are kind of useless to attackers moving forward, so you get this very full security. So finally the last question that I had was whether we can make these things performance enough in practice. We did a bunch of work to implement re-crypt in a highly optimized way. We used, underlying it, the Naur Pinkett's Rheingold key homomorphic PRF, which is this like random oracle model construction that also uses a DDHR group. For that we used a curve 25519. Put together this basically allows us to encrypt L-bit messages in requiring roughly about L over 248 exponentiations, and rotation requires about the same. So the high level takeaway is this is by symmetric encryption standards and cruciatingly slow, and it's about a thousand times lower than the use of like AASNI with ASGCM, which is what you can instantiate a hybrid or KSS with. So for short messages, you can still do this in milliseconds, but for large bulk data this seems prohibitive at this point. And it's kind of a cool open question whether this is kind of fundamental to get this key rotation property. We really need algebraic properties and we can prove that you can't do better. Or maybe if not, then we can come up with something a bit faster. Okay, so to summarize, we provided a formal treatment in depth of key rotation for symmetric encryption. We introduced a bunch of new security notions that strengthen on the prior work and investigated these using a bunch of new and old schemes. And the high level takeaways, those are fast and the other ones are slow. And finally, we have this embarrassing bug in our C-text of proofs, and so please stay tuned for a fix to that, which we'll be posting shortly. Thank you very much.