 All right. Thank you, everyone, for coming to my virtual Asia Crip Talk. I'll be talking today about improving speed and security in updateable encryption schemes. And this paper is joint work with Dan Bonet, Sam Kim, and Maurice Shea. So the subject of this talk is key rotation. Now, what is key rotation? So, key rotation is if you have some data that you've stored in a third-party cloud, and you've encrypted it under some key that you hold, and later you maybe want to change that key so it's encrypted under a different key. And the question is how are you going to do this? So, first of all, maybe we should answer why you would want to do this. There's many good reasons to rotate keys. You might want to retire keys that have been compromised. You might want to proactively refresh keys so that you can get them out before you find out that they've been compromised. It may be some kind of access control enforcement so that giving somebody a key is kind of equivalent to giving them access to your data for, say, one year. And there's also, you might just want to follow the advice of others. So this is something that's recommended by NIST. It's recommended by Google. And if you're in the payment card industry, their data security standards actually require it. So how are we going to do key rotation? Well, there's a couple of kind of straightforward ways that are not going to work out. So the first thing you could do is, well, you could send the old key, as well as the new key to the cloud provider, and have the cloud provider decrypt the old data re-encrypt under the new key. This certainly works, but it has no security. So it obviously won't work for us. A second thing you could do is you could download the data, re-encrypt it under a new key. So you download it, decrypt it locally, encrypt it again under the new key, and then send it back. And this will work, and it will be secure in the sense that the cloud provider doesn't learn your keys or your data. But unfortunately, this comes with a high communication and computation costs for the client. So this is also unsatisfactory. The question we've got to ask is, you know, can we do better? So in practice, what's often done, say, if you use like AWS for your key management is they'll actually take a data encryption key that will be used to encrypt your data. And when it's time to rotate the key, they don't actually rotate the data encryption key. What they do is they have that data encryption key stored in their key management service under a client master key, and they rotate the client master key. So the only thing that needs to be rotated is the key that's encrypting the data, which is a small thing. Unfortunately, this doesn't actually rotate the key that encrypts your data. So if we want a kind of stronger security guarantee, we're going to need a new scheme. And this is where a recent line of work on updatable encryption comes into play. So updatable encryption is a paradigm where instead of sending the keys or sending all of the data back and forth, the client sends a small update token to the server. And then the server uses this token to somehow update the ciphertext to move it to the new key without learning either of the keys or learning the data that's encrypted. So our contributions are, first, we have some improvements over prior security definitions. So in particular, we have this additional requirement for security. In prior work, it was possible to have different ciphertext that revealed their age. So what I mean is you could look at a ciphertext and tell whether this is a fresh encryption or this is a ciphertext that's been re-encrypted or if this is a ciphertext that's been re-encrypted twice or three times or many times. So how much encryption has happened is something that was previously revealed. We provide definitions that enable hiding this information. Then we have two new constructions of updatable encryption. One is from nested AES, which is very fast but only supports a bounded number of updates. So you need to know how many times you're going to encrypt, how many times you're going to re-encrypt the first time you encrypt a ciphertext. We have a second scheme from key homomorphic PRS based on ring learning with errors. And this scheme has some different trade-offs but it's 500 times faster than prior work but it's still a little bit slower than our nested AES scheme. And at the end, I'll tell a little bit about the performance evaluation we did on the implementation of our scheme and how we compared it to prior work. So I want to start off by briefly talking about how we define security. So there's four security properties that need to be achieved. There is correctness, which is that you can encrypt something. You can re-encrypt it to a different key and it will still decrypt correctly. There's compactness, which is this requirement that the kind of the communication between the client and the server and the client-side computation used to be small or constant size. There is confidentiality requirement, which I'll talk about more in a minute, and there's an integrity requirement, which kind of mirrors standard integrity requirements. So I want to talk a little bit more about confidentiality. I'm not going to go into the details of the definition, but I want to just use some pictures to give you a sense of what kind of security can we hope to achieve in this updateable encryption world and what kind of security is kind of out of the question. So here we could imagine we have a ciphertext that's encrypted under key three, which is one of the series of keys. And we're wondering like what kinds of security can updateable encryption give us. So something that we know from standard encryption is that if the attacker has access to the key itself, well then we cannot hope for any kind of security. Something else that's true in the updateable encryption setting is if the attacker has some keys or update tokens that get a path to the key that's used to encrypt a ciphertext, we also can't really get any security. So if you have an attacker who has key one, but also has update tokens to go to key two and then to go to key three, then we can't expect to height the data. In our setting, it also works backwards. So if you have an attacker who knows key four and an update token to go from key three to key four, the attacker can take the data encrypted under key three, re-encrypted to key four, and then use key four to decrypt. So when can we expect to get security from updateable encryption? Well, we can expect to get security when the attacker does not have such a path. So if the attacker, say, controls key one, and it has an update token to go from key two to three, and an update token to go from key three to four, well, it doesn't have a path to get to the key that we're using. So we can expect that we get semantic security for our ciphertext. Our definitions additionally require hiding the ciphertext age from an attacker. So that an attacker wouldn't be able to tell if when you have this ciphertext under key three, if it was initially encrypted under key three, or if maybe it was initially encrypted under key two, and then it was later updated to key three. Those two ciphertexts should look the same. Another caveat that I should point out, and this is true of all updateable encryption schemes, is that the cloud must always be trusted not to keep old ciphertexts. So when you do a re-keying and you've rotated the key and you have a ciphertext under the new key, the cloud has to discard the encryption under the old key. Otherwise, if someone gets their hands on the data, on the old data and the old key, well, then they could use the old key to encrypt. And this is true of any updateable encryption scheme. So I want to mention that all of our updateable encryption schemes are in what's called the ciphertext dependent model. So in this model, we break up a ciphertext into a constant-sized ciphertext header, and then a ciphertext body that's the actual encryption. And when it's time to do a key update, what will happen is first the server will send the ciphertext header back to the client, and this doesn't violate our compactness requirements because the header is constant-sized, so it's small. And then the client will use that header to generate a re-key token, which is sent to the server, and the server uses that to do the re-encryption. So this is called the ciphertext dependent model, and it's what all of our constructions are in. All right, so our first construction is an updateable encryption scheme from Nested AES, and this scheme is very fast, and it's a rather simple scheme, and it only requires authenticated encryption and the PRG. So both of these things can be built out of hardware-accelerated AES primitives. There's a couple of caveats with this scheme. First is that it only works for a bounded number of re-encryptions. So at the time that you produce a ciphertext, you need to know how many times it's going to be re-encrypted over its lifetime. This might not really be an issue if you're in a setting where you're doing your re-encryption for compliance, where you kind of know ahead of time your data's going to live for 20 years and be re-encrypted once a year. If you're in a setting where you're kind of re-encrypting your ciphertext more frequently, well then maybe this might be a less desirable property. Another caveat is that the decryption time will always be linear in the number of re-encryptions. So since we're doing some kind of nesting, when it comes time to decrypt, you need to kind of unwrap all of the layers, and then depend on how many layers there are over re-encryption. So how does the scheme work? So the ciphertext in our scheme are going to be split into two pieces, the body and the header. And the key that the client keeps is going to be the header key. So it decrypts the header. And I should mention both the body and the header are encrypted with an authenticated encryption. So the client has a key that decrypts the header. And then inside of the header, there's a key that's used to encrypt the body. So the body key is stored inside the header. The header key is stored with the client. All right. So far, this is, yeah, very simple. So the next thing that's going to come up is, well, we're going to want to change the key. So we're going to go from the black key to the green key. And how are we going to do it? Well, the update that the client sends to the server is, first it sends a new body key, so the green body key. And then it sends a new ciphertext header, the green ciphertext header, which has inside of it the green body key, as well as the black header key. So what happens is this goes to the server. The server uses the green body key to wrap the entire previous ciphertext, and then it places the new header on top. If the client wants to rekey again, it does the same thing. It sends it over the orange body key and the orange ciphertext header. And the server uses the orange body key to wrap the entire previous ciphertext. And then the ciphertext header contains the orange body key, as well as the green header key. So re-encryption always just involves wrapping the previous layer, and decryption involves unwrapping all of the layers one by one. So the issue with the ciphertext, the scheme we have so far, is that it leaks the ciphertext age. Each re-encryption causes the ciphertext to grow, so the size of the ciphertext can tell you something about how old the ciphertext is and how many times it's been re-encrypted. I should point out that this is actually not a problem if you're working under the definitions from prior work. So under definitions of prior work, this scheme is described as actually secure. But we want to hide ciphertext age as well. So how are we going to do this? Well, the first idea is, okay, well, we mentioned at the beginning, you know, you kind of need to know how many re-encryptions are going to be up front. So how about you just pad up every ciphertext to a fixed maximum size based on how many re-encryptions you expect, and just kind of pad it up with random data. And as long as your encryption scheme satisfies the property that the ciphertext looked random, which is true for the encryption schemes we use, then you couldn't tell how old the ciphertext is. The problem is that this introduces an integrity issue, because if we have all of this random padding, someone could flip bits in the random padding and we would be known the wiser, because our decryption process doesn't say anything about what to do with padding. So the way we get around this is that we generate random data from a PRG instead of making it truly random. And we include the PRG seed in the header. So when it's time to decrypt, you first, you know, you decrypt the first header, then you expand out the PRG and you check that the randomness in the ciphertext matches the randomness you'd expect from the PRG. And if it succeeds, then you can continue with decryption. If it fails, you output bottom. And you can look into the paper for the full scheme. This is kind of the gist of it. There are some additional details that deal with the peculiarities of our model. And you can read about that in the paper. All right. So I also want to talk about our second scheme, which is an updateable encryption scheme from key homomorphic PRS. The properties of the scheme are that it supports as many re-encryptions as you could want. You don't need to know ahead of time how many re-encryptions there will be. Decryption time does not depend on the number of re-encryptions. So it's kind of solved both the caveats that came with the previous scheme. And it's actually still quite fast, although it is a little bit slower than the nested scheme, which kind of approaches however fast you can get AES on your computer. This scheme does have a new caveat, which is that it comes with a somewhat weaker integrity guarantee, as well as a weaker age-hiding guarantee. And you can read more about that in the paper. All right. So the tool that we're going to use for this is a key homomorphic PRF. So recall that a standard PRF is one where you can evaluate it at some key k and an input x. And if you don't know the key k and you see this evaluation, it looks like a truly random value. So a key homomorphic PRF has the exact same security property, but it has some new functionality. This is that it has a homomorphism in the key space. So if you take the PRF at some point x under key 1, and you take it under some key 2, and you add those two things together, this should be equivalent to if you had added the two keys together and then evaluated the PRF. One example of this in the random oracle model is a PRF where you hash the message and then you raise it to the key. And this has the key homomorphic property because if you multiply two of these together, there's addition in the exponent and that adds up the keys. All right. So how are we going to build an updatable encryption from key homomorphic PRFs? And I want to say the construction I'm going to present is a very similar kind of a simplified version of the construction of Everspot all in 2017. So how's this construction going to work? So first, we're going to have a ciphertext header as before. And the ciphertext header, just like before, is going to be an authenticated encryption. And what goes inside is going to be a hash of the message and a key homomorphic PRF key K1. And in the body, there's going to be an encryption of the message in counter mode using the key homomorphic PRF. So what does that look like? So you're going to break up the message into blocks and then in each block you're going to take the message piece plus the evaluation of the key homomorphic PRF at the index of that block under key one. So so far, this actually looks a lot like our previous scheme. The only difference is that the body encryption is not an authenticated encryption. It's just a counter mode encryption. It's only CPA secure. But inside of the header, we've put the hash of the message. Now the difference is going to come when we want to do an update. So the update process is going to have the client first download and decrypt the header so that it recovers the key K1. Then it has the key K2 and it uploads the new header, which is the same hash of the message and the key K2 inside of an authenticated encryption, as well as the update token key update. And key update is just the difference key two minus key one. And this is what the server is going to use to update the encryptions in the body. So how is it going to do this? Well, each block of the new ciphertext is going to be the old ciphertext block. Plus the evaluation of the PRF at the update key, the update key at the desired index. So the reason that this gives us an updated ciphertext is that we initially had a ciphertext that was message plus evaluation of the PRF under key one. Now we've added the key update evaluation, which is equal to, by the key homomorphic PRF property, an evaluation under key two minus an evaluation under key one. So the key ones cancel and you get message block plus evaluation under key two. All right, so something I should point out is that in the ever spot at all scheme, they use a key homomorphic PRF based on the DDH assumption, which is the one that I showed you before. In our work, we don't actually use a key homomorphic PRF. We use an almost key homomorphic PRF and we use a new almost key homomorphic PRF on the ring LWE assumption that you can read about in the paper. So what's an almost key homomorphic PRF? Well, it's the same as a key homomorphic PRF, but this homomorphism in the key space is actually kind of noisy. So it's not exactly equivalent. It's equal plus some small noise. So you can look in the paper to see how we deal with this, but the main issue is that, you know, you add noise, this is going to break the correctness of your scheme. So how are we going to get around this? Well, in the end, we end up using a kind of padding. So you kind of pad the lower bits of each message block because the noise isn't going to be that big. When you decrypt, you kind of drop the noise and then you can recover the whole message. It turns out the reward for doing this is that you get performance that's about 500 times faster than using the DDH actual key homomorphic PRF. So a question that might come up here is why is lattice crypto so useful in this scheme? Why does it get us a 500x speedup? And the reason is that usually when we use lattice crypto, there's a trade-off, which is that the actual algebraic operations that are needed to do the crypto are much faster than doing say elliptic curve operations. But on the other hand, all of the ciphertexts are so much bigger and all of the public keys are so much bigger. But the observation we made here is that first, it's symmetric crypto, no public keys to worry about. And second, we're using the lattice-based PRF in counter mode. So we don't actually need to have kind of the entire output of the PRF. We just need the piece that's relevant for us. And as a result, we don't pay the price for kind of these large lattice outputs. We just pay the price for however large our message is, because it's counter mode after all. So the only additional cost we get, and it's not nothing, but the only cost we get is the cost of this padding, which is very small, relatively speaking. And that's why kind of lattice crypto is really good for this setting. So I want to tell you a little bit about our evaluation before wrapping up. So first, we measured the throughput for re-encrypting and re-encrypting a bunch of 32-kilobyte messages. And these numbers are reported in megabytes per second. So we measured these numbers for re-crypt, which is the ever-spot at all construction with a real key homomorphic PRF, then our construction, the almost key homomorphic PRF, and then our nested construction, which was set with the parameters for 128 layers of nested. So you can see for both encryption and re-encryption that the prior work can encrypt something like hundreds of kilobytes per second. But if you switch to the almost key homomorphic PRF, we're getting a 500 times improvement and we're encrypting tens of megabytes per second. And when we switch to the nested scheme, we're actually starting to encrypt gigabytes per second. It's another 30x faster. And this is, of course, due to the hardware accelerated AES, you can, the encryption and re-encryption speed is just however fast your computer is able to evaluate AES. The story for decryption is a little bit more involved. And this is because the decryption in the nested scheme gets slower the more times you encrypt it. So something I want to point out is that all of these times on this graph are in microseconds. So kind of regardless of which scheme you pick, it's going to be quite fast. But the nested encryption, the nested construction we found is that faster than the key homomorphic PRF one or the almost key homomorphic PRF one for up to 50 re-encryptions, after that you're better off if you're doing a lot of decryptions to use the almost key homomorphic PRF construction. I'm not showing the re-crypt scheme on this graph and that's because it's 500 times slower than the key homomorphic PRF construction. I wanted to make sure the graph shows the trade-off of the two new schemes. One last thing I should mention is ciphertext expansion. So I mentioned that we have this padding which will kind of incur some additional space costs and it turns out that this is kind of where we trade off with the re-crypt construction. So because we have to account for the noise, we have to pad ciphertext blocks, plaintext blocks with zeros before encrypting. So depending on kind of the choice of parameters in the key homomorphic PRF scheme and the nested scheme, we can get different ciphertext expansion trade-offs. So with the nested scheme, if you want to only re-encrypt about 20 times, which is kind of what you'd expect if you have rooms for compliance and you have ciphertext that's going to live for say 20 years and be re-encrypted once a year, then you actually get ciphertext expansion that's comparable to re-crypt. If you want to encrypt it say 128 times or some larger number of times, then your ciphertext expansion is going to look more like the homomorphic PRF scheme. And that one, the expansion will range from 20% to over 100% depending on your choice of parameters. So for our evaluation, we use the, I think we use Q equals 60. So you're getting kind of a third blow-up in ciphertext size. So this has been our work on improving updatable encryption. I talked briefly about our improved security definitions for updatable encryption. I presented our two new constructions and RingLW based key homomorphic PRF, almost key homomorphic PRF. And I've shown how we've gotten orders of magnitude performance improvements over prior work. If you want to read more about the details of our work, the paper is online at e-print, as is the source code for our schemes and for our evaluation. And if you have any questions that you're not able to ask me after this talk, feel free to contact me at sabah.cs.stanford.edu. Thank you so much for your time.