 Hi, I'm Joseph Jager. I'm here to talk about retro-encryption and key exchange, the security of messaging. This is joint work with Mehira Bilhari, Asha Kampasingh, Maya Nayapati and Igor Stepanov. So there's growing recognition that kind of ordinary people really should have end-to-end secured communication. After 30 or 40 years of crypto, we have plenty of theory telling us how to build such communication systems. But what tools do we have in practice that might actually be usable? SMS text messaging is typically not secured in this way, various social media websites, they don't do end-to-end encryption, email isn't encrypted by default, you can try to use something like PGP, but that's generally accepted to be somewhat of a pain and hard to use correctly. TLS isn't really appropriate to use here. We're not web servers with certificates, just ordinary people. Secure messaging app seems to be a very good answer these days. Here's a few examples of messaging apps. There are many more out there, and these are somewhat widely used WhatsApp, for example, reports encrypting 55 billion messages a day. So this leaves us with kind of an obvious question, how secure are these messaging apps? This is a big question. Entering a theory thoroughly would require really a systems-level perspective worrying about much more than just the crypto involved. We're not trying to exhaustively look at things that work, we're just going to look at the crypto, and in particular, we're looking at this one component technique used by some of these apps called ratcheting. Ratcheting is used by these four. So before we get to that quickly, in reality, to have any sort of end-to-end encrypted communication, you need to start with some sort of authenticated key exchange. In practice with TLS, the authentication comes from, say, certificates in the messaging setting. It may come from some trusted server or from some sort of out-of-band verification. For our work, we're not going to kind of worry about that because ratcheting is more about how the keys are used and updated. Rather than how they are initially produced. So we're going to kind of just abstract this away is some algorithm called initial key generation, which we trust to give us the keys that we need. So in the kind of traditional way to use the key, like, say, we would essentially just have a single shared symmetric key, which we use to encrypt and decrypt our messages for the entirety of our conversation. Where this becomes a problem is in a setting where you're worried about there being some sort of compromise of your key. An adversary is able to break into your phone, maybe with some sort of malware or whatever. If they steal your key in this setting, they can read all of your messages forever. All your past messages, all your future messages, everything. How can we hope to avoid this? Well, we can try to update our key over time. And this is what ratcheting is all about. Generally speaking, in ratcheting, what we're going to kind of think of it is, we have a shared symmetric key K1 and then periodically throughout the conversation, the parties send update messages to each other based on which they update or ratchet the keys forward to produce K2, later K3. And kind of what we hope for here is if the adversary steals one of the keys, say K1. When the key is later updated and the parties are communicating using K2, knowledge of K1 won't help the adversary at all in reading the encrypted messages. Informally, the two goals here are kind of forward security and backward security. Backward security is the one I just described. Forward security is the other direction. If somebody is able to steal K2, we want that to not help them read anything that was encrypted using K1. Exactly what threats are being protected against by this ratcheting does need kind of careful consideration, because if the threat you're worried about is some sort of persistent malware that can just sit on your phone and exfiltrate all your messages, ratcheting isn't really adding anything to help you there. Where ratcheting might be more useful is if, say, the malware is only able to exfiltrate keys and its presence is somehow being limited by the software security that you have. As we kind of saw in the previous talks, these security notions can also be considered for something like TLS. One reason why they seem to be particularly interesting in the messaging setting though is chatting conversations can stay open for long periods of time. So this kind of this longer period of time to be worried about. So some quick history. The kind of techniques underlying ratcheting were originally used in this paper by Bursov, Goldberg, and Brewer in which they introduced the off-the-record communication system. This picture on the right here, or left, gives kind of a quick sketch of roughly how the ratcheting works there. They use a Diffie-Helman group together with a hash function to update the keys over time. The term ratcheting itself didn't come until later with Adam Langley and the PON protocol. There was also recently a survey article, a systemization of knowledge on messaging apps, which stated that, which noted that there are many claims in the literature, including forward secrecy, backward secrecy, self-healing, and future secrecy, but that these terms are controversial and vague in the literature. So I can talk about what we try to do in our work now. First thing, we tried to lift ratcheting from just being a technique used in practice to a distinct cryptographic primitive. And in fact, we formalized two versions of the primitive. First we have ratcheted key exchange, which solely deals with the updating of the keys over time. And then ratcheting encryption deals with this updating of keys over time in addition to processing encrypted data with the keys. What do I mean by formalized? Well, two parts are just defining some sort of syntax, so to actually define what these objects are that we're talking about, and then providing strong game-based security definitions, specifying exactly what security we're hoping at least to achieve. The ultimate goal of those two primitives I discussed is, of course, ratcheting encryption. Our whole goal here is to have encryption to talk to each other, to kind of bootstrap, bootstrap us along the way there. We show a generic way to compile together a ratcheted key exchange protocol with an AAD encryption scheme to build ratcheting encryption because that we only have to focus on building a ratcheting encryption protocol, so of course we do so in our work. And then protocol by itself isn't useful unless you're able to show that it's secure. So we provide a proof that the ratcheted key exchange protocol we designed is secure, achieves our strong security notion under the strong computational Diffie-Helman assumption in the random protocol model. A few caveats and remarks before I kind of dig into the details of those contributions. Messaging apps in practice use what we call double two-sided ratcheting to kind of abstract out the core of what ratcheting is about in our work. We just treated single one-sided ratcheting in the single one-sided ratcheting model. We think of one of our parties as strictly being the sender, and the other party as being the receiver. And one important thing to emphasize here then is that kind of in this model, we can't hope for security against the exposure of the receiver secrets. If an adversary gets the receiver secrets, all security will be lost. We're only providing these more advanced security notions against the compromise of the sender's secrets. In this kind of isolation of the core ratcheting technique, the protocols we give and make proofs about while inspired by the in-use protocols are not exactly identical to any of them, and they have some important distinctions. And again, we only treated kind of ratcheting as a whole. There's this great other work from 2017, which treats the signal protocol in particular in more generality than just the ratcheting component. Okay, so let's get into our formalism. What do I mean by ratcheted key exchange? We're going to think of both the sender and receiver as having kind of three keys. A static key, which is never changes throughout the duration of the protocol. Session keys, which get updated with every ratchet. And then the symmetric keys are the keys actually output by the protocol to be used. The initial key generation just spits out those keys for both parties. Then we have sender key generation, which performs the actual ratchet on the sender side. It will update the session key, it will produce the new symmetric key, and it produces some sort of update information to send over to the receiver. The receiver then, given the update information, can perform the same ratchet to obtain the same shared symmetric key. And just to kind of emphasize here, the state stored by the two parties, which might be vulnerable to exposures, is exactly those three keys. But again, we're only looking at exposures of the sender state. The receiver state is going to be assumed secure against exposure. Okay, so we give a definition of key indistinguishability of what the security we're hoping for is. In this model, we think of the adversary as having complete adversarial control of the communication, which is represented by a ratchet sending and ratchet receiving oracles. In addition to that, it's at any time allowed to ask to have the secrets stored by the sender exposed to it. Given by an exposed oracle. Given these different powers, the goal of the adversary then is to distinguish the keys that were produced by the protocol from truly random strings. And those are given by the challenge receiver and challenge sender oracle over there. In this setting where you have to worry about the various secrets of parties being exposed, it's actually quite takes a lot of care to write security definitions that are actually achievable. Because when you give out some, when you give out the secret information, there will be some inherent attacks that you just cannot possibly hope to avoid. So you have to write your games to make sure those aren't allowed. Two examples here. If the adversary exposes a secret key K, clearly that key will no longer look random to them. In our game, we just kind of keep track in a table called OP, whether an adversary has exposed or challenged on a particular round, and we don't let it do both. Second thing like this is if the adversary exposes the information from the sender, they would necessarily be able to produce their own update information to send over to the receiver. And then they would be able to know the key that this receiver generates. So again, we have to just allow that and we set this flag called restricted, which just keeps track of whether it used an exposure to forge update information. And in that case, we kind of break the challenge receiver oracle and it will just only return the real key and never the randomness. So we showed these three algorithms for ratcheting key exchange. Ratcheting encryption has those three algorithms again and just kind of augments them with encryption and decryption algorithms in a non-spaced vein. So how's it work there? Encryption takes in a key produced by sender key generation, together with nonce message in the header to produce a ciphertext, then the decryption, given that ciphertext is not in the header, and the key from receiver key generation decryps and obtains back that underlying message. Again, we have a security game. Again, the adversary has complete control of the communication between the parties and is allowed to expose the sender secret at any time. The difference comes in what the adversary is trying to do to win. Now instead of distinguishing keys from random, it has access to both encryption and decryption oracles. It tries to distinguish between the valid output of encryption and random strings of the appropriate length. Or it tries to distinguish between the output of decryption and an oracle which just always rejects. As before, there are various subtleties dealing with when the adversary exposes, not allowing it to do trivial attacks. And we kind of address them using similar techniques to apply a definition. Okay, now that we know what our primitives are, how do we do this generic compilation? Well, it's somewhat intuitive, say, for it. You're given the three algorithms from Ratcheted Key Exchange. You're given an AAD scheme which has encryption and decryption. These five algorithms are exactly the sorts of algorithms that we needed for Ratcheted Encryption. You just plug them in into the appropriate places. And of course, we have to prove that secure. And we're able to prove that this Ratcheted Encryption Protocol will achieve the desired security notion, assuming that the underlying key exchange achieves our KIND security and that the AEAD encryption scheme is multi-user secure. The details behind there. We have some adversary against the encryption protocol. We build adversaries against the underlying components and can obtain this shown relationship on their advantages. Okay, so now that we know how to do that generic compilation, we just have to worry about how you build Ratcheted Key Exchange. The kind of three components that we use in the protocol will be a kind of Diffie-Helman Group G, hash function H, and a message authentication code F. The initial key generation algorithm is going to kind of just pick random symmetric key K0, picks a random MAC key to give to both parties, and initializes a counter to zero. And then it also picks a random group element G. And for the static key of the receiver, it picks a secret exponent. And it gives the corresponding public value as the static key for the sender. Then every time we want to perform a Ratchet, the sender will create its own secret exponent, which it then creates the public value of to send to the receiver to authenticate this. It uses the MAC to authenticate the value. And then we apply the hash function to the counter, the tag, the value we're sending over. And this part, kind of the magic sauce of what we're hoping is hard for an adversary to guess, hash them all together to get both the next symmetric key and the next key for our the receiver. Then given this updated information simply checks that the tag is correct and assuming it is can obtain the same shared keys using the hash function. In our paper, we gave a tax on a number of variants of this protocol as I've described it. Motivation here, it both kind of shows where the various parts of the, why the various parts of our protocol were needed. And it also helps to kind of elucidate what attacks are actually being disallowed when we show that we achieved our security. First we look at various situations where we don't use the MAC and try to be secure anyway. And the ultimate result there is we have this generic attack showing that any scheme without some form of authenticity can necessarily be attacked. We also look at what we call key reuse and key collision attacks, which are attacks where an adversary tries to kind of reuse the secret information across different rounds so that the receiver will repeat keys and lose security. Okay. Quickly mention how we proved our security. This, the security is proven assuming that the group D and hash function are such that ODHE is hard and that the MAC is unfoldable under chosen message attacks. ODHE is all called Diffie-Hulman with exposures. I will discuss that in a moment. Again, we prove it with a production given an adversary D against the key exchange protocol. We build adversaries against the underlying group and hash function and the MAC, giving that advantage round to very quickly sketch the proof. The sketch, the proof required really a quite a careful eye for detail because for very subtle reasons kind of you're, the first things you would want to do to prove it secure don't quite work. Anyway, the first step of the proof involved a hybrid argument in which we argue that the adversary won't be able to forge update information without having exposed. And how this works is we first have to preemptively switch the MAC keys to be, to look like random so that we can argue forgeries won't happen. But then we have to switch them back to no longer be random in case the adversary exposes. Then having done this, we can in kind of one fell swoop switch all of the non-exposed keys to random. Now let me just finish by saying what this all called Diffie-Hulman with exposures thing was. It's a multi-query variant with exposures of the all called Diffie-Hulman assumption originally introduced by Abdallah, Velary and Ragh away. It's closely related to the PRF ODH which we will be hearing about momentarily. In kind of our model here there's a fixed value g to the y and the adversary can ask as many, for as many values g to the x to be created as it wants. The adversary as many times as it wants can ask for the underlying x values to be exposed from any of these g to the x's. It has all called access to a hash function with that secret value y kind of embedded into it. And then the goal of the adversary is to distinguish between the output of the hash function and randomness assuming that it hasn't exposed that x. As with the other two definitions because of these exposures there are various subtleties about trivial attacks we have to prevent and well we address those and kind of quickly. As Abdallah, Velary and Ragh away did in the original work we're able to show that this all called Diffie-Hulman with exposures can be reduced to the strong computational Diffie-Hulman assumption in the random oracle model. We in fact do this twice in our paper first we do it with kind of the first proof one would think of doing a straightforward index guessing sort of proof but then we because there's some inherent uh non-tightness in that proof technique we redo the reduction using a more clever rewinding argument so that we can obtain a tighter bound on the security. Thanks for listening.