 Thank you, Kenny. Good afternoon, everyone. Thanks for being here on that beautiful, beautiful New York day. So I'm Trevor Perron. I'm going to be talking to you this session about the classical use case of cryptography, which is of course the encryption of text messages. This is where cryptography started thousands of years ago. It's the only thing it did for most of its history and it remains a challenging use case even today and an important use case. So I'm going to be giving a sort of sweeping overview of this history and the evolution of message encryption systems and talk about how we got to where we are today and some some kind of challenges we have to tackle for the future. I'm going to be doing this from a couple of perspectives. I'm going to be talking about first key management and key distribution and then secondly zooming in and looking a little bit at the protocols that these keys are then used within. So to dive into it and look at the history of key management and message encryption systems, I think we have to first say a little bit about cryptography. So, you know, if we go back in time and say before the 20th century cryptography really had not been changing all that quickly for the previous 400 or so years, people would use manual ciphers which tended to be either not very secure or hard to use or both, or people would use various sorts of codebooks and with those were easier to use, there were a lot of effort to create because someone would have to compile a list of thousands of substitutions or hundreds of substitutions and then maybe reverse index them and print all this out and distribute it. So if you think of sort of codebooks as sort of keys of the time at this point keys were hard to even create, so you'd make one of them, you'd give it to all of your diplomats, they'd you know, they just use the same one for years and that had all the sort of security problems you can imagine. That was sort of okay because cryptography was not terribly important at this point in time. So coming on, with the 20th century we got radio and the era of world wars and the importance of cryptography sort of exploded. There was a huge amount of innovation and research and it was sort of the industrial revolution for cryptography. So by the time of world war two we had cypher machines as sort of the workhorses of military crypto. There was still a lot of use of manual cyphers and code books for sort of lower level or tactical traffic and then at high levels of community or and for certain point-to-point communications there was use of one-time pads for say high level diplomatic or military traffic or spies who don't want to carry around cypher machines. So at this point there wasn't a sort of unified military messaging system. There was a lot of different systems that you could use but a lot of, you know, the military case was compartmented and hierarchical anyways with someone passing messages up and down the chain of command. So that was, you know, sort of okay or at least what they had. The key management underlying this was was simple. Someone created keys and gave them to everyone. So there'd be a center. If Alice and Bob wanted to talk the center would have to give each of them the same key and then they'd be able to communicate. So this was a, you know, a rigid centrally controlled system, but this is what they had. Moving on to the, you know, the first decades of the Cold War, cryptography remained a very, you know, sort of secretive and military activity. But enough has been declassified that I think we can kind of see the picture. And it's, you know, about the cypher machines taking over, becoming electronic, use of one-time pads or code books or other things falling off except for special use cases. And symmetric, at least, Streamsoft is assuming a pretty modern kind of appearance with small keys, random nonces, and strong security models. But key distribution and key management really didn't change much in this period. The center created keys and gave them to you and you'd use one key a day. And so on. Remained a rigid system that remained a lot of a big logistical hassle. You know, there's a lot of risks here because there's a lot of opportunity for someone to copy a key list and sell it, or to steal keys, and these were very real threats. So, you know, within 1980s, a lot of attention shifted in towards ways of using symmetric keys to protect other symmetric keys. There's not a great single term for this, so I'm going to just call it symmetric infrastructure. And the basic idea is pretty simple. It's just that every party has a unique key. They register it with the center. And then if the center wants to group users into a group to communicate, it can just choose a group key and encrypt that group key to each user's unique key and then send them a ticket. It's just an encryption of the group key under a unique key. So that sort of evolved in the context of radio systems where it was known as over-the-air re-keying. It's a pretty reasonable way to manage sort of small radio networks. And it's how, you know, you can add users to groups. You can re-key groups and revoke users by giving everyone else new keys. And this is still how a lot of, you know, voice radios for military or public safety systems are managed today. But it's pretty, it doesn't get us to a large-scale messaging system where anyone can send anyone a message unless we put the KDC online so Alice can contact it and say, hey, give me tickets for Bob. Of course, that's the Kerberos model. And in this time frame, you know, the U.S. government built secure phone systems also using the same model, but it's pretty awkward to have to reach out and contact an online KDC every time you want to send a message or have a phone call with someone. You know, it's a performance hit, it's a reliability problem, and it's a security problem if the online KDC gets compromised. So by the 1990s, public key cryptography was mature enough that attention shifted to using PKI. And of course, public key crypto was a, you know, a revolutionary breakthrough in cryptography, but for message, for key management, the systems were pretty continuous or there's a lot of similarities to what came before it. So we can look at a PKI, sort of a two-part system. You, you know, just like a symmetric infrastructure, you register your unique key, except it's a public key instead of a symmetric key. And then the center is going to push out information that lets people communicate with each other in the form of certificates now instead of tickets and probably revocation information. But these certificates are different from tickets because they're not per each group of users, giving them a single key, they're per each individual. So the PKI can push out, you know, a single certificate per user, put them in a directory and allow any users to choose to communicate. And so now all of a sudden we have a flexible system, anyone can communicate with anyone else. That's fairly exciting. That's a, that's a big advance in this. You know, as this idea was popularized in the 90s, the internet was taking off, people, you know, were, were enthusiastic about this. And an industry strung up around it, there was sort of the idea that, you know, military applications and use would drive government use, which would drive big enterprise use, which would drive little enterprise use. Everyone would be deploying PKIs. And you could go further. You could have the certificate authorities issue certificates to each other to express how much or what they trusted each other for and then connect the whole system into sort of a global PKI mesh. And, you know, then anyone would be able to have secure communication with anyone. And so that was a very exciting vision to some people. It was a scary vision to other people. The 1990s were the eras of the crypto wars and a lot of controversy around what would happen, you know, when, when all this stuff was ubiquitous. But as the decade progressed, it became kind of apparent that this was not just an exciting or scary vision. It was a pretty unrealistic one because this PKI mesh was just not coming into existence, even when we tried other, you know, variations of the concept, like saying, well, what if we take it to a logical extreme, say, everyone's can issue certificates into the directory and follow a PGP-like web of trust model? You know, none of these things really, really took off. And it's probably for a few reasons. It's probably because building a key management system is a lot of work. Interacting with it as a user is a lot of work. People are only going to do these things when they're strongly motivated. And the systems they build are going to be for specific context. They're not going to be easy to connect together. You know, also, people are not going to want to publish all of their user information into global directories. People are not going to want to publish their entire social network into global directories. Even if there were languages that you could use to fully describe the world of human trust relationships, which there aren't really either. So for a lot of reasons, the sort of expansive vision of PKI never really took off. It was a useful technology in a lot of niches, a lot of important niches, such as the web. But even in enterprises that deployed PKIs, it got a lot more use for things like single sign-on than encrypted messaging. So the 2000s were sort of an era of kind of demoralization. And no one really, you know, people sort of worked on other things. We tried a lot of PKI stuff. You know, there were some other variations on the concept like identity-based encryption that didn't really, you know, transform the situation either. But in the current decade, I think there's been some movement again for a couple of reasons kind of combining together. And I think firstly, that's just the increased public awareness of people's vulnerability to massive data breaches and mass surveillance. I think it's secondly the emergence of mobile messaging apps, which are managed by some company and have acquired, you know, huge sort of user bases. And of course, bullet point number two dramatically increases everyone's worry about bullet point number one because you're like, well, what if these apps get breached? Are they spying on us? But at the same point, bullet number two gives us the ability to do something about bullet number one because these apps are managed by a company that has an engineering team that's used to pushing out features and competing with all the other messaging apps for a fickle audience who can easily switch apps. So increasingly in the last few years, we've seen a number of these companies look at all the sensitive data that they're handling and that's being stored on their disks and, you know, realizing that that's kind of scary for them, it's scary for their user base, and they would like to just encrypt it. And so, you know, people like iMessage and then WhatsApp have gone over to just default message encryption systems where you have the endpoints generate public keys, the server distributes those public keys to other people as needed, and then the server only sees Cypher text. So that's a good development. It buys you security against the server doing passive monitoring or retroactive decryption. It leaves some big open questions such as how do you know the server really gave you the key honestly and is not doing an active man in the middle attack. And a number of different approaches are being taken here. The most common is something like what signal is doing with a, you know, allowing some sort of out of band authentication to be done. So in signal's case, you can, the conversation has a safety number which is just a concatenation of public key fingerprints, but users don't know what public key fingerprints are, so they're called safety numbers to be just a single kind of token of information. It's easy to compare them just by doing a single QR code scan and you get a yes or no answer. And you can also convey the safety number through a different channel and we try to make it easier by if you, you know, click on it a bunch of ways pop up to send it through another messaging app or, you know, an email system or whatever. So hopefully the man in the middle isn't in the middle of all of these things. Most people are not going to do this, but some people will do it. And if some people do it and if the system doesn't know who those people are, and it shouldn't, then it's hard to, then there's risk to attacking your man in the middle in anyone. So, you know, a smaller number of people, you know, performing these checks, you know, you would, you would hope to keep a larger system honest. There's other approaches. One of the most interesting is the certificate transparency idea which has been adapted to messaging as the, the conics concept, where the server publishes a mapping of usernames of public keys. Bob can check and see if there's any unexpected mappings in the public database and the server can also prove to Alice when it gives her a key that it was a key that was published. And that's an idea that a bunch of people are working on. I hope it, you know, goes forward and becomes a reality. You can imagine enterprises deploying trusted directories as well that people, you know, within their employee base publish their keys into and look up keys, keys from. So there's a variety of things you can do. And if we sort of step back and look at this as a model, I think that they all kind of, let me describe as these often sort of an encrypt then authenticate model, which is different from a lot of the previous things that I was describing, which is more of an authenticate than encrypt model. So that's a distinction that I think we can see at two different levels in these systems. So if we look at the user experience of new messaging systems, there's a, you know, you use them and you just get encryption for free. It just happens. It's a, it's a zero cost thing. There's the encryption is there. And then you can spend extra effort if you want to do the, if you want to add extra authentication. So from that perspective, they're an, they're an encrypt and authenticate system. That's very different from, you know, a lot of the other things we talked about. Traditionally, if you want to do encryption, you have to register with the center. You're provisioned with a key or a certificate or a ticket. And then that gives you the ability to then go and do encryption. So there's a, just kind of a reversal of, of the, that relationship here. If you look at how these systems are being built engineering wise, we see kind of a similar thing. People are in these messaging apps, building out the encryption layer first, working out all the mechanics of distributing keys, handling group and multi-device communications and so on. And then thinking about how to refine the authentication experience, how to add out of band checks and these kind of other more complicated systems to them, which is very different from how the, you know, people try to tackle things like PKI where the, the model was you start off, you build a PKI. You, you know, you choose a certificate authority, practice statements, revocation authorities, registration authorities, deploy this whole infrastructure, provision everyone with smart cards and keys, and then you build applications on top of it. But the authentication is the baseline foundation and the framework. These new systems are saying encryption is the, is the foundation. That's what we start with because that's what we know how to do a little bit better. There's a lot of mechanical problems there we can solve to figure out how practical it even is. And then we're going to work on building these different sort of authentication mechanisms on top of that platform. And so is this a, is this a better model? Is it a worse model? I think that, you know, it's, it's neither it's just a model that reflects the world changing. And I think the sort of authenticating crypt model, the traditional kind of key management model grew up in the 20th century, sort of rooted in a military use case around, you know, sort of a single root of trust. If you, if you're in the U.S. Army, the U.S. Army defines your security policy, it tells you how to authenticate. We're now seeing consumer applications. If you're a WhatsApp user, there's no clear way to say who's defining your security policy, how you should authenticate. That's a question that probably needs to be leapt to these many different users and user communities. So there's a difference there. I think the traditional sort of models emerge out of a symmetric crypto world where you had to distribute keys and do authentication before using them because that's, that's simply how symmetric crypto work. And with public key crypto, you have no options. You can do a, you know, forward secure key agreement and then check afterwards that the keys are authentic. And these, you know, the traditional systems sort of emerge out of a world where you're protecting the radio or rather sort of, it was kind of like simple communications medium. So there was no concept that the communication medium you're using, the servers you're using might actually be able to help you distribute keys and assist you in removing them from the trust boundary. That was an alien world to, you know, if you're just trying to, you know, protect radio traffic. So I think there's just the, the world has changed a lot and these new systems are kind of like, you know, to make a sort of a, or to draw a broad picture, I think the, you know, the 20th century systems sort of, you know, evolved as, as in kind of continuity in a world that made sense. But I think that, you know, and kind of culminated in PKI, but I think with public key crypto, with the way new messaging systems work, we're seeing kind of an inversion in how these systems are built. And I think that, you know, a lot of future systems are going to follow those lines and going to be defined in terms of, you know, just building out, you know, encryption layers and then trying to build authentication on top of them. And I think that's a direction we should broadly try to work in is thinking about how we can just encrypt more things and move out horizontally and then build authentication, more elaborate authentication sort of vertically upon that. So that's kind of my like, pontificating about key agreement. Let's talk about the protocols that some of these messaging systems are using a little bit. So in the 1990s, when people started applying public key crypto for real, it seemed pretty obvious that there were two classes of protocols. You either sign in encrypt messages, like PGPRS MIME, or you negotiate a session via some handshake protocol like TLS or SSH and then use the negotiated session key. And that can give you some better things like forward secrecy, but it has a cost of interaction. The 2000s, OTR came along and applied the concepts of an interactive protocol to interactive text messaging and added some clever ideas. It added the idea of a deniable key agreement where you can arrive on a key, but you're not actually signing every single message you're signing who you're communicating with. So you're not producing a record that can be used against you to incriminate you. I think that's a good idea. OTR also pioneered the idea of after you've done this key agreement stage, then doing something that Adam Langley ended up calling a ratcheting protocol where you just continually update your keys as you send messages. So you're continually upgrading your forward secrecy as you communicate. So, and that looks something like this. As you, both parties, whenever they send a message, they send a Diffie-Hellman public key. And when they receive the other person's public key, they replace theirs and then their counterparty receives their public key and replaces that. So there's just a pin-ponging of public keys, which defines sort of this upgrading sequence of new Diffie-Hellman secrets that you just use for encryption. So it's a nice way to just kind of inline an upgrading of forward secrecy as you communicate. TechSecure, which became the signal protocol, was worked on in, you know, 2013-14, started with OTR as the base and then tried to adapt it to an asynchronous case with more longer-lived sessions. And so there were a few adaptations. One was to change the key agreement so that it did not have to be interactive by allowing the recipient of a message to sort of publish some, what we called pre-keys, which were like the first message in an AKE protocol. So you publish a bunch of pre-keys to a server. So when someone wants to do the handshake with you, the handshake is already sort of half done. They just have to retrieve those messages from the server, complete the AKE. At that point, they have a shared secret that you can finish once you get the first message, then you have a session. At the same point, we wanted to use these sessions for longer periods of time. So we wanted to extend the ratcheting so that if you were to send a bunch of messages in sequence, you would continue replacing the keys with every message sent and received instead of just replacing them with every roundtrip. So we added an additional degree of symmetric key ratcheting on top of the DH ratcheting, and that ended up with kind of the core two-party element of the signal protocol. Of course, people don't want just two-party messaging. They want multi-party messaging. And so what you'll probably see in the messaging space is then people building their multi-party messaging on top of the two-party protocol. So you just send multiple messages to people to communicate to a group. To make that more efficient, you might often want to do things such as what a signal called sender keys, where you send a single symmetric key to everyone in a group and then use that symmetric key for future messages and also give them an ephemeral signing key that you're going to send those messages with so that the server can do sort of server-side fan-out of messages. So you can build some sort of more efficient multi-party protocols on top of the two-party protocol. And then you can build multi-device protocols on top of multi-party by just treating every device as a different party to be communicated with. So I think that's like a pretty common stack that Signal is following, that a lot of other things are following, which is to take a two-party framework, build multi-party on top of that, build multi-device on top of multi-party. There's still open questions within this framework. For example, managing the sort of metadata in the group in a multi-party communication, who's in the group, who's allowed to modify who's in the group, is that information stored centrally or is it distributed? Systems are making different decisions there. In the multi-device case, there's some interesting design questions around, do all the different devices that represent me share the same key because I've synchronized it across devices, or do they all have different keys which makes the authentication more challenging because I'm no longer represented by a single key. I'm a cloud of keys, so I need to reflect that in the authentication somehow. So there's a number of kind of mechanical questions being worked out here. I think the sort of predominant question for these sort of systems that really bugs me and is really challenging is how you deal with things like metadata and how we can do a better job from leaking huge amounts of metadata. So that's going to have to be a talk for another time, and no one's really doing a great job of that yet. So let me finish with that and take some questions. Yeah, we do have time for questions. Let me start with one, if I may. You had this slide with the stack and you're saying that Signal and other systems are going from this two-party to the multi-party setting. Is it possible that you could get better security and efficiency by going straight for a multi-party protocol from the off? That's a good question, and I think there's certainly work on things like multi-party key agreements in the form of ring-based protocols like Bermester, DeMest, I'm probably not pronouncing that right, or KLL. Those tend to be pretty interactive protocols, so they don't, I believe, work terribly well in an asynchronous setting. I think those protocols also, once you try to add authentication to them, you're going to end up doing pairwise authentication anyways so that you're not going to get a lot of benefits versus just a sender keys model where the sender just initially sends one key to a bunch of people and then uses it forward. So I don't think those are going to give you a lot of improvement. Yesterday there was a talk that outlined using the sponge construction for automatic retching. I don't know if you have any thoughts about that approach. I think that that's kind of, you could insert that at probably a lower level here because there's a problem here of you're like absorbing a bunch of key material, right? And the sponge is kind of a natural fit for that. And that's, you know, that's an interesting thing that you see, you know, I think a lot of protocol nowadays like TLS 1.3 or the sponge or other things I've worked on, but there's a lot of these kind of multi-stage key agreement protocols that evolve the keys. So there's interesting questions about how you either use HKDF to sort of absorb the keys in a really good way or our sponge is maybe a more natural fit for that. And I think those are good questions that I'd like to see more people think about is what are the actual security models for these kind of symmetric key constructions and what are, you know, maybe more optimized ways of doing that. You mentioned this movement from authenticate to encrypt towards this encrypt then authenticate model. And I'm wondering how much of that you think is based on the lowering of cost around the client generating its own key, for example, versus other factors, what you think those other factors might be? I mean, if you mean the cost of just generating a key, the cost of generating a key is nothing because keys are cheap. There's a lot of other costs with authentication that are not generating the key. It's the user, you know, if you want the user to really be involved in the authentication, they're going to have to do something, you know, and that's either use some other system or choose to use some other system that's not the default system or do some sort of out of band check. So I think that authentication does bring a lot of costs. Those costs are going to user interaction or user trust decisions more than, you know, anything really mechanical. And I do think that eliminating those costs is a lot of why systems are deploying encryption is kind of the baseline and then authentication in more end to end fashion is being used sort of on top of that. Yeah, I just wanted to say that this is really based on, so for example, mobile devices, like early mobile devices, didn't really have the capability of generating keys in any good way. And over time, you've watched the cost of actually doing that decrease, right? So now you can actually usefully generate a reasonable key on a mobile device, whereas you previously couldn't do that. Yeah, yeah, that's very true. I mean, I think that it's hard to disentangle that from the fact that we also just are using our mobile devices a lot more and we're using social networks and we want, you know, different things. So there's like all those factors in the mobile space that are happening at once. But it's true that the devices are very powerful nowadays and, you know, basic cryptography is really not a challenge to do simple AK things at all. It's very cheap. Thank you. With end to end encryption in signal, did you have any problems with the abuse or with the spam? Signal is maybe not popular enough to be wildly abused or wildly spammed. I think, you know, other systems probably have dealt with those problems more so than we have. I think Facebook's done some interesting things around the franking mechanism as a way to sort of demonstrate to the server that you received an abusive message and that they can then take actions upon. So I think that's a good mechanism to look at. You know, and these systems still have a lot of metadata available to them in terms of who's sending messages, the rate at which messages are being sent. So that's a lot of metadata that you can use for spam and anti-abuse. I think that problem gets a lot worse if you try to move towards metadata hiding systems and more anonymous systems and you start needing like anonymous networks, anonymous, gentrals, things like that. I think the spam and abuse problem becomes much worse there and much harder to deal with. That's a very open research question.