 Okay, so welcome everybody to the 35th Euro Crypt this year in Vienna. Thanks for coming and thanks everybody for organizing and the program committee and program chairs for selecting such an exclusive program. I will keep myself short because I will give another talk tonight at the cocktail reception in the town hall. So you have been welcome back. You have a plan how to get there and everything. So I hope I will see every one of you there. There will also be the awards ceremony then. A few other things that I'd like to mention. There is a cloakroom. So if you have suitcases or jackets or everything you can leave it there. There is a Google calendar on the web page. So if you want to plan your talks apparently that's very useful I've heard. Thanks to Puyah for doing that. And yeah without further ado I would like to give over to the program chair. All right. Good morning everyone. Thanks Christoph. So this year we actually decided to have even more talks than last year. Last year we had 57 talks. This year we went up to 62 talks distributed again in two parallel tracks. And last year I got quite some heat for my coin tossing at the beginning of the conference. The coin tossing was about which track, the real one or the ideal one goes to which room. So this year I decided to not repeat this again instead toss a coin privately for myself and just decide that the real track goes in this room and the other track is in the room one floor down. But we as last year we start with the morning we start with the invited speakers and the best paper awards. So we have a plenary session here every morning. So on Tuesday there is the best paper session on Wednesday we have Bart Prenel who will give the ICR distinguished lecture. And on Thursday we have Christian Kohlback who will talk about obfuscation. And this morning I'm very happy to have Karthik Bhagavan here who will present his recent works about TLS. So I think we can just jump right into the program and start with the invited talk. Maybe Karthik needs to set up. Okay so while Karthik is preparing. So Karthik Bhagavan did his is currently a director of the research at INRIA. He did his PhD at the University of Pennsylvania then he joined Microsoft Research before he joined INRIA. He's a holder of one of the ERC grants which are the prestigious European grants research grants. And he's very active currently in the TLS 1.3 standardization very active in terms of input to the standardization process, e-mail discussions, sometimes following those discussions and a lot of posts there. But he's not only contributing to the standardization itself but he's also very successful in terms of research about TLS. So for example he won the this year's NDSS distinguished paper awards about his work. Last year the distinguished paper awards at security and privacy and for example best paper award at the USNICS workshop on offensive technologies. So these are just a few of his recent success stories and I'm very happy that today he'll talk about some work related to TLS. Thank you. Thanks so much Mark for that kind introduction. So yeah so today I'll be talking about some work that my team and I have been doing for several years now in the context of transport layer security or TLS. This work is co-authored with many many many many authors. In fact we've been accused of having too many authors on our papers. So I'm just going to suggest that you go read the papers to give credit to the new co-authors. So we're going to talk about TLS which is one of the most popular protocols out there. And popular cryptographic protocols like TLS, SSH, IPsec, they evolve. That's one of the characteristics which means that there's version one one date and there's version two or there is a new feature like elliptic curves that you want to include in the protocol. So you want to add that in or there's an old hash function that is no longer considered obsolete. You want to get rid of it and put a new hash function in. So in various ways these protocols evolve and how do you manage a protocol that is actually deployed but is evolving. You do this using protocol agility which means that clients and servers will actually support a wide range of protocol modes and between themselves they'll negotiate what's the best shared stuff that they have in common and they will use that. When it works agility can be a great thing. It means that you can continue to interoperate with backwards compatible, be backwards compatible with old clients and servers but at the same time you can offer the best fancy new features to new clients and servers. But when it doesn't work it's kind of a disaster. So because one temptation of an agile protocol is that you can just leave the old legacy crypto in there forever and ever because it feels innocuous. Nobody's using it anymore but it can still stay in there. But as you'll see in if you do that you're kind of opening yourself up to attacks. In particular the kind of attack which is called a downgrade attack but even if you have a new client and a new server an attacker might be able to convince them to roll back to a old legacy crypto cipher which it can then break. So if you've been following the news on crypto stuff in the last couple of years, you'll have noticed there's a whole bunch of attacks that have come out on TLS and they have hit the news quite prominently. Partly because they have all cute C names but also because they usually affect quite a few servers and clients. So if you go down that list you'll see that a lot of the attacks are on really old legacy things that you might be surprised are still there. Why am I talking about an attack on RC4 in 2016 or on PKCS1 encryption padding oracle which is like a 1998 problem? Why am I talking about export ciphers which were obsoleted in 2000? So in this talk I'll try to give you a feel for where these kinds of attacks come from and also about why TLS was unable to prevent downgrades to some of these legacy things even though it actually has a downgrade protection sub protocol. And more importantly we'll try to see how we can fix it in future versions of the protocol like say in TLS 1.3. How do we prevent these kinds of attacks from coming again? So in particular the ones in red log jam and sloth is what I'll be describing in more detail in this paper but in the stock but you can give you a flavor of the rest of them. So a little bit of history. The transport layer security protocol was first called SSL and I think the first known document about it is in 1994. Since then there have been many versions. The most recent version was standardized in 2008. It's called TLS 1.2 and this year we're hoping to have the next version TLS 1.3 be standardized. Over its lifetime this has become more or less the go-to protocol any time you want to secure channel on the web. So most prominently of course it's used in HTTPS and websites and so on but it's also used in a bunch of places you may not be aware of like Wi-Fi, VPNs, also for server to server communications and so on. The other aspect is that this protocol has many, many implementations. So it's actually a poster child for interoperability. It has implementations that you may not even know that you're using. If you're using an Apple device you're using something called secure transport. If you're using Firefox you're using NSS. If you're using Chrome you're using something called boarding SSL. If you're using any server you're probably using open SSL. If you're using Microsoft devices you're using S channel and the great thing about the design of the protocol or the way it handles their functionality is that although of these implementations implement different subsets and different subsets of feature of this protocol, they're still able to interoperate with each other on the common subsets. On the flipside is that every year we see many attacks like the list I showed you on both the protocol but more commonly on the implementations and these always result in critical But there's also been concerted efforts towards the provable security of things like TLS. And there are many papers published every year on proofs of security for various modes of TLS. So we find ourselves in this funny kind of quandary. On one side, we have proofs of security. On the other side, we have attacks. And there is a gap between those. So what we are proving clearly do not cover the attacks. And what we are attacking is clearly uses features that we should be covering in our proofs. So we'll talk a little bit more about this gap as we go along. So what does the protocol itself do? Well, in this particular talk, I'm going to not really give you many details of TLS. That's quite boring. Rather, I'm going to look at it from a high level. And whenever we want to look at crypto details, I look at simplified sub-protocalls wherever possible. So the TLS protocol has four phases. The first phase, I'm going to call it hello, is the negotiation phase. So this is where the client and server basically agree that this is the version we will use. This is the Cypher suite we'll use. This is the key exchange. These are the elliptic curves, whatever. So they agree upon what they're going to be using. So once they have finished negotiation, they actually jump into an authenticated key exchange, depending on whatever they negotiated before. Once the authenticated key exchange is complete, they have a session key in hand. They have authenticated each other, or at least the client has authenticated the server. And then they move into this finished phase, which is the confirmation phase, really. So they use the session key to compute max over the entire transcript of the handshake so far so that they can confirm to each other that each of them knows the key and that they have matching conversations. And once this is complete, finally, they can start exchanging data in both directions. So if you look at it at this high level of detail, if we abstract it away, you can observe that TLS, when you look at it like this, is more like a protocol framework than a protocol itself. If you develop a new key exchange protocol, I can plug it into the second layer up there. If you give me a new authenticated encryption scheme, I can plug it into the last layer up there, and the protocol will still just continue to function. And that's really the secret of how TLS negotiation and agility works. So if you take a typical TLS client and a TLS server, like your web browser or whatever, and you go and look at what it actually supports, you'll find that each of them supports three to five versions of TLS, and I'm listing them in decreasing order of preference, usually, because the newest one first. It supports several kinds of key exchanges, including those which are forward secret, which are not forward secret, which use pre-shared keys for embedded devices and so on, which uses many kinds of authentication modes, and it also supports different kinds of authentication encryption schemes, some of which are generally generic schemes, some of which are very specific to the way they're used in TLS. So the idea, of course, is that over time, the ones towards the left, the most preferred ones will get more and more popular, and the ones towards the right will drop off. They will just go away. In practice, that's not what happens. In practice, things drop off from the right only when you find a concrete attack against the protocol. I mean, that's sort of what happens. So these things on the right just tend to stay on there as long as needed until somebody says, you know what, there is a serious enough attack, get rid of it. But even if you just look at sort of the most commonly used combinations of these things, you'll immediately note that there's actually the hundreds of combinations of these things that you can put together between TLS clients and TLS servers, like web browsers and web servers. Nothing special. So that sort of gives you an idea of the trickiness of analyzing this protocol. I mean, even if you manage to do a proof for one particular Cypher suite in TLS, how the hell are you going to do a proofs for hundreds of combinations? That's kind of much harder. But even for a single instance, there are lots of challenges in the way we have to address protocols, real-world protocols, like TLS, and lots of people in this room have actually addressed this problem. And it's worth sort of expanding on that a little bit. So here's a Cypher suite, which used to be quite popular. It isn't very popular anymore. It's quite old, but it's also the mandatory Cypher suite in TLS 1.2. So any TLS 1.2 implementation has to implement this. That's to support it. It doesn't have to prefer it, but it has to support it. So if you look at that Cypher suite, it breaks down into two parts. On the left-hand side, we have the key exchange part, which in this case is RSA key transport. What that means is that the client is going to generate a session key, and is going to encrypt it for the server and send it. This encryption is going to use RSA PKCS 1.5 encryption, which as many of you might know, already in 1998, was shown to be vulnerable to a class of attacks, which we normally now call Black and Barker attacks. And in response, all the implementations of TLS and even the specification mandates countermeasures against this attack. But even though the scheme was supposedly, I mean, this RSA encryption was quite well understood even in the 1990s. In order to get a proof of the precise way it is used inside TLS, it took us a really long time. I think the first credible proof of this actually appears only in 2013, due to many people in the room, Kenny and Hugo and Hotec. But even that doesn't end the story, right? Because this year, you might have noticed there's a new attack called Drown, which is also an attack on TLS RSA. But this attack is actually an attack on, which uses a downgrade attack plus an implementation bug in order to break this scheme. And of course, the crypto proof from 2013 doesn't cover implementation bugs or downgrade attacks. So the story keeps going on like this. On the right-hand side, we have the authenticated encryption scheme that is going to be used to actually encrypt the application data. And that particular one uses a combination of ASCBC plus HMAC, but not the usual one that you might actually prefer. It uses a particular scheme called Mac encode encrypt, which is very peculiar and specific to TLS, at least the way it's implemented. And already in 2002 and 2003, there were attacks against the scheme, a padding oracle attacks. And there were a bunch of quantum measures proposed towards it. The first crypto proof that actually takes care of all the details of the scheme, as used in TLS, appeared only in 2011. So they're talking like 15, 16 years after the protocol first was standardized, right? But even after that, there have been attacks on implementations and downgrade attacks like Poodle and implementation attacks like Lucky 13. And finally, this gap has been closed in 2016 with the best paper award this year at FSC, where the authors show how to actually verify an implementation of Mac and encrypt, and as well as prove that it doesn't have any side channels. So the reason I'm showing you this little bit of history is to show you how these things evolve. So there is a long lag between the time feature is added to the protocol and the time at which cryptographers are able to actually do a proof of the scheme as it is implemented. Because the scheme as it is implemented in the protocol is quite different from high-level crypto specs. And even after the crypto proof has been done, there is another big time lag before we can be sure that the implementations are implementing all the quantum measures correctly and are doing things right. And it is in these two gaps between the time when the protocol is there and the crypto proof and the crypto proof and the implementation verification that a bunch of these attacks start coming up. So two things I'd like to focus on. One of them is that there is actually a modeling gap, I think, between crypto proofs and real world protocols like TLS. So this is most easily seen in the two examples I showed you before, that if you take a textbook crypto proof of how to do Mac and encrypt or RSA encryption, it doesn't really apply to the way it's used in TLS. Because TLS uses classical constructs, but in non-standard ways. So you need protocol-specific assumptions, protocol-specific definitions, and new proofs. And indeed, over the last decade or so, I think there's been a concerted effort by a bunch of researchers. And now we finally have these definitions for TLS, for the key exchange, for the record layer. We actually have the definition that we have to meet. It took a long time. The other aspect of this gap between what cryptographers think of as the proof and what actually is implemented is that there is a gap in the attacks as well. So what you might think of as a theoretical attack may not always be exploitable, or at least not immediately exploitable. The general rule of thumb is if you find a theoretical attack on any construct in about 10 years, it becomes a serious attack on TLS implementation. So exactly after people have forgotten about it. And this actually is a problem because practitioners currently only respond to practical attacks. So if you show me a theoretical attack on Shawan, but in fact the way it is actually used in TLS is not easy to mount this attack on. They'll say, well, it's OK. We'll keep going with Shawan until you show me a real attack on the protocol. And this is, I think, a communication gap. And between cryptographers, what they consider as attacks and what practitioners consider as attacks. And this is actually one of the big contributions of that long line of attacks I showed you in the previous slide is that it is actually closing this gap by showing that, in fact, these theoretical attacks can be used to break the security, so do get rid of these legacy crypto schemes when you can. But it gives you an idea of why these legacy crypto schemes hang around for as long as they do. The other gap which is more specific to agility is what I would call the protocol composition gap. So most crypto proofs, because they're done doing them by hand, are going to be for single constructs of the protocol, like the RSA key exchange, the DHE, PSK key exchange, or the record layer encryption scheme and so on. But they do not say anything about what happens when all of these are put together. Well, sometimes they do, but really not for arbitrary compositions. But many of the attacks that we are concerned about, and they were in my list, only appear when you compose things together. It's only when you compose an old protocol version and a new protocol version that you're going to discover downgrade attacks. It's only when you compose RSA signatures and RSA encryption using the same key that you're going to discover across protocol attacks. Similarly, even for implementations, it's only when you start considering different protocols and how they are composed together that you discover state machine flaws that can actually completely bypass the protocol. And the issue here, of course, is that it's hard enough to do a crypto proof for one particular scheme. Now, if you want me to do a crypto proof for arbitrary compositions of these schemes, that's pretty hard to do by hand. So in my team, in our work, we advocate the use of automated verification tools that can scale up to actually handling various compositions of these protocols and real world protocols, like TLS. So our main project, our flagship project, is called MeTLS. Well, some of us pronounce it in my TLS, but nobody agrees on how to pronounce it. It's basically the idea is, instead of trying to verify models of the protocol and try to make the models as precise and close to the protocol as possible, let's just try to verify straight on a reference implementation of the protocol. Since it's an implementation and it has to interrupt with other implementations, it has to deal with the nitty-gritty low-level details of the protocol. It cannot ignore them, okay? So we wrote this implementation, which covers about three protocol versions and dozens of Cypher suites, so not nearly as big as the full TLS protocol. And we specified its security as a multi-Cypher suite, multi-handshake secure channel protocol using a technique called dependent types. And we were able to verify both the security of the protocol and the correctness of the implementation in one system by using type checking that relies on external SMT solver for automation. And this kind of effort, since you need to understand both the underlying crypto assumptions as well as how the verification tool works, as well as how programs work, is necessarily a joint effort by a large team. So it's an ongoing effort, some of us are program verification folks, some of us are protocol analysts, some of us are cryptographers, and we all get together to kind of try to make this implementation. So you can get the current version of the implementation, but I would suggest that you wait another month and we'll have the 1.3 TLS 1.3 version of this code, which should be much more fun to play with. But in the course of doing these proofs for me TLS, we also discovered attacks, because I would argue that a lot of these attacks only appear when you try to start to do compositional proofs of various parts of the protocol and try to plug them together. For example, in 2014, we found this attack called the triple handshake attacks, which only appears if you compose three different handshakes of TLS in sequence. So nobody had actually considered all three of them in the same system, before once you do, the attack appears. In 2015, we found this class of attacks on implementations called state machine attacks, including one called Freak. And these are all bugs in the composite state machines that implementations have to implement if they're going to do both ECDHE and RSA and PSK in one implementation. So how do you compose these together? And there were bugs in that. But in today's talk, I'm going to be talking about the last two kinds of attacks, which are both downgrade attacks on TLS. One is called log jam, and one is called slough. And we'll see how these actually exemplify what are the kinds of things that go wrong when protocols are trying to negotiate strong and weak ciphers at the same time in TLS. So again, I'm going to try to stay very far away from the details of the TLS protocol, because I think the very ideas can be expressed in much simpler, smaller, exemplary protocols. And the most exemplary protocol is from Crypto 101, anonymous if you have one. So Alice and Bob are doing anonymous if you have one. GX, GY, G to the XY, and that's as simple as it gets. And as everybody in this room knows, there is a standard man-in-the-middle attack on anonymous if you have one, where an active network attacker can tamper with the GX on one side, the GY on the other side, and then controls the keys on both connections. Well, yes, this is EuroCrypt, and I'm showing anonymous if you have one on the slide. And this is because I think a lot of the attacks I'm going to show you are exactly this pattern. In other words, the downgrade attacks I'm going to show you are going to take very fancy protocols, but actually reduce it to the same security more or less as anonymous if you have one. So the standard way of fixing these man-in-the-middle attacks is for using authenticated if you have one. And a classic example of this genre is the Sigma protocol, where the basic idea as well, both the client and the server have some private keys, and they're going to use this to both sign and MAC the protocol transcript in some combination. What gets signed, what gets MAC, differs between different implementations of Sigma, but this is the general framework that is used in TLS, IQV1, IQV2, and so on. You can actually see this pattern in many of these protocols. So authenticated Diffie-Hellman, let's consider it as a solved problem. But what happens if you introduce agility, right? So let's do Sigma again, but this time Alice is supporting two Diffie-Hellman groups. One is a nice big Diffie-Hellman group of 2048-bit groups, and the other one is a tiny little Diffie-Hellman group of 512 bits. And Bob also supports both of these groups. You might ask, why are they supporting this 512-bit group? And we well know that that's far too small and is probably broken, but there could be many reasons. One is backwards compatibility. One is just laziness. They put it in there and they forgot 20 years ago. And the other is export regulations. So there might still be export regulations that require them to support this kind of weak Diffie-Hellman group. But in practice, it shouldn't matter, because Alice will say, I support these two groups, strong group, weak group. Bob will say, okay, I support the strong group, so let's just use a strong group. So in practice, it should be the case that the weak group is never used, even though both of them support it. Of course, that isn't the case, because there is a man in the middle attack. And the man in the middle, what he's going to do, is going to see the two groups coming from Alice to Bob. One is a strong group and the weak group. And he's going to nuke the strong group and just pass over the weak group to Bob. So Bob thinks, oh, maybe Alice only supports the 512-bit group. Okay, for backwards compatibility, I will say okay. And so Bob will send back the 512-bit group to Alice. And Alice will now say, oh, maybe Bob only supports the 512-bit group, so let's use the 512-bit group. So even though both of them supported the strong group, they have both been downgraded to the weak group. Okay, so the remaining problem is now the attacker, all he needs to do, is to break the computer discrete log on this weak group, and he has full control over the keys of this connection, and he can impersonate the client to the server, the server to the client, and do whatever he wants. This is the essence of the logjam attack, which came out last year. There are a lot more details to implement this effectively inside TLS, but this is the essence of it. But then, when logjam came out, people were surprised because they were saying, well, but this is a downgrade attack. You took the negotiation and you tampered with it. This is not supposed to be possible in TLS, because TLS has a downgrade protection mechanism. It's one of the things that they added after downgrade attacks on SSL a long time ago. And let's look at this downgrade mechanism. In the finished phase, which is the third phase of the TLS protocol, both the client and the server exchange MACs over the entire transcript computed with the session key. Okay, and the idea is, if anybody in the middle had tampered with the transcript, you would be able to detect it by verifying this MAC. But unfortunately, it's too late to prevent this downgrade, because the MAC key, which is derived from the session key, G to the XY, is already using the downgraded Diffie-Hellman group, which is the fine-ranked 12-bit group. So your downgrade protection countermeasure in TLS is itself downgradable. And that's the key observation that makes logjam work, okay? So this kind of downgrade protection mechanism, which itself depends on downgradable or negotiable parameters, is kind of dodgy, because you have a circularity there, and you have to figure out what it is that you really depend upon in order to give you any kind of downgrade resilience. So the only thing that you have to do in order to mount logjam is you have to be able to show that you can break the Diffie-Hellman, you can do a Diffie-Hellman discrete log computation on a 512-bit group while the connection is still live, because you have to complete the connection, you have to compute this MAC, which is up there. In order to do that, you have to compute the key, so you have to be able to do the discrete log while the connection is still live. And this is another place for practitioners traditionally thought, well, this is not possible, because it's a group, you have to do a discrete log, it takes weeks, you won't be able to do it. And there, again, is the communication gap, I think, between crypt analysts and practitioners, because the Cardo NFS guys knew for a long time that, in fact, the computation doesn't have to be all done at once. There is a big pre-computation phase that only depends on the prime, and the very tiny phase at the end, which is called descent, which can be used to do the discrete log for a particular key. And in fact, in TLS and in IPsec and in SSH, it turns out that all the implementations typically use one or two well-known groups. So if you have one or two well-known groups, for example, for DHE Expo, they're one or two well-known 512-bit groups, you can compute, do the pre-computation for them, for each of these groups in one or two weeks, and then it takes only 90 seconds per session to compute a discrete log and break the protocol. Okay, so this is, again, one of the cases where there was probably a bit of a communication gap which allowed this attack to slip in. So the lessons of logjam itself, whether the TLS transcript MAC does not prevent the Fee-Hellman group downgrade. So you cannot rely on the transcript MAC for that. The only way you can get away from logjam is by disabling all weak groups and weak elliptic curves, if you know of any. So in response, what browsers are doing, for example, is that they're refusing to accept 512-bit groups, and then they will start to refuse to accept 768-bit groups and so on, until the minimum group size that you see will be quite large. And the reason they have to do it phase by phase, and this is another symptom of the internet, is that if you have any, even 1% of internet servers with which you would break interoperability if you made a change, that is too much for all implementations. So people will stop, they will refuse to disable a feature if, in fact, one person of the internet will break. So you really have to do this step by step in order to allow this to happen. But let's focus on the protocol design problem. The transcript MAC doesn't work. Could we have done better if we signed more things from the client to the server and the server to the client? This is the Sigma protocol. We have a signature and a MAC. The MAC wasn't doing very good, very well for us, so maybe the signature can do something for us. So let's look at some other protocols which took a different approach to Sigma, right? So in IQV1, both Alice and Bob signed the negotiation message, which is the Alice's offered groups. But for some reason, they don't sign Bob's chosen group, which again opens a tiny little hole which is enough to have a downgrade attack. In IQV2, both parties signed their own messages. So Alice signs the offered groups and Bob signs the chosen group. But since Bob doesn't sign the offered groups, there's again a tiny little hole which you can use to cause confusion between the two and mount a downgrade attack. SSH V2 and TLS 1.3, the upcoming version, take the brute force approach. They signed the full transcript. Both client and server signed the full transcript, called a hash of it, hash of the full transcript. And this certainly does prevent any kind of downgrade attack because if you tamper with the message, then it will be detected in the signature verification. But that's fine for Diffie-Humman group downgrades and other things like that. We still have this other question, what about signature downgrades? So let's go back to the negotiation table and let's assume that we have a simple protocol between the two messages. Alice sends G to the X plus some arbitrary set of protocol parameters like versions, groups, ciphers, and so on. Bob sends G to the Y and some arbitrary set of protocol parameters. And they both negotiate some common parameters between each other. And let's assume that they're both computing the transcript as some function of the two messages, M1 and M2, on the left and the right. And Alice is going to sign the transcript to Bob and Bob is going to sign the transcript to Alice. So this certainly would prevent things like logjam. Okay, but let's look at what we're actually signing. When I say we are signing the transcript, you're never really signing the transcript, you're signing a hash of the transcript. So what properties do we need from this hash function? You can look at it from a proof point of view, what assumption should I put on it? Or you can look at it from an attack point of view, how do, I mean, would I attack on this hash function, break the protocol? How weak can this hash function be? Do we need collision resistance, for example, or do we only need something like second pre-image resistance and that'll be enough? And this is an important question because if you took at crypto proofs of TLS, SSH, IP2, and so on, you'll find that almost uniformly everywhere, people assume collision resistance for all the hash functions used even in the protocol. But at the same time, these protocols are currently implemented with MD5 and SHA-1 instantiating those hash functions. So there's one of two cases happening here. Either the cryptographers are lazy and they're using too strong an assumption and really they should weaken their assumption and do their proofs again. Or it is that the practitioners are wrong and in fact they need collision resistance for their protocols, but they think they don't. And this is not just an abstract argument. There's actually an explicit argument of this form in RFC 4270 between Bruce Schneier and Paul Hoffman where Bruce Schneier says, get rid of MD5 right now and Paul Hoffman says, but why? There is no real attack on the protocol just yet. And that sort of shows you where these sort of legacy things hang on for. But in this kind of work, which is sloths, which appeared at NDSS this year, we definitely show that in this case the cryptographers are right. So you do need collision resistance for almost all uses of hash functions in these key exchange protocols. And the other reason is, if you don't, you have this class of attacks called transcript collision attacks. Again, we're going back to anonymous Diffie-Hellman man in the middle attacker. There's a man in the middle in the middle. What he's going to do now, even though you're running sigma with negotiation on both sides, he's gonna tamper with the G to the X on this side, tamper with the G to the Y on that side, tamper with the params of A, the params of B. He can do whatever tampering he wants. What he has to make sure though is before the signatures go through on both sides, he has made sure that the hash of the transcript on the left is equal to the hash of the transcript of the right. And that's what he called a transcript collision. If you can achieve this, then Alice's signature on the hash transcript on this side can be forwarded to Bob and Bob's signature can be forwarded to Alice. And neither of them will be able to detect that there has been any tampering. And in this way, you can do a downgrade attack, you can do server imposition, client imposition. It's a very powerful technique. The key question is, well, how do you compute this transcript collision? Which is basically, if you think about it, there is a message one on the left, a message two on the right, and you're tampering with message one, so you make message one prime, and there's message two prime. And you want to make sure that the concatenation of the two hashes to the same value. The first thing to observe is that this is actually a collision. This is not a pre-image because we actually control bits of the transcript on both sides. So we can actually play with both sides of the connection. And if you know the black bits, we have to compute the red bits. That's basically it. And if you are careful, we can set this up as a generic collision. But if you're really clever, we can sometimes set this up as a shortcut collision like a common prefix or a chosen prefix collision, which typically will be much more efficient to mount. So let's sort of see how you can mount a chosen prefix collision on that simple protocol that I showed you before. So Alice is sending a message m1 to be, and the man in the middle has stopped it. The message, if you assume, has a nice format. So it has a length field. It has an fmr key, and it has some set of params. Now, Bob is going to, in response, send an m2. What the man in the middle has to do is to send an m1 prime on this side and an m2 prime on that side so that if you hash the two, they are equal. But that's the problem. And the way he's going to do it is the following. Let's assume that, in fact, the params contain some uninterpreted places where we can stuff some collision blocks. So it's like a blob, really. Okay, let's ignore the structure of the params. But let's assume that it allows us to stuff some collision blocks in there. And let's assume that both the client and the server are generating fresh fmr keys. Now, in this setting is where protocol experts and practitioners think, oh, this surely is not vulnerable to a collision-based attack because both the client and the server are generating fresh values and putting it into the transcripts. There are fresh nonces in there. So it's difficult to kind of figure out how you might be able to mount a collision attack, but we'll show you how. So the idea is when you see the message one coming on this side, the man in the middle is going to start constructing the message one prime to Bob. But at the same time, he's going to start constructing the message two prime coming back to Alice. He's going to start stuffing in whatever material he wants to Bob, whatever material he wants to Alice. Then he's going to make the both the lengths of the transcript the same by stuffing some zeros into the message to Bob. And we assume that there are places where you can stuff this kind of stuff. And when the lengths are the same, he's going to ask for a chosen prefix collision between these two transcripts. So he's going to obtain C1 and C2. And he's going to say, okay, so I can stuff C1 into this message and C2 into that message. So now we have the two transcripts being equal, but we haven't yet sent any messages, okay? So the message on the right, we're going to wrap it up into an M1 message and send it off to Bob, okay? And it sort of is going to have the same format as what Bob expects. But in fact, inside the blob, there is lots of junk material which he doesn't know how to interpret and he's going to ignore. So when Bob sends his message back, what we do, even though it has fresh material, all you have to do is take it and we can stuff it into the end of the message that you are already preparing for Alice. So we're assuming that there are places where we can stuff these messages at the end, but in fact, for TLS, IK, SSH, all of these protocols, they have such flexible formats that there are places you can stuff these things in. And once you've done this because of the Merkel-Darmgat property, of course, the hash of the two functions, two transcripts is going to be the same and thereby we have computed a transcript collision using a chosen prefix attack on the first half of the transcript. And if you remember, a chosen prefix attack on MD5 currently is thought to cost about two to the 39 hashes which we can do in an hour on a powerful workstation in our demos. For SHA-1, it's considerably believed to be slower, but we only expect this attack to get better over time. In any case, the complexities of these attacks is way, way smaller than what practitioners believe because they thought that they had the full 128-bit security for MD5 in this case. So using, I showed you an attack on a very simplified protocol, but in fact, the very same attack appears if you take the full details of the TLS protocol as well and those details are important for proofs and attacks, but not necessary for this stock. And there's another few quirks that you need in order to mount this protocol on TLS. So until TLS 1.1, in fact, TLS did not support pure MD5-based signatures. All signatures from the client and the server used a concatenation of MD5 and SHA-1 together, so at least as strong as SHA-1. But in TLS 1.2, while they were upgrading the hash functions throughout the protocol and they used SHA-256 for everything, they introduced a signature algorithm extension that allows you to negotiate signature algorithms and in the list of hash functions that they allowed to use for signatures, they, I think, accidentally included MD5. So because of this, TLS 1.2 is the first version of the protocol that actually allows you to do MD5-based signatures, even though the previous versions did not allow you to do that. And adding an MD5 kind of possibility for signatures is kind of dangerous even if you never use it or you think you'll never use it because there's a downgrade attack. So if the client says, I support SHA-256 and the server says I support SHA-256, they should only be using SHA-256 signatures. But a man in the middle, in our setting, can downgrade both of them to use MD5 and then he can do the transcript collision attack to break the protocol. And that's what we showed in our demo. We showed that using transcript collisions, you can break client signatures in TLS 1.2. You have to do the, you have to compute the collision while the connection is still live and this is a bit hard. But we managed to do it in one workstation in one hour, we think this could be optimized further. So it's almost practical as an attack but it was certainly enough to kill MD5 from TLS 1.2 implementations and also from TLS 1.3. And that's not the only use of hash functions, right? I mean, okay, MD5 signatures are obvious. You could say you should not be doing that. But there are various other uses of MD5 and SHA-1 in the protocol, including under HMAX, including the truncated HMAX and so on, which are all somewhat dodgy. So there's a bunch of attacks which are of this class on TLS SSH and IK that you can go see in the paper. So I showed you two attacks, one is log jam on a slot. Both are kind of downgrade attacks but they also use cryptanalysis and they also use the existence of weak configurations and legacy crypto in the wire. What do we learn from them? The first thing that you should kind of be observed is that legacy crypto can stay for a long time, really long time and often it's because since nobody is using it, it's invisible, right? But the attacker is going to make sure that you use it because if he finds it, he's going to make sure that you're downgrade to it. Many people were surprised by so many people actually supporting export ciphers or other same MD5. People didn't know that their servers even supported it. The other observation is that it's not just enough to show theoretical weaknesses because people don't delete legacy crypto based on theoretical weaknesses, but this line of attacks that various people are engaged in which actually show how they can be concretely exploited makes a difference. I mean, that's what you need. And it's also fun because in some cases you might yourself not be sure whether you need collision resistance for hash functions in your protocol and this actually is a way of justifying that assumption. It can also be a useful way of motivating cryptanalytic optimizations as we saw in the logjam case. But the grand story of this in terms of the protocol is that TLS 1.2 does not really prevent all kinds of downgrades. Even though we thought it actually had a downgrade protection, feature it actually doesn't. There are many kinds of downgrades that escape it. And we don't even know how to state this property about downgrade resilience. So we need a new model that allows us to state what downgrade resilience is so that we can then start to model a protocol that actually design a protocol that achieves it. So the last part of my talk, I'm going to talk about some definitions that we came up with and which will appear at Oakland next month on what is the downgrade resilience property we might desire from TLS and how we can prove that for TLS 1.3. So we're going to consider two-party protocols for the initiator and the responder. And for simplicity, we're only going to consider the negotiation sub-protocol that we want to protect from downgrades. So let's assume that the key exchange contains two inputs. One is the config of i and the config of r. And the config you can think of as a list of versions, ciphers, groups, et cetera, that i supports and the list of groups that r supports. They also, of course, have their own credentials, long-term credentials, credi and credi that they have given. And the output of the key exchange, which we don't care about what the key exchange actually does, is that it must produce a unique session identifier, a session key, and it must produce a negotiated mode, which is the version, cipher, et cetera, that they actually decided to use. So this is an agile key exchange because it actually supports different modes, and it's going to do different things based on what modes you negotiate. So for this protocol, you can state some standard security goals. Well, I've written it very informally here. We have formal definitions for these, and I'm sure many of you can write them yourselves. You want things like partnering, which means that there is at most one honest partner with the same session ID. You want agreement, which is that if I have a partner, then he and I agree on everything, including the mode, the key, the UID, everything. And if I have an honest partner, then the key should be known only to me and my partner. Also, if my partner is actually authenticated, then I can use the authentication to say that I have at least one partner with the same UID. So you can write these kind of standard properties. But one thing you observe is that since this is an agile key exchange, all these properties are preconditioned with something that says that the mode has to be strong. So if you look at the agreement property, we say that I will agree with my partner on the mode as long as the mode only supports strong algorithms. There's this weird circularity there. And that's inevitable, because we are negotiating the protocol mode while we are doing the key exchange. And so the key exchange is only going to give me good guarantees if the negotiated mode actually has strong algorithms. If a negotiated mode which uses MD5 and RC4 and whatnot, then I'm not really going to be able to get very strong guarantees from this protocol. And importantly, none of these standard guarantees are going to tell me that the mode that I will negotiate is going to be strong. They're all going to say, if the mode that you negotiated is strong, then you get something good. If it's weak, you get nothing. And in fact, previous research work on negotiation, including the one in Meteoros, all the previous theorems are of this form, which is you do some negotiation and at the end of the negotiation, if you end up with something that is strong, then you get lots of good guarantees, but you don't tell you that you will end up with something which is a strong mode. So this does not prevent downgraded acts like log jam, this kind of definition. And in fact, if you want to rely on this kind of agreement to prevent downgrades, the only course is to make sure that the intersection of config I and config R that is the ones, all the modes that they could possibly negotiate are all strong. There should not be any weak algorithm in that intersection. If there's even a single weak algorithm, the attacker is going to downgrade to that and then he's going to break the protocol. So we propose a new downgrade resilience goal for fresh key exchanges. And the downgrade resilience goal can be written very intuitively, although you can, to formalize it, take some time. And the idea is, let's define a function called negol, which is what the client and server would have negotiated if they both knew each of those configurations and there was no active man in the middle. Both of them actually met on the street and said, this is my config, this is your config. Let's compute the negotiated mode. That's the ideal negotiation mode. And then the security goal, which is to downgrade resilience, becomes that even in the presence of an active man in the middle attacker, the final mode that the initiator and responder compute has to be this mode they would have computed, even if there was no attacker. So the attacker cannot really tamper with a negotiation. So this is the most general definition you can come up with, which allows the negotiator and responder to use their own custom rules for negotiation. We don't say that they have to negotiate the best one. We don't have a notion of best one. What we say is that you have to negotiate the same thing that they would have negotiated in the absence of the attacker. So now that we have the definition, we can test it, we can write the negotiations sub-protocols of various other famous protocols and see whether it actually satisfies the definition. And we wrote the sub-protocol for IKV1 and we found known attacks on IKV1, which are downgrade attacks. For IKV2 we found a surprising attack on the EAP mode of IKV2, which said because the normal mode is downgrade resilient, but if you add EAP, it's not. Similarly for ZRTP, normal mode is downgrade resilient, but if you add pre-shared mode, it's not. And SSHV2, which is the strongest in some sense in terms of downgrades among all these protocols, is in fact downgrade resilience, but you'd better not use SHA-1, which is actually one of the default hash functions in there. I mean, unless you want to make a strong assumption about SHA-1. And even this downgrade resilience proof that we have for the sub-protocol of SSHV2 is strictly stronger than what has been considered in previous work, because previous work requires that the intersection of the configurations has to be all strong algorithms. We even allow weak algorithms and you still will get strong downgrade resilience. So now that we have this definition, we have tested it out, can we use it to motivate the design of a new protocol? So we've been involved in the design of TLS-13, along with a bunch of other researchers, and it's an interesting task, it's an interesting sort of experience. TLS-1.3 is new in the sense, it does a lot of, makes a lot of changes, surprising number of changes from TLS-1.2. One of the biggest changes it does, it reduces agility. So it reduces a lot of the problems you've been talking about by just killing a lot of key exchanges, a lot of encryption modes, and requiring that the Diffie-Hellman groups have to be strong, elliptic curve have to be strong, and the hash functions have to be strong. So it's really kind of removing algorithms has been a big part of it. But at the same time, it has been adding new features like this new one RTT and zero RTT modes, which are meant to be faster than TLS-1.2, but require completely fresh analysis. The good news with TLS-1.3 is that unlike TLS-1.2, which had a big time lag between specification and proof, there is a large community of researchers who has been very actively involved in the development of this protocol, and they're developing proofs side-by-side with standardization, sometimes even ahead of the standardization. We are seeing proofs in multiple models, different kinds of key exchange models, including symbolic proofs and computational proofs. And we're also developing in our team the verified implementation of this protocol in ongoing work. So this is a good case study in some sense because this is where a lot of people have been focusing their efforts on getting this right. So the negotiation sub-protocol in TLS-1.3 is pretty complex. I'm gonna break it down into three bits, but this is the sub-protocol for which you want to show down-get resilience. This is not even the full protocol, this is just the negotiation part of it. So there are three features there, which are kind of new. The first one, at least for TLS, the first one is this thing called a retry message. So if Alice sends to Bob a Diffie-Hammann key share, Bob might say, you know what, I don't like that group, please use this group instead, and then Alice will send the key share and the new group. Now, obviously this is a ripe place for a man in the middle to kind of send a fake retry message to kind of downgrade Alice and Bob to weak Diffie-Hammann groups. And in an earlier version of the protocol, there was a question on whether these first two messages should be part of the authenticated transcript or not. And we show that, in fact, if you use it as part of the authenticated transcript, you can prove down-get resilience, and if you don't, you actually cannot under our definition at least. The second feature, which we've talked about before already, is that TLS 1.3, rather than mix, kind of macking the transcript and signing very little, actually signs everything. So it both signs and max the full transcript. Now, this is good because it avoids attacks like logjam, like we've discussed before. And it only works because, of course, in TLS 1.3, because of slot, all the weak hash functions have been explicitly forbidden. So all the hash functions have to be shout to 5.6 on newer. So this is the second feature that allows you in TLS 1.3, but which allows you to base your downgrade resilience proof purely on the signature. You can ignore the rest of the key exchange. But the third feature is actually kind of, in some sense, even more interesting. Even if you take TLS 1.3 and you prove it to be downgraded as a Lian, secure, blah, blah, blah, there is always going to be this one problem that is gonna hit you, which is any TLS 1.3 implementation out there for the near, for the foreseeable future has to support TLS 1.2, because this is how agility works. You have to support old versions for backwards compatibility. Just to give you an example, TLS 1.0 is still the most widely deployed protocol out there, and in fact, very highly supported and preferred as well. So the moment you say that we are also going to support TLS 1.2, we are in a bind, because now we have this version downgrade problem. So I have a TLS 1.3 server, TLS 1.3 client. We have perfect proofs for the protocol. We have excellent proofs for the implementation, but when they are talking to each other, the man in the middle, what he's gonna do is gonna say, listen, please do TLS 1.2, please do TLS 1.2. They're both gonna start doing TLS 1.2, and now he can mount all the known downgrade and legacy crypto attacks that he already had in his pocket. So any TLS 1.3 implementation that also supports TLS 1.2 is vulnerable to this kind of downgrade. But it's meaningless to say that we're gonna prove TLS 1.3 secure if you don't consider this issue as well, because that's what is going to be deployed. So a proposal that we put into the standard, and which is kind of, I think, going to go in, is to at least protect the signature cypher suites in TLS from downgrade. The idea is that the server sends, which is signed in all versions of TLS, we include a fixed sequence of bytes, which indicates the largest protocol version that the server is willing to support. And by providing the signal within the signed value, even in old versions of TLS, we are able to prevent downgrades from TLS 1.3 to TLS 1.2, as long as the server only supports signature cypher suites. We cannot give any guarantees if the server starts supporting RSA, or PSK transport in the lower protocol versions. But those key transports have other problems as well. But if you use only signature cypher suites, you can implement a stack with TLS 1.0, 1.1, 1.2, and 1.3, and you can prevent downgrades from 1.3 downwards. So putting all of those three features together, we can actually prove a theorem that the negotiation sub-protocol of TLS, so we're not really looking at the full protocol, and that's future work. But if you just look at the negotiation sub-protocol of TLS 1.3 is, in fact, downgraded. And this is the paper that will appear in Oakland next month or this month. So let's get back to the beginning. There are attacks on legacy crypto in TLS. We saw a bunch of these. We saw why they occur a little bit, because they occur because there are these legacy things that are left in that people forget about. There are legacy things that people think can't be attacked, but they can be attacked. And there are downgrades that people thought couldn't happen in TLS, but they can. And we showed how to kind of fix it in TLS 1.3. I'd like to leave you with some final thoughts before taking questions. This line of work basically exposes several things. First is that legacy crypto is strangely hard to get rid of, but the only way to get rid of it in practice is to find concrete attacks and force them to be killed off. But at the same time, I think we can't always assume that we will know exactly what is broken. There is going to be one of the algorithms we are using right now, which we think is perfect, but we find that it's broken at some point. So we need to have a graceful way in which we can phase out old versions and phase in new versions. And so we need truly downgrade resilient protocols so that we can survive this intermediate phase. Prior versions of TLS suffered a large time lag between standardization and proofs, but I'm happy to report that in TLS 1.3, it looks like researchers, many of them in this room are closing this gap. And that's good news for TLS. A lot of this work that I talked about, including demos of attacks, etc., is at that website. But thank you for your attention. I'm happy to take questions. Thank you. Are there any questions? I'm sorry, yeah, I can... I have a question about Loggem. In the original paper, you tried to answer the question, how can the NSA break so much secure communication over the internet? And based on a scan of the Alexa top 1 million websites, you estimate that the NSA could use the Loggem attack to break up to around 20% of the websites. Now, we did, together with my student, Yalronen, a res scan of the Alexa 1 million websites, but this time we divided it into the top 10, top 100, top 1000, etc. Because almost all the websites which are of interest to the NSA are going to be in the top 100 or 1000. These are the big websites which contain interesting communication. And we got much less alarming answers, namely only about 2% of the top, really top websites are vulnerable to Loggem. So this is about 1 tenth of what you have estimated. So while I do not dispute anything which occurred in the original paper, including all the top 1 million websites in a single statistics is a little bit misleading because most of, you know, website number 1 million doesn't care much about security, and the NSA doesn't care much about it. So I wanted to give a little bit of good news about Loggem. Oh, that's pretty good, thanks. I think there was another question. So how do we cover attacks based on the certificate validation, the different colors of certificates and attacks also based on changing the clock values through the NTP protocol? So I didn't catch the last bit. Attacks on certificates and? There are attacks also based on changing the value of the clock of server or the client-based NTP. So a big sort of trusted component of MeteLS and a lot of this work is, in fact, the certificate infrastructure, which is pretty horrendously broken in many cases. So I think there have been very serious attacks based on even things like ASN1 parsing, they continue to happen all the time, and almost each every attack on an ASN1 parser can be usually mapped to a certificate. There's also the timing-based attacks like the one that you just mentioned based on the clocks. And there's also other attacks which are more sophisticated ones which are sort of based on how you use public key pinning and certificate transparency in mis-issued certificates and so on. And that entire can of worms, I think certainly needs a concerted attempt. I have one of my ex-students who's now at Microsoft is planning, is currently building a verified PKI using similar techniques to what you use for TLS. But I think we certainly need more models and more investigation into that aspect of things because it hasn't been looked at enough. Garthik? Yeah, hi. Thank you for the great talk this morning. I really appreciate the first point here about the need to keep killing off broken primitives or legacy crypto. Do you think that all the recent attacks have had an educative effect upon people like, people who are working in the ITF, for example? Do you think we're making progress in educating them about getting rid of legacy crypto sooner rather than later? So I think as you know, engagement with the standards bodies has its own time lag and you have to spend a lot of time without much academic output. But I think for sure, okay, so the ITF consists of people who write protocols, but a big chunk of people who come in there are people who are implementing these protocols. And they're from Google, they're from all of these places, Cisco and so on. And these implementers certainly are responding, they have to respond because of pressures to attacks and they are very much responding to this line of attacks and actually killing off various parts of various legacy crypto bits. Now in protocol design, I'm not sure whether or not how much effect this has been having or like I can say is, Taylor's 1.3 where the researchers have been very, very active in advising the working group on every aspect of it. The working group seems very receptive to avoiding even vaguely potential attacks which nobody can really see as even concrete right now. And then on occasion, they're even willing to accept changes that just makes proofs easier, which is actually a very welcome change. So we don't have to go around and hack our proofs to fit the model rather than the other way around. So I think this has been a very good effect, I think. But I'm not sure I would say that this is all over the ITF or many different places. Any further questions? If not, then let's thank a speaker again. And I think we're now on a break, on a coffee break and then we'll start again at 1040 here and then the room downstairs.