 I've did some research and it was not easy that Diffie-Hellman Key Exchange is so much above my pay grade therefore I'm going to keep it simple. Please welcome, we have Alex Halderman from the University of Michigan and Natja Henninger from the University of Pennsylvania. Thank you. Thank you all so much. It's wonderful to be back again in a 32c3. I'm Alex Halderman from the University of Michigan here again this year with with many of my students from my group here in the audience also speaking. We study security in the real world. So tonight we have a very special story to tell you that I'm very proud to be telling along with my colleague Natja Henninger. We're going to be talking about discrete log Diffie-Hellman and some of the research that we've done over the past year to try to understand how the NSA may be breaking so much of the crypto that we know they are breaking. So this work is joint work with a number of co-authors with 12 other co-authors, three of whom are in this room right now and I'd ask them to stand up but they said they didn't want to. So please a quick round of applause for my co-authors as well. So thank you. So in this very room a year ago at 31c3 Jacob Applebaum and Laura Poitras gave a talk Reconstructing Narratives in which they announced some new results from the Snowden archives. They were able to tell us how the NSA that the NSA was breaking cryptography used in widespread online communication and they later published an article in Der Spiegel in which the article included documents that showed indeed the scope of NSA breaking widely used encryption was significant that NSA is breaking maybe not all crypto but they're able to break widely used crypto for many of the different kinds of services and protocols we care about. What the leaks didn't answer is how NSA is breaking this cryptography and to a technologist well if we don't know how they're breaking it what can we do to stop it. So Nadia and I and our co-authors set out earlier this year to try to through our research start answering those questions. How is NSA likely to be breaking widely used cryptography and what can we and other researchers do to stop governments from being able to attack the crypto that all of us depend on. So let's tell the story. Wait until you see how it ends. So if I were setting out as the attacker to break widely used crypto well there are a few different ways that I could do it. I mean one branch of the decision tree here is through attacking the implementations right either finding vulnerabilities or introducing backdoors what we've all been witnessing over the past week or so with Juniper and their their systems being being compromised. On the other hand there's another prong you could take you could try to attack the crypto algorithms themselves the underlying crypto and the difference is if you're attacking implementations you have to make a big investment in every hardware device and piece of software that you're trying to compromise. If you're attacking the underlying crypto you have just one stop shop to gain access to potentially a very wide swath of all the crypto in use. So a small number of algorithms predominate for both symmetric cryptography things like AES and RC4 but those particular algorithms anyway most cryptographers seem to think that breaking them at least in the general case is pretty hard right now. Instead though we also have a very small number of public key crypto algorithms that are also in use very widely for most of all of the protocols and services we care about things like RSA and Diffie Hellman and Darby Dragons as they say this is the cryptography that we and many other cryptographers think is most likely to be targeted. So I'll hand it off to Nadia to talk about breaking public key. So in order to understand a little bit about how crypt analysis works I'm going to go all the way back to the very beginning of public key cryptography from the 70s and I'll start by explaining a little bit about RSA. This is Revesh Shamir and Adelman up on the screen here from the 70s you can tell and this is sort of the simple example and then we'll talk a little bit more about the actual Diffie Hellman based crypt analysis that we're going actually talking about. So this is the first public key crypto algorithm that was ever published and it is still the most widely used public key cryptography algorithm out there. That shows you I guess something about the naturalness of ideas or maybe the lack of progress that we've had in the past 40 years. So here's sort of the textbook version of RSA encryption. What we really care about is that so Alice and Bob they want to bootstrap communication over an untrusted channel so there's some eavesdropper in between them who's intercepting their messages and in order to get around this they need to somehow figure out how to share a key in order to actually encrypt their communications and the way that RSA accomplishes this is Bob here on the right has a public key which in RSA is a modulus n which is the product of two large prime factors and he sends this over to Alice and Alice uses Bob's public key to encrypt a message like a session key and send it back to Bob and then Bob can decrypt the message, get the session key and then they can communicate using some other symmetric cipher. So this is the big picture of RSA encryption. The reason that we think that RSA is secure is because the best way that we know how to break an RSA public key is to factor the modulus and as far as we know factoring is not very easy. So in particular factoring is what we hope is something like a one-way function, multiplication is easy, factoring the number into two pieces again is hard in some sense and the best algorithm that we have in the general case of say an RSA modulus that's well generated is called the number field sieve. So here's the, this is about as technical as I'm going to get and I'm waving my hands vigorously here but here's something along the lines of what the numbers field sieve algorithm actually looks like. So it's a multi-stage algorithm, it's rather complex, some stages parallelize very well, embarrassingly well, other stages parallelize somewhat less well and so we've got these multiple stages that we go through and at the end of the algorithm we discover we hope a prime factor of the number that we're trying to factor. So how long does it take to factor? Here's one answer. If you ask a number theorist this is the answer that they will give you, they will go through your optimization and they will tell you the answer is a sub-exponential function in the size of the number that we're trying to factor. That I think is not the answer that you guys are looking for. So here's a slightly more concrete answer. So how long does it take to factor different sizes of keys? So for 512-bit RSA the first 512-bit RSA modulus was factored in 1999 by a group of academics, it took them six months and a few supercomputers, now you can rent supercomputers by the hour. So what does that do? Well some students of mine actually were able to factor a 512-bit RSA key for four hours and $75 on Amazon EC2. If you would like to do this too, so you can also do this too, code is online right here, download it, send us bug reports, etc. So as we go up in key sizes, a 768-bit RSA modulus estimate, current estimate is less than a thousand core years and for sort of reasonable sized academic clusters this should take less than a calendar year to finish now. The first 768-bit RSA factorization was done in public in 2009, so that gives you some idea of sort of the progress. For 1024-bit RSA nobody has factored a key of that size in public, the estimate is probably it's about a million core years, which is certainly within range for a large government. So it is against better recommendations to use a 1024-bit RSA key size at this point. The current recommendation is to use a 2048-bit RSA modulus with current algorithms nobody should ever be able to factor something of this size without some kind of major improvement. So there's the big picture for you guys. Now I'll move on to Diffie Helman. So the paper that introduced Diffie Helman was called New Directions in Cryptography. It's one of the seminal papers of the 20th century. How many of you have read this paper? You should all go read it right now. I mean not right now, maybe after our talk. The first sentence of this paper written in 1976 is we stand today on the brink of a revolution in cryptography. This is one of the best opening sentences of an academic paper I've ever read, and they were 100% right about everything they put in the paper. They laid out the entire foundations of cryptographic research for a couple of decades, and to boot, they came up with the first public key exchange that is still one of the most commonly used public key methods we have on the internet. So all that in one paper. So the way that textbook Diffie Helman works is you've got a couple of parameters, you've got a prime P, and some element G less than P. You can think for concreteness, G is 2. It's a good number. And P is some very large primes, so 1024, 2048 prime. So Alice and Bob, same as our previous scenario, they want to bootstrap communication in the presence of an untrusted eavesdropper. So what they're going to do, Alice will generate some secret value A, and she will compute G to the A mod P and send it over to Bob, and Bob will compute some secret value B, and compute G to the B mod P and send it over to Alice. The eavesdropper sees the values G to the A and G to the B. Can't derive anything useful from those, but Alice and Bob can individually take their secrets and derive the values G to the A, B, mod P, both the same values. And that becomes a shared secret, which they can then use as a session key and switch to AES and start symmetrically encrypting their data. So that's Diffie Helman Key Exchange used all over the internet, one of the most commonly used things possible. So one of the reasons that people have been advocating Diffie Helman Key Exchange recently over, say, RSA is because it can provide the property of perfect forward secrecy. So assuming that Alice and Bob generate fresh, random secret values A and B for every single connection, then if, say, some large government agency is collecting all of their communications and later tries to hack into Alice or Bob or break one of their keys in order to decrypt their communications, they can't hack into Alice or Bob's computer later and then discover the key that Alice and Bob used to generate the old conversations that they had, because Alice and Bob will have already forgotten their keys that they used. So as long as Alice and Bob are generating fresh, random values every time, those values should reveal nothing about pastor or future communications. So that's perfect forward secrecy. And a lot of people who really know what they were talking about, including, unfortunately, me on the stage a couple years ago, have said you guys should always use Diffie Helman over RSA Key Exchange because of this property of perfect forward secrecy. So here's a selection of quotes that I found on the internet just for Googling perfect forward secrecy in Diffie Helman Key Exchange. And you find people saying this provides better security, the NSA can decrypt nothing, 1024, but Diffie Helman is arguably better than 2048, but RSA, no matter. And then 1024, but Diffie Helman is better than any key size for RSA. This is a selection of friends and people I respect, and also some random people on Stack Overflow, which is where we know everybody is actually getting the recommendations from. So there's been wide scale advocacy of perfect forward secrecy as like the reason that you should use Diffie Helman. It will protect you against the NSA. I'm sorry. We are wrong. And why are we wrong? I'm going to tell you a little bit more about what the crypt analysis looks like for Diffie Helman. So the underlying problem that we hope is hard for the security of Diffie Helman is called discrete log. And it is exactly sort of the problem of given one of the key exchange values G to the A mod P, compute, say, Alice's secret A. This would allow the attacker to compute the shared secret in the same way that Alice did. And sort of similar to factoring and multiplication, discrete log we think is much harder than modular exponentiation, the computation that actually gives you the value G to the A mod P. And the best algorithm that we have is called the number field sieve. So there's a lot of parallels going on here. So what does the number field sieve for discrete log look like? Hopefully this diagram is somewhat familiar to you by now. It's a multi-stage algorithm. It has many of the same stages as factoring. You can sort of line them up in parallel. The last bit looks a little bit different, but we can sort of ignore that for the moment. Okay. So we have some picture of what the number field sieve looks like. How long does it take? Answer number one, the same answer as factoring. In case you didn't remember, here it is again. This is kind of why everybody has been saying, okay, the security of 1024-bit Diffie-Hellman key exchange is about the same as the security of a 1024-bit RSA key, is because we have this same complicated formula that tells us how hard it is to break. The sort of more subtle answer for, or not more subtle, but the more practical answer for how secure is it is we can say, well, how long do we think it will take to actually compute a discrete log for commonly used key sizes? And the answer is slightly longer than factoring an RSA key of the equivalent size, but not so much longer than an RSA key. And the minimum recommended key size today is 2048 bits. Okay. So 2048-bit Diffie-Hellman, we're good, right? We can all go home. Okay. However, okay, what if you want to break many connections that use the same public parameter P? Do you have to go through this whole computation for every single connection that you want to make? That was kind of the justification for perfect forward secrecy is that every single connection should be as hard as factoring an RSA key of the equivalent size. Except that's not quite the case. Because if you look at where the actual target that we're trying to compute appears in this plot, it's only at the very last stage. So all of this only depends on the prime P. So we can actually divide up the algorithm in two pieces. A few stages that depend on the prime P that we're using. And then the final computation that takes the output of this first pre-computation stage, and that's the only stage that actually matters, that actually depends on the target of our discrete log computation. So we're in trouble. In particular, that means that if many connections are using the same prime P, you can do the pre-computation once, spend a huge amount of effort, and then the actual effort required to break individual connections using those primes is much, much smaller. So here's our current estimates. These are actually somewhat new from our paper of how long the individual log stage takes in practice if you push the parameters as far as you can to make this as fast as possible. And the answer is basically, if you're worried about a government, you better start worrying for reasonable key sizes that people were using. So I'll let Alex continue with the next part of our talk. So this fact that Nadia just told us about Diffie-Hellman and the number field sieve, this was something that the mathematical crypto people knew about. But most of us who did system security, people like me, didn't ever get the memo. So I'm here in part to apologize to everyone I've taught about Diffie-Hellman and crypt analysis who didn't get to hear about this as we were talking about perfect forward secrecy, right? This fact about the crypt analysis and how well it can apply and exactly the scenario that we're worried about this kind of situation involving mass surveillance was news to many of us. But now that we have that fact, how can we exploit it to try to break Diffie-Hellman in scenarios that we all care about? And we're going to talk about two scenarios in the talk today. The first one applies to HTTPS to encrypted web connections. And it applies not only to government agencies, but also just to normal everyday attackers with maybe the same resources that you or I have, right? This is a down to earth kind of attack on HTTPS, and we call it Logjam. Logjam allows us to break the HTTPS connections to many, many popular websites in modern browsers by fooling those browsers into using 1990s-era Backdoor Crypto. So where does this Backdoor Crypto come from? This is from the first crypto wars back in the 90s, when my country, the U.S., had restrictions on what kind and strength of cryptography could be exported and used by people abroad. So U.S. companies were prohibited from exporting products that contained cryptography that had greater than a certain strength. For RSA, that was that the key size had to be less than or equal to 512 bits, and for Diffie Hellman, it was that basically the prime had to be 512 bits or less. So back in the 90s, these were constants, these were strengths of crypto that were chosen, presumably, because these were easy for NSA to break. So if you were an American company making products like, let's say, Netscape Navigator, the web browser that initiated the first SSL protocol, you needed some way to be able to communicate with, from servers in the U.S. to clients, including your own browser, that you had shipped to people abroad, say, here in Germany. And so they came up with a way to use, to have HTTPS automatically select the appropriate key strength, depending on whether the browser was able to support the full strength cryptography or the weakened version for deployment abroad. And the way that they did that was something called export grade cipher suites. So when your browser, whenever it starts a TLS connection for an HTTPS URL, one of the first things that it does is the browser will send to the server a list of the kinds of cryptography that it can speak. These are called cipher suites. And then the server selects one of those that is compatible with whatever cryptography the server has available. And then that negotiated cipher suite is what it's used for the connection. Now, the way that they supported the 90s era backdoor crypto was by having browsers shipped abroad from the United States that could only speak a certain subset of crypto algorithms that were limited in strength to 512 bits or less. Those were the export grade cipher suites with the names you see here. Now, even though no browser has been shipped that is limited to only these suites since probably 2000 sometime when the US relaxed its export regulations, there wasn't just any one day when all of those old browsers from before that era went away. So servers, even now many servers will still accept and speak these weakened cipher suites if that's all a browser has available. Like if you're running an e-commerce site, I'm sure you still want to be able to speak to any customers who have 1998 or a Netscape navigator involved otherwise you'd lose some sales. So there was no reason to turn them off because no modern browser anymore now that those restrictions are lifted would choose these weakened suites. Well, that's what we thought anyway. So over this past year, there were two attacks that exploited these weakened cipher suites and found ways to convince modern browsers to speak them instead of full strength cryptography. The first was the freak attack, which was revealed in early 2015 by a separate group of authors and the freak attack was an implementation flaw in many modern browsers. In order to exploit it, all you need to do is to be able to relatively inexpensively factor a 512 bit RSA key. And as Nadia has told you, this is now a matter of maybe four hours, maybe $75, something like that. And if you did that, you'd be able to break all the connections coming into a weak server for a long period of time to a server that still spoke these cipher suites. So this affected most modern browsers and just shy of 10% of Alexa top million sites that speak HTTPS. Now that left the Diffie Halman export grade cipher suites, which were not affected by freak. But we came out with a paper in May of this year that showed a separate attack, the log jam attack, which is a protocol flaw in TLS and affects all modern browsers. And similarly, allows you to downgrade connections to export grade Diffie Halman and then interceptor modify the contents if the server speaks and still supports these export grade Diffie Halman cipher suites. So now let me give you the hopefully brief technical overview of how the log jam attack works. If you've you've been curious about this, this is the this is the chance to see it. So log jam is a problem that happens during the TLS connection handshake. And the first thing that happens in the handshake at the top of this diagram, so this is just your browser connecting to some website there or some server there on the right, maybe Alice connecting to her favorite website here. So the first stage is this client hello, where, you know, the normal client is going to say, I speak various kinds of cryptography, including full strength Diffie Halman. The server at that point is going to respond by picking some cipher suite, let's say Diffie Halman, and then sending over its certificate chain as well as its Diffie Halman public parameters P and G, the server gets to choose them, as well as G to the A. And then it's going to use its long term RSA key that is the thing that is signed in its certificate in order to make a signature on those Diffie Halman parameters. Okay, then it's going to do complete the negotiation and so on. In the log jam attack, we have a man in the middle attacker who's able to modify some of these messages as they're going by. So the first thing the attacker does is he modifies the client's hello message to replace all of the different kinds of cryptography the client says it supports and just put in export grade Diffie Halman. The 90s are here again. All right, so then, you know, the server will get that and if the server supports export grade Diffie Halman as a about 10% or so of servers still support export grade Diffie Halman, it's going to respond and say, okay, that's what you asked for, I'll take it and it's going to send over its side of the Diffie Halman key exchange, but using a 512 bit prime because that's what is required under these export grade suites. Now at that point, the browser would normally reject this message because it didn't ask for export grade, it really asked for full strength. So instead, the man in the middle has to modify the server's hello message and say that this is full strength Diffie Halman. Well, if it's full strength, why is it only a 512 bit prime that's being used? Well, there's actually no limitation and no distinction between the messages that the server would send in that space with PNG that says normal grade Diffie Halman has to be more than 512 bits. In fact, we've found a handful of real sites that even for normal strength Diffie Halman just happen to use 512 bit or even weaker cryptography. So as long as we modify this earlier message, the server's hello message and make it say normal strength Diffie Halman, there's no way for the client to tell from just the structure of the protocol that anything is amiss. So at this point, there is one last place that we could catch the problem that the client or the server could see that's something wrong, which is that each side sends the other a finished message here at the end that says, basically has a uses the session secrets to add an authentication code to a dialogue of all of the protocol messages that max the handshake dialogue, all the messages going back in each direction so that each side sends the same message to each side. But before have to be the same from each side's view. However, in our case in log jam, the attacker is able to spoof these messages to make them look correct to each side. And he's able to modify that dialogue. Why? Well, because we're using this 512 bit Diffie Halman that we know from using the number field sieve, we are able to efficiently break. So if we perform the discrete log on the servers on the Diffie Halman key exchange that's going by at 512 bit strength, then he can fix up the client and server hello messages and neither side will notice that anything went wrong. So that's log jam in a nutshell. I'm sorry it's a little bit complicated. So how widely shared are these Diffie Halman public parameters? Well, we used internet wide scanning to find out. So my group we also make something called ZMAP that I talked about here a couple of years ago, which lets us quickly probe everything on the internet. And so we did this and tried to make connections to each HTTPS server in the public IPv4 address space and found out what key exchange methods were supported and with what primes. And what we find is that 97% of hosts that support export grade Diffie Halman use one of only 3 512 bit primes. They're that widely shared. Why is this? It's because those parameters are very often either hard coded or encoded in standards that different people implement. The most common of these used by 80% of hosts that support export grade Diffie Halman is a public parameter that's hard coded into Apache 2.2. So it's actually there embedded in the source. You have to recompile Apache to change it. 13% of hosts supported something a second prime that is also 512 bits that's hard coded in mod SSL and the next most popular 4% was in the Sun JDK. Only 10 primes accounted for 99% of all the hosts we found in the public address space that supported export grade Diffie Halman. So if we would like to compromise these, well, Nadia just told you about how long it takes to use the number field sieve to break 512 bit discrete log. Well, we actually went and did the pre-computation for all three of these most widely used Diffie Halman primes. And our colleagues who make a tool called Cado NFS were able to implement the code for that piece of the discrete log version of the number field sieve and they ran the algorithm on these primes on a cluster they just happened to have lying around. It took about a week of time on the cluster for each of these primes. After which, using an optimized version of the last portion of the number field sieve, it takes about 70 seconds for us to break any individual connection that uses any one of these three most popular primes. So log jam and our pre-computations now allow us to break any connection to about 8% of the top million HTTPS sites from Alexa. And when we came out with this attack, it worked in all modern browsers. So mitigation. This is bad, everyone. This is the crypto all of us are using. So we do have some mitigations. This is the actual positive part is that the browser makers have now started to increase the minimum strength of Diffie Halman they will accept. So IE Chrome and Firefox will reject primes less than 1024 bits and Safari less than 768. And the new draft of TLS13 is including an anti downgrade flag that will make it even harder for such attacks to take place in the future. Now back to Nadia. So we promised in our abstract that there would be a hands-on portion of this talk. So you have a couple options. Number one is if you're really into this, you can go and download Cato NFS yourselves. Cato NFS. And run the discrete log algorithms yourself for any prime you wish and then you can compute arbitrary discrete logs. However, since we have already done some of the computations, we figured that we would make them available for you guys if you wanted to play with them. So we have done through the Twitter API. So we have a bot running on Twitter and if you would like to compute discrete logs for any of these widely used parameters, this bot will do so for you. So here's the group generator and the primes in hexadecimal for the three groups that we did the precomputation for. And if you wanted to test it out, you would do something like this. So this is using Sage, which is a Python based open source mathematics package that does a lot of algebra and number theory. If you like playing with this stuff, Sage is super cool. So I set, say, my prime M is this last value in hex there, the mod SSL prime. Then I take two and raise it to the OX1337 power, mod M. And then I print it out in hexadecimal and I get this value. Then I can copy paste it into a tweet at D log bot. Then some stuff happens on our back end. This is running on one of the machines in my lab, so please don't break it. And after a minute or two, you should get back and answer. So there's a queue, only one thing can run at a time, median time 70 seconds, it can vary between 30 and three minutes, 30 seconds and three minutes. So if it doesn't respond to you within like an hour or something, then send us a ping and we'll see if it's still running. Okay. So have fun. Please don't actually use this for malice. We already have some satisfied customers. All right. So we promised there were two exploits that have to do with weakened Diffie Helman. And the first log jam right, anyone can use back doors from the 90s to pwn modern browsers. Well, the second one is a little bit more widespread. So we're going to talk about how Diffie Helman weaknesses can be used for mass surveillance. We believe that governments can probably already right now exploit 1024-bit discrete log to break Diffie Helman for wide scale passive decryption of internet communications. So is breaking 1024-bit Diffie Helman within the reach of governments? Let's look back at these numbers quickly. So we can see that for 512-bit RSA and Diffie Helman, they're both really in reach of basically any effort right now. Any one of you can probably muster the resources to do this. For 1768-bit RSA or Diffie Helman, well, we think this is now in the reach of a concerted academic effort. For 1024, it's a little bit more complicated because the number field save algorithm is complicated enough that even making estimates of the runtime at this size and larger is very, very complicated. And having high confidence estimates is difficult. But we've tried to do the math conservatively and we believe that a conservative estimate at least for 1024-bit Diffie Helman is to break, to do those pre-computations for a single prime P would take about 45 million couriers. Now 45 million couriers sounds like a hell of a lot. But when you start to think about it, if you're going to do an effort that large, there are some optimizations you could start doing. And for instance, maybe instead of running this on general-purpose PCs like these estimates show, if you're going to do an effort on this scale, maybe you're going to tape out some chips. Maybe you're going to use custom hardware. And if we do the math and look at what kinds of gains we can get from custom hardware in other applications that are similar to this, we estimate that we can get maybe a speedup of 80 times just by doing it in custom hardware. Okay, and then we ask what that's going to cost. Well, we estimate that for the, to build a machine that could break one 1024-bit P, pre-compute for one 1024-bit P every year, would cost somewhere in the neighborhood of low hundreds of millions of dollars in a one-time investment. As a result of this, you can churn out pre-computations once a year that will let you break efficiently every connection that uses that P. Now, individual logs then are going to be close to real-time and in fact, you can reuse much of the same hardware to do the computations for individual logs very quickly. So, oh shit, this is what the estimates look like. Now, is NSA actually doing this? This is where we get into the conspiracy theories. So, there have been rumors flying around for many years, I mean, for decades, really, but sort of credible rumors for many years of some large cryptanalytic breakthrough that the NSA made. So, here's an article from James Bamford, one of the, you know, world's experts in the open world of what the NSA's activities are, and he wrote an article in 2012 saying very clearly that he had talked to multiple government officials who said that the NSA made some enormous breakthrough several years ago. Everybody's a target, everybody with communication is a target, and this computing breakthrough is going to give them the ability to crack current public encryption, and it was so secret that no oversight, anybody had sort of access to the details of it, but whatever it was, it was major and massive. Of course, you know, after we saw this, we said, oh my god, you know, what could it possibly be? Are they breaking RSA? Bamford actually goes on in this article to speculate that it's something about AES, which at least to my mind seems less likely than some kind of major public key breakthrough. So clearly, we have sort of these rumors of large breakthroughs by the NSA's tens of thousands of mathematicians. Simultaneously, we can say, you know, we know the NSA is clearly interested in cryptanalysis, is cryptanalysis on the scale of hundreds of millions of dollars within their reach? The answer, thanks to Snowden, is yes. We have some of their budgets, and they spend billions of dollars a year on computer network operations. They spend hundreds of millions of dollars on cryptanalytic IT systems, cyber cryptanalysis, exploitation solutions. In fact, a couple years ago, there was even an increase of hundreds of millions of dollars in their budget for cryptanalysis. Interesting. So $100 million of special purpose hardware is certainly within range of a government the size of ours. Additionally, we can ask what would the impact of doing one of these single pre-computations for a discrete log for a single prime would be? And the answer is actually surprisingly large. So if you did pre-computation for a single 1024-bit prime, that would allow passive decryption of connections to 66% of VPN servers and 26% of SSH servers. This is from internet-wide scanning. We connected to all of these, and we said we would like to speak Diffie-Helman with you. What parameters do you prefer? And these are the servers that preferred a single 1024-bit prime over every other parameter and key size. A second 1024-bit prime would allow passive decryption for 18% of the top 1 million HTTPS domains. These are domains that prefer to speak Diffie-Helman with this fixed prime. And the final piece of evidence for something like this being within range and at least being worth worrying about is actually some of the slides that were released last year by Derspiegel. And in particular, they have a large amount of detail about passive decryptions of VPN traffic. So here's an example. It is clear from these slides that whatever the NSA is doing, they have the ability to passively decrypt VPN connections on a large scale. And they're very happy about it. I think this is my favorite Snowden slide ever. I feel this way when I decrypt things, too. So if we take a look at what, and these slides are specifically talking about IPsec VPNs. If we take a look at what the key exchange looks like for IPsec VPNs, what happens is we have two hosts who want to make a VPN connection with each other. The key exchange actually uses a fixed set of parameters from a small list of possibilities. And so Alice and Bob will negotiate which parameter they're going to use from this list. And then they will do a Diffie-Helman key exchange. From that, they will have a shared secret. They're at G to the AB. And then in the most commonly used mode, some pre-shared key like a password that has been shared in over some other channel. And that Diffie-Helman secret that was negotiated together with the pre-shared key are mixed together to generate the session key. So if somebody wanted to break a connection of this type, one option would be to, say, steal the pre-shared key through some other mechanism and then break Diffie-Helman. That would be a possibility. So if we look at what the NSA's requirements are for their mass-scale decryption efforts, they require finding out what the pre-shared key is, getting both sides of the connection, getting both the asymmetric key exchange and the symmetrically encrypted data, and then having some metadata. These are the requirements for them to get decryption. And we can also take a closer look at what their decryption flow actually looks like. This is somewhat complicated, but in this diagram, so they're getting the Ike key exchange and the symmetric data. They're sending it into one system that they have. They're sending the Ike messages through out to some high-performance computing resources, and they then get sent back with some data from stored databases of information that returns the actual decrypted data. So that's what the decryption flow looks like. We don't have any details of the cryptanalysis, but we have details from the sysadmin's perspective of how the systems that are doing the cryptanalysis are hooked together. And they're doing something that requires high-performance computing that takes in key exchanges and hands out decrypted data. So we can line up sort of the NSA's on-demand Ike decryption with what a discrete log decryption would actually look like, and they are very close. So they would both require the pre-shared key, both sides of the handshake, both the handshake and the symmetric data, and they would send off the data to high-performance computing. So in the same set of slides, they also discuss targeted implants against particular implementations. If you were going to design a backdoor to make your life easy, you would have fewer requirements than this, potentially. There are many kinds of backdoors that you could design, but if you were being clever about it, you might try to make it a little bit easier on yourself to decrypt than this. So I will let Alex finish with this. So to wrap up what we've seen today through the cryptanalysis of Diffie Hellman is probably a mass-surveillance dream. The algorithms that we've talked about would let a government with sufficient resources to invest in these pre-computation attacks break connections on an almost unheard of scale across almost every widely used crypto protocol on the internet. Here are some numbers again. For HTTPS, the top million sites, we're looking at a device like this, one we hypothesized, breaking connections to maybe 56% of them passively. For Ike, for Internet Key Exchange V1 and V2, we're looking at in the 60% of servers are potentially compromisable using this same hardware. For SSH, for IMAP with secured encrypted connections, for SMTP with Start TLS, the encrypted mail transport, all of these protocols are potentially jeopardized by the same kind of attack because everyone fundamentally, so many people fundamentally rely on the same underlying cryptography, often with the very same public parameters that are so widely shared. So what can we do about this? So first, let's go back to the logjam attack. Again, using 90-Zera Backdoor Crypto that lets any of us break connections to modern browsers. Luckily, browsers have already started to mitigate this, as I said, by increasing the minimum strength of Diffie Hellman they support, although there's still a way to go there since they're all still accepting 1024-bit Key Exchange. Our biggest recommendation under here, though, I think the lesson is don't backdoor crypto, right? Because the backdoor crypto of 20 years ago is now coming back to bite everyone. And then we have the second attack, the 1024-bit case that enables so much mass surveillance. Well, to get around this, we're going to have to do some upgrades. Probably the easiest thing to do, and the thing that almost every cryptographer that we talk to recommends now, is to move to Elliptic Curve Crypto. Yes, there's been talk about whether the specific NIST curves may have been backdoored by NSA, but by and large, we think that Elliptic Curve is the most sound choice we have for now. Now, if Elliptic Curve isn't an option, and there are technical reasons why it might not be, at the very least, use Diffie Hellman Prime that's 2048-bits or longer. If even that isn't an option, you're using legacy systems for some reason. Well, they, or Java, yes. If there's anyone here who works for Sun, or for Oracle, please, please, tell them to fix the crypto in Java. But if that's not an option, if that's not an option, the fallback is you can generate, at least, generate your own 1024-bit Prime. Mind you, there are various tricks that you have to make sure you do. When generating a Prime, it must be a safe Prime, but there are implementations of doing this. So it's not exactly free to generate your own 1024-bit Prime, but it's inexpensive, and if you have no other option, at least that makes it so this large government adversary has to spend a lot of pre-computation, a year, perhaps, targeting you individually. And they can't just get this for free. All right, so that is our talk for tonight. We're saving a lot of time for questions. Thank you all very, very much. Nadja. Nadja and Alex, thank you very much. We installed some microphones here in the room, so please queue up. First, Signal Angel, do we have some questions from the net? Yes, we have a lot of questions. First question is, do you think it's possible that the NSA uses quantum shore factorization for 1024 or bigger keys already? I would believe it is much more likely that they're using classical cryptanalysis for 1024-bit keys than that they have a quantum computer that nobody has heard about. And another one? Is it thinkable that the NSA solved the P equals MP problem but keeps quiet? Probably not, but if they have, yes, I think they'd want to keep quiet about it. I hope they would tell us. I hope they would tell us, too, but I doubt it. Okay, and over to number one, please. Two questions. First, is it reasonable to think that they, or is it possible they are attacking individual RSA keys that they can factor individual RSA keys in about a week with custom hardware, and to NSA Suite B came out 2005 and they don't use Diffie Helman, so NSA themselves didn't, they told us in 2005, we won't use Diffie Helman, so is it reasonable that when they change the requirements for top secret that we should follow? Well, to the first part of your question about whether they're factoring RSA, I think the answer for 1024 is very likely, yes, they are for high-value targets. So if you're a major website, at least, and you're using a 1024-bit RSA key, well, it's long past time to change to a higher strength. If the NSA has not factored a 1024-bit RSA key, I'm going to be very disappointed and I'm going to ask where my tax dollars are going. And also, I think actually the point of sort of watching what the defensive side of the NSA is advocating in terms of recommendations is actually a wise thing to do because as far as we know, at least the public recommendations defensively should, I mean, they're making recommendations for people who are building systems that are going to be handling classified data, so they should be solid recommendations as far as we know. What the NSA has told me about those recommendations, by the way, is that as long as you follow them exactly, you're going to be okay. But if you deviate in any small way whatsoever, then they make no guarantees whatsoever. So think about what that might mean in terms of your implementation, the next time you read through those particular recommendations that they make. Okay, then we hope over to microphone 3, please. So for the moment it's elliptic curve-based is Diffie-Hellman secure? I hope so. So does no... It doesn't suffer from the same shape of attack that we've described here. As far as we know, there's not a way to do this same kind of pre-computation for elliptic curve-Diffie-Hellman. So what we didn't mention in the talk is, so, I mean, one of the reasons that elliptic curve keys are so much shorter than, say, finite field Diffie-Hellman or RSA is because we have these super powerful index calculus number field SIV type algorithms for factoring and for discrete log over finite fields and those don't seem... We don't actually have equivalence of those algorithms for properly generated elliptic curves. So that's why those key sizes are shorter and that's why we think that they seem to be more secure. Then we take another one from microphone 3, please. Yes, you said that when doing the pre-computations for known commonly used primes, you can reduce your effort. You have to put in a single connection to about 70 seconds. How is that usable? If my TLS handshake is delayed 70 seconds, I already run away. So we'll refer you to the paper for the full answer to that. But it turns out there are a bunch of tricks that you can do to keep a session handshake open for at least 70 seconds. So this may not be what you want to do to the connection, say, in a web browser that's loading index HTML, but whichever one is loading, say, the, I don't know, the one pixel tracking image in the background that nobody sees, which is also getting the same session cookie, that one you can hold open for 70 seconds without the user noticing. So what we've been able to do is show a variety of ways that we can trick browsers and other implementations into holding the connection open long enough. Also, 70 seconds is just what we were able to do with a few weeks of hacking around and optimization. We think that with not that much more effort, we could get that number down quite a bit more. But 70 seconds we think already is not so bad, and there are plenty of ways that we can exploit it. Perfect concept. Okay, do we have something from the net? How long do you estimate the security of RSA, DHE to sustain, and do you have any idea if and when there are any quantum encryption algorithms that will soon be available to be used by a broad public? Oh, quantum encryption algorithms. You should watch Dan and Tanya's talk from yesterday evening. Yeah, last night was the time to hear about that. The dangers of quantum cryptography. I mean, the short answer is that people who know what they're talking about have said that we should start worrying now and you might see quantum computers within the next 15 years, maybe. But it's really hard to speculate about advances in physics that may be pretty far off. Do we have another one? Sure. What's your opinion on the NIST curves, especially with the current rumors about the curve parameters having a backdoor? There are no known ways that the curves could have been backdoored with the given parameters. But if you don't trust them, you know Dan Bernstein has a curve you can use too. So, do you trust Dan or do you trust the NSA? Over to two, please. It sounded a little bit that you recommend or that you say that Diffie-Helman is worse than RSA now. So, are you saying this? Does it mean I should switch back to RSA using it instead of preferring it instead of Diffie-Helman? With equivalent key sizes, equivalent sizes of your prime or your RSA modulus, yes, we are saying that, that in the 1024-bit case, there's strong reasons that you should distrust the very common repeated primes for Diffie-Helman. But that's not the whole story. So, for longer sizes of modulus, longer, larger strengths of crypto, RSA is probably still okay. But I think either way, switching to elliptic curve for key exchange is probably the thing to do right now. I think the precise statement that we can make is if you're comparing 1024-bit Diffie-Helman to a 1024-bit RSA key, that if you're using Diffie-Helman with the most commonly used parameters, say the Oakley Group 2, that everybody on the internet is using, and you think it is likely that a large government agency has already done the pre-computation for that prime, then breaking an individual connection using that prime with Diffie-Helman key exchange would be much, much, much less effort than factoring a freshly generated 1024-bit RSA key that is unique to you. If that 1024-bit RSA factorization is within range of the NSA, it may not be worth their while to actually factor your key, whereas breaking a Diffie-Helman key exchange, they've already done the hard work to break everybody on the internet, so you're just one more fish. That's the precise statement that we can make about the security. The real answer, use elliptic curves or use 2048-bit Diffie-Helman, probably fine. And over to number one, please. How realistic is it to use or to create a new prime for every exchange or at least every few exchanges? So, unfortunately, the properties that you need for discrete log to be hard, you need to have a safe prime and you would hopefully like it not to be backdoored. Generating safe primes is still kind of effortful on modern hardware. I mean, if you try to do it on your laptop, it'll probably take like, I don't know, a minute or something. So it's actually a lot of effort to generate a new safe prime all the time. Just use a larger safe prime and you'll be better. So we're running out of time, but let's start with number two. Said that elliptic curve cryptography is not susceptible to this pre-computation attack. Is that luck or is it engineered to be that way? Luck? I mean, a combination of both. So as far as we know, you can do pre-computation with elliptic curve. So the sort of generically the best thing that you can say is you can do a lot of pre-computation, but you still have to do a lot of effort sort of for each individual value. So you could do, generically, if you want to break an elliptic curve, you could do like a square root of n attack against the key size. Say n to the two-thirds pre-computation and then you would have n to the one-third online work, if that makes sense to you, but you get less effort as far as we know or less benefit. Okay, we're going to finalize them with number four. What do you think about blacklisting these common primes just in the modern process? Will this get rid of this issue? Just blacklisting the common primes. Well, if you blacklisted the common primes when we first came out with this, you'd immediately break about 10% of websites because there's not a good fallback mechanism if you don't like the prime you got during key negotiation. What the browsers are more likely to do is to phase out this kind of finite field Diffie-Hellman entirely over the next larger number of years. So first they're going to start rejecting things that use unusually weak primes. That's what they're doing already today. But I think in the long term, they're going to encourage the use of elliptic curve as an alternative. If you want forward secrecy, elliptic curve will be the way to get it. Thank you, Alex. Once again, thank you so much. Thank you.