 Okay, everybody. So it's time to continue. I want to remind you to rate and comment on the presentations today. And so let's welcome our next speaker, Huzaifa. Can you hear me now? Okay. So thank you for coming to this presentation. I know the topic looks a little bit technical, so a lot of people get kind of intimidated by the amount of mathematics involved and not everyone has a PhD in mathematics in neither do I. But yeah, I'll try to keep it as interesting and as simple as possible. The presentation is on log jam. So, you know, if you have been in the first presentation which George took, one of the big names which he mentioned on the screen on the left-hand side, which he probably did not spell out was log jam. This is a vulnerability in Diffie-Hellman key exchange. And if you don't know what Diffie-Hellman key exchange is, in just a few slides I'll try to explain to you in a way which can be as simple as possible. And there is a really interesting issue in Diffie-Hellman key exchange. And I tried to give a little twist to the presentation and mention why government agencies would basically allow these kind of issues. So my name is Huzaifa. I am a principal software engineer. I have been working with Red Hat for 10 years now. I'm a part of various upstream security teams. Last I counted was around 15. So I'm a part of Mozilla, WebKit, LibreOffice, Exorg, PHP, and a couple of ones which I don't seem to remember very well. I spoke about shell shock last year. It was very well received. I used to work with Josh before he moved on to the team which has more money now. And he mentioned some attributes which security engineers have. So I was trying to look at the back of my mind. And I figured out that I would fit into most of the things which he talked about, including does not like to talk and stuff like that. He also mentioned we people are extremely intelligent. So I would like to... Before I talk anything, anything, a small disclaimer. Last time I got spoken about with management and they did not like certain words which I spoke during my presentation. So a small disclaimer. Anything which I say in this presentation is strictly my opinion and probably is not the opinion of my employer. So let's get started. We use government agencies out there. And when we talk about government agencies, the first thing which comes in our mind is definitely the NSA or the GCHQ. There are probably other agencies out there as well. And if you look at the Snowden files and a lot of other references, they are basically actively trying to crack a cryptography online. And the Snowden files basically tell us that there are three things which they are trying to attack right now. The first thing is they are trying to attack the SSL-TLS protocol because a lot of traffic uses that. So Till Maas talked about HTTPS and SMTPS and whatever protocols probably have an S at the end. They are wrapped around by using SSL-TLS. They are probably also interested in trying to break VPN. And when I say VPN, it's mostly IPsec. They are also trying to break into SSH, which is probably 26% of all traffic on the internet. And when you want to break crypto, there are basically two ways of trying to do that. One is you try to attack an implementation. And we have seen the recent case of Juniper sometime back in which there was a backdoor or something like that. So one method of doing that is to attack an implementation and basically two ways of doing that also. You pay tons of money to the person who makes the hardware or who makes the software. And you persuade him to have a backdoor in his software or hardware. That's one way of doing it. The second way would probably be to try to find a flaw in the implementation and try to exploit the flaw. There is a really a big problem with this kind of approach and the approach is that not everyone uses the same implementation. If I'm using a particular hardware device, not everybody on the internet uses the same kind of hardware or software device. So whatever effort you spend in trying to crack the implementation, that's probably only going to affect small number of people on the internet. The second and the better way of doing it is to attack the actual algorithms, which run these security protocols on the internet. They are generic across all crypto products which we have. So if it was possible to crack AES, everybody uses AES. So the holy grail of backdooring all crypto on the internet probably would be to crack AES. And once you crack AES, then it's like 99% of the game is done. So most of the crypto protocols on the internet, they use a combination of symmetric and asymmetric. Symmetric crypto is hard to break. And there are two symmetric crypto algorithms which are mostly used. One is AES, which I just mentioned. And second is RC4. You don't need to break RC4 because it's already partly broken. And most of the cryptographers seem to think that symmetric crypto is very hard to break. We are just going to believe them in this case. On the other hand, asymmetric crypto probably is not that hard to break. And when we talk about asymmetric crypto, right now there are two algorithms which are widely used on the internet. There is RSA and there's a DH, which is a Diffie Helmet. So we are just going to have a brief look at each one of them and see which one of them would be more difficult to break. So a very simple representation of RSA. There's Bob and Alice. This is basically public key encryption. So Alice basically wants to send a message to Bob and hopefully make sure that the government agencies in between Alice and Bob are not able to see what the message is. So the first step is to generate a public-private key pair. So Alice basically needs to think of two prime numbers P and Q. And there is a product of two prime numbers, which is n is equal to PQ. And then there are two more things which come into play. And I'm not going to go into details of how they are derived right now. So there's a E, which is the encryption exponent. And there's a D, which is a decryption exponent. So basically when Alice wants to send a message to Bob, she basically creates a public key, encrypts the data with the public key. She creates a public and a private key pair. And the communication is on the other way. So Bob basically encrypts the data with Alice's public key. The data is sent across the internet. And because Alice has her own private key, she's able to decrypt it. So I'm just being a little bit fast because we don't have a lot of time. So what are the best known attacks on RSA? The best known attack is factoring n into its component prime N and P. Factoring n into its component prime P and Q. So in the last slide, we saw that n is equal to PQ. And that's the whole story around which RSA basically works. So if you really want to attack RSA, then you need to figure out a way of trying to factor n into its component prime. And it seems that it's not very easy to do something like this. The best known algorithm which you have is called the number field save. And there is a brief description of how the algorithm works. And so how much time does it take to break? So this is really interesting. If I have a 512-bit RSA key, it's going to take one core, core year. Which basically means that if you have a machine with one core, it's going to take you one year to break to factorize n into P and Q. Which is quite silly right now because if I have a laptop, my laptop has got four cores or two cores or something like that. So probably if I run some software on my laptop for a couple of hours, then I would be able to break the 512-bit RSA key, which is pretty silly right now. 768-bit is supposed to take 1,000 core years. There is freely available software on the internet. If you buy an EC2 instance, it's going to take you 72 USD to break a 768-bit RSA key. So what you need to do is you need to hire EC2 for say 4 hours or 5 hours or something like that. And run freely downloadable software on the internet on the EC2 instance. And probably in a few hours you would be able to break a 768-bit key. As the component size increases, as you see it becomes increasingly more difficult. 102-bit RSA key is supposed to take 1,000,000 core years. And there is no public proof that it has been done publicly. Probably government agencies have this kind of infrastructure and they have a much better algorithm and they would be able to break these kind of keys. The recommended size currently is at least 2048-bit. Now comes Diffie-Hellman Key Exchange. Probably equally widely used on the internet as RSA. Diffie-Hellman Key Exchange is much more simpler to explain than RSA. We have again two parties who want to exchange data with each other. What they basically need to do is they basically need to think of two numbers. One number is a prime P which is called the generator. It's called a prime. And there is one more number we should need to think of which is called the generator. So you basically Alice and Bob they need to think of two numbers. One is a prime P and one is a number G. So once you do that, what Alice basically does is she thinks of a private key A and the public key is G raised to A mod P and that is sent to Bob and Bob does the same thing. Bob thinks of a private key B and the public key is G raised to B mod P and it is sent to Alice. I know it's a little bit complicated. Now how hard or how easy is it to break Diffie-Hellman Key Exchange? It is much more, Diffie-Hellman Key Exchange is much more harder to break than factorization. Basically the whole attack comes down to something which is known as DLP or discrete log problem in which we need to solve this equation Y equal to G raised to A mod P and the best known attack is called finite number field C. And if you see the algorithm it is shown on the screen. How much time does it take to break Diffie-Hellman Key Exchange? Slightly more time than RSA. 512 Diffie-Hellman Key Exchange, 512 DH Key will take around 10 core years. 768 bit will take 35,000 core years. 1024 it should be 1024 will take around 45 million core years. And the recommended key size right now is 2048 bit keys. So why Diffie-Hellman Key Exchange? If we have a good asymmetric encryption protocol on the internet, why do we need something else? So you enter something which is known as PFS or perfect forward secrecy. What PFS basically means is theoretically the government agencies could record traffic on the internet. So you know if you use RSA to encrypt a traffic on the internet and you know if you use the same key for 2 years or 5 years or something like that which is quite normal on the internet, then government agencies could basically record all the traffic which is passing on the internet by using that key. And now they have 2 or 5 years. So they have lot of time to factorize your key. So theoretically probably 1 year down the line or 2 year down the line when they are able to factorize your N, they would be able to decrypt all the traffic which they have recorded. So what you basically need to do is you need to sit on the internet. You need to figure out when the cert is going to expire. You need to record all the traffic which is passing through the pipe. And once you are able to factorize N, you automatically are able to decrypt all the traffic which you have recorded. Which is not the case in Diffie-Hellman Key Exchange because when Diffie-Hellman Key Exchange initially came out, they basically assumed that you know when Bob talks to Alice each time, then a new P and a new G would be automatically negotiated. So when I talk for the first time, I think of a P and G, then the connection goes down. Then when I talk for the second time, then new P, new G. And then when I talk for the third time, then new P and new new G. So basically even if I'm recording all the traffic on the internet, unless I know what previous private parameters are used, it's really difficult to decrypt the traffic. So a lot of people basically started to advocate Diffie-Hellman Key Exchange over RSA. A lot of cryptographers also said that if you really want privacy on the internet, then you should actually use Diffie-Hellman Key Exchange. I have a few quotes from some famous cryptographers, which I did not get the time to put on the slides. But there is a problem over here. And the big problem over here is perfect forward secrecy basically says that for each connection, you need to use a new P and a new G. But that is not really the case. For each connection, we don't normally use a new P. For many connections, we use the same prime P. So if many connections are using the same prime P, then you can do a lot of pre-computation and you can break Diffie-Hellman Key Exchange. So probably I'm going to show you a slide in which I'll show you how you can do a lot of pre-computation. And by doing pre-computation, you can actually quite easily break Diffie-Hellman Key Exchange. So this is probably a gap between the cryptographers and the people who actually implemented the library because the cryptographers initially thought that if you really implement perfect forward secrecy, then for each time you are going to generate a new P and a new G, but that's really not the case because you generate a P once and you probably use that for five years or ten years or something like that. So how do we use this to exploit connections on the internet? So enter something which is known as log jam. When we talk of log jam, we are basically not only talking about government agencies who have large computational resources, but we are basically talking about people like you and me who have our laptops which say two cores or four cores or something like that and we can actually use this vulnerability to do an active man-in-the-middle attack and we can figure out what data is being sent on the internet. So in the 1990s and the years before and after that the US had a very strange policy in which they did not allow crypto of higher strength to be exported. So for Diffie-Hellman, I think they capped it to 512 bits. So it basically means that you cannot export crypto whose strength is more than 512 bits for Diffie-Hellman. We of course don't know why that was done, but probably because it's more easier for US-based government agencies to break these kind of low strength crypto ciphers. Because of that, they basically invented something which is known as export-grade ciphers. So export-grade ciphers are basically weak algorithms which are used. So like 40-bit symmetric keys or 512-bit RSA or 512-bit Diffie-Hellman key exchange. During SSL handshake, the server gets to select what kind of ciphers would we want to use. A lot of crypto libraries which were made like thing of OpenSSL or NSS or Gaganootailers, most of these crypto libraries have support for these export-grade low quality or low strength ciphers as well. Post 2000, what happened was I think US became a little bit more relaxed and they removed these export restrictions and you could basically export cryptographic algorithms which have most strength. However, something interesting happened. The US relaxed its restrictions. The browsers did not remove the support for these low quality export-grade ciphers. That was basically done because the servers need to maintain a kind of backward compatibility with these bla bla bla browsers. So basically what happened was in post 2000, though US allowed higher quality ciphers to be exported, these low quality ciphers did not get removed from the code which runs on our browsers and runs on our servers. So in the past two years, there have been two attacks which exploit something like this. So if you have heard of Freak, Freak is an attack on RSA and when Josh mentioned that these named attacks have gone down, I basically disagree because we just saw an attack called Slot. I'm not sure how many have heard of that. Which is kind of on how TLS uses MD5 to protect some signatures during handshake or something like that. So it's definitely not going down and Josh doesn't look very happy with that. So Freak basically uses 512 bit factorization. It's an attack on RSA and it's an attack in which a man in the middle attacker can actually downgrade the RSA connection to 512 bit. It's an implementation flaw and the one which we are going to look at right now is actually a protocol flaw. So this is really more a protocol flaw is really more serious than an implementation flaw. This attack basically uses a fast 512 bit Diffie-Hellman discrete log algorithm to downgrade modern browsers to use 512 bit Diffie-Hellman key. So I have a very good representation here which I hope you guys would be able to know. So basically this is Alice and she's trying to connect to her e-commerce online bank website and she's trying to make some transaction. So the first four of the five arrows which you see is the handshake. The first handshake is called client hello. And when client hello basically happens, the client sends a list of Cipher suits which it supports and the server gets to select one of them. What basically happens in a log gem attack is it's a man in the middle attack. So there's a man in the middle attacker between Alice and the website which she's trying to connect. And what the man in the middle attacker basically does is when Alice sends a client hello, the man in the middle attacker will try to remove all the Cipher suits which Alice basically supports and add only one size Cipher suit to it which is called DHKE export. So only add support for the export Cipher suit. And when that goes to the server and we know that the servers are configured in such a way that the export Cipher suits will also be on. The server will select the export Cipher suit and I know it's a bit boring but I can figure out because Josh is yawning. So let me try to make this a little bit more interesting. So yeah, there's a man in the middle attacker. You try to contact your bank website, there's a man in the middle attacker. What the attacker basically does is the attacker downgrades your connection in such a way that you know the server will only be able to choose the export grade weak Ciphers which we discussed some sometime back. And now the catch with SSL is that the handshake has to come complete. So the last packet which needs to go between the client and the server to complete the handshake is called finished. It's called client finished and server finished. And the handshake is made in such a way that the finished packet needs to contain a combination of all the packets which were previously sent. And the finished packet needs to be signed. And it is impossible for a man in the middle attacker to sign it unless and until he has the private key. So here comes a very, very interesting part. When I said earlier that a perfect forward secrecy only works if a new prime is generated each time. So that is not really the case because a lot of implementations including Apache, ModSSL and a lot of other web servers out there, they basically hard code primes. So what these implementations basically do is they think of a good prime number. So let's think of a good 1, 0, 2, 4 bit prime number. Let us hard code it in our implementation of Apache. So each time when you use ModSSL plus Apache to do a DHKE, a Diffie-Hellman key exchange, you are going to use the same prime. And there are various reasons why it is done. It is very difficult to generate primes. It is very difficult to generate safe primes. And the bigger the prime, the more difficult it is. So if you want to generate a 2, 0, 4, 8 bit safe prime, it is going to take a lot of time. So if you go to your favorite website, facebook.com or gmail.com, and if it is going to take them 2 minutes just to generate the prime, you are not going to use the website. So the real interesting part is a lot of these implementations, they generate primes, they hard code primes. And I have a slide which shows you how bad it is. 97% of all HTTPS servers on the internet use 3 primes. They are 3 prime numbers which were generated and they are used by 97% of all HTTPS sites on the internet. Top 10 primes account for 99% of all HTTPS sites. So theoretically if I am able to break 10 primes, I am able to decrypt 99% of all HTTPS traffic on the internet. And if you want to look at these primes. So 80% of the primes use, 80% of the hosts use Apache 2.2 and it has got a 512 bit prime which was hard coded in the year 2005. If you are using mod SSL, 13% of all hosts who use mod SSL, they use a 512 bit prime which was coded in the year 1999. And if you are the proud developer, then 4% of all Java websites which use Java on the internet, they are using a 512 bit prime which was coded back in 2003. So basically if websites on the internet which use prime and if you are using a 512 bit prime, then most probably this prime has already been broken. Because it is now very, very easy to break these small prime numbers. So how does one attack this kind of implementation? So if you go to my previous slide and if you see the algorithm which is used to attack. So if you see the algorithm over here, this algorithm can be basically broken into two parts. There is a first part in which you give P as the input and you do step number one, step two, step three, step four. And at step number five, you will need YNG. So what you can basically do is if I know what P is, and I do know what P is because we just saw that a lot of implementations are basically hard coded P. So if I know what P is, I can compute 1, 2, 3, 4 which is like say 90% of the algorithm. So it's very easy to attack. I look at Apache code. I find out what prime is used by Apache. I use this method to do pre-computation. I create a pre-computation table. And when a message is sent on the internet, my work till this point, till YNG is required, my work till this point is already done. So what I just need to know is, I need to know what Y is, which I can see from the packets on the internet. I'll feed Y to the algorithm. And it's going to take me 70 seconds to crack. So I do a lot of pre-computation. Probably my pre-computation will take a few years. If I have a lot of cash, I'll buy a supercomputer. Or I'll try to hire a lot of cloud machines or something like that. And for a few, few years, do pre-computation. And once my pre-computation is done, it takes only 70 seconds for each connection to be broken. So I'll see why I have, and it'll take only 70 seconds. I hope that makes sense. Next slide. So this is some work done by some researchers on the internet. So if I want to break a 512-bit Diffie-Hellman key, my first step is going to take 3 hours. The second one is going to take 15 hours. The third one is going to take 120 hours. So if I combine all of them, this is going to result in my pre-computation. And the last step, which involves actually sniffing packets on the underwire, trying to find out what y is, and to find out what the private key is, is just going to take 70 seconds. So after one week of pre-computation, I have a lot of data, and I need only 70 seconds to break packets on the underwire. So a log jam and our pre-computation can be used to break 80% of top 1 million HTTPS websites on the internet. What are the mitigations which I have? Most of the browsers which we know, they have raised their limit from 768 or 512 to 1024 bits. 1.3 is supposed to have an anti-downgrade flag in the client, a random handshake. So these are the two mitigations which we could use. There is a second attack scenario as well. And this basically says that if you have a larger Diffie-Hellman key exchange key, so even if I have a 1024 bit key, it is quite possible that government agencies may be able to break the key. So to just give you a simple example, doing pre-computation on a hard-coded 1024 bit prime is going to take me 45 million couriers. But this is the kind of research which researchers do. But there are a lot of optimizations. And we need not run these algorithms on generic computers. We can run these algorithms on specially designed FPGAs, especially special chips. And once you do that, it is possible to speed up by around 80 times. So if I want to break 1024 bit and I have 10 minutes, okay? So if I want to break 1024 bit key, it's going to cost me around 100 million USD. There are reports which say that government agencies are actually doing it. And why they want to do something like that? Because if I am able to pre-compute 1204 bit prime, I would be able to decrypt 66% of IPv... IPsec VPN traffic on the Internet, right? So if I spend 100 million USD, and if I pre-compute 1024 bit key, I'll be able to decrypt 66% of all VPN traffic on the Internet. I would be able to decrypt 26% of all SSH traffic, right? So there is a lot of motivation for the government agencies to spend a lot of money to do pre-computation and to be able to decrypt the traffic. Pre-computation for the second most known prime will result in decryption of 18% of all HTTPS traffic on the Internet, right? So, yeah. What is the mitigation for this? The mitigation is probably if it is possible, move to ECC. If ECC is not an option, then increase the bit size. Probably using 1204 bit is not safe, increase to 2048. If bigger primes are not an option, then see if it is possible to generate a fresh prime each time, which is really difficult to do right now. And this is probably the end of my presentation. I would like to acknowledge that there was a presentation given during 32 C3, and I've tried to steal a lot of graphics from that presentation. And if you have any questions, I think we have some time. I would like to apologize. It was a little bit technically difficult, but yeah, I tried to, yes, theoretically speaking, we could do something like that. Yeah. So the question is, would it be possible to regenerate the primes in each release of more SSL, right? That's the question. So theoretically, we could do something like that. But the question is, I'm not sure how to answer the question. But yeah, that is definitely something which we could do. But my cessation at this point would be to move to a stronger, would be to move to a bigger key size, if possible to move to ECC. Yeah. Updated each release. Yeah. Yeah. Yes. How difficult is it to regenerate? Why doesn't everybody generate it every time Apache starts? Because it takes time to generate a prime number. Once the prime number is generated, there are a few specifications which the prime number needs to fall onto. If you have heard of, there was a SoCAT security advisory sometime back. And it seems SoCAT was using a 1024 bit prime number, which was not even prime. And they have been doing that for a long time now. Right. So it just gives you a simple example of how difficult it is to generate a prime number and how difficult it is to actually check if it's a safe prime number. Yeah. OK. So Josh is basically saying that there is a YouTube video. OK. It's called number file. It basically contains some information on how do you check if it's a safe prime number. Right. Yeah. There's only one side generated. There's only one side generated. The server is the one that sends the prime number. OK. Yeah. And if you try doing, if you try that, exactly. Exactly. So if you try to generate a 2048 bit prime with that, it's going to take a lot of time. Yeah. Yeah. Safe. Yeah. Safe prime. Yeah. Subfactor, which a lot of them. Yeah. But that would be Helman's prime versus DSA prime. DSA uses a much smaller number. Smaller number. Which is still safe. But it's hard to check. And it takes much, much faster. For a particular prime, there needs to be a, it must be a safe prime. Right. I hate using the word safety. Because it means something else in mathematician. So we should have asked. Yeah. Yeah. But the subprime must be a selfish amount. Right. Yeah. And this causes you that the base prime and the prime you are using must be both prime. So this is a very special kind of prime. And generating those takes, for 2048 bit, basically that's like 15 minutes on a new hardware for four gigahertz. This is basically like once a week, that's the maximum you can do for a lifetime. Yeah. Safe prime. That's a very good sign. I think, I think it's all. Secure microphone. Absolutely. All right. Yeah. In a safe prime, if you take the prime, subtract one and divide it by two, it's another prime number. It doesn't actually have to be a safe prime to be secure in DHA. The same primes that you use in DSA would work. But in DSA, the way you generate that prime is you create a smaller prime that's still big enough to be safe. And then you publish that smaller prime and then people can look and see, oh, that, yes, that is actually a factor of P. Therefore I know P is safe to use for Diffie-Hellman. No. None of the protocols for Diffie-Hellman include an ability to send the subprime. So the only way to do this is to create a, to do a fully safe prime. So those are more expensive to generate. Yeah. So if I can understand correctly, like some software bundles like three primes, which are usable, would it help to generate like few hundred thousand primes and give it to them, you know, to model it there? How much time is it going to take to generate, you know, hundred primes or thousand primes or something like that? Plus, you know, so I think primes have a property, which, you know, the bigger the prime, the more there the prime becomes. So probably Zob knows about that. Zob, so he's basically asking it, you know, is it possible to generate one hundred or one thousand primes? So I basically told him that, you know, primes have a property that, you know, the bigger the number, the more there the primes become. Yes. So it's extremely difficult to generate. Yes. I'm asking if we have them generated, would that help in this problem? So basically it seems I'm out of time. Well, thanks a lot. Why? No, when I say, when I say space, it's sort of all slow. I basically mean the T less pro pro pro, because the T less pro pro pro pro pro pro pro pro pro does not specify that a new T and new G has to be generated. It's specified that you use a T and you use a G, but there's only one thing. Second thing is, we probably thought of the most, because long long gen, there are two parts to long long gen. There is a pro pro pro pro pro pro pro pro pro pro. Actually hard-coded into this one.