 Good morning. My name is Joshua Freed, and today I'd like to talk to you about a kilobit hidden SNFS discrete logarithm computation that I worked on this past year, along with period Godry, Nadia Henninger, and Emmanuel Tome. So to start, a brief review again of Final Field Defilement Key Exchange, which you've all seen before, your usual cast of characters. You have Alice and Bob that want to compute some shared secret in a public channel. So they use a publicly agreed upon pair of parameters, a prime P, and G, a generator of some subgroup of P for their computations. They compute their public values, modulo P, and separately can both derive the same secret. Now what I actually want to focus on here are the prime P and the generator G that they pick, where those actually come from in practice. So today, it really depends on the protocol. In several cases, protocol specifications in RFCs actually name a set of primes that implementations should use for Diffie-Helman, so in TLS version 1.3, IPSec or Ike, which is used for VPNs, and SSH also uses a few of these standardized primes. In other cases, they're actually distributed in the implementations when the protocol leaves it up to the implementation. The Apache web server distributes some primes for TLS versions before 1.3, open SSH distributes primes, and the Java JDK also includes just a bunch of fixed primes that users of its crypto library can use. And then finally, some users actually generate their own primes, which is possible to do in SSH and TLS versions prior to 1.3, but this isn't actually done that often in practice. It's usually a small fraction of users that choose to go this route. And we see, for example, with TLS, with web hosts, about 80% of web hosts that are using finite field Diffie-Helman are actually using just one of 10 common primes. So today I'd like to talk about the possibility of backdouring these common primes, given that they're so standardized. If you are picking the prime and baking it into your standard or your software implementation, what sort of advantage can you give yourself for computing discrete logarithms with this prime? So what would backdouring a prime look like? Would it be detectable if you were to do it? What sort of computation would be required if you're the attacker in your backdoor prime? And what the impact of the results the answers to these questions are for currently deployed cryptography. So a brief review of the number field sieve. You just heard previous talk about that, but to review, there's the first stage of polynomial selection. You pick a good pair of polynomials that share a root module of the prime you're targeting. You collect relations by sieving the polynomials and collecting results that factor completely below some bound that you choose. You perform linear algebra to solve for the discrete logs of the elements in your base. And then finally, given some specific instance of the problem, a target, say Alice's G to the A or Bob's G to the B, you try and write that target as the sum of the logs that you have from your computation. So how long does it take to actually run this algorithm? Well, the first answer is using the usual L notation to asymptotic complexity is L of one third with a coefficient of 1.923. It's actually important to note that this figure comes mostly from the pre-computation stage. And that when you're actually given an individual instance, once you've completed that stage and you're given an individual instance of the problem to solve like G to the A or G to the B, the coefficient actually drops in the L notation to 1.232. So what does this look like in practice? With a 512 bit prime number, it might take about 10 queries of pre-computation to run the number field sieve and only about 10 minutes to compute the discrete logarithm of an individual target. As you heard in Thorsten's talk previously, for a 768 bit prime, it might take about 5,000 queries for the first stage and then an average of two days to compute individual log in the second stage. And finally, for a kilobit sized prime, we estimate maybe it would take about 10 million queries to run the number field sieve and then about a month to actually do the second stage. So jumping back to the first stage of the number field sieve, which is polynomial selection, the goal of polynomial selection is to pick a pair of polynomials that share some common root modulo, the prime that you're targeting. So a kind of easy way to do this algorithm that will certainly produce that pair of polynomials which could be usable for the number field sieve is as follows. You can pick some M that's roughly around the size of the sixth root of P, the size of that number. You can write P in base M and then just take the coefficients from that expansion and use M for your first polynomial. For your second polynomial, you can just write X minus M and it's clear to see that when evaluated at M, both of these are going to be zero mod P. Using a construction like this or similar constructions, we expect the size of the coefficients are going to be about the size of the sixth root of P. And this actually has an important impact on the sieving stage, namely when you have smaller coefficients on your polynomials, the resulting norms when you sieve are going to be smaller and they actually have a higher probability of factoring completely below the bound that you're targeting, which leads us to the case of the special number field sieve, which actually historically preceded the general number field sieve. And it was observed that for some numbers that could be expressed with a pair of polynomials that had really small coefficients, the number field sieve was actually much more efficient. So for example, with Merced numbers, or numbers that are close to powers of two, and some other numbers similar to these, it's very easy to find a pair of polynomials with small coefficients that share a root module of those numbers. And the impact for running the number field sieve on the asymptotic complexity is fairly large. The coefficient in the L notation drops from 1.923 to 1.526. And in real terms, a discrete logarithm computation for a 768-bit special number field sieve applicable prime only takes about 60 couriers in comparison to 5,000. And in the kilobit case, it only takes about 400 couriers to run the number field sieve as opposed to an estimated 10 million couriers. So if we take a brief trip back to the 1990s. In 91, NIST was proposing standardizing digital signature algorithm, which was one of the first standardized schemes that relied on the discrete logarithm problem. They were considering using primes of 512-bit size and 160-bit prime-order subgroups. And it was observed that a trapdoor could theoretically be constructed or you could pick some prime that would be amenable to the special number field sieve, but it would be somewhat hidden or not clear, meaning it wouldn't be of the form two to the n minus one or something close to power of two, which would be obvious, but some other kind of number that would not be clear. So how would you possibly do this? So Daniel Gordon in 1992 wrote a paper about trap-during DSA primes. A kind of easy way to start off doing this would just be pick some random pair of polynomials, F and G, where you're with small coefficients, check if they share a common root that is a prime, and then if they do see if that prime has a subgroup of your desired order. Then Daniel Gordon actually proposed a improved algorithm in this paper where you could actually define your problem in terms of some polynomial F, which you pick in advance that has small coefficients. You pick the order of your subgroup Q and you try some random coefficient G zero and then see if you can solve for a G one, such that the resultant of your polynomials is prime and you'll have the property that P minus one is divisible by Q. So it's actually a fairly simple algorithm for coming up with such a trap-door prime. So how would one of these primes generated this way be detectable? The answer is yes, it would be certainly detectable if your linear polynomial was monic or if the coefficient in front of the X was a one, because the resulting prime would actually, the upper bits of the resulting prime would be a direct product of the, a direct result of the coefficient G zero and the leading small coefficients in F. So you could, since F has small coefficients, you could brute force over all the values, possible values for the leading coefficient of F and see if you can solve for a G zero. However, if the polynomial is not monic and there's some large-ish G one used to construct this, then there's actually no known way to uncover this trap-door and stating it sort of in a different way. If you're given some prime, it's hard to find a pair of polynomials with small coefficients if there exists such a polynomial, a pair of polynomials. So 1992, they were considered using 512 bit primes with 160 bit prime order subgroups and actually trying to construct a trap-door prime with these properties was considered to be difficult or impossible and you were basically either forced to choose a trap-door that gave you a polynomial that was not optimal for running the number field sieve, namely the degree of the polynomial would have to, the degree of one of the polynomials would have to be three when in principle you really want a degree five polynomial or if you picked a larger degree polynomial, the coefficients would have to be so small such that it would be easy to brute force over the polynomials or enumerate all possible combinations of polynomials or all possible polynomials for one of the polynomials, excuse me. And this led them at the time to believe and there was a big panel discussion at Eurocrypt in 1992 about this topic that the trap wouldn't really be feasible for these types of primes or if you were to trap-door them, they would be detectable. And as such, when the DSA was standardized, they noted that perhaps picking the prime should be done in some verifiably random way. They gave a procedure for doing that but they sort of marked a proof of this as an optional field. So you could generate your primes verifiably random but you didn't have to randomly be, you didn't have to. How about today? So today people are using DSA primes that with 1024 bits that have 160 bit order subgroups and this is actually optimal for Gordon's construction. This allows you to choose a polynomial with a degree six which is good for the number field sieve and also allows you to choose large enough coefficients for your polynomial F such that actually enumerating all possible polynomials would be very expensive or equivalent to the cost of running Poirot row for the subgroup Q. So it's certainly possible to construct such a prime using the same algorithm that Gordon published in 1992. So next what we wanna show is that it would actually be possible to exploit a prime that was constructed this way. So we generated one of these primes using a small script that implemented Gordon's algorithm. It printed out this pretty random looking prime. It's random if you look at it like this or in hex and 160 bit prime order subgroup. You can see the polynomial pair that shares the root module of this prime. And you can observe that in that polynomial the coefficients are really small. So the special number field sieve applies. So we went ahead and ran the number field sieve for this prime and you can see that it took a fairly low amount of time. It took about two months of calendar time split across two clusters, one at U Penn and one in Inria in Nancy. We use on average about 2,000 cores for most of the computation. The sieving took about one month. The linear algebra took about one month. We'll note that the linear algebra was faster because we were able to do it module or 160 bit Q instead of if we'd picked a save prime. It's also worth noting that the solution step is fairly quick in our case because we used a neat trick due to Horner to speed that up and have that take a significantly less, a smaller chunk of time. We also know the individual log, actually computing a log given a target. We tried to speed it up significantly by throwing a lot of cores at it and we got it down to about an hour and 20 minutes across some subset of our cluster. So all in all, actually computing, coming up with one of these primes and computing the discrete logarithm for it is very much in range for attackers with pretty modest resources. In case you're wondering what 2,000 cores actually looks like, this is what it looks like. Not too many racks, not too many servers. So how about today? Are there actual primes being used in the wild that are amenable to SNFS? The answer is yes, there are some primes that are being used that are not hidden at all. Namely these primes that are close to powers of two. We have a 512 bit prime and a 1024 bit prime which we discovered when using internet scanning of publicly visible services. There are about 120, 130 hosts that are still using these primes today. As of last week, I believe. For the 512 bit prime, running the number field sieve actually takes just over three hours. We ran the special number field sieve computation for the 784 bit prime also, which was discovered and baked into a crypto library. That took about 23 days on our cluster and then we did not run the special number field sieve for the 1024 bit prime that we discovered, we estimated would have been three or four times harder than the one that we ran because it's a safe prime. How about poorly hidden primes? So primes that have this property where there does exist some pair of polynomials where there's a small coefficient, where the coefficient for the linear polynomial is monic. So we collected all the primes that we could find from various scans of internet hosts and we brute-forced potential leading coefficients of a degree F polynomial for degrees two through nine with up to possible 10 bit leading coefficients. We didn't find any primes that had these special number fields in polynomials. So how about the remaining primes that are seen today in use? Some of them are verifiably random. So the primes that are published in the Java JDK are actually published with the seeds to show that they're how they were generated using this DSA prime generation algorithm. Some numbers are nothing up my sleeve numbers that are derived from digits of pi or E. We sort of trust that these numbers are not backdoored because they seem to have arrived at them in a fairly random way. And finally, there are a bunch of numbers that are sort of floating around or in use that were for which we actually have no record of how they were generated. Some of these are actually pretty commonly used. Some examples include the groups that are baked into the Apache web server. There's no published record of how those were picked. And also groups that were standardized in an RFC 5114 have no record of exactly how they were generated. If you take a look at RFC 5114, the first group that it defines is a 1024 big group with a 160-bit prime order subgroup. It's in use by about 900,000 web servers today which constitutes about 2.3% of web servers using HTTPS or about 10% of web servers using finite field if you help me. And these primes are also being used for IPsec or VPN servers, about 13% support these groups. And again, these groups have been published in this document, but there's no record of how they were actually generated or no proof of verifiable randomness. The document actually says they drew them from NIST test data that was published. We released the e-print of the paper. When we released the paper on e-print, there was a bunch of discussion on the ITF mailing list. Tim Polk from NIST basically said, these probably came from NIST, but we have no idea how we generated them. We have no record. It would probably be a good idea to deprecate them given that we can't really trust them. So what about 2048 bits? Gordon's trapdoor construction would still work using modern parameters. So primes that are used today for, DSA primes that are used today of 2048 bits usually have a 224 bit or 256 bit subgroup, which allows us to pick a polynomial of degree seven, which is good for the number field sieve and use Gordon's algorithm. However, actually running the special number field sieve even for a trapdoor number, or running the special number field sieve would still probably take about 7 billion couriers. And in contrast to our 400 couriers for the kilobit SNFS, it's probably not really feasible, but it's certainly not really giving you 2048 bit strength that you might expect otherwise if it's trapdoor. So considerations for the future takeaways. It's always good when you're designing crypto algorithm protocols to try and eliminate the potential for backdoor parameters. We saw what Dooley sieve, and if it was never actually backdoored by anyone who standardized it, it's actually been weaponized or used in the real world. The backdoor has been exploited. It's important if you need verifiable randomness in your parameters to really stress that and not allow it to end up getting marked as optional, even if something doesn't seem immediately feasible. And of course, it's good to account for pre-computation, your analysis. If everyone is using the same set of primes, then the cost of actually backdooring or running the number field sieve for one of these primes is amortized because it gets you many, allows you to break many instances of the problem. Thank you. Thank you Joshua for that very interesting talk. Do we have any questions? Is there any evidence to suggest that Gordon's algorithm is optimal for embedding these trapdoors? Well, no one has come up with a way of uncovering them yet. So it seems somewhat optimal, but there may be better ways. I mean, you can basically, you can create these trapdoors also by just randomly picking a pair of polynomials and seeing if it satisfies your criteria. But I'm not sure. Any other questions? So have you thought about if there is an algorithm to detect whether kind of a primate trapdoor without spilling out the coefficients? Some of my co-authors did spend a considerable amount of time trying to go through possible ways of uncovering the trapdoors and weren't able to come up with anything. It's sort of an open problem. We don't have a proof that it's not uncoverable or that it is. So you don't have a proof whether these two problems are separated or something like that. We don't have a proof that this is completely undetectable. It could be that someone in the future comes up with some algorithm for detecting this, but as of today, there's no known way of detecting it. I guess my question is like, so maybe you can detect, but cannot spilling out the exact F and the G. Is that a possibility to do that maybe faster? I'm not sure. Okay, thanks.