 Hi, thank you for taking some of your time to watch this pre-recorded crypto 2020 video. Our paper reports on three computational records, which we have recently obtained, two related to integer factoring, and one related to the computation of discrete logs of our financials. I'm Emmanuel Thermea, and this is a joint work with Five Goals. Four of us are from the same group in Nancy, France. When we deploy cryptos, there's a major decision to make, which is a decision on the key size. Depending on your interest, you might have different things in mind. If what you want is that your crypto computations be quick and be cheap, then you want short keys. If, on the other hand, you are interested in security above anything else, then you want longer keys. So there is a compromise that is needed. As an end user, you might be very confident and trust the manufacturer to have done the right thing, to have taken the right decision. But even if you're confident, the sad fact is that there are many outdated crypto products that are lingering. What everybody can do, or should do, is check that the crypto products that we are about to use abide by the recommendations of NIST, for example. Now, so the tricky question, especially for public crypto, is how do we make these recommendations? We have to base them on harness assumptions, but these harness assumptions have to be based on assessments of cryptanalysis for key sizes that by definition are out of reach. So how do we make these assessments convincing? We need to base our assessments on hard facts. We need to base them on state-of-the-art software implementations and what these implementations give for sizes that are within reach, and we need to go through some effort to obtain computational results that show that we have done our best to provide results. So it means that we have to explore algorithmic ideas that perhaps pay off only for very large sizes. We need to explore the question of whether our algorithm scales, whether we are encountering stumbling blocks, whether we can harness a large amount of computing power. And overall, we need to show that this cryptanalysis is more than just theory. It's actually something that can happen for real. So it's also important that we make our work reproducible. Another important aspect of our work is that we address both integer factoring and computation of discrete logs over finite fields. And it's actually a fact that people believe that discrete logs over finite fields are a lot harder than factoring integers. And this is based on the observation that records for finite field discrete logs over the years have been lagging behind by several dozens, if not hundreds of bits. One of the takeaways of what we do is that this hardness ratio is not as large as one can think. So I'm going to give a brief introduction of the number field save, which is the algorithm we use for our computational records, and then highlight some of the key aspects of our work. So the number field save algorithm is a complicated algorithm which goes through many steps. So what I want people to have in mind is that there are two particular steps which are relation, collection, or saving, and linear algebra. And these two steps are the most computationally expensive in the algorithm. They have different characteristics, but the bulk of the computation time actually goes in these two steps. To give a brief explanation of NFS, I need to start with polynomial selection and explain what NFS does. So polynomial selection is the first step of NFS. And within polynomial selection, we select a pair of integer polynomials. And one of them defines a number field, q of alpha. And when NFS collects relations, what it does is that actually it searches for pairs of integers a and b such that two fairly exceptional events occur. Namely that the integer a minus bm and the principal ideal generated by the element a minus b alpha in the number field both factor into a small thing. We say that both are smooth. Small things means here small prime numbers, and here it means small prime ideals. But you don't have to bother with prime ideals and that sort of things. Just have in mind small things and what you can think of with prime numbers is close enough to the reality. When we have such pairs a, b, we say that we have relations. And what we want to do with relations is that we want to combine them, to combine a subset of them, which is not easy to guess. So that all the multiplicities that appear in the factorizations are even. And this is done by linear algebra. When we have such a combination with even multiplicities on both sides, we have an equality of squares. And we have actually many equalities of squares. And each of these, when mapped, but you do n, gives us a non-trivial factorization of n with probability at least one-half. So this is how NFS factors integers. The funny thing is that NFS is also adapted to computing discrete logs of our finite fields. And there's more or less a dictionary translation from one to the other. The only major difference is in linear algebra. Because while in the context of factoring integers, we had linear algebra over z over 2z. Here it's over z over lz, where l is something a lot bigger. So some things change, the balance of things change, and we have to adapt to it. But the general pattern of what the algorithm looks like is pretty much unchanged. Okay, how do we collect relations? I'm going to address two fairly classical aspects of relation collection in NFS, and then give some description of what we did in order to choose the parameters for our computation. When we search for these exceptional pairs AB, we are essentially searching for needles in a very large haystack. So there's a question of how we divide the work. How do we arrange so that several computing units can participate in the computation simultaneously? So there's a trivial strategy that consists in splitting the search space in rectangles. And then the downside of this is that the yield is pretty unstable, and at the end of the day it doesn't work too well. Instead, we prefer to do what all the computational records have been doing for years. Namely, we use what is called special queue seeding. We constrain a factor q to appear in one of the factorizations. And then this defines many independent tasks, one per queue. The yield becomes stable, and because we have prescribed a factor to appear in one of the factorizations, this is also one thing less that we have to find. Okay, now how do we actually find within one of the smaller search spaces? How do we find the smooth ABs? It's a question of finding potential prime factors that appear in the factorizations of a minus bm and a minus b alpha. So the strategy pretty much depends on the size of these potential factors, p. For the prime factors below some bound b, which we can choose freely, we strive to find all pairs such that p appears. It means that we are going to go to some efforts to find all the ABs, such that 17 appears in the factorization, for example, or such that 73 appears in the factorization. To do so, we use a process which is called sieving, and this accounts for the name sieve in the buffer sieve. On the other hand, for primes above b and up to the bound l, which is the maximum bound of things that we allow to appear in the factorization, which we can also choose freely, and this is an important aspect of the parameter choice. Then in this range, we are opportunistic, meaning that if some factors show up, then we take them, but if we miss some, then no big deal. So how do the relations that we encounter look like? Here are some example relations. We have the blue thing that are the large primes above b. The red one are the ones that we have constrained to appear in the factorization. And then there's a question of how many blue things do we have? Because the black things we know that we are going to have many of them. So we're bound to use some non-trivial line algebra to find a magic combination that has all the valuations even. On the other hand, the blue things are going to be rare, and we are going to keep this rare aspect of the blue things. So in order to do so, we are going to set a limit on the number of blue things that can appear on each side. It means that the two relations here, I'm going to discard them because I say that we have too many of the blue things here. So this means that before I do linear algebra proper, I'm going to try to do some cheap linear combinations in order to get a smaller matrix. I'm going to try to cancel a blue prime here with a blue prime some other place. And this is going to simplify the linear algebra work that's going to happen afterwards. Okay, next, given these observations, it's important to understand that the relations that have only two large primes, maybe even less, they're really a blessing because they participate in the cheap combinations very easily. Essentially, these relations are trading one blue thing for another. So it's very good because it helps the set of cheap combinations going. So we will use these relations to do the early filtering, these cheap combinations. And if we have only relations with two large primes or less, then the filtering will essentially get rid of all of them. So of course, because I have two sites to deal with, it's a bit more complicated than what I'm saying, but essentially this is the idea. Also one thing we want to pay attention to is how does Q participate in all of this? I mean, does Q go with the black things where I am going to have many appearances of each prime factor? Or is Q going to go with the blue things? And is it going to participate in a set of things where I want to eliminate the prime numbers as far as possible? So it's also something important. For RSA 240, the strategy that we used is that we used Q from about half the bound B to way above it. And for Q below B, so Q being among the black things, which is exactly the situation I had in the example relations to slides ago, I allow two large primes inside 0 and three large primes inside 1. In contrast, when Q gets above B, Q also goes with the blue thing, then what's going to happen is that I'm going to decrease the maximum number of large primes that I allow inside 1 so that the number of large primes plus Q that also makes 3. So the relations whether Q is here or here are more or less the same shape. So this strategy was really effective in getting rid of most of the P above B on side 0. And on side 1, I still had many of these, but that's not too bad because in the context of factoring, linear algebra is relatively manageable. So now for DLP 240, which was our discrete log record, we wanted to go to a lot of effort in order to reduce the matrix size because the linear algebra task in the DLP context is much harder. So we wanted to go to some effort to have a smaller matrix. So what we did is that we chose to constrain factors Q that were composite so that when we write down the factorization of all the relations, the things that are related to Q are actually two distinct prime factors that both belong to the range below B and they do not interact with the things that we want to cancel. This strategy was extremely effective as we will see in reducing the size of the matrix and essentially getting rid of all the primes or most of the primes between B and L. So that was very important. Another aspect I want to mention that necessarily related to the previous but I want to mention it because it played a role in our computation is what we call batch smoothness detection. It's an alternative to sieving. It's a fun way to find the B smooth parts of many integers. This is due to Bernstein in 2000 and the idea is to multiply everything together. By doing so and by keeping track of the tree of sub-products, it's actually possible to find the B smooth parts of all the A minus B ends and this algorithm does it in time that is quasi-linear in the input size thanks to the use of asymptotically fast FFT-like multiplication algorithms. So this finds all the primes below B for example so just like sieving does. The downside is that it requires some memory but on the other hand in the context of NFS this helps save memory in several occasions so overall this is very often the benefit. So we used it in part of the parameter ranges for our computation not always but pretty often and this is detailed in the paper. Okay now on to linear algebra. So linear algebra as I said is the second most important part in the computation and as I mentioned as well there's a huge difference between what we have in the context of factoring and in the context of discrete logs namely that the field of definition changes. So linear algebra for DLP is harder. For this reason our strategy was aiming at having a smaller matrix for DLP and this was pretty effective as we can see because by spending a lot more effort in finding relations for DLP we were able to obtain a matrix that was much smaller. It's important to notice as well that this matrix in both cases is very sparse because it has very few non-zero elements per row. Here this is less than one non-zero in a million. With this kind of sparse matrices the algorithms that you want to use are iterative algorithms meaning that you essentially rely on a key operation which is the multiplication of a sparse matrix times a vector. If you want such an operation to scale it's not sufficient that you go to the computer clusters that is in the basement of your CS department building. Instead you want to use an algorithm that has this scaling functionality sort of built in and this is the case of Coppers-Smith's Bloch-Viedemann algorithm which dates back to 94. This algorithm uses not one but n independent sequences compared to the original Viedemann algorithm and these sequences are shorter. So it means that in this case the scaling of the algorithm is almost perfect with respect to n's number of sequences that you have. So of course there is a downside which is that you need at some point to reconcile the work that you have done in several independent sequences. I mean by independent I mean that I could be running a sequence in the US and one in France with no communication. So it's very independent. So you have to reconcile the work and the more sequences you have used the harder it gets. So you can do a back of the envelope calculation of the complexity of the different steps and it's important to understand also that the underlying operations in most cases are slightly different. Here you are essentially measuring the performance of the memory of your computers while here you have operations that are slightly more delicate to deal with. So I mean these are different characteristics. Okay good now on to software resources. To do these records we use the CADR NFS software which we have been developing in Nancy since 2007. It's a huge piece of software. It's open source, LGPL licensed. It's an open development model. I have a link to the github here. And regarding just relation collection it's a huge part of CADR NFS which has undergone many improvements in the last four years relating to parallelism for example or to the freedom of choice that we have for the parameters. And also we have many improvements that are pretty recent related to our capacity to predict the runtime and to assess the validity of some parameter choices. Regarding linear algebra it's also an important part of CADR NFS. And likewise we have seen many improvements in the last four years some related to this grid log specifically for the multiplication of sparse matrices times vectors. And more recently we improved on the generator step computation by using more parallelism and this was key to enabling the use of many different sequences. I mean this allowed us to have an implementation that scaled much more than previously. If I draw this kind of picture with the time to solution as a function of the number of cores that I am using a perfect algorithm one that scales perfectly achieves a straight line. And thanks to our new implementation of the generator steps thanks to the flexibility that the block venom and algorithm offers I have something that approaches a straight line up to several thousands of cores so this is something we're pretty satisfied with. Okay we used many computer resources of course we used several in France and the US we also used a computer location on the European price infrastructure. And since this means several different computer clusters it also means several software installations different job schedulers different policies different kinds of bugs it also means that we had to have recovery procedures in some cases so that was also an important part of the work. Okay I can draw an approximative timeline of our computation. Here I have the different steps of the three computations that we are reporting in millions of cores so this is the hardness so to say of each step and here this is the timeline and the surface of each thing is proportional to the hardness. So we started with relation collection for DLP and then relation collection for FSA 240 and then linear algebra was in the summer last year and finished in the fall because some implementation work had to be done and by the time we were nearing completion completion of our records it was pretty clear that we were going to have some of our allocation time left so we started FSA 250 in fall 2019 and this was over in February 2020. Okay so it's also possible to give a total cost in core years or core hours depending on what you prefer and I want to highlight here the fact that if you do the total of relation collection and linear algebra in both cases for FSA 240 you reach something that's slightly below 900 core years while for DLP 240 you get something that is slightly above 3000 core years so the ratio between the two is only slightly above a factor of 3 which is not that large. Also I mentioned that our results are reproducible and this can be checked with this link where we give parameters of all the steps of our algorithms and of our computations and instructions on how to reproduce part of our computation. So as a conclusion we did more than just records we developed parameterization strategies that can be used for further computations. We also developed a framework for simulating NFS that is not perfect there is some detail about it in a paper but it was essential in guiding our parameter choices. We also show that our implementation scales well and can tackle larger problems so we have the feeling at this point that we are not hitting a significant technology barrier and it's possible to go further. Comparisons are always good we can compare our DLP 240 which is 795 bits to the previous record which is DLP 768 232 digits and in fact we had access to hardware that was identical to the hardware that was reported having been used for the DLP 768 computation and what we found out is that our harder computation would have taken less time on that hardware than the reported time for the DLP 768 computation so this is something that we are pretty happy with. Also as I mentioned we learned that the hardness ratio between finite field DLP and integer factoring is not as large as one can think. So for future computations we intend to keep the focus on the anticipation of the computation costs and our ability to anticipate that and also we want to show that we are able to harness a large amount of computing power. Okay thank you this is all I wanted to say so thank you for listening and looking forward to your questions.