 Hello, my name is Gabrielle de Michele, and I'm going to be presenting some joint work with Pierre-Yngaudry and Cécile Pierrot on discrete logarithm algorithms in pairing relevant finite fields. So the discrete logarithm problem is one of the two major mathematical problems on which is based the security of asymmetric cryptography protocols, the other one being factorization. So the discrete logarithm problem can be found for key exchange protocols such as Diffie-Hellman, El-Gamal, or some signature protocols such as ECDSA, DSA, etc. So the definition goes as follows. Given a finite secular group G, a generator G of this group, and a target element H of the group G, we want to find the exponent X such that G to the X is equal to the target element H. So one question that arises from this definition is what finite secular group G can I choose? So for cryptographic purposes, we want the group G to be chosen in such a way that DLP is as hard as possible. So commonly used groups are prime finite fields, finite fields, or elliptic curves over finite fields, for example. DLP is used in protocols that are widely deployed such as, for example, the Ephraim or Diffie-Hellman protocol as mentioned before, which is present here, for example, in this TLS handshake. Another interesting example which we're going to focus on is pairing-based protocols. Let me go back to definitions. So what is a cryptographic pairing? A pairing is a map between the product of two groups, two additive groups, into a target multiplicative group, GT. The map has a few properties that one needs to satisfy. So first it has to be billionaire, non-degenerate, and also for practicality reasons we want E to be efficiently computable. In cryptography, the groups that are often chosen for pairings are G1 and G2 are often subgroups of elliptic curves over prime fields or finite fields, and the target group GT is often a subgroup of a finite field, FPN. Pairing has been used a lot in older protocols and in more recent protocols such as ZK-SNARX, which is an applicable use for blockchain, Zcash, etc. So it is an interesting example, not only because it is used in many protocols, but also because it uses the discrete logarithm problems on both elliptic curves and also on finite fields. So one question that we try to answer in this paper that we look at in depth is how to construct a secure pairing-based protocol. And the simple answer, which turns out to be not so simple, is to look at DLP algorithms on both the elliptic curve side and on the finite field side. So let us start with the discrete logarithm problems on elliptic curves. The best known algorithm is Polaro, which has a complexity and square root of the size of the subgroup that is being considered. And not much is said on this slide because there has not been many major gain in the complexity since a few decades regarding DLP on elliptic curves. However, on the finite field side, the discrete logarithm algorithms have a lot evolved a lot in the past decades. There are many of them and their complexities often depend on the relation that there exists between the characteristic p and the extension degree n of a finite field, which we define as fpn throughout this talk. So as mentioned before, the complexity of many algorithms for DLP on finite fields depends on this relation between p and n. And one useful way to express this relation is to use this l notation. So this l notation is this very ugly formula in the green box, which depends on two parameters, lp and c. And it is defined as the exponential of some constant c. So the second parameter is this c in front of a log pn raised to the power lp and the log log pn raised to the power 1 minus lp. So what is interesting regarding the complexities of these algorithms that are expressed using this lp and n notation is that if one tends, makes lp tend to zero, then this whole formula goes to some log of pn, which corresponds to a polynomial time algorithm. On the other hand, if lp tends to one, then we end up with pn, which corresponds to some exponential time algorithm. So this lp that varies between zero and n defines, allows us throughout this l notation to define what is called a sub exponential time algorithm, a complexity, which is not as bad as exponential, but not as good as polynomial. So all the complexities that we are going to see, most complexities that we're going to see, I should be careful with the quasi polynomial time algorithms, are expressed using this l notation, where this l is a constant between zero and one. This l notation also allows us to define three families of finite fields, where the characteristic p is expressed using, again, this l notation. So depending on this relation between p and n, we have finite fields that are said to be a small characteristic, medium characteristic, and large characteristic. And what is interesting with the algorithms that solve the lp on a finite field is that depending on the area, the zones where the finite fields are, these algorithms are going to be different and are not going to perform as well. So the complexities are also going to vary depending on which area we're looking at. So we are interested in a particular area, which is what we call the first boundary case, which is exactly the area between small and medium characteristic. So it is the area where the characteristic p is defined to be exactly of the form lp n of one third. The cp, which is the second constant in the l notation, is often ignored, if not important, depending on the context. So why do we look at this area for two main reasons. The first one is concerning pairings. This is the area where pairings take their values. And since one of our goals is to look at the security of pairing values protocols, then we're interested in, of course, the area that concerns pairings. The other reason why we look at this area is because it is an area where a lot of algorithms overlap. We have algorithms coming from the small characteristic world, from the medium characteristic world, which both apply to the boundary case, but it is not entirely clear which one performs best and which one are actually applicable up to where. So this area required some studying. So going back to the first reason, the first motivation for studying this first boundary case, we go back to the security of pairings. And in order to have a secure pairing, we want the dlp, so the discrete logarithm problem, to be as hard on the elliptic curve side and on the finite field side. So one can look at the known complexity of these algorithms for the finite field side. And the first thing one can do is to ignore the small characteristic area because we know there are some plausible and unknown algorithms with much better complexities. This leads us with lpn of one third complexity for all the other known algorithms in medium and high characteristic. Then the idea is to balance these complexities. And we know that for the elliptic curve side, we have the square root of p, which comes from a polar row. So when we balance all these complexities, we end up with a characteristic p of the form lpn of one third. And this corresponds exactly to this boundary case, which we discussed before. So now what do we focus on on this paper? Well, the idea is to study the behavior of all these algorithms that exist in this area in order to then draw some conclusions for the security of pairing-raised protocols. So let us now discuss about these algorithms that exist to solve DLP in finite fields. So most of them come from a family of algorithms called index calculus algorithms. And they all follow the same steps. We consider a finite field fpn and a factor basis, f, which consists of a small element. So f is a small set of small elements. The three main steps of any index calculus algorithms are a relation collection step when we find relations between these elements of our factor basis. A linear algebra step where we solve the linear equations. So if we have enough of these relations that we've collected in step one, we can form a system of equations where the unknowns are the discrete logarithms of the elements of our factor basis. If we have enough equations, as many as we have unknowns, we can solve the system and then we end up with the discrete logarithms of all the elements of our factor basis. This allows us to proceed to the last step, which is the individual logarithm step. We're also called a descent step, where the goal is to solve a DLP for a target element h. And we do so by using the discrete logarithms of the elements of f, which were computed in the step above in the linear algebra step. So one of the most well-known algorithms from this index calculus algorithms family is the number field sieve. The number field sieve is often illustrated by this commutative diagram that is given on the slide. And it starts by choosing two polynomials f1 and f2 in such a way that this diagram commutes. So f1 and f2 allows to define two number fields, one on the left, one on the right. And this is where the name of the algorithm comes from. And now the idea of this diagram is to define, to compute some relations where relations come from algebraic norms that are factored in both number fields. And if they are b-smooths, b-smooths means that for a given constant b that is defined, they factor into elements that are smaller than b. So if they are b-smooths on both sides, then this results in a relation. So we have an equality because the diagram commutes between this product of elements smaller than b. This allows, this creates what we call a relation. And as mentioned before, if we have enough of those, then we can solve our system and have the discrete logarithms of the elements of the factor basis. So all the technical details are in the paper or in general in the literature. There's been a lot of work on the number field set. And in particular, throughout the years, many variants have been put forward to improve on the complexity of the number field set. One of these variants is called the multiple NFS. So the multiple NFS simply considers more number fields than just two, as in the classical NFS setup. So we have, again, a commutative diagram where instead of considering two number fields, we have v-ones, which are defined by polynomials. Again, two are chosen at the very beginning of the algorithm just as an NFS and then the other ones are just linear combinations of these two initial polynomials. So this results in some trade-offs in the complexity, but overall the complexity is lowered using MNFS. Another variant is the tower number field set, where, again, the setup is very similar to NFS. The steps of the algorithms are the same as for any index calculus algorithm, but here we just simply add extra algebraic structure. Again, in order to lower the complexity of NFS, of the variant. Another variant of NFS is the special number field set, where here the difference comes from the characteristic key, which is defined as the evaluation of a polynomial of some degree lambda with some small coefficients. So in this case, p has this special form, which gives the name to the algorithm. And again, in this particular setup with p of this particular form, then we have an algorithm with lower complexity than the classical NFS. So how do we evaluate the complexity of NFS and also all of its variants? So NFS or any index calculus algorithm have these three steps. And so the overall complexity of the algorithm is simply the sum of the three costs of each of these steps. So in order to optimize the overall complexity, we want to optimize the maximum of these three costs. So this is rather complicated for several reasons. First of all, NFS and its variants have a lot of parameters. Some are discrete, some are continuous. These parameters vary depending on the variance we're considering, depending on the polynomial selection that we're using to select the first two polynomials f1 and f2. There are some boundary issues. So this whole optimization problem for which we use Lagrange multipliers becomes all the more complex because of the number of parameters that are being considered. When we solve the polynomial system, we use Grubner basis algorithms. And on top of all of this, a lot of analytic number theory results must be considered. So in the end in this work, we have looked at the complexity of all these algorithms along with all their variants, both for the algorithms and the polynomial selection. And this resulted in these two plots where we have the complexities as a function of CP, which is the second constant in the L notation for all the variants and all the polynomial selections. So a surprising fact that we've noticed when computing these complexities is that not all the variants are applicable at the boundary case. For example, if we want to use the special number filter, so SNFS, with the tower setup, then this results in norms that are much larger than expected and thus a much higher complexity. So some variants cannot be combined precisely in this boundary case. Now, we can also look at algorithms that comes from the small characteristic area. Since we are precisely at the boundary case, we are interested in the ones that come from the medium characteristics, so NFS and all these variants. But we should also look at how applicable and how well these algorithms from the small characteristic area perform at the boundary case. So one of these algorithms is called the function field SIF. And the function field SIF is very similar to NFS, except you should think of function fields instead of number fields and look at it as a special variance. So one result of our work was actually to reduce to lower the complexity of FFS by working in some shifted finite fields. So this is a very similar argument as is done for the tower setup, where if we consider a finite field FPN, instead of actually working in FPN, we work in a translated field F prime eta, where P prime, for example, is P times kappa, where N is this composite kappa times eta. So we were working in a translated finite field, which allows us to lower the complexity. All the technical details are given in the peak. And finally, from the small characteristic world, we should mention the quasi-pollinomial algorithms, which a lot of work has been done in the past decade, up to very recently, in 2019, we're finding that this looks key proved this complexity given by the theorem just stated below. So in the end, what do we have? We have this little image that shows us which are the best algorithms, depending on the area in which you are located. So in the small characteristic on the very left, we have these QP, these quasi-pollinomial algorithms. On the right, in the medium characteristic, we have all these variants of NFS. These variants, of course, are not always applicable. It depends on the considerations that are made on N, whether P is special or not, whether N is composite or not. And then we focus on this boundary case, where not only do we have all the variants of NFS, but we also have the function field sieve, which is applicable in this area. So one part of our work has been to precisely identify the crossover points where FFS stops being the best algorithm, and the variants of NFS starts outperforming FFS. And similarly, we have a crossover point between FFS and the quasi-pollinomial algorithms in our paper. So the motivation for having all this precise analysis about these algorithms is the security of parents. So we would like to answer the following question asymptotically. What finite fields at the end should be considered in order to achieve the highest level of security when constructing a parent? So the goal is to precisely find P and N that answers the question above. So this question can be answered now that we have this entire analysis of all the algorithms at this boundary case, which is the area where parents take their values. And so the idea here is to look for the value of CP. So again, CP is the second constant in the L notation that maximizes the minimum between the complexity on the elliptic curve side and the complexity on the finite field side. So from this whole analysis, we've seen that the complexities for the finite field side for DLP are decreasing functions of CP. On the other hand, for elliptic curves, we have polar rho, which is an increasing function. And the complexity depends on this value of rho, which characterizes the size of the subgroup considered. So here in this plot, we take rho equals 1 to 2. And again, details about rho are given in the paper. And so the optimal CP that has to be chosen in order to balance two complexities is given by the intersection point between the complexity of polar rho and the complexity of the best algorithm for DLP. So as one can see, FFS is the plot in yellow or in yellow, which is far on the left. So it is not considered in this analysis for pairings as its complexity becomes much higher for values of CP that are considered. So this plot gives the intersection points for all the variants of NFS. So not only MNFS, but also MXTNFS when we have a composite N, when we have the tower set up, and also curves for the special number fields. So one can then identify precisely, depending on the considerations you want to make on the characteristic P and the extension degree N throughout this plot, can precisely identify the crossover points. This is summarized in the following table. So for example, if we take N prime, non-composite, and a normal P, so no special form, then one can see that the best algorithm to solve DLP for finite field is the multiple number field. So A is the polynomial selection that is being used. And so the crossover point is for CP equals 4.45. This allows one to define a precise P and then the corresponding N in order to define the finite field that has to be used to construct a secure pairing. So there were some surprising facts that we've noticed when considering the security for pairings, one of which is the fact that using a special form for P does not always make the pairing less secure, as one could think, because SNFS would tend to lower the complexity. Indeed, if we look at the first line of the table, we see that if we take a special prime and a lambda, which is the degree of the polynomial used for defining P, and this lambda is equal to 20, then the curve of SNFS is above the curve of MNFS. So SNFS does not perform better for this particular value of lambda. So one can choose values of lambda for which MNFS will outperform SNF. So this was one surprising fact that we've noticed when studying the security of pairings. So this concludes my talk. Thank you for listening. And I will be happy to answer questions during the session.