 Hi, everyone. My name is Andrea Basso and I'm going to present our work on cryptanalysing an oblivious pseudo-random function protocol based on super-singularisogenes, like Bonnie, Kogan, and Boo. This is joint work together with my co-authors Peter Kutush, Simon Phillip-Murth, Christophe Petit, and Antonis Anso. I will start by briefly recalling what an OPRF is. Then I will describe the OPRF protocol by Bonnie et al. And eventually present two types of attack, a poly-time attack and a sub-an exponential time attack. Eventually I will conclude by describing some implementation of results as well as discussing the need for a trusty setup. An oblivious pseudo-random function, or OPRF, is a two-party protocol where a client and a server jointly evaluate a pseudo-random function F, whose input is value M chosen by the client and a key chosen by the server. In particular, we want the client to learn the output of the evaluation, this F of K and M, but nothing else. So the client shouldn't be able to evaluate or get function on any other input without interacting further with the server. And conversely, we want the server to learn nothing. So the identity of the client must remain perfect as well as its input. And on top of that, we want the server does not learn even the hash of the message, because that would allow the server to distinguish when the same input is reused twice. Moreover, some OPRF protocol have an additional property called verifiableness. When an OPRF is verifiable, the server also provides some guarantee or some zero-knowledge proof. The same key K is reused across all iterations. Oblique pseudo-random functions are an important primitive because they have several applications. These include password-authenticated key exchanges with the one notable mention OPEC, the protocol that builds packet based on OPRFs, as well as privacy-hate-intersection, privacy-preserving captures, and many other real-world applications that are reused and deployed every day. Last year, Boney et al. presented two Oblique pseudo-random functions based on super-singular esogenous. One was developed in the framework of CSIDE and one in the SIDH framework. Our work focuses on the SIDH one, which was also the main focus of their work. The main idea is that we can have an SIDH type of exchange and we can look at the shared secret, the resulting curve and use that as the output of our pseudo-random function. However, we can use an SIDH type of exchange directly because the client has access to the server public key, which allows the client to evaluate the OPRF on multiple inputs without further interacting with the server. And similarly, the server can also see the curve EM, which depends exclusively on the message M. So while the result is what we want, the privacy guarantees are not there. Those Boney et al. propose to disregard all the other computations and only start by computing the isogenic corresponding to input M. Then the client computes an additional isogenic, phi of R, which is called a blinding isogenic. The resulting curve EMR does depend on the input M, but hides or blinds this value. Indeed, the server once sees the curve EMR will not be able to recover EM or the message M. This curve is then sent to the server who computes the isogenic phi K. Note that this isogenic is parallel to the isogenic between E0, EK, and corresponds to that one. The server also provides some torsion information which allows the client to invert the isogenic phi R. When the client inverts the isogenic, unblinds the value and obtain the curve EMK. In this way, the client can recompute the curve that would have been computed by a shared SIDH computation. However, in this case, the client does not learn the public key of the server and the server does not learn anything about the client. Or formally, we can define the zero random function F to be the hash of the input M, the resulting curve or its J invariant and the public key of the server. The security of OPRFs are fairly complex because they depend on several assumptions. In our work, we mainly focus on one assumption, the so-called one more assumption, which states that an attacker who has interacted several times with the server cannot compute one more evaluation of the OPRF on a new input. And the assumption is stated in reference to a game which starts with the attacker being able to send points M on the curve E0. The challenger then computes the corresponding curve E of M0 and then E on zero K, which is the resulting curve from the OPRF. The attacker then is allowed to repeat this process several times. And eventually the attacker needs to output the OPRF evaluation corresponding point M prime, where M prime is chosen by the challenger. This assumption is different from the protocol in two main ways. One, in the assumption, the attacker, which in the protocol corresponds to the client, submits points. In the protocol itself, this would reveal the input of the client because the client needs to compute that chain or the audience needs to protect their own privacy. In the assumption, the attacker sends points to abstract away from the privacy preserving computations as well as because the client provides zero knowledge proofs that their identity coefficients are correct in the protocol. The other main difference is that in the OPRF protocol, the client hashes their input onto the point M and then uses that for their computation. This is needed to prevent simple attacks where a two neighbor here corresponds to a two neighbor of the resulting curve. And that prevents those type of attacks because even if the client would compute a neighboring curve of this curve, they would not know the input corresponding to that point M. And this corresponds in the assumption to the fact that the attacker needs to output the resulting value for a point M point chosen by the challenger rather than chosen by the attack. We will now present an attack that can fully break this assumption and compute the share curve for any input point P. And in general, we will use a two-step strategy. First, we devise a strategy to recover the curve EK, which corresponds to the public key of the server if it was an SIDH exchange as well as the sample generated by the image of a point M under the private isogeny of the server. For some point M. Once we can do this, then we can repeat this process a few times in order to combine several points to obtain the image of 5K on the 2 to the N version on E0. This is assuming that the isogenies are the client corresponds of order 2 to the N. Once we have this, then we have a full attack because for any point P in the 2 to the N portion is 0, we can compute the subgroup generated by the image of P on EK. And this, we can compute the output of the PRF as EK modded by this value, which corresponds to isogeny from E0 with kernel P and K. So let's see how to do the first step first, where we want to recover EK as well as points on the curve. We start from the curve E0 and we query the challenger with a point M and we obtain the curve E and K. Then we will query with a point 2M. Now the point 2M will have order 2 to the N minus 1 and this curve will correspond to the curve on the isogeny between E0 and EM. We can then repeat the process with 4M, 8M and so on. Eventually when at the end, we will obtain the curve EK, but we will also have to recover all the points, all the curves on the isogeny between EK and EMK. So if we know the green isogeny, we know the kernel of the green isogeny. And the kernel of the green isogeny is precisely the generated by the image of M under the isogeny 5K. So now that we can recover the subgroup generated by 5K of M for any point M on E0 on the 2N torsional E0, we can also recover, that means that we can also recover the image of M on EK up to a non-scale. Thus, we repeat the process by varying 4 point M, N which must be linearly independent from M and M plus N, which gives us the point M prime, which is a scalar multiple of the image of M, N prime a scalar multiple of the image of N and the same for R, which is a scalar multiple of the sum of the image of M plus N. But now we know that M prime and N prime form a basis for the Google M version on EK. So we can express R prime in terms of M prime and N prime. If we do all the computations, we see that the ratio between the alpha and beta is exactly the same as the ratio between beta and B and alpha and A, which gives us just enough information to compute what we want. Indeed, given any point P of order to the N on E0, we can express that in terms of M and N. And then we can compute the subgroup generated by the image of P as X prime plus Y N prime times alpha over beta. That is because we know that M prime is the correct image scale by alpha and N prime is the correct image scale by beta. But then when we multiply by alpha for beta, the two beta scans those out and we have that both this value and this value are multiples of alpha, which is a not scalar. So that does not affect the subgroup computations and we obtain precisely what we want. So to briefly summarize what we have done so far, we've seen that we can use all of lambda queries to recover the subgroup generated by 5K of M for any point M in the two to the N torsion in E0. Once we have three such subgroups for M and N plus N, we can then compute the subgroup generated by the image of P for any point P without further queries or further interaction with the challenger. And this allows us to break the one more assumption because we can now evaluate the OPRF or any input of our choosing without further interactions. However, this depends entirely on the fact that we can submit points of varying order. And while the isogenic, sorry, while the assumption as stated does not have any requirements, it is easy to check that the query points have full order because it is easy to work such an attack. Moreover, while this attack applies to the assumption, it does not directly translate to the OPRF protocol because the protocol also requires the client to provide a zero knowledge proof to protect against GSD attacks, which this zero knowledge proof has also the effect of guaranteeing that the corresponding points and just the corresponding isogenes have full order. So we will now present a new way to recompute the values on EK that will only use queries with full order points. This type of attack will be sub-exponential but will also apply to the protocol. Note that the second part where we use multiple sub-groups on EK to then break down one more assumption remain the same as in the previous attack. Now we submit a point M again to the challenger and obtain the curve EMK. Then we consider the isogenic from easier to EM and backtrack a bit. After we backtrack, we deviate and go back to another curve here that has distance from E0 to the end. So we can then submit the point corresponding to this curve to the challenger because this point will have full order and we obtain then a curve down here. Now, since the degree of the secret isogenes co-prime with the degree of the isogenes of the client, we have that the share parent of these two curves corresponds to the image of this curve under the curve under the isogenic 5K of the server. And we can repeat this process several times by query multiple points and then checking for their common parents. And then we can check for their common parents again and repeat this several times until we go back all the way to EK. Once again, we recover all the intermediate curves from EK to EMK. So we recover the isogenic from EK to EMK, the greener isogen. Once we have that, the remaining part of the attack is exactly the same as in the polynomial time attack. The attack relies on building a binary tree to pre-compute all the intermediate curves and does the isogenic from EK to EK over a given point. This attack can now work because all the curves on the bottom layer are to do the N away from EK and this order checking won't prevent this type of attack. The complexity of building these three depends on the number of queries that we are using. A higher number of queries will mean that we have a higher number of curves on the bottom layer and this multiple computations are required to build the entire tree. Conversely, if we use fewer queries, then we will have fewer curves on the bottom layer and this fewer computations and fewer steps needed to get to EK. However, when we use a higher number of queries, the curves are closer together and this computing a common parent will be faster. Indeed, to compute a common parent, we use a meet-in-the-middle approach whose complexity depends on the distance between the two curves. If we use fewer queries, then we will require fewer meet-in-the-middle computations but the complexity of each would be higher and this explains the flexibility of our attack. Indeed, our attack allows to easily trade off queries for complexity and vice versa. But to show that such an attack always brings a positive impact on and can always decrease the security, we show that with as few as two queries, we already reduced the complexity of the attack from lambda to two lambda over three. Moreover, the attack can be highly formalizable. That is because the computation that takes place in one side of the tree are completely independent from the computations on the other side, which means that we can then traverse the tree on a graph-first approach and thus paralyze the different branches. On the other hand, if we do want to minimize the amount of memory required, we can travel the tree in a depth-first manner which will require a higher computation time but will minimize the amount of memory. Thus, to sum up, we have a full sub-attack that builds a binary tree to recover EK and points on EK and will then use the second part of the attack already presented for the poly-time attack. And since we have this trade-off between number of queries and overall complexity, we can achieve a balanced trade-off when the number of queries is comparable to the complexity of the middle computation. And when that happens, we have an overall sub-exponential complexity for the attack. Note that this assumes that query computations have a unitary complexity. In real-world applications, this is not the case because the server will require a certain amount of time to process the request and output the value of the OPRF. However, since the attack is highly flexible, we can then take into consideration the computation time sub-server and find the number of queries that minimize the running time of the attack. An interesting point of this type of attack is that, unlike the poly-time attack, cannot be easily prevented. There are no simple or obvious countermeasures that can afford the attack with one possible exception, which is increasing the parameter size. The attack, of course, is sub-exponential non-poly-time, because if we increase the parameter enough, we can make this type of attack costly enough. However, if we want to guarantee 128 bits of security, we would have to use extremely large isogenic degrees. For 128 bits of security, the client would need to use isogenic degrees higher than 2 to the power of 4,000, which is extremely not practical. And, of course, there might be new ADOC and efficient countermeasures, but this seems to be highly non-trivial. To validate the correctness of the attack, it shows its feasibility in practice. We implemented our attack in stage math and run the full attack on a laptop machine. We remarked that this implementation is not particularly optimized, so the results shown here provide some sort of lower bound to what it can be achieved. On this side of the table, we see the parameters used, including the size of the prime, the bits of security, and the degree of the isogenic corresponding to the client, where Q denotes the exponent of the number of queries. So in this attack, we use 2 to the 3 queries up to the 18th queries. In the middle section, we see the parameters corresponding to the meeting the middle part, where the distance denotes distance between two curves as well as its memory requirements. And eventually we have the running time of the complete attack. As you can see, the running time does grow quite quickly, but for 67 bits of security, we can compute the entire attack in less than two days, which is a complete break of the security guarantees. And while we didn't run the full attack for 128 bits of security, we were able to estimate the overall running time based on partial results, and we saw that the attack should take less than six years. Once again, this implementation is not particularly optimized and does not run on a powerful machine. So a better implementation running on a cluster will be able to significantly reduce the running time of the attack. And our code is available on GitHub, so feel free to check it out. Lastly, we want to discuss how the starting curve is selected. The paper does not specify how the starting curve is chosen. Besides saying that it should be random. However, there is no I'm going to compute a random super singular curve at the moment, because some party needs to take it. And we have several possibilities, including the client, a third party, the server using a known curve or having to rely on the trustee setup. If the client or a third party choose the curve, they can choose to backdoor it. This means that if a backdoor curve is used, the server is then vulnerable to a key recovery attack using torsion point attacks, which would reveal the key K of the server and it would completely break the OPRF protocol. Secondly, the server may choose the curve or a known curve such as E1728 may be chosen. However, if the server knows the endomorphism ring of the starting curve, this can lead to a simple attack that breaks the super singularized adge in equalization assumption using the security proof of the protocol. At this stage, it is not clear yet whether breaking the assumption leads to breaking some security properties of the protocol because it seems like the protocol is slightly more involved than what the security assumption proves. However, since the security guarantees are not clear when the server knows the endomorphism of the ring and since if the client or a third party choose the curve that can lead to a full key recovery attack on the server, at the moment a trustee setup is needed. To sum up, we presented two attacks that break the one more assumption used in the security proof of the protocol and we also showed that one of the two attacks, the sub-exponential one leads to an attack on the OPRF protocol where after a sub-exponential amount of queries and a sub-exponential amount of computations, we can then evaluate the OPRF without further interacting with the server which defeats the entire purpose of an OPRF protocol. We also provide an implementation that demonstrates the attack is feasible in practice and does sensibly reduce the security guarantees of the protocol. And eventually we also discussed the need for a trustee setup. For future work, we want to look into improving the attack complexity, ideally reducing the sub-exponential attack to a polynomial one as well as develop new and efficient countermeasures that can prevent this type of attack. We are also interested in studying the trustee setup performance and either showing that attack on the assumption leads an attack on the protocol or amending the security proof to show that this attack does not actually apply to the protocol. If you have any questions, feel free to check out our paper available on the e-print or reach out to me or any of my co-authors. Thank you very much for your attention.