 Okay, so B-side is an idea that I stumbled across completely by accident. A little over a year ago I was writing a tutorial on SIDH and I was trying to use Magma to produce a toy example. In fact, the toy example corresponds to the GIF that you can see on this title slide. And in using Magma to try and derive this toy example, I stumbled across something that was really confusing to me. I thought it was a bug in Magma, but then once I realized what was going on, essentially that what I thought was a bug gave birth to the idea that ended up being B-side. Now I think that down the track, if some problems can be solved, in particular if computing large prime degree isogenes can be accelerated, I actually think that B-side might end up being the way to go as far as isogeny-based key exchange goes. It might be as performant, maybe even more performant than SIDH and Psyche. But I'd like to start the talk by essentially reliving the mistake or what I thought was a mistake that I stumbled upon and kind of explaining how that gave rise to B-side. To set up this toy example that I used in the tutorial, I chose a prime, the prime 431, which is the same shape as all of the real world SIDH and Psyche primes. It's 2 to the 4 times 3 to the 3 minus 1, and of course built the extension field FP squared. And then I just asked Magma using this super singular invariance function, which only takes the prime as input. And basically it outputs the set of Jn variants that corresponds to all of the nodes in the super singular isogeny graph. So Magma, as we know, this is always very close to P over 12. So in this case I think there's 37 nodes in the toy example, maybe one of them's hidden under me. But that's all of the Jn variants there that are super singular over this prime characteristic. And Magma output them, so all was fine. And I started drawing the super singular isogeny graph, no problems yet. And so the next order of business in drawing and depicting this toy example was to start drawing edges. I needed to draw edges in the 2 isogeny graph for Alice and edges in the 3 isogeny graph for Bob. You'll see those graphs a little later on. But to do this, I was going to essentially do things exactly the same way that they work, that sRH and site computes isogenies in practice, which is to find points of order 2 and points of order 3 on curves that correspond to these Jn variants and to use those points as generators of the kernel to feed that into values, formulas and then to output the nodes that are connected to each node. So I started going around looking for these points of order 2 and points of order 3 on each of the curves in the graph. And the way to do this in Magma was to use the elliptic curve from Jn variant function, which on input of a Jn variant outputs an elliptic curve in the isomorphism class corresponding to one of those to any given node in the graph. And so I was doing that for a while and computing points of order 2 and points of order 3 on representative curves at each node. But for some of the nodes in the graph, I wasn't able to find points of order 2 and points of order 3 over fp squared, which for a little while started to confuse me and I was thinking, what's going on? All of these curves are isogenes, which means they should have the same number of points. But I can't find points of order 2 and points of order 3. So then when I started to compute the group orders of the representative curves that were sped out by Magma, I started to see that basically there was two different group orders being produced here by Magma's elliptic curve from Jn variant function. There was the group order p plus 1 all squared and there was the group order p minus 1 all squared. And then I remembered that essentially in SIDH and Psyche, we've been making a choice for p plus 1 all squared. So we've been using p plus 1 all squared as the group orders. And I originally was thinking, oh, we kind of need to choose one or the other. But the essence of this paper and the idea that came from observing this mistake that's, I guess, thankful to Magma's elliptic curve from Jn variant function. The essence is that we don't need to make a choice between using the group order p plus 1 all squared or the group order p minus 1 all squared. In fact, we can use both of those group orders because essentially each node in the super singular isogenic graph corresponds to, there's elliptic curves defined over f p squared who have group order p plus 1 all squared and there's elliptic curves defined over f p squared that have group order p minus 1 all squared. And both of these choices are valid at each node in the super singular isogenic graph. So each node isn't red or green. In fact, each node is both red and green. So in here I'm saying that each of these nodes is yellow. And roughly speaking, the idea of B side is that Alice is going to work with the p plus 1 all squared group orders and Bob is going to work with the p minus 1 all squared group orders. And in doing so, there's kind of a whole array of new options for instantiating isogenic based cryptography. So just to reiterate, here's Alice and she's going to work completely on the A side. She works on the same set of nodes as Bob, but she's going to work with elliptic curves and with torsion points whose orders divide p plus 1 or she's going to work with elliptic curves whose group orders are p plus 1 squared. Now calling this the A side, not just because it corresponds to Alice, but because it's kind of where we fell by default in SIDH and psych. There we always kind of instantiate or write down the system parameters by starting with a subfield curve that's minimally defined over fp and then lifting it into fp squared. So when you consider the super singular elliptic curves over fp, there is only one group order over fp and that is p plus 1. So when you lift that to fp squared, by default you kind of naturally fall into the A side. But when you consider the possible group orders of super singular elliptic curves over fp squared, that's where you get more than one. And that's where we get this possibility for Bob, which is to work on the B side. So I've called it B side not just because it corresponds to Bob, but because it's kind of an analogy to the flip side of a record where, which is kind of often the less popular side, but that's or the forgotten side of a record. And in this case, p minus 1 all squared is kind of the forgotten group orders that weren't really used in SIDH and psych. But so Bob's going to work with elliptic curves that have group order p minus 1 all squared. So he's going to work only with torsion points whose order divides p minus 1. But again, Alice and Bob, they work on the same set of nodes. They're just going to work with different representative elliptic curves at each of those nodes. The first piece of good news for our purposes is that essentially there's no changes to the SIDH or psych protocol or to any of the kind of isogenic arithmetic that we need if we want to use the B side. And that's by virtue of the A side and the B side are actually the Fp squared quadratic twists of each other. So on this side, I've kind of zoomed into one node in the super singular isogenic graph. And in green is the curve or is a representative curve that Alice is going to use in Montgomery form. And in red is the representative curve that Bob's going to use in Montgomery form. Now the only difference here, there's no difference in the A perimeter. They use the same A. The only difference here is that Bob is using a non-square B whereas Alice's B is kind of one by default. And if you use a non-square B, then you get the p minus 1 all squared group order that Bob uses. And if you use the curve in green, you get the p plus 1 all squared group order, which is what Alice is going to use. But the reason that the arithmetic doesn't change is that all of the arithmetic that we use in state of the art implementations is actually agnostic to whether you're working on the curve in the green oval or the curve in the red oval, whether you're working on the curve or twist. And so in the first yellow box there is the x-coordinate only scalar multiplications, which pay no attention to whatever the y-coordinate was. And they basically only use x-coordinates and the a-coordinate of the curve, which is the same for the Alice and Bob on the A side and the B side. The same goes for the isogenic arithmetic, which takes x-coordinates and curve parameters to x-prime and a-prime in the second yellow box there. And again, that's the same arithmetic and the same explicit formulas, whether you're on the A side or the B side. And the Jane Verna of both of these curves is the same. Technically speaking, these two curves, they have different group orders over Fp squared, so they're not even isogenous over Fp squared. But if we lift to Fp to the 4, their quadratic extension field, then they actually become isomorphic, which is why their Jane variants are the same, even over Fp squared. So all of the arithmetic that we would ordinarily use to compute isogenes or isogeny-based crypto on the A side immediately applies to any isogenic arithmetic that Bob wants to use on the B side. So essentially there's no changes to the underlying explicit formulas that Alice and Bob use. But probably the best piece of news is that we can actually now use finite fields that are much smaller than in the original SIDH and psych constructions. So the first observation to see why we can do this is to recall that the security of SIDH and psych depends more so on the degree of the isogenes that Alice and Bob compute and not on P. And that's because P is so large compared to these degrees. In fact, these degrees are, as you can see in the second bullet point, roughly the square root of P. So Alice computes through the Aisogenes, Bob computes through the Bisogenes. And because we're squeezing Alice and Bob into the A side so that their isogenic degrees divide P plus 1, the prime is roughly the square of their isogenic degrees. But now if we're going to put Bob on the B side, we only need to squeeze Alice's isogenic degree into P plus 1 and we can squeeze Bob into P minus 1. So as soon as we remove the factor 2 that is necessarily common to both P plus 1 and P minus 1, then whatever remains is immediately co-prime. So Alice can compute Misogenes and Bob can compute Nisogenes, where now M and N can be the same as before. So the degrees of the isogenes can be the same as before, but now P is roughly the square root of what it used to be. So ideally, and in this last bullet point, we kind of would like, in an ideal world, to still be able to let Alice compute 2 to the Aisogenes and Bob to compute 3 to the Bisogenes because these small prime power isogenes are currently as it stands the most efficient ways to go. But unfortunately the largest prime where M and N can be 2 to the A and 3 to the B is 17 and there is no prime larger that is squeezed between these 2 and 2 times the 3 power numbers. So the upshot is that we're going to have to relax the requirement that Alice and Bob must have these 2 and 3 power isogenes. So since we can't find large enough primes where Alice and Bob can still compute their 2 and 3 power isogenes, we've got to relax now the requirement for 2 and 3 power isogenes and allow Alice and Bob to compute isogenes that are the product of many different prime powers in general. So instead our goal now is to find primes P, large enough primes and I'll get to that in a minute, where P plus 1 and P minus 1 are both as smooth as possible. And in fact this is the same or very related requirement to the best paper at this conference. The ski sign construction also requires this primes at the sandwich between 2 very smooth numbers. But another way to state this problem is to look for what we call twin smooths, so 2 consecutive integers that are both smooth. And the special case that we're looking for is when that sum is also the prime P. So I'm going to talk about finding twin smooths because finding twin smooths with a prime sum is just a special case of finding twin smooths. And it turns out that it's rather difficult to find primes of cryptographic size that have P plus or minus 1 very, very smooth. So ideally we'd like them to be as smooth as possible because their smoothness relates to the efficiency of the subsequent isogeny calculations. But starting with the largest 3 smooth twins, we saw that on the previous slide, the largest 3 smooth twins are 8 and 9. And their sum is 17 which is the prime we saw on the previous slide. The largest 5 smooth twins are 80 and 81. And again that sum is a prime but in general of course that won't be the case. The largest 113 smooth twins have the first, both numbers are roughly 2 to the 74. And their sum wasn't prime. The largest 113 smooth twins that have a prime sum is about 8 bits less. In general what we'd like to do is keep bumping up this smoothness bound until we find ones where M is somewhere close to 256 bits which is roughly the size or whatever size we're looking for. But in general guaranteeing that we've found the smoothest possible or rather the largest twins subject to a fixed smoothness bound, it requires solving 2 to the pi of B Pell equations where pi of B is the number of primes up to B. So I think the number of primes up to 113 is close to 30. So when I computed the largest 113 smooth twins, I had to solve 2 to the 30 Pell equations to be guaranteed that that is indeed the largest. At least according to the best known methods we have which is a theorem due to Stormer and subsequent work by Lemma shows that we have to solve these 2 to the pi B Pell equations. So in fact computing optimal parameters, the upshot is that computing optimal parameters where we're guaranteed that these are optimal is rather difficult at least as far as we know. And so the lesson from this slide is that finding optimal parameters is in fact nontrivial. So I'm going to return to the twin smooth problem in a sec but I'm just going to have a brief interlude because there's something that I kind of skimmed over on the previous slide that needs to be discussed. And that is that because we're no longer using two and three power isogenes, the kind of main difference in the security of B side to that of SIDH and Psyche is that we're no longer staying in a fixed L isogenic graph. So here was the two isogenic graph that I promised we were going to see earlier. This is Alice's two isogenic graph where each node has in general three neighbors, three nodes that it's connected to. And Alice only uses these edges. And if we move to Bob's three isogenic graph where each node is in general connected to four neighbors, again these edges are fixed and Bob only uses these edges in the three isogenic graph. In B side, we can no longer just hope to stick with one prime power. We can't hope that Alice has a prime power and Bob has a prime power. At least we can't hope that they both have prime powers. It might be possible in some constructions that either Alice or Bob is a prime power but in general they're going to have their isogenic degrees to be the product of many primes. So it should be said that even with this toy example where the prime P which was two to the four times three to the three minus one was chosen so that the two and three isogenes were rational, you can still draw the isogenic graph for any prime you like. So L is five, L is seven, L is 11 and so on. For all of the primes there's still, as long as it's not co-prime to I guess P, all of these primes there's still an isogenic graph that exists over these nodes and so for the example L equals five and L equals seven I've given two, I've given the edges here outgoing from two different nodes, one for L equals five and one for L equals seven. At the L equals five node you can see six outgoing edges and at the L equals seven node you can see eight outgoing edges. Of course this is true in general that there's always L plus one outgoing edges or at least in general there's L plus one outgoing edges when we're talking about the prime L because the graph, the L isogenic graph is L plus one regular. And so the kind of inherent conjecture that we're making in the B side construction is that Alice and Bob are free to switch between isogenic graphs consecutively. So Alice might compute a five isogenic which would take her to one of six neighboring nodes and then from the node that she now stands on she might compute a seven isogenic or an 11 isogenic and so on. The conjecture we're making is that it's not a problem to do this, and I think that to the best of our knowledge and I hope that the consensus amongst the experts would agree that it's fine to do this because the hardness of the problem is dependent on the size of the full isogenic that Alice and Bob compute and not its factorization. The best known attacks all kind of treat the isogenic problem as somewhat of a black box that's independent of its factorization and so really what you're interested in is how many destination nodes Alice and Bob can both land at and in this case the number of destination nodes will roughly be the same as it is in SIDH and Psych if we make the isogenic degrees the same. And so the assumption here is that Alice and Bob are fine at successive steps to be taking a step with some prime degree isogenic and then immediately taking a step with a different prime. Now it could turn out that future crypt analysis shows that this is not a good idea but as I say in the paper it could also turn out that it's an even better idea than sticking in the isogenic graph. At the moment it's not very clear but I think I hope that expert consensus would be that it's actually irrelevant as far as the security or the best known attacks are concerned. So now returning back to our problem of searching for twin smooths to try and instantiate the b-side construction on this slide you'll see some probabilities that are related to some concrete choices of parameters. So suppose we're looking for numbers that are roughly 256 bits in size or for twin smooths that are roughly 256 bits in size where the smoothness bound is 2 to the 16. So they're allowed to have any prime factor that's less than 2 to the 16. And in the blue rectangles we're assuming that that's parts of the number or the full number itself that is b-smooth and in the orange we're saying that this is not smooth. So the three methods that I've discussed in the paper. The first one is the most naive method that one could think of and that is to search over smooth values of m. So to construct smooth values of m that we just take products of any primes that are less than 2 to the 16 and loop over all such products until we find one where either m plus 1 or m minus 1 is smooth. Now some sort of basic smoothness probabilities can tell us that the rough probability of finding a twin smooth in this way is roughly 2 to the negative 70. So it's kind of like no hope of searching for twin smooths of this size using that method. The second method uses the extended GCD approach so if you look at the extended Euclidean algorithm what you can actually do is take roughly a square root of each of the numbers to be smooth and then hope that the other half of these two numbers are smooth. So you can choose both a and b to be roughly 2 to the 128. They've got to be co-prime and then you can compute the numbers that come out of the extended Euclidean algorithm, the numbers s and t and hope that s and t are both smooth as well. Now in doing this the probability of smoothness is roughly 2 to the negative 50. It's a little bit better but the chances of these two orange chunks of the numbers being b smooth is still rather low. So the best method that I could come up with in this paper, the one that was most successful in finding smooth parameters, was to instead search for m and m-1 to be x to the n and x to the n-1 where n is some small integer like 6. The reason it needs to be small is that if it's too large there's not enough x values to search over. And if it's too small like 2 then we kind of get back to one of the situations above where the probability of smoothness is really small. So in searching for m is x to the 6 we're actually only looking for a number that's roughly 2 to the 43 to be smooth. And then the factorization of x to the 6 minus 1 means that we're looking for two more numbers of roughly that size to be smooth and then two other numbers that relate to those quadratic terms that are twice as big or have twice a bit length to be smooth as well. And in doing that our probability of smoothness becomes a lot better. But the best method known to date of searching for twin smooths is subsequent research that was done with Michael Meyer and Michael Narrig. Very recently we've developed a sieving algorithm that performs much better asymptotically and for the sizes of the primes that we're looking for. So I'm briefly going to touch on that. The idea is similar to the third method on the previous slide but to instead use polynomials AX and BX that differ by one that are completely split over the polynomial ring over the rationals, Q of X. So there's an example of such a degree 6 example where AX and BX are both degree 6 polynomials that are completely split. And this really improves the probability of finding smooth twins at the cryptographic sizes. The difficulty in that construction is actually finding the polynomials AX and BX themselves. But if you can do that, which we know how to do for certain degrees up to 12, then the probability of smoothness is a lot better. And you can see one of the best examples we found was the 250-bit prime below where P plus 1 and P minus 1 were both 2 to the 15 smooth. And so if you want to take a look at that work, you can look at the archive address given there. So probably the easiest way to get a high level snapshot view of the security of B side is to kind of contrast it or to compare it to the security of the original SIDH construction. So on the left here, or in fact in both cases I've fabricated a little mini super singular isogenic graph or at least the nodes. And the red, let's say that the red dot in both cases is the starting curve or the starting node. And the blue dots represent the set of target, the set of possible destination nodes that Alice and Bob could walk to with a secret isogenic. So in the case of SIDH on the left, these nodes form, they're roughly square root the size of the number of nodes in the full graph. So Alice and Bob can only really reach a square root of the full set of nodes, which is why we don't use the general algorithms for trying to solve the problem, which is why there's better algorithms for solving the SIDH problem. In particular, we use these meet in the middle core finding algorithms, both classically and quantumly. And the security analysis of both of these shows that it's big evolve P to the one quarter and big evolve P to the one sixth. If you ignore the memory requirements. Now in practice, nowadays we know that the all things considered in particular memory considered as well that the best classical attack at least is the Venoshot Wiener algorithm. But in any case, these all of these algorithms are specific and not general. So they can't be used to solve the general isogenic problem, but they're tailored to the types of isogenes that are used in SIDH. In B side now, we've got much fewer nodes in the in the graph, but we've got roughly the same amount of destination nodes. And in particular, the number of destination nodes is now much closer to the full set of nodes in the graph. If you recall, there's rough, there's roughly P over 12 nodes in the isogenic graph and and Alison Bob take walks that are both roughly that roughly can land you at O of P destination nodes. So maybe there's a handful of nodes that they can't walk to for technical reasons, but in general, they kind of cover a fraction that's much closer to one of the full the full graph. And this shows why this is kind of related to the reason why we can attack the problem using the the general isogenic algorithms. In particular, the the classical algorithm due to Delfz and Galbraith, which which runs in big of Peter the one half and its quantum its quantum variant due to the RC John Sandcar, which is basically Delfz Galbraith with Grover with Grover's algorithm to get the square its speed up. These algorithms solve the general problem and but they're both memory free. So neither of them require any sort of big memory, big memory to run. They're both perfectly parallelizable. And that the big O in their in their cost, it just hides the cost of the isogenic oracle. So if you can get the the concrete costs of an isogenic oracle classically or quantumly, then you can basically write down a concrete cost of the of the best known algorithms for solving the general problem. But what it essentially does is it really nicely matches the NIST the NIST requirements for level one security. For example, matching as 128 because this this square root relationship between the classical and quantum complexities means we can we can choose parameters quite easily in B side. There is one caveat. It's related to the fact that both of these algorithms will just return some path between the two nodes, which will probably not be the path that also Bob actually took. So what we're assuming here is to just to be safe that these paths can be modified to output the the path that Alison Bob took and can be used to actually solve the underlying cryptographic problem. So we're kind of making a conservative assumption there that that that those algorithms will will be able to be modified to produce the the the secret keys. Of course, at least in this talk on rather biased to B side, especially if I'm comparing it to SIDH and psych. The pros are that the smaller primes the the primes that you know could be close to half the size or roughly half the size. They end up giving smaller public keys. Even if you take compression into account in SIDH and psych in our case the there's nothing to be gained from compression so you get you get some simplicity in B side because the public keys are already as compressed as possible. The the scalars that you might use to compute the secret scalars that you might use to compute these public keys are roughly the same size as the finite field because Alison Bob their walk sizes are roughly the same as as P. And so there's nothing to be gained for compression so you kind of avoid that overhead of compression. The security analysis is arguably cleaner than the the security analysis of SIDH and psych and certainly of other post quantum primitives. And one other nice thing is that the hybrid security that you gain matches the so that the classical security that you might gain from doing an ECC hybrid over the same finite field matches perfectly with the conjectured quantum and classical security of the of the post quantum primitive. So if you if you also do ECC over the same field that you'd use in B side then the classical ECC security that you get is the same as the classical and comparable quantum security that you get from B side. The main con right now which is the the the main direction for future work is that the efficiency is is nowhere near as as good as SIDH and and psych at least if you're if you're balancing the case for Alison Bob where they they both get. You know, I said you need degrees that aren't as smooth as SIDH and psych I said you need degrees then the efficiency is is a lot worse as it currently stands. But some some really exciting recent work by Arge Chai Dominguez and Rodriguez Rodriguez Henriquez that's that's on the archive shows that this this gap is a is a certainly as it stands quite close a fair bit closer than what I was expecting and is encouraging for future work for B side. What's unclear at the moment is that whether or not the security of jumping between different prime degree isogenies that I was talking about earlier, whether that is a problem or whether it's a feature or whether it's a bug. And also whether smaller P introduces attacks that I didn't consider in the security analysis so there could be attacks that are better than Del Scalbraith or better than the RCJ and Sankar in the quantum sense that applied to B side. As far as future work goes, I guess it all comes down to really trying to make B side and B side as fast as possible. So the most obvious direction to do that is to look for better parameters because if you can find parameters that are much smoother than the parameters that we've found then this translates directly into into very noticeable speed ups basically the very roughly speaking the the efficiency of B side is just heavily dependent on the on the smoothest bound that you use to find the parameters. So the smoother parameters would be would be the easiest way to get to get B side a lot faster. One area that's probably more ambitious is to see if the O of square root of L isogenic complexity, asymptotic isogenic complexity can be made even better. And if that happens then that'll be yet another boost for B side that might be able to make it as performant as SIDH. Cheers.