 Hello everyone. My name is Daniel Kuisters and I would like to thank the organizers of Crypto 2021 for the opportunity to present our paper called Thinking Outside the Superbox. This is joint work with Nikola Borch, Johan Dahmer and Sjilvon Asch. To motivate our work, consider Modes. Modes are cryptographic algorithms that take arbitrary length input and give arbitrary length output. Internally they make use of a primitive and the primitives that we have considered are cryptographic permutations. These can be thought of as block ciphers for which the key has been fixed. An example of an unkeyed primitive is the sponge construction that you see on this slide. So the sponge construction is proven secure if f is a randomly and uniformly chosen permutation. And this means secure against generic attacks. That is to say a text that do not make use of any primitive specific properties. An example of a keyed mode is the keyed duplex. And the keyed duplex has as well been proven secure against generic attacks if the primitive that is being used f is a randomly and uniformly chosen permutation. So these two examples show that the design space traditionally is split into two. So on the one hand you have people that design boats built on top of ideal primitives and on the other hand you have people that try to design the primitives to behave like ideal ones. However the latter cannot be formalized. So in practice assurance has to come from crypt analytic evaluation of round reduced versions of f possibly used within the mode. So this requirement of behaving like an ideal one is quite strict and often maybe a little bit too strict. And it leads to primitives that are over-engineered and quite resource-heavy. So a recent trend has been to design modes that take primitive specific properties into account during the design. An example of such a primitive rare mode is the one that you see on this slide namely Farfalle. So clearly design and crypt analysis are really intertwined and this leads to the research question that you see on this slide. Namely how do the different designs of cryptographic permutations affect crypt analysis. So we've actually looked at both linear and differential crypt analysis. However for this talk we will restrict to differential crypt analysis. So there are two things design and crypt analysis and we start by giving a short overview of differential crypt analysis just to make sure that everyone agrees on the same definitions. So given a permutation f we call a tuple a comma b where a is an input difference that propagates to an output difference b through f a differential. And assigned to a differential is its differential probability also called the dp and this is defined as the number of x for which the equation L of x plus L of x plus a equals b holds relative to the total number of different x. And an x which satisfies this equation uniquely defines a pair x comma x plus a that is set to follow the differential. And because of the additional property of the logarithm it is often more convenient to work with the minus log 2 of the dp and this is called the weight of the differential a comma b. The primitives that we have considered all have a similar structure namely they are the composition of say r round functions of the form a nonlinear layer which is an sbox layer that is the parallel application of the number of sboxes followed by a linear layer that is the composition of a mixing layer m and the shuffle layer p. And note that this mixing layer m is allowed to be the identity and in fact one of the primitives that we have considered uses a mixing layer that is simply the identity. And the shuffle layer is just another word that we use for what is normally called a bit permutation. And finally this is followed by a translation which is the addition of a round constant to break up any symmetries in the round. So typically the linear and the nonlinear layer are the same for each round whereas the round constant is used like I said to break these symmetries. To know that we have not considered pi-sl structures or round functions that are based on arx so that is to say round functions that make use of additions, rotations and x-words. So perhaps the following figure helps to to illustrate this composition. So here you clearly see that there is a first nonlinear layer consisting of the parallel application of a number of sboxes and then we have drawn these boxes for the three other layers maybe the mixing layer the permutation layer and the translation at the end and we see that this structure is repeated for the number of rounds that has been chosen. Given that the permutation is composed of a number of rounds it's possible to give a more precise description of how the differences propagate through this permutation. So in fact we can specify an intermediate difference so a difference for each intermediate state in that and this leads to an r plus one tuple of differences where this r is the number of rounds and this is called a differential trail but this is also known as a differential characteristic depending on which paper you read and similar to the differential case ADP or differential probability is assigned to a differential trail. This is simply the number of input pairs that follow each difference of the trail relative to the total number of different pairs and again we define a weight of the differential trail which is simply the sum of the weights of the round differentials but this turns out to be equal to the weights of the differentials of the active sboxes so this is by definition the case. So each trail actually partitions the the set of pairs that follow the differential and if this partition is non-trivial so if there are multiple trails that share a share the same input difference and the same output difference then we say that these trails cluster within a differential. So the ciphers that we have considered they differ in one notion that we call alignment. Well actually the notion of alignment was coined by the ketchup designers in a paper that was presented during the e-crit hash workshop of 2011 by means of how differences propagate through the round function. So this actually is a different definition than the one that we present in our paper. So intuitively you can think of the bits being grouped along the sbox boundaries so for example bits are grouped in nibbles so 4 bits if the sbox has size 4 or in bytes so 8 bits if the sbox has size 8. And when the bits in the round function are consistently processed in these groups then we say that the round function is aligned and if each round function is aligned we call the entire primitive alliance. So in the paper you can find a more formal definition of what we mean by alignment and from this definition it immediately follows that there exists something called a superbox substructure. And combining this superbox substructure with a mixing layer that is the parallel application of an NDS matrix you can reason about the differential properties of the cipher using combinatorial arguments. So in particular this makes it possible to easily give bounds on the trail weights and this was one of the I guess selling points of AES namely that it was resistant against differential cryptanalysis where the argument was based on these trail bounds. However of course to be completely resistant against differential cryptanalysis you need more than just trail bounds. So again this is a figure that hopefully clarifies some of these things. So here you see again a layer of sboxes and compared to the previous figure the mixing layer here is actually split into a number of sub functions if you want to call them that that are nicely aligned along the sbox boundaries and they are followed by a shuffle layer so a bit permutation and what you cannot see in this picture because it was a bit difficult to draw is that this the shuffle layer actually shuffles the bits in groups so I mean that if the bits belong to one group then the group as a whole is moved to a different position. And then the shuffle layer is followed by a group wise addition of a constant and then this structure is repeated for the other rounds. So clearly if there is an aligned approach it should not come as a surprise that there's also something that we call an unaligned approach and in an unaligned approach the idea is to avoid any such groupings when designing the round functions. So in general although there are exceptions to this case this means that you need computer programs to investigate the trail paths but this naturally leads to the question if it seems like there are only advantages to using the aligned approach why is not every cipher designed with an aligned approach and in fact many ciphers are indeed designed according to an aligned approach but it turns out that an aligned approach may have some potentially unwanted side effects. And in the paper we mentioned some of these side effects and we have generated a lot of data for four different primitives to actually quantify these side effects and the four primitives that we have considered you can see in this table. Actually in this table there are some block ciphers but these block ciphers have been transformed into a permutation by fixing the key to the constant zero. So the first primitive that we have considered is Rheindal and actually this is a somewhat modified version of Rheindal in that its width is larger than what you would typically see namely 256 bits. So according to our definition Rheindal is aligned. It has a strong mixing layer and its S-box works on bytes so it has size 8 and there are 32 of such S-boxes corresponding to a width of 256 bits. But the second cipher that we have considered is Saturnet which can be thought of as a more modern Rheindal. According to our definition it is aligned. It has a strong mixing layer and it works on nibbles so that means groups of four bits. There are 64 such S-boxes corresponding to a width of 256 bits. So third we have looked at Spongens which again according to our definition is aligned. It has a weak mixing layer because the mixing layer is simply the identity function and it works on nibbles so four bits and there are 96 of such S-boxes corresponding to a width of 384 bits. And finally we have looked at permutation Zulu. And Zulu is an example of a permutation that follows the underlying approach. It has a strong mixing layer and it has a rather small S-box size because it works on groups of three bits. And there are 128 of such S-boxes corresponding to a width of 384 bits. So clearly we have a sample space of only four ciphers. But already this took quite a lot of work. So in order to increase the sample space we would like you to also work on this. So please expand this sample space. And in order to help you with this we have made our software available at the following URL. So let's now compare an aligned approach with the underlying approach in a more quantitative way. So the differential probability of a drill can be approximated by the product of the dps of the active S-boxes. So an S-box is active if the input difference to that S-box is non-zero. And actually if equality holds then we say that the round differentials are independent. So when does a drill have a low differential probability either if there are not many S-boxes active or the S-boxes have a very high dp or of course both. And the idea behind the wide drill strategy is to ensure that all drills have many active S-boxes. So intuitively you take your mixing layer and you look at its input and its output. And you want to make sure that in the tuple of both its input and output there are many active S-boxes. So in particular if there are very few active S-boxes in the input A then you would like to make sure that there are many active S-boxes in the output M of A. And if there are a few active S-boxes in the output B then you would like to make sure that there are many active S-boxes at the input. So an inverse of B. So a natural thought is to consider the whole distribution over all the 0 A's of the box weight of A plus the box weight of M of A where the box weight is simply the measure that counts the number of active S-boxes. And the minimum is called the branch number. And this gives a way of kind of lower bounding the number of S-boxes that are always present in the round function. But the branch number can actually be seen as the minimum of well this distribution. So actually of something that you could present in a histogram. And here you see two such histograms. So on the one hand you have the bit weight histogram and on the other hand you have the box weight histogram. So the bit weight histogram shows the number of states distributed over the different bit weights. And this gives a measure of the diffusion power of the mixing layer without regard for the S-box layer. And a good bit weight histogram starts far to the right. And it is quite flat in the sense that if you were to think of these histograms as continuous lines, you would like the slope to be quite small. And then we actually recover the bit weight branch number as the starting value of this histogram in the 0 starting value. So for example, here Satana starts at 5, so its bit weight branch number is 5. And then we have the box weight histogram, which shows the number of states distributed over the box weights. And again we want these histograms to be flat and start far to the right. And the smallest value, so the starting value, is the box weight branch number. So for example, Rheindal has a box weight branch number of 5. So clearly we see that in moving from the bit weight to the box weight, the graphs are kind of shifted upwards. So there is a loss of diffusion in going from the bit weight to the box weight and this phenomenon is that, well, so we call this phenomenon huddling. And in general huddling increases with the S-box size. However, we found that it's more pronounced in the line size. By the way, the box weight is also a very nice way of predicting what happens in the case of two rounds differential trails, the trail weights to be more specific. And in this slide we see the differential trail weight histogram versus the differential weight histogram. So on the left we have the number of trails versus the trail weights. And on the right we have the number of differentials versus the differential weight. And again we see that in moving from the trail weight histogram to the differential weight histogram, that these histograms are slightly shifted upwards. And this is the result of the fact that within a differential there can be clustering. So multiple trails that have the same input and output difference. And they all contribute to the differential weight. So clearly the further a histogram is shifted upwards, the more it say suffers from clustering. And we found out that this effect is again more pronounced in a line size. Of course, we have done more than just this. So as I already mentioned in the beginning, in addition to the differential propagation properties, we have also computed or looked at the linear propagation properties and have computed the two round histograms of the four ciphers in that case. And looking at these histograms, there were two clear winners, say, namely Saturnah and Zulu. So ciphers that performed the best with respect to this metric. And for these two ciphers we have computed the three round trail histograms just to see how they compare in that regard. Moreover, we've also looked at whether the approximation of the DP of the trail by the product of the DPs of its active S boxes is an equality or not. So whether this is a good approximation or not. And as I already mentioned, this is related to the independence of the round differentials. And for three rounds of Zulu, we actually have looked at this independence and found out that for Zulu, there is independence of the round differentials. Moreover, based on available information, so that means looking at the existing literature and doing some back of the envelope calculation, we have scheduled happens when considering the weight histograms of four rounds and beyond. And actually what we found out is that given the same computational resources, so not looking at the same number of rounds, because clearly one round implemented on some architecture can be a lot more resource heavy than the round of another cipher implemented on the same architecture. Given the same number of rounds, we have found that Zulu actually performs the best with respect to the differential and the linear propagation properties. So this concludes our presentation. And we would like to thank you for your attention.