 Hi everyone, my name is Lauren and today I will be presenting a work by myself and my colleague Piguil Bilkin on the classification of quadratic Boolean functions. So we will actually talk here about factorial Boolean functions which are Boolean functions with multiple output bits and in symmetric primitives we often use S-boxes which are actually balanced factorial Boolean functions and I will quickly recall here some S-box properties that we often use. So one way to represent an S-box is with its lookup table. This shows very clearly what the outputs are for every input. An alternative is to use the algebraic normal form which gives us a Boolean function for every output bit as a function of the input bits. Then an important property of S-boxes is their algebraic decree and this is something that we can read directly from the algebraic normal form because it is the degree of, it is the largest decree that occurs in the Boolean functions of the ANF. Finally we also care about some properties for cryptographic security and two important properties are differential uniformity and linearity which are used to indicate the resistance against respectively differential and linear crypt analysis. Now the space of S-boxes or Boolean functions is very large so an important tool is the property of a fine equivalence. Two functions are considered a fine equivalent if and only if there exists input and output of fine transformations that transform one function into the other. This equivalence is so useful because it preserves many properties such as the algebraic decree, the differential uniformity and linearity and also the multiplicative complexity which is the minimum number of end gates that you need to implement a function. So a fine equivalent has been used for many years it's actually been around since been used since the 50s and by 1972 all Boolean functions with up to five variables were classified. Then in 1991 Majorana classified also the Boolean functions with six input variables and Fuller proved in 2003 that this classification was complete and when it comes to vectorial Boolean functions the Canier demonstrated an exhaustive classification of bijective P by P S-boxes for P up to four but this methodology was not visible for larger S-boxes so then in 2017 Bozilov et al restricted their work to only quadratic functions and managed to classify them for five inputs. So this restriction to quadratic functions is justified because of recent interests in side channel secure implementations and also multi-party computation. The methodologies of both the Canier and Bozilov et al need an algorithm to find the a fine equivalent representative of a class and they both used an algorithm by P Georgov et al which finds this representative and it works for bijective Boolean functions. The algorithm chooses as a representative the lexical graphically smallest S-box in lot form. Let me demonstrate this. We see here on top the lookup table of an S-box with four input bits and four output bits so there are 16 possible input values and 16 possible output values. Since as is balanced each output occurs exactly once. We're now looking for a function R that is a fine equivalent to S by constructing a fine functions A and B. Since we want R to be as small as possible lexical graphically we start with output bits 0. Now in the first step we need to guess the first output of A and we pick 0 then we look up what the output of 0 is and that's 1 and then that means that B of 0 is also equal to 1. In the second step we need to guess again because you have no other information to go on and now we guess that A of 1 equals 1 then because S of 1 equals B we obtain that we obtain B here and then we again we can put the output R of 1 to this we can put it to the smallest available output which is 1. We repeat this one more time where we guess A of 2 equals 2 and S of 2 equals 9 which means B of 2 equals 9. Now it gets more interesting because of the linearity of A we know A of 3 because the A of 3 is equal to the XOR of the previous 3 outputs so it is equal to 3 and this is called a forward sweep. A forward sweep is completely deterministic because there is no guessing involved. We can again look up S of 3 which equals C and then here we cannot just choose the output 3 because B of 3 does not equal C because of the linearity of B so we choose the smallest available power of 2. Then in the next step we can again not use the linearity of A because 4 is the power of 2 but we don't have to guess because we can do a backward sweep. Since we didn't use the output 3 yet we can put it here in R since we want the smallest possible output here we can use the linearity on B to determine that B of 3 equals 3 which is the XOR of 1, B and 9 and then we look up the inverse of S which is 7 which gives us that A of 4 equals 7. So this was a backward sweep which was again completely deterministic because there was no guessing involved. Then we can again do a forward sweep because A of 5 is determined by the linearity based on the XORs of 0, 1 and 7 that gives us 6. We look up 6 as of 6 is F and then again we need to choose the smallest available power of 2. This process continues until R is defined but remember we made some guesses so to find the absolute smallest representative we have to repeat this process for every possible guess. For example in this case we chose A of 1 equals 1 and it resulted in a representative that goes 0, 1, 2, 4, 3 etc. Now suppose that we make a different choice here. If we guess A of 1 equals 5 and we continue the process then we obtain a representative that goes 0, 1, 2, 3, 4 etc. So this is lexicographically smaller than the previous. So to complete the entire algorithm we need to exhaust all the guesses and then keep the representative, the lexicographically smallest representative. In order to extend existing classifications to larger functions we needed to be able to find the representative for non-bijective functions and in particular for balance Boolean functions that may have less output bits than input bits. That means that the S-box S is not invertible which is a problem for the algorithm because remember the backward sweeps use the inverse of S. If S is not invertible then in each backward sweep there are 2 to the p minus q possible values for the inverse of S if S is balanced. Biryukov et al. estimated that the complexity of such an adapted algorithm would asymptotically follow this formula and we draw it here for p equal to 5, so 5 input bits. As you can see the complexity increases as q, the number of output bits decreases. As a result they advise to not use the algorithm for q smaller than p minus 2. However when we implemented our algorithm and measured the experimental runtimes we saw very different trends. We saw that the complexity initially decreases as q decreases and only increases when q becomes very small. Of course I do want to mention that these observations are for p equals 5 so it could be well different for a larger piece and also we only use quadratic functions. But the most important conclusion for us was that we realized that it is practical to use this algorithm for non-bijective balance Boolean functions. In fact we noticed that the algorithm could be more efficient. Here we show an example of an S box with 4-bit inputs and 2-bit outputs. That means that every output value occurs four times. The first difference you might notice in the algorithm that is that instead of a lot of guesses we have a lot more backward sweeps than before. This is because we can reuse y values. Every output y can occur four times in a representative r. And since we want r to be lexical graphically as small as possible we can keep using output 0 four times and perform backward sweeps instead of guesses. And backward sweeps are cheaper than guesses because they have 2 to the p minus q options while guesses have 2 to the p options. This is why we see that the complexity first decreases when q decreases. Eventually it increases again because as q near 0 the complexity of backward sweeps becomes the same as that of a guess. We now start from the methodology of Bozilov et al that was used to classify quadratic 5-bit S boxes. Their approach was to construct an exhaustive list of quadratic algebraic normal forms and to use the affine equivalent algorithm to get their representatives. The list was approximately 2 to the 23 items long which is like 8 million items. Then after eliminating the duplicates from the representative list they ended up with only 76 functions which represented a 76 quadratic affine equivalent classes for 5-bit functions. Using 16 threads their method took 3 hours of runtime. The clear bottleneck here is the affine equivalence algorithm because it has to be performed so many times. So to optimize this methodology we really need to minimize how often it is used. Thanks to our adaptation to the affine equivalent representative algorithm we can do this process iteratively with an increasing number of output bits. So suppose we start from all p by 1 Boolean functions we can combine them to find p by 2 functions and reduce them to their representatives. Then from the p by 2 classes we can find all p by 3 representatives and so on. By doing the search step by step the number of times to find a representative is way smaller than with the previous method because we can reduce the group of candidates to their representatives and eliminate duplicates in each intermediate step. I can make this a bit more clear with a graphical illustration. So let's assume that as a result of previous steps in the algorithm we have a list of the representatives of p by q minus 1 functions. We also have an exhaustive list of p-bit Boolean functions. By concatenating each function in the first column with each function in the second column we create a list of p by q functions. We then apply the algorithm to find the affine equivalent representative to each of those functions. Finally when we eliminate duplicates we obtain the list of p by q affine equivalent representatives and this list is exhaustive. We first apply this new methodology to the 5-bit quadratic Boolean functions and while the 76 5 by 5 functions had already been classified this new algorithm provides also the representatives of the non-bijective 5-bit Boolean functions. A first version of the algorithm took us 50 minutes on four threads which is already a big improvement over the previous work but after some optimizations that we described in the paper we were able to reduce this time to barely six minutes. Thanks to this optimization the classification of 6-bit Boolean functions became possible. Before this work we had no idea how many there could be and the resulting number 2263 classes is probably even higher than I expected and apart from the 6 by 6 s boxes we also classified non-bijective balanced 6-bit Boolean functions. The representatives for all these classes and also the 5-bit Boolean functions can all be found online. Let's look at some of the properties of the classes that we found. We found that only five classes have odd parity and all the others are even. 70 classes have a quadratic inverse and all the others have a cubic inverse. In a follow-up work which is also presented at this conference we also found out that all of those 70 classes are a fine equivalent to their own inverse. And finally here is a distribution of the linearity and differential uniformity properties over the classes. The eight best classes achieve differential uniformity 4 and linearity 8 which is not great but it's also not bad for quadratic functions and interestingly the other classes have significantly worse differential uniformity or linearity. So we can classify balanced quadratic Boolean functions with up to six input bits but there is a problem because people don't like quadratic functions because they don't always have good cryptographic properties. As I mentioned before quadratic S-boxes are preferred for side channel and MPC applications but of course we also need S-boxes with good cryptographic properties. Since we need good properties but we also want efficient and secure implementations a good compromise is to use higher degree functions that can be decomposed into a small number of quadratic functions. And we can adapt our classification methodology for this new goal. In particular we will now look for a function f and a representative r2 such that their composition is a particular higher order function h. For simplicity let's first assume that r2 is known and that we're just looking for f such that f composed with r2 gives h. We will do this search again iteratively. In fact we're going to start from the exact same algorithm as before. We're going to change only three things. Firstly instead of using all Boolean functions on p bits we will filter them and use only those that can be decomposed with r2 to form h. How do we discard the bad candidates? This is very easy if they have the wrong algebraic degree or the wrong linearity or differential uniformity. Secondly the same holds for the list of p by q representatives that we start from and of course also that we create. And finally instead of reducing the candidates to their affine equivalent representatives in each step we need to reduce them to the left affine equivalent representatives because we do not want to discard the affine function a that is required for the composition with r2. When we perform this algorithm for each representative r2 we obtain a list of composed functions with the same properties as h. If h is a part of this list we have found a decomposition. If it is not then a length 2 decomposition does not exist. Instead of looking for a specific function h we can also choose a set of desirable properties and compose functions to create s boxes like that. For example we use the algorithm to generate optimal 5-bit s boxes with a maximum algebraic degree, the minimal differential uniformities and linearity and the best part is that they can be efficiently implemented with a quadratic decomposition length of only 2. In conclusion we adapted the affine equivalent algorithm to be suitable for non-bijective functions which allowed us to optimize the methodology for classifying vectorial Boolean functions. We were able to create the first exhaustive classifications of quadratic vectorial Boolean functions on six bits both bijective and non-bijective. And even though these classifications are limited to quadratic functions they're very useful since recent implementations with side channel security or multi-party computation are much more efficient for quadratic functions. And finally for those who do not like quadratic functions we also show a composition or decomposition algorithm for higher order functions. Thank you for watching this presentation. The full list of representatives can be found on my website.