 Okay, thank you for the introduction. So this is a joint work with my colleagues at ANSI, Henri Gilbert, and Joana Trigar. So I will explain about the white box cryptanalysis of the Azaza crypto system with expanding S-boxes. First, I want to apologize. This is not really block cipher cryptanalysis. This is more multivariate, public key cryptanalysis, but it is a public key block cipher. So what kind exactly of block cipher is this? First, it is setting the challenge is to design a public key cipher, which is closed in terms of complexity to a private key block cipher, so to mix public key features with private key features in order to bring the efficiency closer to each other. So the approach was to look at the classic structure of block ciphers that alternates diffusion with substitution, so affin layers and S-box layers. And since the structure of Sazas has been broken for 15 years now, the smallest still unbroken candidate was Azaza. So it means three affin layers interleaved with two S-box layers. In the approach of Biryukov, Buyage and Kovratovich last year at Azia Crypt, instead of lookup tables as S-boxes, they use quadratic functions in order to be able to use this as a public key cipher, which means that since the affin layers have Degree 1 and the S-box layers have Degree 2, you can just output the composition of all layers which are quadratic polynomials and use this either as a public key or as a white box implementation of the block cipher. So in this setting, public key and white box are more or less the same, so I will speak about white box cryptonesis of this. And so you can use this as a white box implementation of your block cipher. In their papers, they had two variants of the Azaza block cipher. So the ideal way to do this would be to use some random objective S-boxes, but since you are using quadratic polynomials, computing quadratic polynomials that give random objective S-boxes is basically you lose the randomness. So they had two solutions to do this. The first solution was to discard the randomness and use a fixed objective S-box. So in this case, they chose to use the Ketchak key function. And the other idea was to discard the objectiveity and keep the randomness. So they chose to use objective S-boxes in order to have an objective block cipher and to keep it random, they had to make it expanding. So the S-boxes take four inputs in a finite field and they output eight outputs. So this is not subjective at all, of course, but it is injective with relatively high probability about one half probability for a random S-box to be injective. In the paper, they propose both schemes, but they sketch an attack on the key scheme based on Grobner's basis. So they said we think that the most solid of both schemes is the expanding scheme. So in this talk, we give an attack on the expanding scheme in the white box setting. So as a public key cipher and we give a very low complexity attack which uses about 60 million small, very small linear algebra computations. So it's about one million times a public evaluation, more or less. There have been some, a lot more recent activity on this scheme since publication last year in December. At least three more new crypt analysis of this scheme. So one of them by people who are mostly present in this room, on the other variant of Azaz, the key variant. So this is broken with more or less similar complexity. And so this is also algebraic crypt analysis. The generic Azaz with lookup tables, S-boxes is also reported as broken, but without some, okay, I forgot about this. So these schemes are protected against decomposition attacks by the use of some perturbations. So we broke the perturbed version. The key scheme perturbed is also broken and without perturbation, it is broken even in a generic setting and even for many, for much longer schemes than Azaza. Since two of the authors of this scheme broke the much longer version, Sazazazaz. Okay, so how do they do this exactly? First, they start with the finite field, F16. So with 16 elements. And they start with 32 elements of this field which they first mix up with an affine layer. Then they split into eight S-boxes. So each S-box takes four elements as an input and outputs eight. So here they get 64 elements instead of 32. They break again into this time 16 S-boxes which are also four to eight. So now they get 128 elements. And the first two outputs of each S-box are absorbed with some perturbation polynomials which are pseudo-random polynomials of degree four. So they get 122 outputs. They are the final affine layer and the public key is the output. So the public key is just the composition of all these polynomials. These are all polynomials. So this is a quartic polynomial at the output. So to do the decryption, on the last layer, the S-boxes are perturbed by perturbation polynomials but most often from only the six unperturbed outputs you can get a small list of candidates for the inputs and then you can filter out with the first layer to get only the good candidates. So getting the complexity, the precise complexity of the decryption is a bit blurry. There is an expansion factor of four. The plain text is 128 bits and the safer text is 512 bits. And okay, so the perturbation polynomials is a bit weird because it's mixed up with the S-boxes but what we can do is very simple. Since two of the outputs of the S-boxes are immediately absorbed with perturbation polynomials, we can say, okay, these outputs are zero and perturbation polynomials take care of the outputs that were absorbed. So for each private key, there is an equivalent key in which two of the outputs of the S-boxes are zero and we can move this part of the S-boxes to the perturbation polynomials. So this means that each private key is equivalent to a private key of this shape which is much simpler. When we move the perturbation polynomials here, the S-boxes are two to six. The second layer of S-boxes are two to six and we complete, so this is 96 outputs and we complete by adding the 32 quartic perturbation polynomials here and mixing all of them together with the affine layer. We can also cancel the constants in these affine layers and assume that they are linear using the constants in the S-boxes and perturbation polynomials. So this is a scheme that we will attack. It is the same scheme, but the presentation is quite simpler. So we'll give this attack in two parts. First, we will give a very simple distinguisher and then we will explain how this distinguisher develops later into a full key recovery attack. So this is a decomposition attack. We compute the middle layers of the output of the first AS layer and the tools we use are only linear algebraic promise, no grab-down basis today. This is not a multivariate tripto session. And the total cost of the attack is very low since the distinguisher is only the rank of one big matrix and the key recovery is the rank of 60 million small matrices. So it's about a couple CPU hours. So the distinguisher, when we look at the structure of the AS-ASAP polynomials, so I will use some names, so I call X the inputs, Y the middle layer and PK the output layers, the public key. So for simplicity, I will speak only about the homogeneous parts of a degree two here and degree four here, because this is the hardest part to obtain and to try to analyze. I will speak at the end about the non-homogeneous terms. So I write HD for the space of homogeneous polynomials of degree D and I'm mainly concerned about this point of attack, which is the space vector space generated by these polynomials. So this is vector space generated by 64 quadratic polynomials in the space H2 of homogeneous quadratic forms. And actually, instead of this space, we identify some things slightly different, which is the products of these polynomials by all linear polynomials. So these are polynomials of degree three, but it's still much smaller than the whole space because it has 2,000 polynomials in the space of dimension 6,000, so we can still identify this. So first, we assume that we have no perturbation polynomials and then we look at how we obtain the cubic terms in the output. So how do we get them? The output, the cubic terms are made by these S layers and these S layers are quadratic functions which are using as inputs. These quadratic terms here, so to make a term of degree three at the output, you have to multiply a term of degree two here by a term of degree one here. So all the cubic terms in the output that belong to this space because they are linearly generated by the products of the yj by any linear form. So this space is something that we can attack that we can identify. Okay, so there are only 128 outputs, so this is not enough to identify the space, but we can generate more elements of this space by looking at the translates of the public key. So what we do is that we translate the input by some constant delta, some differential delta, and we look at the relevant public key, the translated, which is just a translation of the polynomials. And this is still true because if we translate by a constant here, we are not changing this space because this is all the linear terms and we are not changing this space because these are the quadratic, the leading terms of these S boxes, they are not modified by a translation. So these are the same. And so if we look at a family of n differentials, delta one to delta n, we obtain 128 times n elements of this space. So this is a good news. It looks like we could obtain the whole space here just by taking enough differential, but actually these terms, they are related to each other very simply. And because the differentials are actually differentials, by which I mean that the differentials in the differential cryptanalysis meaning are actually differentials as in polynomials, differentials of polynomials. And this is because of the Taylor formula. Simply put, the Taylor formula is still almost valid in characteristic two. We can do some slight adaptation to cancel out the denominators, but I'm mainly concerned about the first term which no adaptation needs to be done. It means that f of x plus delta is related to f of x plus the first degree term of the Taylor formula here. And here since we are looking at the cubic part, so the cubic part here of f of x plus delta minus f of x is this term which is the derivative of f and the derivative of degree three comes from the homogeneous degree four part of f. So to obtain the cubic terms in the output, we need to look only at the dominant terms as the quartic terms of f. So for simplicity, we now assume that everything is homogeneous of maximal degrees and we completely ignore up to the last slide all the inhomogeneous terms. And instead of a transition delta, we look at the basically the same thing which is a derivation which is written as a formal sum of derivative of each variable with some coefficients. And since because of the Taylor formula, a transition and the derivation are really the same, we still have this differential relation which is what I explained in the previous slide that for any derivation delta, the delta of any of the public key terms belong to this space capital Y dot h1, so the product of the middle layer by all linear forms. So this was the unperturbed case. In the perturbed case, it is still, this is, okay. So for the unperturbed case, we can write a very simple distinguisher. Namely, we take a family of 128 polynomials which are either uniformly random of degree four which are a nice as a public key or the homogeneous part. So we just write down the matrix where for each polynomial, we write the first differential, so the differential along x1 of this polynomial, so the differential of x2 and so on, and we glue this together with the same matrix for each of the output polynomials. So we obtain, we look at the terms of degree three, so there are about 6,000 such terms, and we have 4,096 terms columns in this matrix. So as we just saw, if f is an as a public key, then there are at most, all of these terms belong to capital Y dot h1 which as you mentioned at most 2048. So if f is an as a public key, then the rank of this matrix is as most 2048. However, if f are random polynomials, then we have a heuristic, so which are, we are a subject unable to prove, but we have validated with the computers, with the simulations, that says that with the overwhelming probability and really overwhelming is about two to the power of several billion, that says that if we take random polynomials and the rank of the matrix with generally be 4,000 and with a very negligible probability below 2,000. So this is a distinguisher for the unperturbed as a scheme. So for the general case, so with the perturbation polynomials, actually it's not much more complicated because what happens is that if we take a differential of the public key, it belongs to this space plus the derivatives of all the perturbation polynomials. So this space here still has dimension 2048 and this one has dimension 32 times 32 because there are 32 derivatives of 32 polynomials. So the total space here has dimensions 3,07012 which is still lower than the whole rank for the matrix. So we can still write a distinguisher. We just replace 2,000 by 3,000 here and we have a stability distinguisher. So we just compute the matrix of derivatives, the rank of the matrix and we have a direct distinguisher. And actually we can slightly increase the number of perturbations to at least 90 and probably 96 to keep it equal with the number of actual outputs. So from this we can write a key recovery algorithm. So the point of attack is the same because we attack the middle layer capital Y dot h1 and actually we detected the presence of this space in the distinguisher and now we are trying to compute this. So we are trying to compute some, we are going to compute because of the perturbations some spaces that are larger than this one and then we are able to cut away the perturbations. So this is a decomposition attack and it is made in two steps. And once we have the decomposition attack it is known that a simpler hazard problem is broken. We give a specific attack that breaks it in a very low complexity. So how do we do the decomposition attack? First we have to grow a space that will contain our attack point capital Y dot h1, middle layer. So first we use the differential relation that we use in the distinguisher that says that for any derivation delta, delta of the public key is contained in this space capital Y dot h1, sorry, plus the perturbations. So if we take just one differential then it will spawn some vectors that will be in this whole vector space and if we increase the number of vectors we have a slightly larger part in this space and we'll grow the part of perturbations that we do not want. Actually what happens is that if we take enough randomly generated derivations then we are going to have enough vectors to generate the whole space d of the whole space of derivations of the public keys because if we take n derivations then all these derivations apply to all the public key polynomials are 128 n. So this is the number of vectors we get and the space they live in has dimension which is with growth slower with n. So if we take n large enough then we have a number of vectors here that will be significantly larger than the dimension of the space they live in. So at some point and actually for n equals 22 with overwhelming probability these vectors will generate this whole space. So this is what happens here. We are generating a space which contains some space of perturbations plus the whole space Y dot h1 which we are trying to compute. So when using 22 perturbations derivations we are going to generate a space containing capital Y dot h1. But we have also some unwanted perturbations here. So how do we get rid of them? It is very simple actually. We just do this several times. We get each time a different set of perturbations. So we have 22 perturbations instead of 32. So if we do this just four times it intersects away to zero and we get just the space capital Y dot h1. So this is up to a linear transformation we get the output of the first azalea and the input of the second one. So the total cost for this is okay negligible to compare to what follows. So what follows is how to solve in a low complexity the simple azalea. So I will give the example of the inner layer but the outer one is similar. So the inner layer it is written in this form so I cheated a bit this is not an actual metric because these are quadratic functions. So we first apply an affine or linear transformation then by blocks quadratic functions and then an outer affine layer. So this is not cheap because this is actually a linear function. This is when you apply derivation to this function since it's our quadratics when you differentiate these are linear. And so we have now an input linear transformation composed with this block matrix composed with an output and what is interesting is that these are the derivatives of each S box. So they use only the values of the differential delta applied to the inputs of this S box which are in the number of four. So the linear maps which are written as blocks they depend only on four values of delta to the inputs of the S boxes. So if all of these values are zero since the dependency is linear then this becomes zero. And this whole matrix instead of having rank in general 32 has a rank 28. So we can detect if the derivation delta vanishes on four of the coefficients of the S boxes. So we just do this repeatedly. So this happens with the probability which is large enough to minus 16. And we may detect this very easily just by computing the rank of a small matrix matrix of size 22 by 64. And how many matrices do we need to compute? We need a 28 equation for each S box because it is a space of dimension four in a space of dimension 32. And we need each eight S boxes. So the total number of matrices of which we need to compute the rank is two to about 24. For the inner layer and for the outer layer which is a bit larger we can show that it is about two to the 26 matrices of a small size 96 times 64. Okay, so I'm almost done. Now about the steps that I did not speak about. They are actually the easier steps. The first one that was a bit under the Dormat is how do we actually recover? So we recover this space. I did not explain how we recover this one. So actually we have a heuristic that is somewhat related to the other one which is that if we take a small number of uniformly random polynomials and we multiply them by all the linear polynomials then the resulting set behaves almost like, behaves rank-wise almost like a set of uniformly random polynomials. So from this, this is also validated only by computer experiments. But from this we are able to compute this set very easily from this one. And if we do not compute this very precise set it's not too bad, we can recover. The second part is the outer as a layer is not given explicitly as a quadratic map. It is given as we have an input which are quadratics, which we have outputs which are quartic and how do we compute the quadratic map from one to the other? Okay, we just choose a monomial ordering and it's very easy. There is no linear algebra to be done. And the last part is how do we recover the inhomogeneous part of the key? Actually if you write down the equations for each S box, we find that the equations are linear in terms of degree one and degree three. They are linear and they are much more than enough of them to compute all the terms so it's just a bit of a linear algebra to be done here. So the total cost for all of these missing steps is very small compared to what we just did. Okay, and that's all. Thank you.