 for presentation. So I want, I'm quite grateful to the previous speaker for introducing linear cryptanalysis and D structure. And D structure. So Matsui linear cryptanalysis is based on distribution of XOR of plain text bits, X1, XS, and cipher text bits, Y1, YD, in algorithm one. In algorithm two, bits X are input to the second round of the encryption and Y inputs to the last round of the encryption. So we managed to find a method to compute the joint distribution of X and Y that is a pictorial random variable. And this was the starting point in our work. Both distributions, Matsui's distribution and our distribution are approximations. And they depend on the small sets of cipher key bits or linear combination of the cipher key bits. And then briefly, algorithm two cryptanalysis is applied. This is outline of the rest of the talk. So first I remind about Matsui algorithm two and about logarithmic likelihood ratio statistic. Then I introduce new statistic, which are most suitable for multidimensional cryptanalysis. And then assure that deciding on values of the key bits relevant to the statistic may be represented as an instance of optimization problem. And I will introduce a search algorithm to solve this problem. And then give some details on implementation of the attack against the 16 round DS. And then show how we manage to compute multidimensional distribution in phased ciphers. And then I will go to conclusions. Okay, first is about Matsui algorithm two and the conventional statistic. In algorithm two, by Matsui encryption is split into three parts. It's the first round of the encryption internal rounds and the last one. And our vector is x and y. And its distribution depends on how some key bits which are relevant to the internal rounds. But observation is not available because observation depends not only on the play text but also on some key bits from the first round. And the same for observation y. So one can say that distribution on x and y and observation depend on all these key bits shown here. Now, logarithmic likelihood ratio statistic is used to distinguish two distribution with densities. Distribution P and Q by observing independent observations. This is the most powerful test according to Neyman-Pearson lemma. And the test says that we accept P if the value of this function which is some of logarithmic values, ratio values P under observation and Q is larger than some threshold. And left-hand side function is called logarithmic likelihood ratio statistic. It's already shown that, I'm sorry, it's too fast. As already mentioned, distribution of x and y depends on some key bits and the observation of x and y depends on some other key bits. So LRF statistic should depend on all these key bits. And this statistic may be used to distinguish correct value of relevant key bits against incorrect value. However, to do that we need so many computations of values of the statistic. So it's feasible when the size of the set is relatively small. But when vector x and y is large, its distribution observation depends on so many key bits and this is not efficient. So instead, in this work, we suggest using a new statistic and with the new statistic, computation is much faster. In our experiment with DS, it's a thousand times faster than with conventional LRF statistic. And this new statistic reflects the structure of the round function. It depends on how its boxes, how rounds implement using its boxes. And of course, our new statistic is not optimal because LRF statistic is optimal according to the name of Pearson Lemo. But this statistic is not optimal but as we can compute resolve statistical problem much faster, trade-off is positive. So now I explain how this new statistic was constructed. We have our vector x and y and we know its distribution. And we take some sub-vectors of this vector. Let we take m sub-vectors. Or I call this sub-vectors projections. It may be any functions in these bits. And those projections are taking such that distribution observation for each of them depend on a low number of key bits. So these are key bits which are relevant to distribution of this. And these are key bits which are relevant to observation of this. Now we construct LRF statistic for each of these sub-vectors. And then consider vector of this LRF statistics. As we know, distribution of this vector, it's easy to compute distribution of this vector and easy to compute distribution of this vector. And asymptotically, this vector is distributed as a invariant normal random variable. If a key is correct, so we have one distribution, if key is incorrect, so distribution is close to something opposite. And these are parameters of the distribution, some mean vector mu, a covariance matrix C, and the number of plaintexts N. Now the problem is to distinguish these two normal distributions. We can use again LRF statistic to do that. Usually when you want to distinguish two multivariate normal distributions, LRF statistic is quadratic. But in this case, because covariance matrix is the same statistic degenerates to linear, so that this statistic to distinguish these two is the sum of the statistics for each of the projections. So that one can say that this statistic is separable, so that statistic presented the sum of statistics for smaller parameters. Such statistics were called separable, and the theory of such statistics was developed by Russian mathematicians if you can read it quite long ago. But we use here only notion of the statistic and not any theory about them. Now, because we know distribution of everything, we can compute distribution of our main statistic. It's one where it's normal distribution with the parameters u, its expectation, and u, its variance for some explicit positive u, if k is correct, if it's not correct, it's something opposite. Okay. So now what we need in cryptanalysis. In cryptanalysis, we need to solve this inequality so that we need to find the values of the k such that this inequality is satisfied, and then we accept this as a correct one. But we want to do that without forcing these k-bits. And that can be done because our statistic is the sum of some functions in a low number of parameters, of variables. And in our experiment with DES, the size of this set is 54 bits, and the size of this sets for every i is around 20. And that can be represented as solving an optimization problem and method to solve this optimization problem is a three-search. Now I discuss optimization problem and introduce search algorithm. So optimization problem is to find, say in this small example, to find the binary x1, x2, and x3 such that this function is larger than 1.3. And functions s1, s2, s3 are real-valued functions defined on vectors of length or two. This is definition of s1, definition of s2, s3. You can check that 1, 1, 1 is the only solution to this problem because if 1, 1, 1, we can see that 1, 1, 1 chooses this column, this column, and this column, and some of all three values is 1.4, is bigger than 1.3. So solution is working over search three. We start at the root, then we assign to x1 value zero and look at this inequality and try to understand if it's feasible or not. If it's not feasible, we cut this branch in x1, zero, it's not feasible, and we try x1 and check if it's feasible. It's feasible, we continue. So one to check six-linear inequalities in this example and brute force takes eight inequalities. So this is faster. It's the same way we solve what we need actually. This inequality. Now, output of this research is key candidates for the final brute force. And this distribution of the main statistic is known. We can compute theoretically success probability of this attack and the number from solution that is key candidate to brute force. That is done theoretically. Now, I say a few words about implementation details for DES. So DES input to DES is a 64-bit block. It's split into two parts. It's a 32-bit block, 32-bit blocks. This is plain text. It's a cipher text after 16 rounds. It's a cipher key. Matsui best linear approximation for 14 rounds is given here. It's a sum of eight bits. Those bits are part of the cipher text. This bit is input to the last round. These three bits, sorry. So these three bits are input to the second round. We take all these bits. It's a linear combination that is sort of all these eight bits, but we take all these eight bits as a vector and add more bits. And we get the vector, sorry. Again. And we get the vector of length 14, 14. And by symmetry, it's very close to these two buttons. So we can compute distribution of this vector. It's a vector of length 14. And by DES symmetry, we can use another vector, also of length 14. And the distributions are the same after permuting the bits. As these two vectors incorporate different internal bits of the encryption, we consider them independent. And computing the distribution with our formulas took a few seconds. Now for each of these two vectors, we take 14 projections. Over all 28 projections represented here. And for each projection, LLR statistic depends on at most 21 qubits. Overall, the number of qubits involved here is 54. Now we construct two separable statistics and apply search algorithm. And search algorithm reconstructs 54-bit key candidates for the final brute force. So how it looks for one projection is shown here. So we take projection H1, this vector of length 10. And all the relevant qubits, those qubits are remind, qubits which affect distribution of this vector. And those qubits affect observation of this vector. The set incorporates 20 unknowns written here. This is an unknown from the DES master key. And we can compute to power 20 values of the statistic S1. And similar is done for all other 27 projections. So now we want to construct a tree and the work of this tree. In this case, we need to order relevant qubits to run this research. And the order is very easy, we take the first variable which appears in the maximum number of these subsets. So the second most common X19 and so on. And the order actually affects the complexity of the tree. Now we run experiment. We fix the desirable success rate. It's for example, 83%. It's close to Matsui, but a little bit different due to some bad planning. Then we solve the equation in N. N is the number of plaintext. And another parameter is here is a threshold. We find threshold for our statistics as that the number of plaintext equals the number of keys to brute force in the final stage of the analysis. And we got that number of plaintext is two power 41.8. And then we run search tree algorithm. This is complexity. In Locke's case, it's a number of nodes at each level of the tree. So and we got finally the number of key candidates in relevant qubits is given by this number. And this is number of keys to brute force. It's the same as here. And this was done experimentally. That is somehow gives ideas that our method is correct. But however, the number of nodes in the tree is two power 45.5. It's as larger than complexity of brute force. But constructing one node is a very simple operation. It's just checking two linear inequalities with low precision real numbers. So in bit operations constructing so many nodes in the tree is lower than brute forcing with DS so many keys. As it improves Matsui result on DS, which according to the previous talk was two power 43 with success rate was 85%. But remarks that in the contrast with Matsui result, this success rate was computed experimentally for eight rounds and then extrapolated to 16 rounds. But this success rate is computed polis theoretically. Now I want to say a few words if I have time about how to compute multidimensional distributions in phase tail ciphers. So I assume that we have around DS. This is a plain text, this is cipher text, this is cipher key. Assume that X is random. So every bit in encryption algorithm is a random variable and let E be any event inside the encryption. And we want to compute its probability. And I want to formalize this problem to put into mathematical form. So the construction is following that X0, X1, XR plus one, this is internal blocks in DS, randomly dependent 32 bit blocks. But it doesn't define a cipher. And we take some auxiliary event C, which defines actually the cipher. And that is given by these formulas. And F is a round function of DS. And Ki are fixed round keys. Now probability of the event is actually probability of the event in this scheme under condition of the C, of the C. And by conditional probability formula, we can write this ratio. And then probability of this event is very easy to compute is two power minus this number. So and we want to find this probability. But as this event depends on all key bits of the cipher, this is actually infeasible to compute. So it's very complicated function. So we need some approximation to this. As that may be done by relaxing this event C, which defines DS. So we take a large event in the sense that the main event C, which defines cipher, implies event C alpha is that is given by these formulas. Where alpha i is a subset of indices for each sub-block in DS. And alpha is a combination of this, not a combination, it's a vector of this subset of indices. And probability of this event, it's also very easy to compute, is given by this formula. But now instead of this, we compute its approximation. Why this is approximation? It's because if we extend, take a larger subset of indices, C alpha tends to C. But we accept anyway that it's approximation. And for this conditional probability, we use the same formula. And then we need to compute the probability of our targeted event together with this event. And that was actually done in Matsui work implicitly. But I did it explicitly here. And that is more feasible because this event depends only on low number of the qubits. Now I should say something about trails. So I assume that you want to compute distribution of this vector. This two sub-vectors input to encryption function, this is output after r rounds. Those are some indices. We define some event C alpha. Our approximation depends on how we choose this event. Assume that we take some event C alpha, then we construct trail. Trail consists of some input bits for FR. And some output bits of FI. And we call the trail regular if this condition is satisfied, where XI gamma input bits relevant to FI alpha. And this condition is very similar to Matsui's condition. In Matsui condition, if you remember that this beta is exor of these two. But here we consider not the linear approximation, you can see that it's a pictorial random variables. In this case, we need a bit different. But the sense is the same. We need some condition of the trail. And for regular trails, which satisfies this condition, we can compute distribution of this vector by using some convolution type formula. This formula is given on the slide. So Z, again, I remind that this vector distribution which we want to compute, we want to compute probability that Z equals some vector A. It means that this part equals this part, this part equals this part, this part equals this part, and this part equals this part under condition C alpha. And this is exact formula. Exact formula. And this is some of products of probabilities from round sub-vector distributions. And they are computed from the definition of DES round. And this is relevant, some relevant qubits. And this formula is quite complicated to handle because a lot of summons in this formula. But due to the round structure of the cipher, we can split the encryption into two parts and compute it by parts. And in this case, it's very easy computation. It's only a few seconds for 14 round DES to compute distribution of very large vector, actually. Okay, almost done with conclusions. So in this work, we suggest method of computing joined distributions of encryption internal bits, X and Y for facelift cipher for SPN. It's about the same, the same logic may be applied. And we realize that conventional logarithmic likelihood ratio statistic isn't efficient. So it was suggested new statistic which reflects round structure of the round function. We computed its distribution and able to predict success probability and the size of the final brute force. And also efficient search algorithm to find key candidates which fall into critical region is suggested. So we got an improvement of Matsui result in DES to at least in bit operations. And we're able to predict correctly success probability for eight rounds. We're able to check if our success probability is real success probability. And the number of final key candidates for 16 round DES. And our search algorithm to solve optimization problem is 1000 times faster than brute forcing called relevant key bits. This end of my talk. Thank you very much. Question for Igor. I have one. Can you go back to the number of nodes? Which one? Yep, here. So I can see the number of nodes is much higher than the complexity for brute force. And when you build a tree, do you need to actually sort the nodes? Do you need a binary search tree or? This is a binary search tree. We construct nodes, but constructing nodes is very simple operation. It's only a few X source. Yeah, but for each insertion, there will be some more operation according to the depth of the tree, right? No, for each, this is a number of nodes. This is a number of nodes. So complexity of the tree is a number of nodes, of course. It's proportional to the number of nodes. But complexity to construct one node is a few X source. Yeah, I mean, constructing the entire tree, is it complexity n times log n? No, no, complexity of the tree is a number of nodes. We have some node and we get some new node. And we should check two linear inequalities at this node. If they are satisfied, we continue constructing nodes. If not satisfied, we cut this branch. So that complexity of this node, that node, it will check in two linear inequalities with real numbers. Okay, okay. More questions for you, Go. If not, let's thank you again. Okay. Thank you.