 Hi, and thank you for listening to this presentation about collision attacks on small Ketchak. This work was done together with Yann Rottela at Versaille University. Ketchak is a hash function designed by Guido Bertoni, Yoann de Men, Michael Peters and Jill Van Ash. In 2012, four of its instances were standardized under the name Chathry. It uses a permutation-based model of operation called the sponge construction. The underlying permutation is called Ketchak FB, where B is the state length in bits. For standardized instances, the state has length 1,600 bits, but today we are going to focus on so-called small Ketchak, where the state has length at most 400 bits. So what motivated our analysis of small Ketchak? First, the Ketchak authors have organized cryptanalysis challenges on round-reduced Ketchak instances. Commenting on the results of this contest, they noticed that the smaller versions seemed harder to break. Further, small Ketchak hash functions have been proposed for uses in constrained environments, such as our FID. Here is a tab which sums up the published collision attacks on Ketchak hash functions. As you can see, analysis so far have mainly focused on standardized and thus large Ketchak instances. On the other hand, for the smallest instance of the crunchy contest, here on the bottom line, only collisions with the permutation-reduced one round have been successfully attacked. We designed the first attack on three small Ketchak instances with the permutation-reduced to two rounds. Our attack, although not practical, is significantly more efficient than the generic attack. It has been implemented and verified on toy versions and the practical complexities obtained match the theory. So first I'm going to give a brief overview of the mode on which Ketchak is based, the sponge construction. The sponge construction uses a permutation F, which operates on a state of length B. The state is divided in two parts. The first orbits are called the outer state and the last C bits are called the inner state. R is called the rate and C is called the capacity. The message is first padded, then divided into blocks of orbits. The message is then absorbed by exorbing the message blocks to the outer part of the state with in-between applications of F. Next in a squeezing phase, output blocks are generated by outputting the outer part of the state with in-between applications of F. Once a sufficient number of output blocks have been generated, they are concatenated and truncated to the desired output length, which we call D. So as you can see here, getting a collision on the output may require getting a collision on several output blocks until their concatenation is big enough to be truncated to this desired output size. And this can be tricky because of the in-between applications of the permutation F. But in fact, this is not the case for standardized instances. This is because the output length is smaller than the rate, which means that to get a collision on the output, an attacker needs only getting a collision on one output block. On the other hand, in the case of small instances, the output length is bigger than the rate. This means that to get a collision on the output, an attacker needs getting several collisions on output blocks. Instead, a good strategy is to focus on getting collisions on the inner part of the state or capacity part of the state. We call such collisions inner collisions. And we are going to show in a minute that inner collisions can really easily be propagated to all of the output blocks and thus to the output of the hash function. Note that trying to get inner collisions would have been a very ambitious strategy in the case of the standardized instances because the capacity is twice as big as the output size. On the other hand, for small instances, the capacity is equal to the output size, which means that inner collisions have the same generic security as output collisions. So how can inner collisions be propagated to all output blocks? Suppose you get an inner collision after absorbing the first i blocks of two distinct messages in this figure i equals 3 and the inner collision is in blue. You can easily choose one last block to make sure that going into the next application of F, not only the inner part of the states but the whole states will collide and thus the states will collide during the whole squeezing phase and thus so will the output blocks. The inner collision is propagated to every output. So at this point, I can now give you a general description of the attack. The general idea is to generate inner states that all belong to a proper subset of F2 to the power C and then to apply a classical birthday attack on this subset. To generate said inner states, we generate several random long messages and thus get random inner states. Then for each inner state, which we call S in this figure, we exploit the properties of the permutation F to find a message block such that the inner state of FMS belongs to a proper subset of F2 to the power C. In order to understand how to generate a good message block for each random inner state produced, we now need to understand how F works. In the next section, I'm first going to quickly describe how a K-check F works. So as you know, F operates on a state of length B. B is in fact dividable by 25 and the state can be represented in three dimensions. For each K-check instance, both the columns and rows are five bits long, but the length of the lanes, on the other hand, varies. In our case, we are interested in lanes of length 8 and 16, which means that the states we consider have either lengths 200 or 400 bits. A round of K-check FB is the composition of five mappings, theta, row, pi, chi and iota. We will not describe iota because it simply consists in the addition of round constants and it's not relevant for our attack because we are studying collisions. So the first mapping is theta. It maps each bit to itself, absorbed with then other bits located on two other columns of the state. The two columns added depend solely on the column to which the bit considered belongs. This means that if two bits are located in the same column, theta will add the same value to them. The permutation row acts independently on each lane of the state and it rotates each lane. The permutation pi consists in a reorganization of the lanes of the state. Finally, chi is the only non-linear mapping. It is of algebraic degree 2. It acts independently on each row of the state and maps each bit to itself, absorbed with the end of two other bits on its row. It's the end of one bit of its row and the complementary of one bit of its row. So eventually, remember, we want to launch a birthday attack and send each inner state to a proper subset. But to understand how we chose this subset, now that we have a better understanding of how Kitchak F works, we are going to go back and study what it means for two states to present an inner collision and we are going to see which properties of F can be exploited to generate them. So the problem is as follows. From two random inner states S and S prime, we wish to find two message blocks M and M prime, such that we obtain an inner collision after applying F. This problem can be modelled by a system of C equations which depend on the bits of M, M prime, S and S prime. Since F is two rounds of Kitchak F, our equations have degree 4. At first sight, it is thus a hard system to solve. Yet we will see that this system is in fact equivalent to a system of degree only 2. Please take a quick look at where the inner state is located when we represent the state in three dimensions. It is represented in blue in the right hand figure. So here we study inner collisions at a slice level. Here in blue, an inner state collision after two rounds of Kitchak F is represented. And now, since K is a permutation which acts independently on each row of the state, having a collision on a row after K is equivalent to having a collision on a row before K, as such. Pi consists simply in a reorganization of the lanes of the state. Thus, having a collision on each lane after Pi is equivalent to having a collision on a different lane before Pi, but a lane that can be very easily traced back as such. Last but not least, row rotates each lane independently. So once again, having a collision on a lane after row is equivalent to having a collision on a lane before row. So we have gone back up one of the two applications of the only non-linear mapping Kai and the previous system S is thus equivalent to a system of degree only two. It is not so clear, however, how to continue to go back up the next mapping, namely Theta, because it does not simply consist in a reorganization of the bits of the state. However, this difficulty can be overcome by exploiting properties of Theta. So recall that if two bits are located in the same column, Theta will add the same value to these two bits. This implies that the sum of these two bits before Theta will be equal to the sum of these two bits after Theta. By exploiting this property of Theta, we are able to derive necessary conditions on the difference between two states for them to present an inner collision around later. We show that if two states collide on a column after the application of Theta, then necessarily the difference between these two states before Theta is constant on the columns. So here on this slide, we show our property for states colliding on four bits of a column after Theta. The reasoning is as follows, if two states collide on certain bits of a column, then they also collide on the sum of these bits. Since the sum of two bits of the same column after Theta is equal to the sum of these bits before Theta, the fact that they collide on sums of bits after Theta is equivalent to them colliding on sums of bits before Theta. Lastly, it is very easy to show that states colliding on sums of bits before Theta means having a constant difference on these bits, again before Theta. More rigorously, having a constant difference on k bits of a column is equivalent to satisfying k-1 equations of our system S. However, recall that our goal was to apply a birthday attack. Therefore, to explore this property of Theta, we need to create a set in which any pair of states has constant difference on their columns. To do so, we decided to simply produce states that are constant on their columns in value. It is then straightforward that if we generate a set of states that are all constant on columns, then the difference between any two of these states is also constant on columns. How to produce such states is not straightforward because it corresponds to solving a system of c equations of algebraic degree 2, but the only non-linear mapping sky and we decided to linearize it. To linearize sky, we used two of its very well-known properties. Recall that sky works on rows. So if you allocate a value to an input bit of a row, for example, you fix the value of 1 bit to 0, you fix the value of 1 bit to 0, then two output bits can be expressed linearly and one output bit takes the same value with probability 3 out of 4. I have now given all the ingredients of our attack. In order to complete the attack, the remaining problem is to optimize allocation strategies in order to construct a good linear system which satisfies as many equations as possible. I won't go too much into the details of how we optimized our allocation strategies, but I will give an example of our allocation strategy. This step is very important because we have a very small amount of degree of freedom at our disposal in Spolkitschak. For example, for Kitschak with the 200-bit state and 160-bit interstate, which is the example I'm going to focus on in a second, we only had 40 degrees of freedom, but 160 equations to satisfy. So I will start by giving an example of allocation strategy on a slice. Here we allocate the value 0 to 3 bits before Kai, the bits in blue. This allows us to linearize the value of 6 bits after Kai. This means adding 3 linear equations to a linear system and costs us 3 degrees of freedom. In order to ensure consistency on the yellow bits, rather than adding 6 equations to our linear system, we only need adding 4. Because we do not care about the value of the constant, and thus we can simply construct linear equations by summing the expressions of yellow bits and all of that equal to 0. This allows us to save 2 degrees of freedom. Lastly, note that with probability 5 out of 8 rather than 1 out of 2, we satisfy an extra equation of S. So let's now give an example of allocation strategy on a state. Again, the Kitchak instance considered here is a Kitchak with state lengths 200 bits and capacity 160 bits. On 5 slices, we set 3 bits to 0, and on 1 slice, we set 2 bits to 0. This means that we get a linear system of 39 linear equations. Then, we produce a state by solving this linear system, and thus we get an output set such that for any pair of this output set, 21 equations of S are satisfied automatically, and 6 equations of S are satisfied with probability 17 out of 32 to the power 6. We can thus compute the probability P for any two states of this set to collide. Applying a birthday attack, we get a time complexity of 2 to the power 70 times a constant G that we will specify in a minute. Roughly, it can be understood as the cost of finding a solution to the linear system. So how do we compute G? First, note that its value does not depend on the rank of the linear system. This is because to conduct our attack, we solve the linear system a large number of times, and on average, each Gaussian elimination provides 2 to the power r minus e solutions. Further, we can pre-compute the Gaussian elimination before conducting the attack. This means that looking for a solution to the linear system corresponds simply to a multiplication matrix vector, that is, e times c operations. G is thus equal to e times c divided by the number of solutions provided by each multiplication matrix vector, which is also 2 to the power r minus e, divided by the number of operations of K check f reduced to two rounds, which we called NO. In the case of our attack example, the final time complexity is thus of 2 to the power 73. Note that we do not discuss the memory complexity of our attack, because it can be a greatly minimized thanks to Floyd's Circle Finding algorithm. So as a conclusion, we tackled the challenge of the K check team and which was recalled, if you can recall, that the smaller versions are hard to break. Our crypt analysis shows that their statement seems true, because even two rounds required a strong effort. Most importantly, as a conclusion, we believe that small K check instances require dedicated analysis. Thank you very much for listening to this presentation.