 on cryptanalysise, we'll have two talks, the first one is improved linear, improved differential linear cryptanalysise of several round chess key with partitioning and the paper is given by Géton Laurent and obviously he's giving the talk. Thank you. Thank you. So I'm going to talk about cryptanalysise of chess key. So first of all, what is chess key? So chess key is a micro-algorithm that was designed by Muay et Tal and presented à SAC 2014. So what this means, well, a Mac algorithm is something you use to provide authenticity of messages. So it's used when you don't want to encrypt your message, but you want to make sure they're authentic. It can also be combined with encryption, of course, to build authenticated encryption and then you get all security features. So one typical use case would be if you have something like a network of sensors, say you're measuring the temperature somewhere, and you want to collect all the data and you want to make sure all the data is authentic, that someone is not sending you false data. So you use this Mac algorithm to authenticate the data. And you don't actually need encryption because a temperature reading is something public. Anybody can do their own measures. So there's no need to encrypt the message, but you want to make sure they're authentic. So that's when a Mac algorithm will be used. So the way a Mac algorithm works, you have some function that takes a key and a message and gives you a short tag as output. And then you send the tag together with the message. And then the reception of the message can recompute the tag with his own copy of the key and then verify that the tag is correct. So in the case of Chaskey, it's a Mac algorithm that's optimized for microcontrollers. So it's a lightweight design. And actually one of the main design goals was to be 10 times faster than AES on microcontrollers. So that's a very aggressive design. And it's also interesting to look at Chaskey because Chaskey is currently under consideration for standardization by ISO. So it makes sense as a community to look at this design and see how secure it is. So in terms of design, Chaskey looks like this picture here. So you can see it's a permutation-based design. So you have this big public permutation pi. And you iterate it, and you export messages inside the state at every round. And actually the way it's constructed, you can see it as CBC Mac on top of an even non-source cipher. But in terms of design, it's really just this permutation-based thing. The size of the state is 128 bits. That's also the size of the key. And in terms of security, you accept to have a birthday security, which means that any attack should have a time data product at least to the 128. But more concretely, the designers of Chaskey limit the amount of data you should authenticate with a single key to the 48, which means that any attack should require at least to the 80 times. So, more precisely, if you want to understand the design, you have to look inside the pi permutation. What you find inside is this thing here. So as I said, the state is 128 bits. You consider it as four words of 32 bits. And then you do a series of operations on those words. And you only do very simple operations. You do additions modulo2 to the 32. You do rotations, bitwise rotations. And you do XORs. And those three operations are very simple in software, they're very fast. And when you combine those, you get something we call ARX designs. And you expect that the mixing of those different operations give you some security. So more precisely, this is exactly the set of operation that you do. And you can see it's the same structure as SIP hash. And in the case of Chaskey, you do eight of those rounds. So that's Chaskey. Now, in terms of cryptanalysis, if we want to study the design, so we'll see what happens. So we're going to look at security properties of the pipe permutation. And a simple way to do this, in terms of the MAC algorithm, we're going to use single block messages. And when you have a single block messages, the authentication is just this very simple thing here. So you take the message, you XOR a key, you permute, and you XOR another key. And you can see it also as an even monster cipher, actually. It's just the encryption of an even monster cipher. But there's something interesting that you cannot do decryption. I mean, if you have a real even monster cipher, you expect you can do encryption and decryption. Here, because it's a MAC, you can only take a message and get the tag, you cannot do the inverse. And if you look at previous work on the security of this permutation, the designers show some bias if you only reduce to four rounds instead of eight. And this can probably be extended to an attack on five rounds if you do some extra work. So now, let's see what we do in this work. So if you try to do cryptanalysis in symmetric key cryptography, there are two very important techniques, of course, differential cryptanalysis and linear cryptanalysis. I'm not going to go through the details, but the basic idea is that in differential cryptanalysis, you look at differences. So you start with some differences in your plaintexte and you try to predict the difference in the ciphertexte. And if you have a good input difference and output difference so that this happens with a good probability, that gives you an in-sequition. On the other hand, linear cryptanalysis, what you do is you look at linear approximations. So you're going to select a subset of the input bits and a subset of the output bits, and we call it a mask. But what you want is that when you take this, the extra of those input bits and the extra of those output bits, they are correlated. And if the correlation is strong enough, this gives you an in-sequition. So in the case of ARX schemes, when you try to apply linear or differential cryptanalysis, when you look at your differential trail or linear trail, it will often look like this. So you're going to start somewhere in the middle with a single active bit and then you propagate upwards and downwards. And so if you start from a single active bit, maybe after one round you will have five active bits. After another round you'll have 20 or 30 active bits. The next round you get maybe 100 and then you cannot do anything anymore because basically everything is active. So you have a few rounds where you can control what happens and then everything explodes. Basically you just lose everything. So because of this property, it's interesting to look at techniques where you can combine two independent trails. Because if you have one trail, so if you have one trail up to somewhere like this, you cannot really extend it. But if you can combine with another independent trail, then you will be able to target more rounds. So in particular one technique that does this is the boomerang attack. So that's a very nice cryptanalysis technique where you combine two differential trails. But unfortunately we cannot use it here on Chaskey because we have no decryption oracle, like I explained earlier. So boomerang attack will not be applicable. However, if there's another technique that does something similar, it's differential linear cryptanalysis. So that's what we're going to use here. So differential linear cryptanalysis was introduced by Longfrog and Helman in 1994 and then extended by BM Dunkelmann and Keller in 2002. And the main idea is... So you're going to divide your cipher in two parts, the top part and the bottom part and you want to do something independent on each part, of course, that's the main point. On the top part, what you do is you try to find a differential trail. So starting from some input difference delta, you try to have an output difference gamma with a good enough probability. So that's what you do in the top part. And then in the bottom part, you try to build a linear trail. So starting from mask alpha, you want to go to mask beta with a good enough correlation. So that's how you define your differential linear distinguisher. Now, the way you use it, you're going to start with a pair of inputs with, of course, a difference delta because that's the interesting difference. And then you know that with probability p, you're going to reach difference gamma in the middle. OK. Now, if you take mask alpha out of those wide and right prime values, you know that with probability p, you actually have this difference gamma. So then you know exactly what you get here. And if you don't follow this differential trail, well, you can expect that you get something more or less random. So you can have probability 1,5 times 1 minus p. So this means you actually have a bias of p over 2, basically, when you do this linear masking here in the middle. And next, you're going to look at the bottom part. And we know that there's a correlation between y alpha and z beta. And we know there's a correlation between y prime alpha and z prime beta. But when you combine those three equations, what you get is that there will be a correlation between z beta and z prime beta. So you start with a pair of input with a fixed difference. And then you can say something about this mask taken in the output. So that's how you use the distinguisher. And if you look at how much it cost to use this distinguisher, the cost is basically 1 over p square epsilon 4. So that's basically you pay the differential price twice and you pay the linear price twice. So it's somewhat expensive. But when you cannot extend your trades anymore, it's still a good technique because you can combine two of them. So you still get good results, even though it seems very expensive. It's actually nice because you can add rounds more efficiently. So now, what we do in this paper, we're going to do it a little bit differently because when you do this analysis, it's actually quite complex to see what happens in the middle. Because you can have lots of different trades. Starting from delta, you can have many different values gamma here that are good enough and that can interact. And same thing starting from beta at the output, you can have many alpha that are somewhat good. So it's very complex to really understand what happens here. So instead, what we do is we're going to divide the cipher in three parts. So that's only for evaluating the complexity. The attack is still the same, but we use a slightly different way to evaluate the complexity and to actually build the trades. So what we do, we divide in three parts and we select the position here where there is a single active bit because we know that most of the time, good differential trades and good linear trades will start with a single active bit somewhere in the middle. And you'll have this kind of shape with one bit in the middle and then it gets complex and then it gets complex in the other direction. So if we start from this position, we can divide in three parts. And now the middle part is... we will consider it as a small differential linear distinguisher. So at the top, we have something purely differential. At the bottom, purely linear. And in the middle, it's already something differential linear. But the good thing is, since it's small enough, you don't have too many rounds, you can do it experimentally and you can evaluate the bias of the small differential linear distinguisher. And when you do it this way, you actually capture all the different ways the trail can interact. So you get a much better estimation of the complexity. And in order to build this distinguisher, what we do is we just try all possible positions for this single active bit. Here in the differential part, here in the linear part, then we evaluate the bias of the differential linear thing in the middle and then we build a differential trail and a linear trail. And we just keep the one that's the best, of course. So we try all possibilities. So if you apply this to six round chess key, it actually works quite well. So the best position we found was to have four rounds in the middle, then one round at the top, one round at the bottom. And we used those bits as the position with the single active bits. And then we have this input difference and this output mask. And if you look at this, if you try to evaluate the complexity of this differential linear distinguisher, you find that it's something around 2 to the 34. And we actually implemented this and the experimentation really matched this prediction. So it means this analysis is relatively good. So this is already a six round distinguisher. So that's already nice. But of course we want to try attack more rounds. So how can we extend this to more rounds? Well what we do usually, if you look at SPN ciphers, if you have say a differential distinguisher of 5 rounds of AES and you want to break 6 rounds of AES, what you do is you're going to guess some bits of the last round key. You're going to do partial decryption and then you're going to test your distinguisher. That's a very common way to do it. In the case of ARX ciphers, it's a bit more complex because you tend to have differences more or less everywhere and when you guess key bits, you don't know exactly which key bit are going to affect your addition because you have carries that can go all the way around. So in this paper, we try to give techniques that can be used to do something similar to this partial key guess and partial decryption. And the way this is going to work, so we're going to guess some key bits. So that's the main idea. And then what we do when we guess some key bits, so if you guess say bits of the first round key, then from the plaintext you can compute a few bits of the states after the key addition, of course. And then what we do is we partition the data according to the value of those state bits. So we have several subsets of the data. And what happens is we're going to look at the bias of the distinguisher in each of the subsets. And what will happen is if we select properly the bits of the key that we guess, then subsets will have a stronger bias and some subsets will have a weaker bias. And then this will help us get a better advantage. So those techniques have already been used in certain ways. So in particular by BM and Caramelly at SAC 2014. And also attacks on SAC20 are also very similar to the main idea is quite similar. So more concretely in the case of linear cryptanalysis I try to explain a little bit how this works. So if you do linear cryptanalysis of an addition, so first of all in addition you can write it as this set of equations. So if you do X is Y plus B, each bit of the output can be written as the XOR of the two input bits and some carry bits. And the carry bits as computed as the majority of the previous three bits. Three input bits and one carry bits. So if you try to predict say a bit XI of the output here, so it's going to be AI plus BI plus CI and what you can do is say well the carry is correlated to any of the input bits because it's a majority including this bit. So there is some correlation between AI minus one. So you get this linear approximation that XI is correlated to AI plus BI plus AI minus one and you have a bias of one half if you use this linear approximation. So now when we use partitioning the idea is we're going to look at the actual values of those bits at the I minus one position. And so in the actual use case what we do is we guess the corresponding key bits and then we can get this bit from the plaintext bit of the key bit. And when we know those bits, if we know that both of them are equal to zero, then we know for sure that there is no carry. So there's no longer any probability involved. We know definitely there is no carry and so we definitely have XI is AI plus BI. And similarly if we know that both bits are equal to one, then we know there is definitely a carry and then we have this equation. So already by guessing two corresponding bits in the key bit, we're going to look at those two subsets where we have an increased bias and what we do is we just throw away the rest of the data because if those two bits are different then we don't know whether there is a carry or not and we just ignore this amount of data. So what happens is you throw out one half of the data but your distinguisher you actually gain a factor of four. So in the end you gain factor two in the total thing. And this allows us to analyze the case where those two bits are different. If we have zero on one here, if we also look at the previous bits, if we have zero on zero, then we know there is no carry coming to step I minus one and therefore there is no carry coming to step I. So we have to throw out less data so we can use more of the data and then we get a better complexity. So if you really want to do this in a more complex case, well you have to look at different bit positions because you have different active bits. You want to get several bits next to each other and you can also try to predict what happens at the second nonlinear operations but everything gets really complicated and messy and it's hard to do a really correct theoretical analysis. So what we do instead is use a more experimental approach. So we just look at all the bits that are potentially interesting. So that means all the bits that are next to a bit that we are interested in. And then we just collect a lot of data by guessing those bits and dividing the data into subsets according to those bits and just measure experimentally what bias we have in each subset. And then this allows us also to detect which bits are actually interesting and then to just evaluate the complexity of the attack using those bits. So that's what we do for linear cryptanalysis. On the differential side we also do something similar. So it turns out it's a bit more complex to apply this technique de linear cryptanalysis but it still works. But you have to use structures on multiple differential. So it's a bit more complex and you get a smaller gain than in the case of linear cryptanalysis. But when you combine all this on both sides if you look at the sixth round attack I described earlier we get a significant improvement because we actually reduce the complexity to only 2 to the 24 pairs rather than something like 2 to the 34. So we have really a nice gain and you can see that the gain is better on the linear side than the differential side. Now in terms of time complexity the way you do the attack basically what you do you're going to do some guess for the sub key and then you compute some distance between your measured bias and your theoretical bias for each key guess. But what happens is you have to compute so a whole lot of values L of K I'm not going to go through the details but the important thing is when you compute all those distances simultaneously you can actually do it with a fast Fourier transform and then it doesn't cost very much and basically the time complexity will be very close to the data complexity thanks to this FFT trick. And in the end what we get is this sixth round attack the time complexity is about 2 to the 29 so it's not much more than the data complexity and it's really much better than the basic differential linear distinguisher. So that's for sixth round now can we go further well luckily yes we can so if we want to use the same techniques on seventh round so the first step again is to look at good differential trails linear trails and differential linear distinguisher in the middle so again we tried a different way to do the division and to select where you put this single active bits and turns out one that works good is to have four rounds in the middle one on the half round at the top one on the half at the bottom and then you have positions for the single active bits and you get this input differential that output linear mask and then the bias of this differential linear distinguisher will be around 2 to the minus 40 so you have an attack with complexity something like 2 to the 78 so of course this is too high you're not going to break the security claim with this simple technique but when we use the improvements using partitioning then you get something that's actually better than the claim and the data complexity will be 2 to the 47 pairs so to the 48 plaintext and the time complexity is around 2 to the 67 and again you can see that the gain is mostly on the linear side this technique is really more efficient on the linear side so finally here are the results so we have a nice attack on seven rounds so I forgot to mention the six round attack have been implemented completely with all the tricks so this gives us good confidence that everything is valid and that the seven round attack will really work so this means that the security margin in Chesky is relatively slim because the full version has eight rounds and we have an attack on seven rounds actually the designers of Chesky have decided to increase the number of rounds and they have now proposed a version with 12 rounds so to conclude on a more general note I think this work well the main message I think is that differential linear attacks are really quite efficient for ARX design so that's an interesting result and when you combine a number of tricks to improve it you can actually gain something significant and basically we get one more round using all those tricks and so we have three main tricks the first one is to divide in three section and do something more experimental in the middle then we use this partitioning technique to improve the data complexity and finally this FFT trick to reduce the time complexity so thanks for your attention and I'll be happy to take questions ok so we're slightly over time if there's one quick question thanks very nice talk you said this is all implemented did you put up the code somewhere online so that it can be good question I didn't yet but here I should put it online that's right if you're interested right now just send me an e-mail I will send it to you but yeah it would be good to be online that's right thanks let's thank the speaker again so we can move to the second talk of the session which is about reverse engineering of the S-box of Streebok Kuznijic and Streebok BR-1 I don't know how to pronounce the ciphers the paper is by Alex Biryukov Leopera and Alex Say c'est bien cool