 Hello and welcome to our talk. New slide attacks on almost self-similar ciphers. I'm Ordan Kelman and this is a joint work with Nathan Keller, Non-Lastery and Adishami. So we're going to start with a quick introduction and recapping of the slide attacks. And especially we will discuss the issue of generating these sleep pairs. After we discuss these, we're going to move on to mention several applications of slide attacks and we will mention that due to generation of sleep pair issues, most of the attacks are actually against facial constructions. Then we're going to discuss how to attack self-similar SP networks. And specifically we'll start with 1K AES, which is a generalization of AES with infinite number of rounds for the same sub-keys used in all the rounds. And we'll see a baronatal attack from 2018, how to break such a cypher. Then we're going to discuss what is the problem of applying a slide attack to SP networks. And specifically there are two problems. First of all is a generic due to the last round key addition. And then we're going to discuss cypher SP networks such as AES, which have a very different or slightly different last round, which already poses a lot of problems to slide attacks, which are very much dependent of having self-similarity similarity all through the rounds. To solve these issues, we're going to introduce four new techniques. The first one is sleep sets, where instead of sliding only pairs, we're sliding sets of plaintexts. Then we will show the hypercube sleep pairs, which is actually a way to take several sleep pairs, which were built in a specific manner and make them into many other sleep pairs. So you get the hypercube. And then we're going to discuss two more techniques. The first one is the suggestive plaintext structures. And in the suggestive plaintext structures, what happens is instead of having to guess which are the sleep pairs, we work with a slightly different approach where once you work with a plaintext, you automatically get some of the key material. We'll see this later. And at the end, the substitution slide attack, which involves a lot of playing with arguments to make the attack even faster, specifically for the case of 1KAS, we're able to attack it using 2 to the n over 2 non-plaintext, meaning we need one sleep pair for the attack to work, but using 2 to the 3 over 4n time, which is significantly better than the trivial, let's try all 2 to the n pairs and see whether which of them is indeed a sleep pair. So the slide attacks were introduced in 1999 by Birkovin Wagner. And these are adaptation of the related key attacks introduced by BM93 and Knussen in 1992, to the case where the key schedule generates self-related keys specifically, assume for a second that the key schedule is such that the key is the same in all the rounds. If this is the case, then if the plaintext p is encrypted to the intermediate encryption value q after one round, then this q and another plaintext q can develop together through the encryption function. Because if the value key q here is equal to the value q here, and this is exactly the same function, then these two values are going to be the same. And if these two values are going to be the same, and this is exactly the same function, again, the same key. Therefore, these two values are the same. And this continues all the way through until the ciphertext c, which is equal to the value here. Now, do know that this thing is independent in the number of rounds. Actually, having more rounds here in between doesn't change this property, which has a very nice implication, meaning that slide attacks usually can break any number of rounds. And at the end, we have here another round because p was encrypted one round to q. That means that the number of rounds here and the number of rounds here is the same, meaning there is still one last round needed here. So a slit pair is actually a pair that p becomes q after one round, and such a pair actually satisfies these two conditions. q is equal to the encryption of p, and d is equal to the encryption of c, but through only one round. Now, as you probably know, we're using many rounds because of the diffusion and confusion approach, or if you want to make things harder and more complicated, that means that breaking a single round is expected to be significantly simpler than breaking the entire scheme, especially as the scheme can have as many rounds as you want, because as I mentioned before, the attack is independent in the number of rounds. So how the attack works, first of all, we need to find such a slit pair, p and q such that they satisfy this relation, we get this relation for free. And then we try to break f of k, the round function, using the slit pair, we find the key. Now, there's only one problem. In most of the attacks that work using slide attacks, what happens is that you take a pair that you don't know whether it's a slit pair or not, you extract the key, then you verify the key, and then you know that you found a slit pair, so you know that you found the key. So if you look at it very carefully, you will notice that in order to attack the scheme, we first of all need to find a slit pair, but to find a slit pair, we need to find the key. And we know that we found the right key only if we found a slit pair. So there is some internal loop there. And this is why many techniques were developed in order to mitigate this problem. And most of them work mostly for facials. And this is the reason mostly facials are attacked using slide. Now, how do you generate these slit pairs? So the worst case, let's assume that you have n-bit block. What you can do is to pick 2 to the n over 2 non-plane text. Now, throughout the talk, I'm going to disregard small factors. Of course, they are in the paper. We need to be accurate. So if you pick 2 to the n over 2 non-plane text, you expect to find one slit pair with very high probability, 63%. Again, exact details are in the paper. Now, again, that means that there are 2 to the n pairs. So we need a way to find which is the correct pair or which is the slit pair, actually. And for facials, you can find such pairs in more efficient ways. For example, for the 1KDS, which is a generic desk construction, generic face of construction, you have the same key going in all the rounds. You can do this using 2 to the n over 4 chosen plane text. And this is already in the original paper by Biryakovin Wagner. Now, for 2KDS, where you have K1, K2, K1, K2, K1, K2 interleaved, then you need the more advanced techniques such as slide with a twist or the complementation slide. And then you can do it in 2 to the n over 4 chosen plane text or chosen cipher text. And you can even attack 4KDS using 2 to the n over 4 chosen plane text and cipher text in a way that combines both the slide with a twist and the complementation property. These are from the advanced slide attacks paper by Biryakovin Wagner from 2000. So this is how you generate these sleep pairs usually. And of course, there is another technique, which is due to Fruje from 2001. Actually, given a sleep pair, PNQ and their cipher text, CND, there are also one round the part, meaning let's go back here. What happens if I look at CND as the new plane text? C after one round of encryption will become D, meaning that if I continue to encrypt C and D, C after one round, which becomes D and D, they will continue together, meaning at the end, there are again going to satisfy the condition that they have the same value here. And because we have the same keys all around here, we're going to say to get the same value here. And because it's the same value here, again, we're going to get a sleep pair. And actually Fruje noticed that you can iterate this as many times as you want to generate a sleep chain. Now, the advantage of this sleep chain concept is the fact that now we don't need to attack the round function using only two plaintexts. Now, you say, okay, now I have two known plaintexts here because we don't control PNQ. There are sleep pairs. We don't control their values. So if the round function is significantly weak and we can break it using two known plaintexts, then you just apply the slight attack. But sometimes you need more data. And this is why we need this trick of sleep chains in order to generate more inputs to the round function. So if something is a sleep pair, then automatically you get a slide, a sleep chain, which is very useful. One small technical comment, this makes the attack a chosen plaintext, adaptive chosen plaintext and ciphertext attack. Some people are less likely to use it. Now, there are other techniques and generalizations. For example, in 2007, we had a paper about detecting the sleepers using cycles, which can become into a quantum settings as well. There is import from crypto 18. There are the reflections attack, the reflection attack by Kara from 2008. And there is the slide text attack. And there are several other attacks that use that. So the slide attack is actually very useful. And besides playing around with permutations, it can be used to attack artificial constructions like one K desk or two K this or four K desk. But actually there is a misty attack on misty one from 2015. Keylock money, many of the attacks actually start with a slide attack, including the paper from Europe 2008. And even several attacks against the format preserving cipher FF three, they start with finding a slide. Now, there is a related quick there. But at the end, this is a slide attack that reduces the problem of breaking eight rounds of a cipher with said the random function as a round function into breaking only four rounds. And this is a huge advantage, as you all know, eight rounds are significantly harder to break them for, especially in the case of facetails, which we know due to Lubeer and Rockwell that eight rounds is very hard and four rounds is slightly less hard. So here's the generic case we're going to discuss, which is one K a yes, a generic SPN. We have a plain text. First of all, there is a key addition, S books layer, fine layer, key addition, S books, fine layer, et cetera, et cetera. And here at the end, we have another key addition because at the end, if there was no such a key addition, I could always take the cipher text and go backwards until the point of the last key addition. So in any case, even if the cipher doesn't have this key addition layer, I can all, I can always remove all the non-keyed components to get into the last round, which is extra goes to the key. And if this is the case, let's look at the case of a slipper here. If P becomes Q at this point, if they satisfy this relation, then I can rewrite the equation Q, which is the application of K on P and then S and then A into Px or K, which is that part. And because S and A are unkeyed, I can always take the values here and go backwards until this point. So I get a very simple mechanism. I can just take all the plain texts here in advance and just go backwards. And generally speaking, we will usually do this trick. We will move plain text and ciphertext as much as we can until we reach something which is key and we can't do this trick anymore. Now, here is the attack by Baron et al. Take two to the end over two non-plain texts. Take a slipper will satisfy two conditions. First of all, this is the condition from the plain text side. But from the ciphertext side, you have something which is slightly different and we will see it in second. But if you write the equation correctly, what happens is that you apply S on C, then A, and then you X or K. So if you can see, you can extract here K from both equations and you get Px or K, Q prime is equal to K, which is also the X or A of S of C. What happens here in order to identify the right key, you move. So the problem is that trying all possible pairs is very time consuming because you have two to the end pairs. So what you do, you just change sites and you move Q prime here and then you move A S of C here and then you get one side which depends only on the plain text and the ciphertext and the other one which depends on the other plain text and the other ciphertext. So and both of them are equal. So you can use hash tables to immediately identify when this happens when Px or that equals to Q prime X or that. So and Q prime is just moving again all the applying all the functions that we can. So as you can see, this is a very simple attack. It takes two to the end over two, no plain text because each side is analyzed once. You just put everything in the hash table. It takes also time two to the end over two, time and memory. Now the baronatal paper had a few other attacks which are based on slitches. And there is a basic assumption in slide attacks that all the round functions are the same. Now unfortunately, this is not the case for SP networks. And the reason is that the last round is different. Even in a generic SP network, not AS which we will discuss in a second, but in a generic SP network, the last round behaves slightly differently. And specifically, let's look at P and Q. P was encrypted to C and this is how the equations connecting C and D happens. So now C and D are slid, but let's look at what happens when you try to encrypt C and D. C will first be X or with K. Now in this data path, C was first altered by applying S. So actually, if this X or cancels this X word, because this is the same key, what happens is that the value D here is K, X or A on S on X, X or K. However, the value here is A after one round of encrypting C is A of S of X. With very high probability, these two values are not the same. And if this is indeed the case, then of course the chains will not continue. So any attack which is based on building slid chains and these are some of the attacks in Baron et al. paper will not work. Now in the case of AS, we have another problem and is it that in AS then there is no mixed columns in the last round. Actually, in many SP networks, the last round is different. Not only that we have an X or an additional X or which actually changes everything, the fine transformation is different because now the relation between plaintext and ciphertext is significantly harder. So let's think that we have AS, not one KAS as in generic SP network, but one KAS as AS. So if for example Q and P are slid pairs, their ciphertexts do not have a simple relation. They have this X or K mixed column, add round key, sub-byte shifter, add round key. These are two different relations. This is not the case like we have in facials and regular slide attacks, which causes a lot of problems in attacks as we will see later. So we have these two problems which relate very strongly to the fact that the last round behaves differently. And this is why we need other techniques to generate more slid pairs out of a single slid pair. And this is at the end the key technique, which is used in three out of the four attacks that we're going to present today, is that we find new methods of transforming a single slid pair into many slid pairs. The first one is the slid sets. Slid sets actually take two lambda structures. In a second we'll remind everyone what lambda structures are, P and Q, and we actually work with sets such that one set is the encryption after one round of the other. Now if this is the case, and because this is a lambda set, it saturates in a second again, it saturates over all the inputs to some specific S-box, we are going to get two to the S slid pairs. Now we may not know which value goes with which value inside the sets, however it will be very easy to identify which set goes with which set, and then solving exactly what is the internal ordering is very simple. Now because we have better signal, we can go back and attack schemes. Now I'm going to show mostly how we attack one KAS for simplicity. In the paper we show that we can break two KAS and three KAS, whether with full diffusion, partial diffusion, and there are several attacks also on unknown S-boxes, because what happens if the AS has unknown S-boxes if we're discussing something which is similar to patterns to our system, which is a fine S-box, a fine S-box or the S-box layer is secret. So what is a lambda set? So we usually call it a square or saturation or integral. You take a plain text, you fix 15 out of the 16 S-boxes in the case of AS or as many, all S-boxes but one, and you try all possible inputs to this S-box. Now this is a set, one set, and we need to pick many such sets. Now we also take the second set, which is the slit version. Now what we do, we take Q, which is the plain text, which is the slit, and we move it backward through all the un-keyed values, and then we just want to make sure that the set is defined that we still get a lambda set. So what happens is that we take values that are in, not in the fine subspace, this is in the fine subspace, this is not in the fine subspace, but actually if you apply this on all the values in the specific set, you get also a lambda set. Now why is that? Because if by any chance the other, the non-saturated bytes, the non-saturated part of the plain text, happens to become the non-saturated part of the this set, these are actually slit sets, and we change the meaning of the slit set because of course, due to the saturation, each plain text from this set will have a corresponding value in this set, as long as the other fixed parts also encrypts correctly, meaning that the part of the non-active S-boxes here becomes equal to the non-active S-boxes here, after the key addition. Now we ask for the encryption of the second sets, and then we try to find a slit set, and now I remind you that we have many slit pairs. So if Pi and QJ are indeed slit sets, we're going to have many slit pairs between CI and DJ. So here's how the attack works. First of all, we move forward, you see we apply S on CI and A on CI, and this is an attack on 2KAS, just to clarify, we take two rounds of AS, but this is, I could have done also for 1KAS. Now we do the standard trick of swapping the key and the A addition, the key addition and the fine transformation in the second ciphertext, and first slit set, we get this lovely equation, which is if you apply A-1 on D, on all the ciphertext in this set, you will get S on this set, X or A of K. Now if you look very carefully, this means that each of the S-boxes is applied independently. The first S-box here and the first S-box here is independent of the second S-box here and the second S-box here, and this helps us in the following fact that instead of trying to guess a key, and now you can say, okay let's guess a key here, and check what happens, it's just that because of the fine transformation on the key, and we don't assume anything of the fine transformation, that means that you need to guess the full key in order to identify the slit pairs, which we don't want. What we do instead, we link the sets without guessing the key, and this is done by accounting multiplicities of different values. This is a technique from our paper in Asia Cup 2010. What we do, we look, for example, in a specific byte, and in a specific set, we're going to see 100 values which appear once, 98 values which appear twice, and some other distributions. If indeed DJ is a slit set with CI, that means that also in the first S-box here, we're going to see 100 values that appear only once, and 98 values that appear twice. They're not necessarily the same, they are very likely not to be the same, because here there is a very, there is a cryptographic transformation, not a strong one, but there is a cryptographic transformation, but the numbers are the same. If we see here 100 values that appear once, 98 that appear, 2 that appear, not 98 times, 50 times, and 100 values that don't appear at all, then we don't, we have a lot of signal in order to identify whether this set is equal to this set, and you can see in the paper a quick analysis of that fact. Now this allows us to just take each of the sets CI, analyze it independently of each of the sets DJ, which immediately gives us the slit sets PI and QJ. Now we move on to the next technique which is hypercube of slit pairs, so what happens again if we have P and Q which are slit pair, and we change the input to some S-box P in the first S-box from one value to the other, so instead of having P we have PX or A, which activates only one S-box. Now after one round we have this input difference A becomes some A prime through the S-box layer, and there is an defined permutation applied to it, but there are the most 2 to the S possible values because of the fact that they are 2 to the S possible A primes, and this is in the fine subspace. So if I take P and Q, I can try for given A or the QJX or A on A prime, all the possible differences in the same output differences coming from the same S-box, and I know that one of them will succeed. So it's probability 2 to the minus S, I took one slit pair and made it into two. Great, now this is a huge success because now we can break the scheme, right? Everybody are happy. The thing, the thing is we can do something even better. Let's assume for a second that we did this trick twice, meaning we had P and Q, which is a slit pair, and then PX or A and Q store A over A prime, we also found out to be a slit pair, okay? So everybody are happy, and now we have two slit pairs. Now let's assume for a second we got a second one, PX or B and QX or A of B prime, and A and B are in two different S-boxes. So A was for example in the first S-box and B was in the second S-box, okay? Why does it help? And here is the fun part. If indeed we have this base pair and this friend pair and this friend pair, we can generate another pair which we know is a slit pair for sure, namely PX or AX or B and QX or A of A prime, X or A of B prime is a slit pair as well. Now of course if I have more such friend pairs, I can generate a hypercube with more related slit pairs, and this gives us again a lot of signal of identifying the right slit pairs. So I'm going to show you how to take one KAS using this technique, and this time with secret S-boxes, and I'm going to define it for the parameters of AS just to make life easier in the description. Again the paper has the full results. So a reminder we have one KAS, the S-box layer is unknown. We know the fine layer for the sake of making things easy, and I'm going to use the AS notations, and here's what we do. First of all we find hypercubes of dimension five, meaning 32 slit pairs. Each slit pair we generate five other slit pairs, each with probability to the minus 8, so we saturate over all these five S-boxes. So we have probability of success to the minus 40, but when we succeed we get 32 slit pairs. Now what we do, we go into the ciphertext to check whether we indeed got a hypercube, and if this is the case we just need to analyze the input to the add round key and the sub bytes layer, the keyed S-box, and apparently 45 such hypercubes are needed to fully recover the S-box. Now what is observing consistency? For 32 of the slit pairs, because we have a structure of slit pairs, we have different values going into the last round to the unknown S-box, and we have different outputs. Now for the 32 slit pairs, whenever we see the same input value, we expect to see the same output value independent of the S-box. We don't know the S-box, but the S-box is deterministic, so it's going to be always the same. So when we say we check by observing the consistency of the ciphertext, we take the structure and we look whether in the c-part of the structure of the slit pair and I'm using C and D. C is the short, the one from the top, a slit pair. Each time one of the 32 slit pairs has the same value in both, in two or more slit pairs, we expect to see the same value, corresponding value, from the Ds in these slit pairs, and this is why we can actually find the correct slit pair. This moves us to the next technique, which is the suggestive plaintext structures, and as we mentioned before, one of the problem that we have is that in order to find a slit pair, we need, or more precisely in order to verify that a pair is a slit pair, we need to check whether the key guess is correct. So most of the attacks try to do tricks, as we saw before, in instead of analyzing pairs, analyzing specific values. So we apply some transformation on all the plaintext, we apply some transformation on all the ciphertext, and then we store everything into hash table and we try to do things efficiently. Now, in the case of the suggestive plaintext structure, what we do, we do something slightly different, which is, for a given plaintext, we associate or we build the data in such a way that once you try working or analyzing a given plaintext you already have some suggestion regarding partial key information. So the idea is slightly different, instead of doing some analysis to all the plaintext and then some analysis to all the ciphertext, and then trying to find the match, here, what we do, we go for each plaintext, we do, we know that if this plaintext is part of a slit pair, then we know something about the key, and then we iterate over the plaintext, and we already get partial key suggestions, which will be used for analyzing the slit pair. Now, an interesting artifact of this approach is actually that we can get the success rate of one, independent of the process, unlike a regular slide attacks where you need to hope that something will happen, there will be a collision, a birthday paradox collision, and this is not the case, so to attack a very simple 1k AES with success rate of one, you pick two to the n over two plaintext such that the lower half is zero, and then you pick the qj's such that the upper half of this value, which is just p after the xor is the key, is zero as well. So that means that if pi was encrypted to qj, that means that the key, the lower, sorry, the upper half of the key is actually equal to the upper half of pi, because in order for pi to become qj, we need the upper half of pi to be zero, to become zero, so it will match qj. Now, if this is the case, we know that there is a slit pair, and we know information about the key, and this is related to a splice and cut, so if we take the attack by Baron et al from 18 and add splice and cut, this is exactly what we get. Now, we're going to show how to use this attack in order to attack 1k AES with an incomplete diffusion in the last round, meaning let's assume for a second again that we're working with AES as AES, and the last round lacks a mixed columns, so this time we need to pick two structures, qj is one where we fix the upper half to zero and one will fix it to one, and the reason for that is now the plain text p is going to have a counterpart part in this set, and there's going to be a friend, pix or 001, which will be encrypted to some qj in the second structure, and they're going to have actually the same values up to this zero and one, so this actually generates two pairs immediately, and of course you can extract, you can expand this and generalize this to more pairs. Now, if this is the case, because we now have two pairs, so we can take ci and fi, which is the ciphertext of the friend of pi, and we get two values that enter the last incomplete diffusion round, and the same from the two qj's, and we get some differential relation between the two, which can be used to solve the scheme, and you have the full details in the paper. Now let's move on to the last attack, which is a substitution slide attack, and we're going to show it how it can be used to take 1k as, even if the last round is has a completely different diffusion, whether it's more or less, in the case of the suggestive structure technique, we need it not to be the full diffusion, but in the substitution slide attack it doesn't matter, it can be a different a prime last round, and the interesting part is that we need 2 to the n over 2 non-plaintext, as I mentioned before, now, because we don't have extra information, there is one slit pair with high probability, there is one slit pair we need to identify, and we are not allowed to try running a full search of all possible pairs, because there are 2 to the n of this. So how do we do that? Actually, this part is mostly algebra, instead of a, we re-write the key of a slit pair, to be dependent pix or something on pj, this is baron et al's standard transformation, and now we substitute the relations between the ciphertext, and you see that now the relations are complete or significantly more complicated, because the ciphertext c i has to be decrypted, then apply the inverse of the last round transformation, then a, then k, s, it's a completely, it's a significantly harder transformation, but we can substitute all the key additions, instead of k, with this value. So this gives us a series of substitutions that ends with this equation, one side depends on pj and cj, and the other one on p i and c i, which is already a huge step forward, in the sense that now we can do analysis for each side independently, there's only one tiny bit of a problem, as you can see there's still key here, so we cannot build two tables independent of the key, so what we do, we do is slightly different, first of all, we evaluate one side, the bottom side, for all plaintext, so we don't need to do it once, then we guess n over four bits of the key, now by guessing these four n over four bits of the key, we do two things, first of all, we're able to evaluate n over four bits of pj, because we can apply s to them, and then we can find whether pj is satisfying, because we can always apply the inverse, the fine transformation of pj, and we find whether they match, so we reduce automatically the number of candidates, pj to two to the n over four, but another thing that we do, which is independent of the pi's, we can also go and evaluate this equation, again n over four bits, now we have n over four bits information from the plaintext side, n over four information from the substitution side, so we get n over two bits of filtering, so we immediately find the pj that corresponds to pi using hash tables, and once we have a suggestion, we can extract a key suggestion and again verify it, so to conclude, we introduced four new slide techniques, the slit sets, the hypercube of slip pairs, suggestive plaintext structures, and the substitution slides, and they are very useful for a substitution permutation networks, and for other schemes as well, but the main strength is the ability to withstand several changes in the last round, and to the point that even if you have non-complete diffusion or different diffusion layers, or even just merely the extra, you can still attack two KIS and three KIS, here's a summary of the results in a very congested form, I will just mention here that we worked on several candidates, you can see in the paper f is for full diffusion, p is a partial, t is for last round, which is not full round, and there are also some results in the case of the secret S-box, where we don't have the previous work by Baron et al, who's incorrect because it used slit chains to break one KIS using slit chains, but there are no slit chains in the case of SP networks, as I mentioned before. Thank you very much, and you're invited to see the full version.