 Hello, I'm Hashay Arbauruti from Laboratory of Security and Cryptography at EPFL, and today I'm going to present our paper titled Cryptanalysis of Plantlets with Subadip Panik and Takanori Yuzobe. We will start by a brief introduction on small state stream ciphers, specifically Sprout. Then we're going to talk about some of the cryptanalysis results in Sprout. Later we will talk about the updated version of Sprout, which is called Plantlet. Last but not least, we're going to describe our attack on Plantlet. So let's start with the introduction. In 2001 an Asia Crypt, Biryakov and Shamir presented a work in which they proved that the internal state of a stream cipher should be one and a half to two times the size of the secret key, otherwise the stream cipher would be vulnerable to some generic time memory trade-off attacks. But in 2015 at FSC, Arminyak and Mikhailov proposed Sprout, which kind of contradicted the results by Shamir, because it had a state size exactly equal to the size of the secret key. But they were using some key-mixing method to get rid of this attack, this trade-off attacks. So Sprout has a grain-like structure, meaning that the internal stage is composed of an LFSR and an LFSR, both of size 40 bits in this case. And it has a much smaller area than any other known stream cipher. This is a diagram of the Sprout circuit. And now we can talk about some of the cryptanalysis results in Sprout. So in 2015, the same year that Sprout was proposed, a related key distinguisher was proposed by Junglin Hall. Later on, Mitra et al. proposed a partial state exposure attack, again in the same year. And in crypto 2015, a guess and determine attack was proposed by Leymann and Plasencia. So later in SAC 2015, a time-memory trade-off attack was proposed by Eskin and Kara. And in interquip 2015, Banik presented a work called Improved Key Recovery with Partial Knowledge of the Internal State. So now let's talk about Plantlet. So Plantlet was proposed in FSE 2017 by the same authors as Sprout. Basically, Plantlet is an updated version of Sprout, such that it will be secure against the attacks that we talked about before. So first of all, to avoid the guess and determine attack, the LFSR size in Plantlet was increased to 61 from 40. To avoid the attack by Eskin and Kara, the key mixing was changed to fully linear. And also, a neat trick was used to avoid Banik's attack. So Banik's attack was using the fact that using a specific IV, one can set the whole bits of the LFSR to zero. And to avoid this, the authors proposed two phases of LFSR update. First one is during the key IV mixing phase or the initialization phase. So in this stage, the 61st bits of the LFSR would stay fixed to one. And then the rest of the LFSR would get updated. And in the key stream generation phase, the whole 61 bits will be updated. So here you can see a diagram of the Plantlet stream cipher. So let's talk about the LFSR updates. As you can see here in the first 320 rounds, the LFSR update does not touch the 61st bits and also the generated key stream bits is given back to the register. And during the key stream generation phase, the whole 61 bits are getting updated. And both LFSR functions would have maximum period this way. So I think we have enough, we talked about Plantlet. So let's talk about our crypt analysis result. This observation that we made was that as the LFSR function is fully linear, imagine you have two states, two LFSR states at time t1 and t2. So knowing the difference between these two LFSRs and also the difference between the times t1 and t2, you can recover the values of the LFSR at time t1 and t2. So this is some basic linear algebra, we will talk about it later. But basically what it says is that given LT1 XOR with LT2 and t2 minus t1, one can compute LT1 and LT2. So we started studying the differential structure of Plantlet. And what we noticed was that imagine we have two times t1 and t2, both 0 modulo 80, and the internal state at this time only differs in the 43rd bit of the LFSR. And we found out that having such an internal state difference, we can have a 160 bit difference vector such that 45 bits of it are fixed. And this difference vector is not anything strange. So basically we take the 80 bit long key stream from time ti on and from time ti backwards and also we do it like we do it for t1 and t2 and then we XOR them together and we get this difference. So 45 bits of this are fixed with probability 1. So also another observation that we made using the same difference is that for seven other bits, different bits in the key streams, although the value is not fixed to 0 or 1 with probability 1. But it's just a fixed linear combination of LIs. And also for another difference bit, for one other difference bit, this difference is just a quadratic combination of the LI values. So we use these observations and filter out the incorrect candidates for the internal states in our attack. And now we're ready to present our attack. So our attack has four main stages. First there is an offline pre-computation stage. In this stage, we basically compute the LFSR values out of the differences between the LFSR values and the differences between the time. The second stage is the online stage in which we just collect a lot of key streams such that a certain hypothesis is satisfied. In the third phase, we filter out these key streams that we collected and we keep a bunch of pairs, a bunch of good pairs. And then finally in stage four, we generate some algebraic systems from these pairs, from these good pairs that we collected, and we solve those systems. So let's talk about the pre-computation stage. As we mentioned earlier, having the difference between the LFSR values and the time difference, we can recover what exactly these LFSR values are. So plantlet by design does not allow us to generate more than 2 to the power 30 key stream bits. So we have that, this time T1 and T2 should both be less than 2 to the power 30. So how many different possibilities for T do we have? We have 2 to the power 30 over 80, because both T1 and T2 are multiples of 80 as well. So we want the difference between these two states to be E43. This is the difference that we care about. So having this, we just have to solve this equation. We have to find LT1 in this equation. So we have the LFSR update matrix to the power T, which is the time difference, plus I times LT1 is equal to E43. So and we prove that this system always has a solution, always yields a solution. So this way we can compute LT1 and adding E43 to LT1, we compute LT2. And then we save all this computed LFSR values in a hash table, which basically the key is the time difference. So let's talk about the first online stage now. So in this stage, we basically want to collect a lot of key stream bits such that we would be sure that for two of them, for two of these samples that we have, the states, the internal states at some point only vary on the 43rd bit of the LFSR. So we first check that if we have N of the samples, what's the probability that two of them having this property? And we prove that this probability is equal to N to the power 2 over 2 to the power 102. This is a simple birthday argument. So like we mentioned before, Plantlet doesn't allow us to generate more than 2 to the power 30 key stream bits. So we have that N would be equal to 2 to the power 30 over 80 maximum for a single IV, which is approximately around 2 to the power 23.7. So what's this probability here? It would be 2 to the power minus 54.6. So how many different IVs do we need? So we would have one good hit on average. We would need 2 to the power 44.6 different IVs. So let's talk about the second online stage. So here we will keep a list of these pairs, T1, ZT1, YT1, which these are the things that we defined before, the forward direction key stream bits of length 80 and the backward direction such that we want to keep the pairs such that they pass the checks that we had in observation 1 and 2. So the important thing here is if we have this difference here in the internal state, we would definitely get the 45 fixed bits in the difference in the key stream generated and also the others, the observation 2 conditions. But the other direction does not necessarily hold as you can imagine. But what we will do is that as soon as we see this type of a pattern and the key stream generated, we will just assume that the internal state had the desired difference that we wanted, meaning that they only differed on the 43rd bit of the LFSR. So this is the idea of the attack in this phase. So we're going to keep everything that passes these two conditions and then we will guess that they have this difference, the 43rd bit difference in the internal state and then we will solve some equations and see if our guess was correct or not. So what do we do? We can just implement this algorithm by a hash table. And then we first check if the key stream bits at these two times have the desired difference on the 45 bits that were fixed. Then we assume that the internal state difference is only on the 43rd bit of the LFSR. Then we compute LT1 and LT2. We just check the precomputed table that we had and then we check the observation two conditions having these values of LT1 and LT2. And then we see if the difference that we have and these values are compatible. So let's see how many pairs would pass these checks. So how many pairs in general do we have? We had 2 to the power of 30 over 80 different samples for each IV. So we have N choose two different pairs which is around 2 to the power of 46.36. So what's the probability of a pair passing the first test is that those differences, those 45 differences should be 0, right? So it's 2 to the power minus 45. Then we compute the value of the LFSR assuming that the internal state difference is only on the 43rd bit. So we have 800 bits that we can check. We have seven linear combinations of the LFSR values and one quadratic combination. So for the linear combinations the probability of them being compatible with the LFSR values computed is 2 to the power of minus 7 and for the quadratic one is 3 over 4. So in average the number of pairs that would pass for a single IV is 2 to the power of minus 606. How many different IVs did we have? 2 to the power of 54.6. So how many pairs in general pass is 2 to the power of 48.54. So let's talk about the last section. The last step in the attack. So here we collected a bunch of pairs which had the desired differences on the key stream. So how many of them did we have? 2 to the power of 48.59. So we start, for each one of these pairs we start creating a system of equations. So we add a bunch of unknowns to our system which are basically the LFSR bits N02 and N39 and K02, K79 and then the values of the LFSR computed using the lookup table that we had. And we have 320 polynomials in Z2 and K. So these are basically the polynomials that we get from that key stream generation function from time t1 forward of length 80 from time t1 backward to the length 80 and the same for t2. And then we feed these equations to a SAT solver to see if we get the secret key or not. So first of all, we run some experiments to time these, to time the solver. Like how long does it take for the SAT solver to solve when we have a unique solution to the system? And here you can see the histogram. And then sometimes the assumption that we had that the internal state only differs on the 43rd bit is wrong, although we had the desired difference on the generated key stream. But in this situation, the SAT solver would abort and would not give us a solution. So we also approximated the amount of time that it takes the SAT solver to abort this type of a situation. And this is the histogram of that. And in the end, of course, we just measured the amount of time that it takes for a single platelet encryption to happen with the same processor that we used. And this is the histogram for the encryption. So now let's talk about the total complexity of the attack. The pre-computation stage is not really costly. So we have matrices of size 61. We have to do Gaussian elimination on them. So the complexity of this is approximately 61 to the power 3. And also, how many of these do we need to do? It's 2 to the power 23.7 for different t-values. So the second stage, the first online stage, is the most costly part in this algorithm. So what's the complexity of this stage? We need to generate 2 to the power 30 key stream bits for how many different IVs? For 2 to the power 54.6 different IVs. Hence we have 320 plus 2 to the power 30. This is the initialization phase. Time 2 to the power 54.6 platelet iterations. Which would approximately give us 2 to the power 76.26 platelet encryptions. So let's talk about the solver part. So the SAT solver, when there is a unique solution, will give us that solution in 2 to the power 17.5 platelet encryptions on average. And when there is no solution, meaning that the assumption that we had that the difference is only on the 43rd bit of the LFSR is wrong, the SAT solver would abort in 2 to the power 17.13 platelet encryptions. So how many different sets of equations do we have that we need to solve? We have 2 to the power 48.54 of them, the entries to amount. And so what's the total complexity? We have one correct sample which is going to take 2 to the power 17.5 and we have 2 to the power 48.54 minus 1 incorrect samples. Which each of them will take 2 to the power 17.13 platelet encryptions. So in general we would have 2 to the power 65.7 encryptions. So all of these stages are done independently from each other. Like they're serialized. So when we add up the complexities we just add them together. There's no multiplication here. So what's the total complexity here? It's completely dominated by the key stream generation phase. So the total complexity would be around 2 to the power 76.26 platelet encryptions. So we're not really happy with this complexity because we went through a lot of trouble and then we get like an algorithm which performs 8 times better than an exhaustive search. And as we said, the most costly part is generating the key stream bits. So we started thinking if we can generate less key stream bits and still have the same amount of pairs. Because we just need those amount of pairs to make sure that we have a pair that the internal state only differs on the 43rd bits of the LFSR. So during the talk we always mentioned that T1 is equal to T2 is equal to 0 moduloat. But we never actually use the fact that T1 and T2 are 0 moduloat. But we only use the fact that they're both equal to each other moduloat. So what we do is that we take different possibilities of T1 equal to T2 equal to some i, which i is between 0 and 79 moduloat. And doing this we can do the filtration and the solving part for different i values separately. But we would need 80 times less IVs to get the same amount of pairs. So what does this mean? It means that we need to run the 2 to the power 30 bit key stream bit generation for 1 over 80 times less IVs which means that the whole complexity like the dominant part of the complexity would be also divided by 80. So the total complexity would be 2 to the power 69.98 plantlet encryptions. Which is way better. So let's conclude this talk now. So we presented an attack on full version of plantlets in this work. And the main thing that we use was that the internal state difference such that the NFSRs are equal to each other and the LFSRs only differ on the 43rd bit actually propagates a really nice difference into the generated key streams. And the total complexity of our attack was 2 to the power 69.98 plantlet encryptions. So I think that's about it. And thank you very much for listening. I would recommend you to read the paper for more details specifically on the memory complexity and all the tricks that we used and the proof of dilemmas. And feel free to contact us by email if you had any questions. And thank you again for listening.