 Hi, this is a presentation of the paper Lightweight AEAD and hashing using the sparkle permutation family Which is the result of a collaboration between lots of very fine people from even finer places namely Christophe, Alex Luan, Johann Alexey, Veslin, Qingzhu and myself I'm talking to you from my beautiful living room In a suburb of Paris on a particularly gloomy day during the second lockdown But fear not as I'm sure that crypto will lighten the mood and With the promise of not making any more puns. I'm going to move on to the main topic of the day namely Who is Mr. Sparkle? It's a question you may have asked yourself upon looking at the list of candidates that made it to the second round of the NIST lightweight crypto competition thing But it's a question which also occurred in a very different context namely in a simpsons episode When Bart and Lisa encountered a very familiar face That turned out to be the one of a guy named Mr. Sparkle I'm only going to address the first variant of this question in this talk and I'm going to do it in this way First I'm going to walk you over our design process for the sparkle family of algorithms Then I'm going to tell you the tricks we have used to lose weight without losing strength And then I'm going to present the result of this whole process Namely the sparkle family of algorithm itself in the beginning We discussed light-weightness after all it's a competition about lightweight crypto and what does it mean to be lightweight? It's a surprisingly complicated question Which can be understood very simply with this metaphor you want to be as light as a feather But what does it mean because there are many different kinds of feathers? Do you want to be light in the sense of this one or this one? It's the same in the case of lightweight crypto So we have decided to choose a rather narrow definition of lightweight NIST and to optimize for it So our targets were as follows first. We wanted security. That's the whole point of the thing So of course it has to be at the top Second we wanted to design algorithms that were optimized for microcontrollers So that could be implemented in a nice way on 8 16 and 32 bit microcontrollers Which means that they should have a small code size and a high speed but also that they should enable various trade-offs so between implementation speed and Implementation size or between security and speed is kind of trade-offs From these first ideas we came up with a scope statement So we decided to have an algorithm which is sponge-based permutation-based The idea is to minimize the code size while still providing both authenticated encryption with associated data, AEAT and hashing We decided to go for an arc-based primitive because modular addition has Enabled squint diffusion has a very high algebraic degree and is very very cheap in software It's not cheap in hardware, but we decided to optimize for software So we don't care and then we had to decide on the round structure So something that would allow us to have strong security argument while also having All the implementation properties that were our goals and the way we achieve this is by looking at Alzett and the long-trade strategy. So I'm going to define both of these concepts now Alzett is this operation it operates on two 32-bit words It also has a constant which is 32-bit as well And it's this sequence of operation squared plus being modular addition and O plus being the XOR as always This operation has the following properties. So if you iterate it Well, if you just apply it once it has this maximum expected differential characteristic probability and this maximum expected linear characteristic correlation Which is similar to that what you will have from one AES round and when you iterate it twice So what I'm going to call double alzett you get properties that are actually a bit better than two rounds of AES It also has a very high algebraic degree because of the calorie in the moderate addition We can show that it has quick diffusion and it has also very good integral properties But I'm not going to go over the details of how we prove that because we have a whole paper on the topic namely this one Which was accepted at crypto this year and for which you can also find a talk on YouTube The key thing to remember about alzett is that one iteration does not provide much protection, but to do Provide lots of protection So then how do you build a secure round function? With the knowledge that you need two iterations of alzett in a row to be secure You use the long-term strategy which we into which we introduced in 2016 when we designed sparks The idea is that you have your Rx boxes, so your alzett instances that are denoted ACI here And you have here a linear permutation with a good branching number taken over the alphabet of the of the branches What this enables you to have is both Double alzett instances for free because the red part here is one double alzett when you look at two steps in a row Same for the blue same for the orange same for the green But it also provides good diffusion because this Has a high branching number. It's always MDS or almost MDS in our case So we know that the primitive will have The security from the double alzett instances while also having good diffusion, which is what we want But this strategy also has another advantage for our purpose So if you want to bound the differential probability in an algorithm, which is built in this way the best way is to look at truncated patterns because since alzett has a big block size You don't have many instances of alzett that are applied in parallel And so a truncated pattern is this is described using very few bits And since in our case you have half of the output bits of the linear layer, which are copies of the inputs You have you need even fewer bits to describe the changes at each step What this means Is that we can proceed as follows We are going to loop over all possible truncated differential shapes So all truncated ones and then for each of them we are going to divide it into single double and triple alzett Occurrences so this is one such Truncated trade for a simplified permutation. So you have nonzero differences here that Nonzero differences that appear here. There are still nonzero here at this score you have a consolation So the dashed part is a zero And then on this side it's still nonzero So in this case you have that the red part is one double alzett The blue part is a triple one actually and the black is just a single one So if you have a differential trade of the truncated an actual differential trade Which fits inside this truncated one its probability will be upper bounded by the probability for single alzett multiplied by the probability for a double one multiplied by the probability for a triple one So using this division we deduce a bound and Then we repeat this process for all possible truncated differential trades and then we will deduce a bound for all possible trades So bound on the probability of all Differential characteristics we can do the exact same thing for linear trades and We have also done it in such a way that it's really the same thing because of the properties of the L that we have chosen We can go even further in our case Because we are designing permutation based algorithms not block ciphers So what the adversary controls is just a small part of the state the rate so for instance if The adversary finds a good differential trade One with a high probability that starts in the rate, but which in the end has a non zero difference on this side They can't do much with it What they need is a difference where is a trade where the difference is only on the rate in the input and only on the rate in the output and since our Bound on the differential probabilities works by looking over all truncated trades We can just constrain The truncated trails to fit such a pattern so to be such that the input is in the rate and the output is in the rate And of course, we've done just that and we have obtained table like this So this is for a 384 bit permutation. So where you have six as an instances in parallel and So this is the probability you have for differential trails that go from anywhere to anywhere and Then for a rate of 192 and the capacity of 192 if you go from the rate to the rate Then the differential probabilities are actually much lower, which is good for us as designers Because this additional constraint on the truncated trail has actually removed those with the highest probabilities So r to r means rate to rate and r to n means that it starts in the rate and it ends anywhere This was for differential collision attacks essentially, but we can also have a similar argument for linear collisions in the key stream. So if you look at the squeezing phase if you have linear Trail it's going to be useful for you as an attacker Only if it's equal to zero here and here because otherwise you would end up with a linear Equation that involves bit you know and bits that you don't know so that you can't do anything with it So we can use the exact same approach and in fact the same code to derive the pounds we need for linear correlations in a key stream So in the end we have designed three permutations that operate on 256 384 and 512 bits and They have these numbers of steps. So you can see we have slim and big versions What's the difference? The big instances can be used in any mode. So the idea is that if you decide to have your own Permutation-based crypto system, you can just take the big instance and we claim it's going to be fine But they need more rounds In our Cyphers and hash functions They are only going to be used during the initialization phase to process the key if you have one and during finalization So before squeezing the tag or the digest The slim instances can only be used in a specific way So in a sponge like mode where the rate is where we have said that the rate should be because that's where How we have made the computations for the rate to rate patterns These use fewer steps, so they are going to be faster, but we can still have strong security arguments Thanks to this truncated trick And of course, since they're the fastest is the one we want to use the most often and since we can do it in a Safe way we put those in between a message block processing for both hashing and authenticated encryption A note to people who would want to attack this and who don't feel comfortable attacking permutations We have thought of you and we have designed Tweakable block ciphers that are built using the same round function and since there are block ciphers The definition of an attack is something we all understand and more or less agree on So you can have a look at those they were published in the same paper as our analysis of alzett at crypto this year And this is it This is sparkle. This is all versions of sparkle permutations actually as you can see We have as a parameter the number of branches and the number of step So this c code is the implementation of sparkle 256 both slim and big Sparkle 384 both slim and big and sparkle 512 both slim and big Of course as an implementer you have the usual straight-off so you could enroll the loops To go faster, but then the code size will be a bit bigger You can use one implementation for all three variants of the permutation and The big versions are just the slim ones with more rounds So if you have an enrolled implementation of the slim one you can reuse it to evaluate the big one so now how do we Lose even more weight at the mode of operation level without losing any cryptographic strength So we have decided to use the bitter mode of operation for authenticated encryption So this is the usual duplex sponge So you have your message block mi and the content of the rate xi which becomes the Rate yi and the cypher text ci This process is described by this very simple operation Well, the matrix here has rank one The bitter mode of operation is the same idea except here Instead of having this rank one matrix. We have one with food rank so rank two. That's the only change But it has far reaching consequences and basically the security level offered by this structure is More or less C minus log of R instead of C over 2 so it allows us to have a higher rate And the lower capacity for a similar security level It was originally intended for lightweight hardware oriented ciphers But why not use it in software as well? And now for other tricks that we have introduced I first need to explain a potential avenue for attacks So this is where we have put the rate on our sparkle instances So this is a sparkle instance where you have the first step and then all the other steps are here and we suppose that yet that The attacker wants to exploit Differential trade with probability to the minus s that covers all the steps that they want to attack Since this is what the adversary controls we assume that they can just choose the values here But that's actually a problem Since they can choose the value they can separately choose the block They put here and the block they put here in such a way that they follow the trade that they need over the first two steps on these two double lz So instead of paying two to the s to find a valid pair They can brute force only the remain the reminder of the trail Meaning that they're going to save a factor two to the sixty four Which as a designer I? Find bad so to prevent that we use two different tricks The first is very simple is what we have called rate masking It can only work if the content of the capacity is secret But that's the case in authenticated encryption because there is a key and what we do in this case is that we just Sort the content of the capacity into the rate It's very simple, but it means that the adversary cannot Choose the values that are input here They can only choose the differences and then they have to brute force So they cannot save this factor two to the sixty four that I mentioned before In the case of hashing we have to use another trick because we cannot hide behind the randomness of the key And furthermore in the case of hashing an adversary can start from the middle because there is no secret And they know everything they can do whatever they want So what's going to happen is that the adversary is going to start at the middle of the permutation do some dark magic like a rebound attack and they're going to have Differential distinguisher here with a probability to the minus t And then they're going to extend it all the way to the message injection And this is going to be very cheap actually because they don't need they don't care about the specifics of the transitions in this one Whatever the difference here is they will be able to cancel it out So they only need to pay for this part, but that's just one as an instance. It's not very strong So they own these two steps only add a penalty of two to the minus six to the attacker, which is not much To get back on our feet We are not just exploring the message Directly into the rate. We are using what we have called indirect injection. So we first put the message through a linear Function, which is the same as here and then we sort the output Into more branches because the idea is that the message has a certain length Which is smaller? Than the length of two as it instances As a consequence of that when doing the same attack the adversary is going to do you have To need that the difference here is such that Such an M exists This will have two consequences first now they cannot just go through this s and be done with it They have to go through the whole double as that instance, which is a lot more difficult But also since this L has non-trivial properties, especially in terms of branching number They're actually going to have to go through several double as instances to in this case So while with the previous attack extending it by two steps would cost to the minus six I was going to cost the cost of two double as that instances, which is two to the sixty four Which as it is as designers we find much better as well Now let's have a look at our algorithms. So we have designed a ad ciphers, which we have called shrimp Rate capacity for instance them one twenty eight one twenty eight This is the block size the rate the capacity the pea size the maximum tag length Sorry and the security level You can see that s is about see but not quite that's because of the beetle mode And we have also designed hash functions called ash. We have two variants They have the same rate of one ninety one twenty eight bits, but they have the different block sizes Shem means sponges in Luxembourgish and ash Is the name of a city? Close to the university campus the University of Luxembourg Now this is supposed to be lightweight crypto. So it should have good implementation properties We have worked on some benchmarks of our own which is focused on Sponges sponge about permutation that were submitted to the competition I'm going to talk a bit more here about the benchmarks that have been done and submitted to the lightweight crypto mailing list on various microcontrollers So on ESP 32, this is the ranking of the encryption algorithms If you want to check the details you can just pause or visit the address so you have ESP 32 arm M3 AVR What you always have is that comet is the fastest a variant of comet is one of the fastest but comet does not provide Hashing it's just encryption. Well, the AAD Citroen are short is pretty good, but it's only for smaller messages because of how it was built and Shem Even high high security variants of Shem are right behind so in the case of AVR You can see that our 248 bit secure variant is above Pretty much all the candidates that are not all the variants of Shem or comet And you can clearly see that we have a nice trade-off between the speed and the security level So we have a good high security variants and a nice trade-off between performances and security For hashing same source on ESP 32 arm M3 and AVR In this case for 32-bit platforms, you have H256 and GZULIAC which are fighting each other at the top in terms of speed On a bit AVR, however, H is the fastest by far on 16 bytes messages It's much much much faster in this case. It's three times faster than the Blake 2 if I remember correctly And again our high security variant is not that far behind in the ranking so in conclusion this was our goal and We managed we have strong argument thanks to the combination of the properties of LZ which we understand really well But it's another paper and the long-trade strategy We have a nice implementation microcontrollers It's very easy to implement as you could see on one slide. You can fit all the permutations big slim all the block sizes We can in fact have small code size and a high speed as was confirmed by benchmarks done by other people and we do provide and that's something which is quite unique in sparkle trade-offs between code size Speed and security so you can pick and choose the best variants for whatever you need is And on this very nice note, I'm going to join the other mr. Sparkle and thank you for your attention