 My name is Seth Hoffert and together with my co-authors Yoan, Mikal, Jill and Rani. We will be presenting Zodiac today So an outline of what we'll talk about first I will talk about what Zodiac is and what we can do with it and then Jill will discuss security claims Yoan will talk about the Zodiac permutation itself and then finally Mikal will talk about implementing Zodiac So first of all, what is Zodiac exactly? So Zodiac is actually a rich API operating on a stateful object This API allows everything from hashing to session-based authenticated encryption to be implemented Zodiac is suitable for embedded devices and Is also designed with side-channel attacks in mind Zodiac implements the cyclist mode on top of 12-round Zodiac permutation. So what can we actually do with Zodiac? So Zodiac is an object with a persistent state You can initialize it in either an unkeyed or a keyed mode So some examples of unkeyed modes would be an extendable output function hash function and keyed modes include stream cipher message authentication code or authenticated encryption scheme The idea is that you absorb one or more arbitrary length byte strings into a persistent state and this state really represents the digest of all the strings absorbed thus far We refer to this digest as the history and the object takes care of properly domain separating all the strings and Allows you to optionally combine absorption with an encryption or decryption operation Once all of the input strings have been absorbed you then squeeze an arbitrary length byte string and You can also ratchet for forward security. So let's take a look at an unkeyed example So this example implements an extendable output function. So you can see we start by initializing the cyclist object with no key we then absorb an arbitrary byte string X and Then we squeeze an n byte output and this n byte output represents the hash of string X What about something a little bit more complex that what about multiple strings for example? Again, we initialize cyclist with no key We then absorb strings x1 x2 and x3 and Then squeeze n bytes of output This time this output represents the hash of the tuple of strings x1 x2 and x3 all properly domain separated But what about a keyed use case? What about something like session-based authenticated encryption? So here we initialize the cyclist object with key K and an optional ID and This ID serves as a diversifier Alternatively you can absorb a non-string separately as shown here. We then proceed to handle our first Message which in this case is composed of metadata string a1 and a plain text string p1 And you can see we are absorbing the metadata and then encrypting plain text p1 Obtaining the ciphertext c1. We then squeeze the tag t1 and now put the ciphertext and tag and Then wait for the next message to be processed So in this example the next message looks much the same. We have a metadata and a plain text string We absorb the metadata encrypt the plain text produce the authentication tag and then output the ciphertext C2 and the tag t2 and wait for the next message to come in We are able to handle metadata only scenarios very efficiently as you can see here We absorb metadata string a3. We have no plain text and Then we immediately squeeze tag t3 Likewise, we can handle plain text only scenarios as well. So in this case, we encrypt plain text p4 obtaining ciphertext c4 And then squeeze tag t4 Returning output ciphertext c4 and tag t4 and we repeat this process for as many messages as we wish to handle Zodiac provides multiple ways of mitigating side channel attacks One set one such way is through the use of sessions So if you consider the key mode of a session based authenticated encryption scheme The secret state is a moving target with each string that you absorb The state changes. There is no long-lived key In addition the cyclist object supports an optional counter parameter This counter parameter serves as a nonce Except that it is absorbed in a special way. It is actually absorbed in a trickled manner and this trickling Reduces the degrees of freedom that the attacker has We also support ratcheting and rolling subkeys. Let's take a look at how rolling subkeys might work So in this example, we initialize Zodiac instance with key Ki But then right after that we squeeze next key Ki plus one and replace key Ki with it We can then perform an optional ratchet and Then we can for example absorb metadata AI Encrypt plain text PI obtaining ciphertext CI squeeze the authentication tag Ti and then output the pair Cypher text and tag and Wait for the next message. We increment the counter and repeat this process as many times as needed each time squeezing a new key And now Joe will discuss security claims Thank you set. So indeed I'm going to talk about the security claim we made on Zodiac So overall we wish to have 128 bits of security So the underlying permutation Zoodoo with 12 rounds We assume it's strong enough so that Zoodiac can reach the generic security security level That is the generic security of the cyclist mode. So let me now discuss the series security of cyclist So cyclist object can be initialized in either unkid mode for hashing or kid mode In this case for unkid mode Cyclist is actually equivalent to the sponge construction. So with 256 bits of capacity We can reach a security level of 128 bits against any attack Unless easier against a random rock when switched to the kid mode Cyclist reduces to the security of the full state key duplex construction So this construction is similar to the original duplex construction The main difference being that now input blocks can be stored into the entire state So output blocks are still taken only from the first orbit. So the other part of the state But nevertheless the input blocks can modify the entire state of this object Another difference is that the input blocks can override the first orbit instead of being sold Though the analysis of this construction was done in these two papers and The assumption we need to make on the permutation is not that the permutation behaves like an ideal permutation Actually a sufficient condition is that this permutation behaves as a PRP a pseudo random permutation When the at the input and the output the last C bits are blinded with some secret value So this should be indistinguishable from a random permutation a Similar condition applies to what happens at initialization when the secret key is absorbed together with the first bits So for Zoodiac what we did is simply to write down the security bound we have on the full state key duplex construction and we just adapted it to the parameters of Zoodiac So the bound looks a bit complicated. Let me illustrate it with a couple of use cases So first use case is authenticated encryption. So in this case, I'm just going to focus on the nominal Behavior that is it uses a nonce. There is no nonce repetition and no leakage of unverified decrypted ciphertext Also, there is only a single secret key and the secret key has capabits kappa up to 192 So we look at the security level for integrity and confidentiality and we divide the security into the time complexity or computation complexity or again offline complexity on the one hand and on the other hand on the data complexity or online complexity and you can see using on that these expressions for a 128 bit secret key that we can reach the 128 bit security level both for offline and online complexities We can even go higher than 128 bits if we use a larger key So what happens in the case of a nonce misuse so for the first Blocks that to plain text differ you will be able to see the difference and then after that the key stream Will diverge This is it's the same behavior as for any other duplex based authenticated encryption mode Now let's assume that there is no nonce or there is a massive nonce misuse or there is release on of unverified decrypted ciphertexts in that case the Adversary has more degrees of freedom Nevertheless the keys should still be protected and integrity should still be guaranteed however given these extra Decrease of freedom the security bound decreases somehow but not them dramatically in terms of offline complexity We can still reach 128 bits of security and for data complexity to the power 64 blocks Another use case to illustrate the bond is multi target attacks So multi target attacks are granted for in this term QIV times n over to the da kappa So here kappa is again the key size in bits and is a time complexity and QIV is the number of keys for which We use the same IV so clearly QIV is a rebounded by you the number of users And the security can degrade by at most long you bits What we can do to compensate is to increase the key size by long you bits and if we do that We can restore the security level we had for single target attacks Another way to address multi target attacks Using subject is to enforce the use of a unique key identifier So each key has an ID and this ID is absorbed At initialization so in that case if we do that QIV is guaranteed to be always one And there is no degradation of security against multi target attacks and no need to increase the key size So that's all I wanted to say Next yarn is going to talk about the Zulu permutation Thank you Jim We made the Zulu permutation public now three years ago at the ECC workshop in Nijmegen here in 2017 November exactly and The Zulu permutation was inspired and triggered by that of Gimli and Gimli is designed by a team of 11 people I will not read their names. That's too many of them and Gimli did a very good job at Being efficient on a wide range of platforms and that was because of its size of its State namely a 384 bits that you can encode as 12 32-bit words And we found that a very good idea because in Ketchak F We only had Ketchak F400 that came near this but this had a native word size of 16 bits and was not so suited for 32-bit CPU So we decided to build a permutation that has a state like that of Gimli But we were not such a big fan of the around function of Gimli So we built a round function that's much closer to that of Ketchak In particular the round function of Ketchak F the permutation underlying Ketchak If you look at The state of Zulu At the bottom of the slide. We see that it consists of three so-called planes parallel planes horizontal planes That each plane is a rectangular array of four by 32 bits So instead of on the figure it's kind of a toy version with eight instead of 32, but you get the idea You can also see it as a rectangular array of columns where columns consists of three bits and each bit of a column is in one plane So the Zulu round function operates on this state and has five steps The non-linear status key and it's the same key that we find in Ketchak F or in subterranean But in this case it operates on three bit units and these three bit units are the columns So you can consider it as the parallel application of three bit S boxes on the 128 bit and 128 columns Data the mixing layer as a column parity mix also similar to that of Ketchak F First we compute the parities of all columns and put them in a plane Same shape as the other planes called P And then we build the so-called that effect plane by taking two copies of the parity plane and shifting them over different offsets And then we add it to the two to the three planes of the state So you see that she and Delta they makes a lot they that bits within a column interact a lot And not so much outside the columns So if we want to break this column alignment, we put in between Delta and see we put Bit transposition that moves bits in a column to different columns and that's a row West Basically, we do this by just shifting plane one and plane two over different offsets And that's comparable to the composition of row and P in Ketchak F And then we do the same between she and the row of the net and data of the next round and that's row East Where we do a similar kind of operation that that's something we don't not have in Ketchak F And finally we add round constant just before he to kill the symmetry so to evaluate the cryptographic strength of these permutations that Come from this round function We look at trail pounds and trail pounds are relevant in difference of cryptanalysis and linear cryptanalysis But they say also something about the general diffusion Now what's relevant in the difference of cryptanalysis is the maximum difference of probability over all possible differential Delta and Delta out and the linear cryptanalysis is the maximum correlation over all possible linear approximations Defined by masks you in and you out and in Zulu for a limited number of times We actually claim that the maximum DP of a differential is closely approximated by a maximum DP over the trails and This is much easier to investigate because finding it for difference is quite hard and it's similar in linear Cryptanalysis we can look at linear trails What we report on is not the DP or the correlation But the weight and where the weight is the minus the log 2 of the DP or basically if the DP is 2 to the power x then the weight will be x So now let's take a look at what the bounds are for trails over a number of rounds of Zulu and we compare actually with ketchup 400 the member of the ketchup family ketchup have Permutation family that is closest to Zulu inside actually quite close And we see already that for three rounds Zulu performs quite a lot better. So it has a Minimum weight 36 while ketchup 400 only has minimum weight 24 both for linear and differential trails So that's 50% better If you look for four rounds We see there are two numbers that and we know that the minimum weight is somewhere in that interval So for instance for ketchup 400 for four rounds We see this number 48 that means that we scan the whole space up to weight 48 And they did not find any trails But we did find the trail of weight 63 so that it cannot be larger than that It is somewhere in that interval. We see immediately that in Zulu these numbers are quite a lot higher and for linear linear trails the difference is even bigger And you see this trend Goes on for five six and twelve rounds where for these number of rounds We don't have tight bounds. So we know a lower bound on the minimum number minimum weight But we don't know exactly what it is But we see that we can investigate Zulu quite a lot further into the weight scale than Ketchup So you can see with respect to trail bounds that Zulu is a great improvement over ketchup F4 Let's now take a look at third-party crypt analysis So there is not so much third-party crypt analysis of Zulu Yakiya Which is quite understandable because it has only been published recently But there is a lot of third-party crypt analysis of Kya ketchup and Keith ketchup Actually, it started in 2009 and we report on all the attacks on a page on our website I invite you to have a look the strongest attacks Cube attacks that exploit the low algebraic because if you have our rounds of the catch a calf run function Then you only have a degree to the power and These most of these attacks they can be transposed easily to Zulu Yakiya So you can easily see how we can see this attack look at the details. Yeah, it kind of transposes to Zulu Yakiya And this gives already a lot of understanding in what we can expect So there have been two papers that have dedicated attacks on Zulu Yakiya One is by song and grow and was published in 2018 and the other is by Zou et al in 2019 so quite recently and both attacks They address Zou de yak with Zulu with a reduced round version to six rounds instead of the nominal 12 rounds And what we actually see for the complexities of the attack is what we would expect at least what we were expecting So there are no surprises yet. Let's see So we think that the security of Zou de yak is well understood. Yeah, that impression But of course all crypt analysis remains welcome That's it for Zulu and the crypt analysis now I give the word to Mikael who will explain about implementing Zou de yak Thank you, Johanna. I'm not going to talk about the implementation aspect of Zou de yak But before giving performance figures of Zou de yak in software and hardware I will give you some advantage that we think are quite unique to Zou de yak when it comes to lightweight application So when you design a lightweight application It doesn't make sense to look for instance at the cost of a single primitive or a single services But instead you have to look at the cost of your application as a whole and one big advantage of using cyclists So the mode that uses Zou de yak Is that it's a mode that provides all the symmetric cryptographic services that you would need So it provides not only authenticated encryption, but also hashing it has a full support of sessions You can derive keys and so on Also, we design cyclists to be as compact as possible and you can implement it You can implement all use cases with very little over it So in all implementation software you can implement all the use cases that Seth mentioned the first part With only while using only 52 bytes of RAM And so this means that we don't even need Message queue for instance thanks to the streaming mode of cyclist and you can compare for instance this 52 bytes to For instance in most AS mode you would need at least 48 bytes of RAM only for the mode part So one of the advantage of Zou de yak in lightweight is that the underlying primitive Zou de can be efficiently implemented So not only the state fit nicely in 12 registers of any 32 bits CPU But also the RAM functions has a lot of symmetry which enables circuitry use and also it only use efficient logical operations Which is very interesting in hardware Finally another advantage of Zou de yak in lightweight comes also from security design. So even though Zou de yak targets lightweight applications It also provides a security level of 128 bits everything. It doesn't make sense to target the security level without also taking take into account such and an attacks and there Zou de yak not only allows for very efficient such an account measures thanks to the primitive Zou de but also in the mode we have designed several features that make it stronger such an attack for instance. I'm thinking about The fact that in Zou de yak the secret so the key or the secret is constantly updated and which make it Moving target and hence much more difficult to extract in the side channel Now let's have a look at some performance figures in software So we've implemented Zou de yak on two platforms of the cortex M0 and M3 So on the M0 we focused on code size So we only implemented one round in a loop and on the M3 we focused on speed So we enrolled the 12 runs in a single loop So at the top of the table you can see the cost in cycle plus byte in hash mode and there you see that for absorb and squeeze the cost is Around 145 cycles per byte on the M0 and 40 cycles per byte on the M3 So of course we have a difference because of the difference of implementations But the main difference comes from the fact that on M3 All the logical operations can be combined for free with rotations Meaning that all the shift in the round functions of Zou de yak can be implemented for free So for instance the round row west and the row east can be done for free So one limitation that explains the difference and one limitation In the M0 is that although it has the same number of registers The logical instructions only works on the first Eight registers of the CPU meaning that then you have never had due to shuffling of registers If you move to keyed mode There you get much better performance And there the difference of performance comes directly from the difference of rate In keyed mode Compared to the hash mode and he of course you can achieve the best performance in absorb because there you get a full state Absorption so the rate is maximal, but also when you consider squeeze we can use a bigger rate because and We are in key mode and in key mode We can get a bigger rate while also targeting the same security level and for sense of absorb You see we can decrease the cost in cycles per byte to 48 in absorb on the M0 and down to 40 cycles per byte On the M3 as a matter of comparison. We also give the performance figures of AS-138 in quantum mode from the two reference below and there you see that the comparison is strongly a favor of the of zodiac So we you can get our implementation software from our repository if KCP and To which we give a URL at the end of this presentation Now let's have a look at performance on hardware. So Sylvia Mela implemented zodiac on the hardware 14 nanometer technology and in this case we give only the figures for The architecture on the one round per cycle if you want more figures You can have a look at our update paper that we submitted to the second round of the lightweight crypto competitions done by NIST And in this table we listed the performance in Increasing order of area and for the smallest area. So what 8.2 kilo gate equivalent? You can achieve 200 megahertz frequency Frequency and you get other clock frequency throughput of 2.7 gigabit per second for auto-attracted encryption mode and 1.3 gigabit per second for hashing so for a slight increase of area You can see that you can increase the frequency and then the throughput you get is directly line linear in the frequency Increase and you can get up to 600 for frequency of 600 megahertz You can go as high as 8.1 gigabit throughput in auto-attracted encryption So yeah, that's it. Thank you for your attention. Please have a look at the paper. There are much more detail in the paper But also the complete specification of Zodiac and you will see that they are very compact and clean So if you have any questions, you can ask them during the FSC sessions on November 9 But you can also send an email at our address zodiac at catchac.team Please have a look to our kit up repository in the description Please have a look to our kit up repository XKCP and Gazoo-Doo for the software and other implementation And we give also the URL to our homepage. Thank you. Bye