 Okay, welcome to my presentation Dumbo Jumbo in Delirium, parallel authenticated encryption for the lightweight circus. This is my name is Bart Menig and this is a joint work with Tim Beine, Yulong Chen and Chris of the Braumlich. The presentation is in fact about Elephant, which is a submission to the NIST lightweight crypto competition. For the design of a new authenticated encryption scheme, the first question this raises is what is authenticated encryption? Well, in a nutshell, it's a cryptographic scheme that provides both encryption of data, meaning that no outsider can learn anything about the data, as well as authentication of data, meaning that no outsider can actually manipulate encrypted data. Mathematically seen, authenticated encryption is just a deterministic function. It gets its input a key, it gets its input associated data and message, and it gets its input a nonce. The message is encrypted to obtain a cypher text. The associated data and the message are authenticated with a tag t, and the nonce is there to randomize the scheme in such a way that if you authenticate and encrypt the same associated data and message twice, you still get independent looking cypher text and tag, and this nonce is optional. Authenticated with this function is also an authenticated decryption function, AD. The AD function gets its input a key, associated data, cypher text, and tag, and the nonce that was used for encryption. And the goal of the scheme is to verify whether the tag is correct, and if this is the case, return the original message. So it needs to satisfy that if the tag is correct, the original message is disclosed. And if this is the tag is incorrect, then the message, nothing about the message should be leaked. And finally, this function needs to be correct, of course, which means that if you encrypt a message using a certain nonce and associated data, and then you decrypt the message again with the same nonce and associated data, you should retrieve the original message back. The next lightweight competition is about a lightweight authenticated encryption scheme. And when we talk about lightweight authenticated encryption, there are various things we can optimize for, various choices we have to make. So for instance, what's the best primitive, what's the best cryptographic primitive we can take for building our lightweight authenticated encryption scheme? And beyond this primitive, how do we combine this primitive in the nice mode to get a secure mode? Do we want to target a scheme that is very easy to parallelize, or do we want to have one that has a minimal state size? Do we focus on hardware-oriented designs, or designs that can be efficiently implemented in software? Do we want to have an efficient nonce-based scheme, or do we want to have a scheme that is secure even against nonce reuse, or that is secure even in the release of an verified plaintext setting? There are various different design decisions that we can make. And the goal that we actually set for Lightweight Crypto kind of determines the scheme we want to make. And for Elephant or Goal was to minimize the state size and the complexity of design, so getting as simple design as possible, while still maintaining, while still meeting the expected security strength of 2 to the 112 and the limit on the online complexity of 2 to the 50 bytes as prescribed by the NIST call for proposals. So this is our goal, a scheme with a minimal state size and a scheme that is as simple as possible. The first step, the first decision we have to make then is which cryptographic primitive to use. We know many popular primitives, like 3-core block ciphers, block ciphers and permutations. But all of them have their differences. So if you start with a block cipher, the block cipher is basically a function with a data path. So it gets some data in and some cipher text goes out. But there are also some data coming in from the side. So you have a key coming in from the side and a tweet coming in from the side. And that of course needs to be done in a secure way as well. Likewise for a block cipher, you also have a data path going in and out. So message going in, cipher text going out and there is a key again coming in from the outside, from the side. So again you need to be careful with constructing this primitive. And think about this for a permutation, a cryptographic permutation is the simplest possible way. We don't even have a traffic sign like this. So we have just an input to the permutation that's cryptographically transformed to an output of it. And if you think about it from this aspect, the permutation is clearly the best choice. And in fact it's a highway, it's the most efficient way of building a cryptographic primitive. Now if you fixed our primitive, namely cryptographic permutation, this next choice is which mode to use. And the established mode for permutation-based crypto is the duplex or the sponge. So you have a permutation, you have some input going into the state, you permute, you get some output, etc. So it's a very popular way, especially for lightweight hashing. Many lightweight authenticated encryption schemes and lightweight hash functions are based on the sponge. But it is a very sequential. And if you want to go for a minimal state size, it turns out that a parallelized approach is easier. So our approach is a parallel evaluation of the permutation. So we have a permutation that's evaluated multiple times in parallel. And you get some input in 1 and 2 and 3 in this picture going into the permutation and you get some output. And the input and the output are masked with a certain mask. Of course, this requires a proper masking. In addition, we only want to evaluate the permutation in a forward direction. So this also requires a proper mode of use. And if we do so, looking ahead, this turns out that we can minimize the permutation size, even up to a smaller size than what you can achieve in the sponge for meeting the NIST security parameters. For the masking, we use a simplified version of the masked evan monster construction of Granger at all. The idea is that you take a fixed LFSR phi 1. You define phi 2 to be phi 1 plus the identity. And then the masking gets its input, the key, and two counters A and B. And the masking operations follow. So first you commute a subkey, basically, from the key. So the key is appended with zeros and then permuted. And then you evaluate phi 1 to the A, phi 2 to the B. And A and B are counters that will be used actually to indicate the location of the masking in the scheme. I will come back to this later. And first about this masking, the advantage of this masking is that its constant time is very efficient and is very simple to implement. It's more efficient than the alternatives. Now we've set the primitive and the mask, we can go to the mode. And this is the elephant authenticated encryption mode. So on the top left here, I copied the definition of the mask from the previous slide. And I will go through the picture from top to bottom, starting with the top, of course. And in the top, we see the encryption. And the encryption gets its input A nonce. This nonce is appended with zeros and just plainly fed to all permutation calls. So without a counter yet. The counter is in the mask, just like the key. So the key is in the mask, of course. And we have a counter in the mask, namely, A, B is 0, 0, 1, 0, 2, 0, up to lm minus 1, 0, where lm is the number of message blocks. So we take the counter value A ranging from 0 to lm minus 1 and B, we keep 0. This gives a key stream, which is added to the message to get the ciphertext. Now for authenticated encryption, for the authentication part, we do the first path following input. So we get the nonce and the associated data, which were not encrypted, but will only be authenticated. We concatenate them with a 1. And we pad it into N-bit blocks and an integral number of N-bit blocks lA1 up to ALA. For the ciphertext, we take the ciphertext coming from encryption, we append it with a 1, and then we pad it into an integral number of N-bit ciphertext blocks. And then we do the same procedure, basically. So we have a permutation with a mask. So the inputs are now fed through a permutation with a mask. And the mask is slightly different now. So mask A, B, in this case, we get 0 comma 1 up to LC minus 1 comma 1, where LC is the number of ciphertext blocks. So the A value in the mask still serves as a counter, counting from 0 to LC minus 1. But now B is 1. Here to 0, here it's 1. This is done for domain separation. Likewise, we have A running as a counter here, and B is 2. So this also gives the domain separation. Otherwise, you can trivially attack it. Anyway, all these blocks are fed through the primitives. The outputs are all added, and then truncated to get a tag of t-bits. The mode at a high level can be seen as an encrypt than mask design. And this is known as the most secure version, because this gives ciphertext integrity. So you don't authenticate the message, but instead you authenticate the ciphertext. At a slightly lower level, we see encryption consisting of counter mode, which allows for inverse free decryption. So you don't need B inverse. The authentication can be seen as a Wigman-Carter-Schup authenticator. The entire elephant authenticated encryption mode is fully parallelizable. It only uses a single permutation P, and it only uses P in a forward direction. The masking is chosen in such a way that it can be efficiently updated. To see this, suppose we compute a mask, for instance, mask 0 0, or in this case, mask i minus 1 0. Then the next mask, mask i comma 0, can be computed as simple as follows. Just evaluate phi 1 on the mask. There is another feature in the mask, namely that if you would encrypt this vertically, so blockwise, so you encrypt to get key stream c1, and then you authenticate c1 on the fly, then these two masks, so mask i minus 1 comma 0, mask i minus 1 1, add up to mask i comma 0. And this might be useful as well. And for security, we prove that the elephant mode is secure against any adversary. Secure up to about four times sigma times P divided by two to the n, where n is the permutation size, and we consider any adversary that makes an online complexity sigma, which counts the number of associated data and plain text blocks, and offline complexity P, which counts the number of primitive calls. This proof is given under the assumption that P is a random permutation. Phi 1 has a maximum length, and in addition, this condition holds, so we don't have collisions in the masks for any two values to mask A, B, and A prime, B prime that occur in the scheme. And A is assumed to be a non-respecting adversary. Most importantly, this proves that we can actually instantiate elephant with a 160-bit permutation and still meet the parameters required by the NIST lightweight call. And this is not possible with a sponge. And this brings me to our instantiation, so we have three instantiations of elephant, namely Dumbo is the first one, Dumbo is elephant with a sponge and 160-bit permutation. This is the minimalist design. So it has a time complexity of security up to a time complexity of two to the 112, and a data complexity of two to the 46 blocks, which roughly corresponds to two to the 50 bytes as prescribed by the NIST call. And we have Dumbo, which is a bigger variant. It uses sponge 176. It's a more conservative design. So it achieves security up to two to the 127 queries and a data complexity of two to the 46 blocks. And I would like to note that this permutation is ISO-ISC standardized. And finally, we have delirium. And delirium is instantiated with catch up F200 and this gives a high security variant. So we get security up to a time complexity of two to the 127 and a data complexity of two to the 70. Catch up F200 is specified in the NIST standard. It apparently is not NIST standardized, but it is specified in the NIST standard. Before proceeding, I would like to mention that originally here were pictures of the Dumbo, the elephant Dumbo from the famous movie, Jumbo, the big elephant and delirium, the logo of the Belgian beer. But I replaced them with self-made pictures just to avoid any copyright issues. Now for, so here we have some more details. I will not go over all the details, but what we see is we have three instances, Dumbo, Jumbo and delirium. All have a key of 128 bits, a nonce of 96 bits, a permutation size of 160, 176 and 200 and a tag of 64, 64 and 128. Here we see the details on the primitive that's used. But most important in this slide is that we use three different LFSRs for the three functions, Phi Dumbo, Phi Jumbo and Phi delirium. These three functions all operate on 8-bit words and have been chosen to kind of fit to the permutation that's actually used. And in addition, all of them turn out to have a maximal length and satisfy this property. So all of these three functions satisfy the criterion that we need for the proof to hold. Okay, so far this is about NIST as NIST submission elephant version one. If an elephant goes to the third round of the competition we're planning to make a tweak to the scheme and the tweak basically corresponds to the following. So the tweak is very subtle, it's very minimal, but if an elephant goes to the third round we're planning to propose this tweak. And the tweak in a nutshell corresponds to these two changes. So one of the changes is that the permutation evaluation that was here for the first associated data block A1 is removed and it's moved to the end. So it's just a move of the permutation from one side to the other one. And the important thing of this is that we kind of move from a Wakeman-Castell-Schubb authenticator to a protect count of some authenticator, which gives better security guarantees. The second change is a side rule of the mask parameters and in particular the value B in the masks. So here now for encryption, B is one. For ciphertext, B is two. And for associated data, B is zero. And that used to be zero for encryption, one for ciphertext and two for associated data. We have done this just for aesthetic reasons because it looks nicer. It works, it is slightly nicer to implement if we have mask zero, zero here. So the good thing of this is that version two would retain all good properties of version one. And in addition, we use a protected count of some authenticator, which means that we get authenticity even under nonce reuse. So in a nutshell, on the left side of the table we see what we proved for the original elephant, confidentiality and authenticity in the non-respecting setting. Elephant version two would achieve the same security goals, but in addition it would be nonce misuse, it would achieve authenticity under nonce misuse. A couple of notes on the implementations. So implementations of elephant they can and they effect should exploit parallelism. So far this was only done by delirium implementation recently made by campus at all. Currently we're working on new parallel reference implementations for delirium where we use the catch-up F and alteration of the catch-up F-1600 implementation to process up to eight blocks in parallel. This gives quite some speed up, but I would like to mention that this is still a work progress. And more information can be found in the GitHub account of the domain. For Dumbbell and Jumbo, I would like to mention particularly for software implementation that it might be interesting to investigate how we can improve the implementations with the bit slicing techniques that have recently been developed for present and gift. To conclude we presented elephant, a parallelizable lightweight authenticated encryption scheme with a very small state. The mode is provably secure in the random permutation model. The primitives are standardized and well studied. Dumbbell and Jumbo they're kind of designed more with harder in mind. Delirium is good in both hardware and software, but it was designed originally with software mostly in mind. That concludes my presentation. So I'd like to thank you for your attention.