 Good morning, everyone. Welcome to this security and encryption track. So I will indeed talk about some cryptographic functions based on Kechak. And everything I will say is actually joint work with my colleagues Guido, Yuan, Miquel, and Roni. So this is the outline of my presentation. We start with some introduction. Then I will try to discuss the security properties of symmetric cryptography. Then the core of the presentation will be divided into two parts, the keyed applications and unkeyed applications, depending whether or not we need a secret key. Then I'll talk briefly about the Kechak code package and recap on the functions with some inventory. So the story somehow begins some years ago in the year 2004, 2005, when the most popular hash functions, MD5 and SHA-1, got severely broken. So at the time there was also the SHA-2 hash function designed by NSA, but it was also based on the same design principles as MD5 and SHA-1. So at the time there was fear that maybe these attacks would extend to SHA-2, and so NIST decided to create a new hash function called SHA-3, but not on their own. They would do that via an open competition. So between 2008 and 2012, there was indeed the SHA-3 competition. It was an open competition in the sense that every design was public, everyone was forced to, of course, give the rationale of the design and give some open source implementation of it. And every team was allowed to try to break and try to attack any competing design. There was also some effort done to evaluate the performance of all these designs. It took the energy of quite a lot of the cryptographic community, a lot of teams from different companies and different universities took part of this competition and you can see in this graph arrows going from the teams attacking the design of another team. So that was the situation in 2009. So our candidate was Ketchak. So Ketchak is a sponge function which is somehow the generalization of a hash function and it works like in this drawing. So there is a state which is initialized to zero and then the input is cut into blocks. These blocks are processed one after the other and between these blocks there is this permutation F. So this permutation F called Ketchak F is really the core of Ketchak. It's really where the crypto is really happening where most of the time is spent. So this Ketchak F permutation is really the, yeah, let's say the common part of this presentation. Okay, so then in 2012 the contest ended with the announcement from NIST that Ketchak was actually chosen as the winner of the Shattery competition and three years later it took actually three years for the NIST to publish the FIPS 202 standard with the Shattery hash functions meant as a drop-in replacement for Shattery family of hash functions. But they also standardized extendable output function ZOFs called shake 128 and shake 256. So generalizing the concept of hash functions to an arbitrary output size. Within the meantime before it was actually standardized by NIST, the three GPP consortium published a new standard called 2-Hack making use of Ketchak internally for application in SIM cards. And then more recently in December last year the NIST published some new functions standardized based on Ketchak in the special publications 800-185. And on the side of this standardization process we actually worked on other variants of Ketchak. So Ketchak and Keyak are authenticated encryption schemes based on Ketchak, which we submitted to the Caesar competition. So the Caesar competition is a currently ongoing competition for a new authenticated encryption schemes. Last year we proposed a new hash function called Kangaroo 12 and then recently we are working on a new set of functions called Kravate. I will discuss this in a moment. So this presentation, the focus on this presentation is really on these new functions. So let me discuss the security properties. And so I want to show you is that in symmetric crypto, the security relies on two different parts. One part is the generic security, so the security of the mode and the other is the security of the primitive. So to take Ketchak as an example, the mode is the sponge construction. It tells you to initialize the state to zero. It tells you to cut the input into blocks to absorb them into the state. It tells you to apply the function F and it tells you how to get the output. That's the mode. That's how it works on top of the permutation. And then of course, there is a permutation. Can we say something about this construction? Yes, we can say something if we make abstraction of the permutations. We've replaced this function F, this concrete permutation that you can implement in C or any language. You can replace it virtually by a random permutation. And if you do that, then you can start a proof. You can prove theorems about the security of the scheme when the function is replaced by a randomly chosen permutation. And you can even derive some theorems. This is a theorem that we proved at Eurocrypt 2008. Don't worry, I'm going to explain what it means. So first, let me go back one slide. So one important parameter of the sponge construction is the capacity, the value C that appears on the left side of this presentation, of this figure. So it's the size of the part of the state that is not directly touched by the input or the output blocks. And the security highly depends on this value. So what this theorem says is that if I replace, in Ketchak I replace the permutation by a random permutation, then the probability of some attack to be successful is upper bounded by n square over to the C plus one where n is the time spent by the adversary and the time unit is the time needed to compute once the permutation f. So as long as this time n is much smaller than two to the C over two, then this probability is negligible and we have security of the mode of the sponge construction. So if we want to have, for instance, security at the level of 128 bits, then we need to take a capacity of at least 256 bits. But of course, this does not tell us anything when we replace this black permutation here by the concrete Ketchak f. So the nice thing about these generic securities that you can actually reduce the scope of cryptanalysis and focus only on the permutation because what it tells you is that if there is a problem in Ketchak, it should not come from the sponge construction, it should only come from the permutation. But the problem is that when you talk about the security of the permutation of the primitive, then there is no mathematical proof that it's secure. So the only way to gain some confidence is first to have some open design rationale of the permutation and then have some cryptanalysis. And preferably peer review, meaning third-party cryptanalysis, and what's even better, a lot of cryptanalysis, is if you want to be confident then you need to have some sustained cryptanalysis activity on the function and the function, of course, not broken. That is the only way to actually have the security of most symmetric crypto primitives. So now let's discuss further with the Ketchak f permutation. Most cryptographic primitives and including the Ketchak f permutation is actually divided into rounds. So a round by itself is a very simple set of operations and by itself it does not provide any security, but if you repeat this round many times and enough times then the security starts to build up. And there is a point where there are enough rounds for that permutation to be strong and then in this case for Ketchak to be secure. What people actually do when doing cryptanalysis is that they try to break one round, two rounds and so on and then see how far they can go. So the status for Ketchak f is the following. Ketchak f, the largest permutation has 24 rounds. Ketchak has been practically broken up to five rounds. The best attack on five rounds is a collision attack published last year. If you allow complexities that are theoretical, so not having an explicit example with an algorithm of complexity less than the expected complexity, then you can have collision attacks up to six rounds and now if you allow any kind of attacks then the best theoretical attacks are up to nine rounds. But nine rounds in that case it's time two to the power 256 which is a huge number. It's even higher than the number of particles in the universe so that's really theoretical. So we can see from this figure that the 24 rounds of Ketchak are actually rock solid safety margin. Okay, so let me discuss some unkeyed applications. So the most obvious unkeyed application of Ketchak is simply hashing. So a hash function if I need to recall what it is is simply a function that takes as input any string of bits of any size. So it can be a small message or a picture, complete DVD and then it processes it and you get as output a digest of fixed size 256 bits for instance which acts as kind of a fingerprint of the input. Obviously the sponge construction can be used for that. The input is then your big file and the output is your digest. Now for the performance of the function you can see that within the permutation F there can be a lot of parallelism that you can exploit but beyond that you can see that the sponge construction by itself is quite serial. If you implement the permutation Ketchak F on a Skylake CPU, you get about 1100 cycles for one application of the permutation. If you now implement this permutation but using vector instructions by performing, evaluating four different permutations in parallel, then the time grows slightly to 400 cycles. So per input byte you can clearly see that it's more interesting to compute four permutations in parallel than just one. So what can we do to exploit this kind of parallelism? We can do some tree hashing. So tree hashing is just taking the input, cutting it into pieces and then evaluating the hash as a tree where each, there should be a slide coming up but my PC is a bit slow. Yes, here it is. So the message is cut into some pieces and then each piece is hashed separately and then the digest are then co-catenated and hashed again to get the final hash. In the latest standard from this, there is this parallel hash function that exploits this kind of parallelism and you can see that the number of cycles per byte still on this Skylake processor decreases significantly as soon as you can exploit this parallelism. As soon as your input is big enough, you can go down to 2.3 cycles per byte so which is quite fast. So that's the existing standard. We can actually go further than this and that's the purpose of this new kangaroo 12 hash function. I've showed you that 24 rounds are rock solid. We can afford to decrease the safety margin a bit and actually more than a bit, we can go down to 12 rounds and still have a comfortable safety margin so that gives us a factor two speedup. And then we can have a special mode, a tree hash mode that has what is called embarrassingly parallel such that as soon as you have enough data as input, there is as much parallelism as your platform allows you to exploit. So concretely what it means is that the input message is cut into chunks and these chunks are can be potentially hashed in parallel. So the more chunks you have, the more parallelism you can exploit and the faster per byte it can go. The first chunk has a special status in the sense that the chaining value, so the intermediate digest that come from the other chunks are hashed together with it and that's something, a technique we call kangaroo hopping and that's where the name kangaroo 12 comes from. All these values, 110, 110 star and so on come from Sakura coding which we defined and the idea is that as soon as we use Sakura coding then generically we know that our mode is secure. So there is no, it's proven that there is no problem on the mode side, if there is a mode it has to come from the permutation, reduce to 12 rounds but the crypt analysis that has been done on Ketchak so far still applies to kangaroo 12. So we rely on this and we somehow inherit from all the crypt analysis that has been done on Ketchak and to apply it on kangaroo 12. So concretely, still on this Skylake processor if you have a short input then the number of cycles per byte is 3.72 and as soon as you have enough input to fully exploit the parallelism you can go down to 1.22. Yes, I forgot to mention that so this idea of kangaroo hopping with the first chunk there used with the same hash function as the digest the idea is that if your input is small then there is no overhead due to the tree hash mode. So this 3.72, it's really the speed for a short message as soon as there is one block for instance. On the last line you can see some figures for the night's landing architecture. On that architecture they have bigger vector spaces, bigger vector registers up to 512 bits so you can compute actually eight permutations in parallel and this decreases the number of cycles per byte below one cycle per byte which is really, really fast. Okay, so now some keyed applications and what I would like to do is before I really speak about concrete functions I would like to take a step back and talk more abstractly about pseudo random functions and what we can do with them and what they can bring to us. So pseudo random function is a function that takes as input a secret key and some input string. Then it produces bits as many as requested. For the point of view of the people who know the secret key it's a deterministic function meaning that anyone can compute it easily but from the point of view of an adversary who doesn't know the key, then not knowing the key means that these bits, these output bits they will look just like random bits. 50, 50% one or zero, all independent. That's what a good PRF should look like. So what can we do with such a PRF? We can build a stream cipher to do some encryption. So if you take the key and then give some input to your PRF you can use the output as a key stream that you saw with your plain text. So if you do that the cipher text will look like garbage and there is no way to recover the plain text. The nonce there is some identifier so that every time you use it you use a different nonce so that you get a different and independent key stream and you can reuse the same key, the same long term key. Okay, so that's for encryption for confidentiality. A second application is authentication. So what you want to have is transmit a message in the clear but have a way to make sure that this was not changed on the way. So we have maybe a client sending a message to a server and the server wants to make sure that the message actually comes from the legitimate client. So what you can do is apply the PRF to the plain text, your message and you can use that output as kind of a tag that you attach to your message and you send the message in the clear plus the tag. The server is then able to re-compute this tag, checks that it's correct, that it's match. If it matches that means that the plain text was actually not changed. Now we can combine the two. You can use the PRF once to get some key stream means you can encrypt your plain text into cipher text and then you use it a second time to have a tag over the cipher text and in that case you achieve at the same time confidentiality and authentication. You have your cipher text and readable and the tag that protects against any changes that an adversary would try to do on the message on the way. A nice thing that we want to achieve with PRF is called incrementality. So let's assume that you want to authenticate some packets flowing from a client to a server. So first you have this first packet, you compute a tag over this first packet and you send it. Then a second packet needs to be transmitted. Then what you want to have is a tag over the first two packets together but you don't need to compute everything again, you just take the result of the computation with the first packet and you extend it, you extend your computation with a second packet. Then the tag you get is a tag over the concatenation of packet one and packet two and you can go on like this. In this case the final tag on packet three authenticates not only packet three but the actual sequence of packet one, packet two, packet three. So there is no way that the adversary can reshuffle the packets for instance. The tag then will not match. So there are at least two ways that we can instantiate this PRF to build schemes. The first one is essentially based on the sponge construction and the second one is based on a new construction called Farfale. So let me first start with sponge like constructions and the first example is Keyak. So Keyak is an authenticated encryption scheme that we submitted to the Caesar competition. It uses the Kachak F permutation but reduced to 12 rounds like can go 12. It works in sessions. So the idea is really, it really extends what I said in the previous slide. So the idea is to authenticate the full encrypt and to authenticate the full session. So let's say that you start the session with some session key. It's called SUV because it needs to be secret and unique. So it can be a session key meaning that it's different every time you open a new connection. It could also be a long-term key but you add join to it some counter, some connection identifier that changes every time. Oops, sorry. So let's say you have your first plain text P that comes in and some metadata A that you don't need to encrypt but still you want to authenticate it. Then what you will get is some cipher text from your plain text and a tag covering both the plain text and the metadata. Then you can go on with more plain text and more metadata. Every time you get some cipher text and a tag that authenticates all the session so far. And then maybe here in this last example, just some more metadata, maybe just a confirmation that does not need to be encrypted but that just, that is okay. And then the tag on this okay actually is okay over the entire session. So including the context on which this okay is sent. So that's really useful in encrypting a network communication. Then we submitted a second proposal to the CISAR competition called Catcher. It's again based on Catchak but it's sufficiently different from Keyak to deserve a separate submission, a separate name. Catcher has less features than Keyak but it's also simpler and targeted at lightweight applications like the Internet of Things and you can instantiate it with smaller permutations, 400 and 200 bits so that it can be really small in hardware or in embedded software. Then the second set of constructions I would like to talk about is really new. So it's really, it's based on the farfale construction. So the farfale construction is this figure. It resembles somehow a farfale pasta, if you wish. So the idea here is that we try to exploit parallelism as much as we can. So on the left side you have the secret key that is used every time and then the input blocks. So the input message is cut into input blocks M0 and M1 and so on. And all these blocks are processed to the permutation F in parallel and the results are summed together. So this gives us the parallelism and the incrementality because if you have more input blocks that come in you just need to add to the counter that you already have at this time. Then to get some output you apply F again and then some more Fs for each output blocks. So you can get an output of any size and again if you want to have a long output you want to do stream encryption for instance then all these output blocks can be computed in parallel if your platform allows you so. So this is the farfale construction and cravate is an instance using the Ketchak F permutation but with the number of rounds really reduced to not much. So typically what we expect now is to have six rounds, four rounds and four rounds. So it means 14 rounds in total. If you remember the figure with 24 rounds that was really rock solid so 14 rounds between input and output that's already quite safe. So now that we have this PRF we can just apply them in a simple way just as I explained in the beginning of this part of the presentation. We can build some so concrete PRF to compute some MAC, some authentication code. We can have different flavors of authenticated encryption schemes and I wish to point out the last one which is a wide block cipher. So wide block cipher is simply authenticated encryption scheme that doesn't have any expansion. So if you encrypt a block of a given size then the output will be of the same size and if you want to check the authenticity then you need to rely either on the existing redundancy of your input or if you really don't have that you need to add some redundancy yourself beforehand but it can be really interesting if you don't want to have any expansion if you cannot afford to have a tag attached to your message. So concretely if you encrypt with a wide block cipher a picture and then at the decryption you get something which is really garbage doesn't look like a picture then you're sure that your message was tampered with. So any changes will diffuse completely over the full plain text and your redundancy so the fact that it was actually a picture a meaningful picture is destroyed completely so it's all or nothing. So that's quite interesting. Okay so that's all for the applications. Now let me give you some words about the Ketchak code package. So the Ketchak code package is simply an open source set of code implementing the Ketchak function and all the variants that I've mentioned except Kravate which we are still tuning and we hope to release it soon this month hopefully. So concretely what can you do? First you can, the simplest way is to make a library so you can just make leapketchak.8 and the prefix there generic 64, 32 and so on that's really the flavor of the implementation we wish to have. So generic 64 is simply generically optimized implementation for 64 bit platforms. Generic 32 the same but for 32 bit platforms and then you can have some more specific set of code for specific targets. If you don't want to have a library just want to have the source files and to integrate them into your project you just do the same but with dot back at the end it will create an archive with all the files that you need for a given target. Then there are some more things you can do. It comes with a complete set of unit tests and you can run these unit tests with Ketchak tests. You can say I want this or that test I want to have some speed measurements you can do that from the command line. And of course you can extend that in any way you wish. So let me give you some more details on how it's organized inside the Ketchak package. So the idea is that we have one layer above that is implementing the modes and the constructions in the more generic way. So there is no optimization it's plain C code it's portable code and there is only one implementation. The idea here is really to make the life of the user easy and there is nothing going on specifically for a specific target. Then there is this interface called state and permutation and below which you can hide the details of an implementation of a given of the permutation for a given platform and it will do all the state management so XORing bytes into it and applying of course the permutation. You can then easily substitute one implementation below S&P with another that is more optimized for your given platform and everything above S&P will still work. Okay so I will recap on what I mentioned before with some inventory. So for hash functions and extendable output functions so the generalization of hash functions to any output size. We have some standard rock solid instances so the shattery instances the shake from FIPS 202 and then the generalization of shake with C-shake in the latest standard. So C-shake is like shake except that you can you have an extra input which is a customization string and the advantage that you can have domain separation between different hash functions so if you put a different customization string you immediately get a new hash function with the output which that is independent of C-shake with a different customization string. Double hash is also a generalization. The idea here is that the input is just not one string of bits but it can be any number of strings and of course then the result will not depend just on the concatenation of all the inputs but really on the exact sequence of inputs that you have. You can have a set of string and then give you that as input to double hash. Parallel hash I mentioned is the three hashing mode that has been standardized by NIST and then can go 12 the more the faster version. Then something I didn't mention explicitly but another useful unkeyed application of hash functions is pseudo random number generation so the purpose of PRNG is if you have some random bits at the input but these bits are not well balanced maybe in this case clearly there is a bias and you want to turn that into a sequence of bits that really look random uniformly distributed then the PRNG can do that. So in the catch-acc code package we also implemented catch-acc PRG which is based on our SAC 2011 proposal and it has some extra features. If you have more seeds that come in you can at any time add them so that they get mixed to the state and then your output will depend on them and the second feature is forward secrecy. So at any given time you can say okay forget the past and if your state is compromised if let's say the PC is generating secret keys and at some point the memory is recovered by an attack then the attacker cannot go back this point. So it's irreversible and cannot find the secret keys that have been derived using this function before this irreversible operation. Then authentication, there is KMAC part of the SP8185. Of course you could use HMAC with Chantry but that's resuboptimal, not a good idea. You can use KIAG, so KIAG is authenticated encryption but you can use it also for a simple MAC and then this Kravate, a new construction can be used also for authentication. And then authenticated encryption of course get your KIAG and then all the flavors of Kravate that I mentioned earlier. And that's all I wanted to say so if you have any questions, please feel free. Hello, thank you for this talk. Yeah, I had some new big questions about tree hashing. You mentioned here that parallel hash has been standardized during the competition, the Chantry competition. I was wondering if there is another word coming up because I remember some draft on the NIST mailing list about the specific tree hashing mode and when you go to the website, to the Ketchak website there is a link with a paper describing the generic tree hashing and that new hashing mode that you may be building. So I was wondering where this is at and also where Kangaroo 12, which is new to me, where this stands, is this also relevant to tree hashing with all the parallel things I saw in the slide? If you could clear that for me, thanks. Okay, so as far as I know, only parallel hash has been standardized. It's true that we've worked on tree hashing for quite some time and we proposed many different flavors of tree hashing. What has been standardized is just a subset of that but of course you can do any kind of tree hashing. If you use Sakura coding, then it's easy because then you can devise any tree topology that you like that fits well with your application and then you're sure that the resulting function is secure. About Kangaroo 12, a design decision that we made is to have just one instance. So Kangaroo 12, there is no parametrization. The tree topology is fixed so that there is no choice. It's easier for the user. There is just one size fits all choice for Kangaroo 12. I don't know if it answers your questions. How does it work with the lost bits? So the every next iteration of the encryption the depends on the previous one. So this is like a well hashing algorithm but if some bits are lost, I've seen this forward secrecy or you can recede it at any moment in time but if some bits are lost of the cipher text, does it magically heal itself like to the crypto text? I'm not sure, I understood the question. Okay, is there any, okay, you have the cipher text which is like, well, yeah cipher text basically but if some bits being lost with the on the decryption. Well, basically if some parts of the cipher text are lost then you get desynchronized and yeah, you cannot recover. Okay. There are ways to have some synchronous stream ciphers but that's not the case of Kijak. I believe AES encryption has been implemented in modern CPU architectures. Is the folding function of Kijak, is it possible to implement that and is there any work in that regard? So your question was about hardware implementations of Kijak or? Yes. Okay, so yeah, I didn't mention this but of course Kijak was actually designed in a way that it can be extremely efficient in hardware either for a given size of circuit it can be really fast or it can consume less energy per bit than well, let's say, than the other char tree finalists. I don't know if it answers your question. For instance, with the Bitcoin minus A6 for char two have been built. Is it, is the same thing possible with Kijak? Yeah, I suppose, I think yeah, we didn't tailor Kijak for Bitcoin like applications. Of course, in this case if there will be a cost to do some mining and I suppose that economically the threshold will automatically raise where the cost of electricity is at a given point is not economically interesting anymore. So that I think even if Kijak is more efficient in hardware for Bitcoin like applications it will not matter because the bar will be raised and at some point it becomes economically not interesting to do more mining. But I'm not really familiar with Bitcoin to be honest. Are there plans to include Kijak with TLS? I haven't followed TLS closely. I think TLS still uses a lot of char two. I don't know exactly why, but I hope that some day they switch to char three and not something designed by the NSA. Thank you very much.