 Hello everyone, this is the presentation of the talk's paper titled Moi, Multiplication, Operated Encryption with Trudge and Resilience. It's a joint work between Olivier Bronchain, Sébastien Faust, Virginie Lallement, Gregor Leander, Leo Perrin, which also happens to be me, and François Xavier Standard. Cryptographic primitives are not the only one, many many of the modern operations that we rely on are implemented using integrated circuits. These are at the heart of all electronic hardware. These are then manufactured in cutting edge foundries, but these are extremely expensive, extremely expensive. We are talking about billions of dollars, so that most nations do not have any. As a consequence, production is outsourced. So the way it works is that you have the designers of the integrated circuits, which can be any company in the world, that will design their specific circuit that they need, send the specification of these to one of these foundries, and in return get the physical circuits that they need. More than 90% of all integrated circuits are produced by only 13 foundries, so saying that there is some concentration here would be an understatement. Can we mitigate the security risks associated to this concentration at the primitive level? That's the question I'm going to try to answer in this talk. First I'm going to introduce what security risks I'm even talking about, as well as our general approach to solving this problem. Then I'm going to introduce our solution, MOI, giving its specification. Then I will give a bit more details about the cryptographic properties of the components of MOI and why we chose those. And then of course give a quick security analysis of the cypher itself. So why do we care to put it simply? Why is it potentially a problem that we outsourced the production of these integrated circuits? There are several risks that are associated to this. The counterfeiting is one of them. The design of the circuit could be reverse engineered to steal some intellectual property. And there could also be some malicious modifications of the circuit. And we are going to focus on the third one here. It corresponds to what we call hardware atrogyns. So it means that if someone embeds hardware atrogyns in an integrated circuit, for a while it's going to behave normally, but then there's going to be some trigger. And that will cause this circuit to get some damage either in a physical or in a logical way. So what do we mean by triggers? It could be a physical trigger, like some temperature that changes. It could be a specific input. There is a special keyword that you send to the integrated circuit and it changes its behavior. Or it could be just a counter. So after, I don't know, 110, exactly 110 calls to the integrated circuits, it starts misbehaving. And what do we talk about damage or misbehavior? It could simply be that it stops working. Or that it starts revealing secrets via some subliminal channel or some specific outputs. Of course, neither are desirable as designers. And we then need to implement some countermeasures. So that even if there is such a hardware atrogyne, we can mitigate what it can do. So obviously we can try to detect them. So we can just try to spot the presence of a hardware atrogyne in an integrated circuit before putting it in our product. We can use some logic testing. We can do some section analysis. We can do some optical inspection of the device using special microscopes. We can also try to prevent their insertion or their exploitation using split manufacturing, input scrambling. But the problem we have is that none of these methods is foolproof. And furthermore, that all of them are very expensive and time consuming. In this context, in 2016, there was a paper that was accepted at CCS by Zimbowski, Faust and Standard, which presents a method to protect at least the cryptography part of an IC against hardware atrogyne. It relies on multi-party computation techniques. So the idea is that it will not be possible to activate the atrogyne. The trigger will not work because of these MPC techniques. But also we can still get some strong guarantees under reasonable condition that the number of times the device is used is bounded. So we know, for instance, if the circuit has a throughput that is what it is, we know that it will not be used to encrypt the size of the universe in terms of volumes of data. So under such bounds, we can make sure that an adversary will not be able to use a hardware atrogyne. So even if they manage to put one in the chips, they will not be able to use them. The downside of the approach presented in this paper is that it requires a testing phase. So once when you receive your circuits, integrated circuits from the manufacturer, you need to do some testing on them. And there is a huge increase in the overall circuit size and on the computational overheads associated to the evaluation of the encryption functionality. So our idea is instead to try to find the best way to adapt existing ciphers to this model, we're instead going to design a custom tailor-made cipher, which is intended specifically to be used in this context. So some more details about the model that we're considering. What they consider in that paper from 2016 is that there is... A circuit you want to implement, say the AES, later our cipher. And this circuit is transformed into a set of sub-devices. And the idea is that all these sub-devices, when assembled together by a master circuit, do the same as the initial circuit. So you're kind of splitting the computation into several sub-components, sub-devices. We can ask a single manufacturer, potentially malicious, to produce all these sub-devices. So we do not need to have different manufacturers for the different sub-devices. And that's one of the strengths of this model. We do not need a complicated supply chain. We can just have everything come from the same spot. Once we have received these sub-devices, we test them. And then we need to assemble them using a master circuit in which we have some trust. Of course, the idea is that the trusted master circuit is going to be extremely expensive to manufacture because it has to be done basically by you. So it has to be as small as possible to limit the cost. So how does this model help? We can prevent time bomb triggering by testing the sub-devices. So we can ourselves want to receive the IC circuits. So integrated circuits decide that we're going to test one of them 900 times and the other 937, etc. Time bombs are not going to work. We can prevent the use of cheat code activation or any of these special keywords by using techniques from secret sharing and multi-party computation. So the idea is that the input of each sub-devices is going to be scrambled somehow using secret sharing. So each of them is going to receive an input which is statistically independent from the correct one. And then we can avoid leaks by recombining the outputs from each sub-device. And again, it's a trusted master which will do the combination. So we still assume that there is a part of the circuit that we can trust. The game is to make it as small as possible. How then do we build a cipher specifically for this? So the idea, the main bottleneck that we need to overcome is the use of secret sharing. So what we will do is rely only on linear operations. Which sounds weird, but bear with me. So one round of encryption will consist of two operations. L and M, and the key addition. And both L and M are linear. So we can do secret sharing each time very simply. Of course, if your cipher is linear, you have a bit of a problem. So we're going to have that L and M are linear according to different algebraic structures. So in our case, M will be over binary field and L over modular ring. Once we have that, we can have different sub-circuits that will each implement a full encryption. So that's what you have in these boxes here with the dots. Each of the boxes is a sub-circuit. And then each sub-circuit will have several mini-circuits. So these mini-circuits are done via outsourcing. So I will ask one of my potentially evil partners to manufacture a bunch of L's and a bunch of M. I'm going to use them to build a first sub-circuit Gamma 1, which will implement a full run of the block cipher. And I will also build another sub-circuit Gamma 2, which will also implement a full run, etc., etc. And then I will use the majority function to actually get my output from this. So again, L and M are implemented using potentially entrusted chips. But there is a master circuit that handles the secret sharing and the recombination in the end. This simplifies greatly the testing phase from this Zimbofsky paper. So in our case, we can test the input-output behavior of the cipher itself and not of each component individually, which is a great simplification. We also reduce the communication complexity by reducing the number of communication rounds between the trusted master and the mini-circuit. And finally, since we rely on linear operation, the secret sharing is going to be very easy. And in fact, we can reuse only two shares instead of three in the original paper, which again reduces the hardware cost. So basically, each mini-circuit will rely on fewer L and M applied in parallel. Ok, so now how do we do that in practice? And that's Moe. So we have decided to build a 128-bit block cipher, where L is going to be a modular multiplication by 3, which we denote A3. M is going to be a multiplication by a big invertible binary matrix, from a random generator. And then a step consists in the following operations. So first you have a key addition, the inverse of the multiplication by 3, big matrix multiplication, key addition, multiplication by 3, big matrix multiplication. You see that we have an extremely simple key schedule with just some round constants. And we claim 127 bits of security, as long as the amount of data that is queried is less than 2 to the 64. Why did we choose these two main operations, multiplication by 3, and multiplication by a big binary matrix? I will let you read the content of the slide. But basically, if we want to get an abelian group, we can use the Cartesian products of rings of dimension P to the power EI. And if we want to have an endorphism over such a group, we are going to have basically matrix multiplications. So in our case, we decide to use two extreme cases. We set P equal to 2. And then we have that the first group we consider is when all the exponents are set to 1. So we only have z over 2z times z over 2z n times. So then we just get the group of invertible matrices over the field f2 to the n. And at the other extreme, we set the case where we don't really have matrix multiplications because we only have one ring in this instance z over 2 to the nz. And that's the two operations we're going to use. And that's why we use these two operations. But we still need to study these operations because we want to have some strong guarantees in terms of security, of course. Using modular multiplication in symmetric ciphers is not our idea. This was already done before, including, in fact, in a cipher called IDEA in 1991. And here on this slide, you can see an overview of all the modular multiplications that have been used to the best of our knowledge in the literature. And you can see that there is some differences in the modular use. So in IDEA, for instance, it's 2 to 16 plus 1. But we also have some who did what we did, which is to have 2 to the n. So you have that in Mars, you have that in Nimbus, in multiswap, in Chabal, also in Insoce Manuk. It's an operation which is not linear in the field. It has good properties of confusion and diffusion for a low cost in software because one multiplication is going to be one instruction. And it will have, as we will see, a high algebraic degree, which is a good property to have in order to prevent integral and algebraic attacks. Why is that? Why is it that the algebraic degree is high? Let's look at alpha equal to 3, because anyway, it's what we're going to use here in Moe. Multiplication by 3, you can say that it's x plus 2x. And 2x in z over 2 to the nz, it's just a shift. So you get this operation. Then you have that yi is obtained using the xor of xi, xi minus 1, and the carry, which we denote mui. The carry then is given by this induction. So the first 2 bits are set to 0. And then it's the majority function of the previous 2 values and the previous carry as well. So this is quadratic function. And as you can see, then, here, this bit is a linear in the sense of f2, function of the input, same for this one. This one will be a quadratic function. But then the next one, the carry will depend also on the previous carry, which is already quadratic, so it will be of degree 3, etc., etc. So overall, the algebraic degree of a3 is equal to n minus 1, which is the maximum for a permutation actually. And since it's the maximum, the inverse will also have the same degree. I hand-waved the proof here. We have a proper one in the full paper if you're interested. Then we turned our sites on the differential properties. So along the way, we have found that if you look at the ddt of multiplication by 3, xor the identity, you get Sierpinski triangles. Which looks cool and which we were quite surprised by. Again, if you're curious about it, I would urge you to read the paper. In terms of security, what's important is this bound. So it's possible to bound all the coefficients in a given line on the ddt using this formula here. So if you take any coefficient, the coefficient at line A and column B, it's smaller than 2 to the n plus 1 minus c of A. What is c of A? I hear you ask. It's the number of changes. So it's a notion we have introduced. It's the quantity defined by this formula. So basically, it's the Hamming weight of this vector c, where c1 of A is the xor of the first two bits and ci is the xor of the bits i and i minus 1. But it's set to 0 if ci minus 1 was already set. So concretely, if you have A which is equal to this value, you have as first two bits 0, 1 here, so that's one change. Then you still have 1, then you still have 1, then you have a 0, so that's another change. And then you have 0, 1, which would be another change. But since you had a change in the previous position, it doesn't count. So for this value of A, cA is equal to 2. So that's the number of changes. What is very nice is that this bound is independent from the output difference B. So if we can say something on the input difference, then we can bound the probability of a differential transition. And that's why we use the inverse of A3, followed by A3, because then this bound from the theorem tells us something in this direction and in this direction. So what we need is an equivalent of the branching number we have for matrices but for changes. As soon as we have that, we will have some bounds. So that's exactly what we did. We introduced the change branch number, which is the equivalent of the branching number before this number of changes. And we have some theoretical and some experimental arguments to compute the change branch number of a random non-linear permutation. First a random non-linear permutation and then we also push this analysis further to look at linear permutation. And in particular, we expect that the random matrix on 128 bits has a change branch number equal to 24. It means that we can reuse the similar style of arguments as we would use with the regular branch number and the differential uniformity and then we can show that any characteristic covering the three operations A3-1M, A3, will have a probability at most of 2 to the minus 22, meaning that four steps will be safe against an attacker with 2 to the 64 data. As for the security analysis for other attacks, there is the simple related key property with probability 1, which explains why we claim 127 bits of security instead of 128. I won't go over the details, but it's a consequence of the key schedule. And for linear attacks as well as for differential attacks, so I won't get into it, we have some very strong experimental arguments. So MOI is defined for 128 bits, but it's general structure Big multiplication by 3 and binary matrix multiplication can be defined for any block size. So what we did is we looked at much smaller versions for 8 bits to 16 bits. And then we plotted what you can see here. So essentially when the quantity that is plotted is under 1, you are safe from linear attacks. I would refer you to the paper if you want the details, I really don't have time for them in this video. But what's important is that as you can see starting from 4 rounds, we are consistently at most at 1. And in fact as n increases we get much further from 1, meaning that in our case since n is equal to 128, we are already very safe. These experiments are very strong in the sense that we actually computed the LAT of the cipher. So there is no assumption about linear trails in particular. This is not the probability of a complete fully specified trade. It's really linear approximation of the cipher. So it's very strong in that sense. For other attacks we of course have results but again I won't have time to go over them. En conclusion, we have proposed a cipher which is tailor made to the CCS model to prevent hardware atrogens from being exploitable and it has better performances than existing ones at the time of publication. Along the way, we have made a comprehensive study of the cryptographic properties of modular multiplication by a constant in particular 3 and we even found some fractals along the way which is always fun. But obviously this cipher has a very simple structure which while it was intended for this specific use case with hardware atrogens could have other applications which we would be very curious to hear about. And with that I will conclude this talk. Thank you for your attention.