 Bon matin. together with Thomas Peters, I'm going to present our work Improved Leakage Resistance Authenticated Encryption based on hardware ESCO processors Which is a joint work with Olivier Bronschain and Charles Mermin from UCL in Belgium. As a starting point, I would like to recall what leakage resistance is about. And in summary, the idea is that since countermeasures against external attacks are very expensive, it can be interesting to avoid protecting all the parts de l'implémentation avec des mesures fortes et si possible aussi d'identifier quelle partie de l'implémentation pour protéger et combien. Par exemple, en 2017, Bertie et co-authors ont montré que si vous voulez encryter et authentiquer un message, c'est possible de générer un clé fresh avec une fonction de génération qui est DPS secure et d'établir un tag avec une fonction de génération qui est DPS secure, pendant qu'il y a tout le reste de la compétition qui est liée dans une manière qui peut, bien sûr, lead à des gains de performance substantially. Maintenant, si vous regardez le state de l'art, la première question est de comment impliquer la fonction de génération qui est de la fonction de génération et de la fonction de tag. Une solution pour cette proposition est de relier sur la masque et c'est pour exemple ce qui a été proposé par Bertie et co-authors en 2019 avec TDT. Le souhait de cette solution est que c'est flexible sur les hauts parce que les hauts sont au niveau de l'implémentation et pour exemple, si vous n'avez pas besoin de sécurité, vous n'aurez rien à payer. Le point négatif est qu'il nécessite l'expertise afin d'impliquer la masque securement. Donc, pour le coup, dans la direction de faire des choses plus simples, vous avez des chiffres comme ISAP, où l'idée est d'obtenir une résistance DPA grâce à l'implémentation et d'appliquer seulement sur la sécurité de l'SPA. Et ici, pour exemple, c'est indiqué sur la figure droite que vous pouvez générer une clé frais par absorber une clé fraîche très lentement et à l'extrême bout par bout. Et finalement, l'année dernière, à Chess, il y avait un papier appelé Retrofitting, où l'idée était d'utiliser l'AS et les chacres ou les processeurs par s'obtenir à l'application d'un PLF d'hôpital. Et l'observation, là, était qu'une sécurité de sécurité de l'SPA ne peut pas être triviale d'obtenir sur l'application et de l'application. Et donc, l'accélération de l'acceleration peut aider pour cela. Et bien sûr, si vous avez ça, vous pouvez aussi faire le schéma plus efficace, spécialement si vous êtes capable de digester l'annonce un peu plus vite. Donc, ça me lead à l'outline de l'application. La première chose que je vais faire, c'est détenir un flow dans la vérification de l'application de le papier Retrofitting. Ensuite, je parlerai de nouveau de l'obtenir de la sécurité de l'SPA sur les devises et spécialement de l'évaluation de la sécurité de la sécurité de l'SPA. Puis, Thomas va détenir comment nous pouvons fixer l'intégrité de l'application de l'application de l'application de l'application de l'SPA avec une nouvelle mode d'opération appelée LRBC2. Et finalement, je vais discuter de performances, conclusions et autres résultats. Donc, je vais commencer avec l'issue de l'application de l'application de l'application de l'SPA. Et pour cela, je vais commencer par récolter quelles sont les deux principales solutions pour détenir la vérification de l'application de l'application de l'application de l'SPA. La première option est de travailler dans le sens direct, pour récomputer le TACTO et pour détenir le processus de vérification de l'application de l'application de l'SPA. Dans cet endroit, il est très important que la vérification soit protégée contre le TPA, sinon un shear est transformed en utilisation d'un tours liquet j'ai essu le length de l'voix pour détenir les fragilités. Par其德, ce sleepys a été magné par les t champs Bertier qu'il department aussitôt pour commun cruel d'une vérificationオniversitaire détenue d'un procBabyil specialiste et que, par l'voix caves, l'on peut même vérifier l'application de l'application Maintenant, le problème avec le papier rétrofetté est que la fonction de la génération tag est basée sur un PRF, ce qui n'est pas invertible. Et comme résultat, la vérification de tag est un target d'EPA et ce target d'EPA n'est en fait pas découvert par l'analyse stéutique de la mode. C'est un problème parce que le premier état de la vérification de papier rétrofetté est firmware, qui nécessite une vérification de tag. Laissez-nous voir comment la vérification d'EPA fonctionne en pratique. A la droite, j'ai posté une pièce de code qui décrivait une comparaison de 128 bits, de papier rétrofetté. C'est quasiment 4,32 bits d'exorces entre le T-tac et le T-sac. La base d'attaque sur l'adversaire que l'on peut performer est la suivante. A l'adversaire, on peut choisir un texte de papier rétrofetté. Il peut demander de décrypter le texte de papier rétrofetté avec beaucoup d'autres tacks de candidates S. Et il va ensuite récover le T-tac, grâce à la ligue de SXOR-T. Les résultats de l'attaque sur un T-tac M4 en armes sont dessinés. Et on peut voir que l'on peut récover le T-tac avec la confiance avec environ 1000 traces. Et on a aussi des résultats très similaires avec le T-tac M0 et le T-tac M33. Cela already allows an adversary to push garbage ciphertexte into a device, which could lead to kind of denial of service attacks. But we also have a more advanced attack if the adversary knows any valid message ciphertexte pair. Because in this case, the adversary can also compute the random string R that was used to encrypt, and this is just the XOR between the message and the ciphertexte. He can choose a malicious firmware M' and then he can compute a valid ciphertexte C' that will allow him to push this malicious firmware in the device. So we see that we have a problem with the tack verification of the retrofitting paper. And before moving to the solution, I would like to come back to the main assumption of this paper. Which is that we can have SP security for its leakage resilient PRF. So for this, I pasted an example of leakage resilient PRF on the left. And what we do there is we start from the master key K. And at every stage of the PRF, we are going to encrypt a zero or one plaintext depending on the value of the nonce. And then after 128 stages, we are going to have a fresh key K star. On the right, we have exactly the same kind of construction based on a sponge. Then of course, this ricking can become more efficient if we are able to digest the nonce faster. Because if we can digest B bits per stage, then we are only going to need 128 divided by NB stages. And this leads to a trade-off because if you do that, it also leads to a SPA with 2 to NB possible inputs leading to the question, how fast can we go while maintaining side channel security. And essentially, this requires assessing the less investigated SPA security of the construction. As a result in the paper, we try to assess the side channel security of AIS co-processors, ideally in a worst case manner. We use three main steps for this purpose, which are pretty standard. First, we selected points of interest based on their signal-to-noise ratio. Second, we reduce dimensionality thanks to a linear discriminant analysis. And finally, we perform the template attack in a principal subspace. We perform the attacks for 2 to NB inputs and we use different levels of averaging. The results are represented by the two figures on the left for a 32-bit hardware co-processor, on the right for a 128-bit hardware co-processor. The x-axis is the value of NB, so the number of inputs that we can tolerate. The y-axis is the median key-rank, which is the security level. And the colors are for the different levels of averaging. And what we see is that for the 32-bit hardware co-processor, we really have to stick with NB equals to 1 if we want to maintain a 100-bit security level. By contrast, for the 128-bit hardware co-processor, we can tolerate a slightly larger NB of maybe 4. What's interesting is that we get slightly different results than the ones of the retrofitting paper, which suggests that slightly larger NB values. And the take-home message here is that SPA security is quite difficult to evaluate because it mostly depends on the size and an attack signal. And this is, for example, different from masking where the security will depend on the signal-to-noise ratio, which is easier to evaluate. In particular, the signal can be very sensitive to set variations and providing methods. So the main take-home message is that we should take security margins when we want to use this type of constructions. I will now give the floor to Thomas, who will explain how we can fix the integrity flow in the retrofitting paper with a new mode of operation called LRBC2. As a bonus, he will show how we can do that by using only AIS co-processors. And this is interesting because AIS co-processors are still by far the most popular in the embedded security industry. And it will be in two parts first. He will briefly describe how we can do the message processing with block ciphers. And then he will describe the tag generation function. Thank you François Xavier. It is now my pleasure to present you a new one pass mode. So in the design, you can see the three steps. So on the left hand side, you have the key derivation function, which generates two n-bit state. And on the right hand side, you have the tag generation function. And we will see in the next slides how we compute t. But for now, let's concentrate on the middle, the message processing part. The picture is actually an encryption of a message that can be split into two message blocks, m1 and m2. And at each iteration, at each processing of a message block, we have, of course, to turn the message block into a ciphertext block and to refresh the state. So from K1 and L1, we first get K2 and L2 after the first iteration. And the next state must be viewed as a hash function of the previous ciphertext block. So that means that at the end, we will have a final state, K3L3, which is the hash value of the ciphertext. Now let's take a closer look at each iteration. As you can see, we make four calls to the block cipher. That means that in practice, we will have four computations of the AES in order to process 128 bits of message. This is actually pretty efficient. Remember, we need to have a hash value at the end. That means that at each iteration, we can rely on a compression function. And due to a result from Bart Menink, we know that in order to have high security, when only using block cipher, call to a block cipher, we need at least three block cipher calls in order to have high security. So this is our starting point. You can see in the first iteration, all the block cipher in black come from Menink's compression function. We simply had to choose among all the possible invertible linear maps, those who are fully compatible with the confidentiality of the mode. And in red, we add one more block cipher call in order to create a random n-bit value in order to explore it with the message block. Perfect. So we repeat this until we have processed all the message blocks and we get the final state at the end before going into the tag generation function. Perfect. Let us see now how we compute the tag t. And also how we can verify the validity of the ciphertext, so the validity of the tag t. To see how we compute our tag t, let us take a look at the existing solutions. So in the picture now, A, B, represent the final states. If we want to mimic the solution of the FAC paper in two years ago, we will first replace their mask block cipher with our PRF, the one that was analyzed in the previous slide by François Xavier. So if we do that, we have an X value. And this X value is next used as a key of the next call of the block cipher. In encryption, this results in attack value t. But in verification, so in a decryption carry, the verification is not directly made on the t value. Otherwise, you have a simple DPA on that value. So you have to invert that value and the check is made on the A value. However, since we cannot rely on masking, as in the FAC paper, the X value can be revealed, can be leaked by a DPA. Indeed, the adversary can simply make distinct decryption carries with many distinct t and X, X will be leaked. So we cannot rely on that solution. Let's take a look at another solution. So the comparison cannot be made on t. So if we invert, it doesn't work. So let's take a look at what happens if we make one more call to the block cipher in the forward direction. Then we will have a solution ala Issa. So that means that the check is not made on t, but is made on the Z value. And zero is a constant. It can be any other constant, of course. And actually, this solution works, except that the integrity only hold up to the birthday bound. And the reason is as follows. Imagine that the adversary make many encryption carries. From that carries, you will get ciphertext with a valid t, of course. And with that t can simply compute itself the Z value. Ok, so we have many good Z value and t value for encryption. Now the adversary also make decryption carries, many, many. And you will get by dpa all the Z value of that ciphertext. These ciphertext for the moment are not valid. But by comparing the Z value of the decryption carries and the encryption carries, with a good chance that we will have a collision on the t value. That means that the collision will not occur on the Z value, but before on the t value. And that means that the adversary will be able somehow to guess a good t for ciphertext that was used only in decryption carry. So you will win the integrity game. So we will have to look for another solution. And the trouble actually with this solution, so the one in the forward direction, is because the Z value only depends on an n-bit value. So let's try to see what happens if we manage to compute Z from 2 n-bit value. Here is what we get. Now the Z value depends on both the value t and the value y, in blue, which is only computed during decryption in order to verify the validity of the tag and the ciphertext. Ok, so to compute y in blue, we reuse the fmr key x once, and in the key input of the block cipher call, we simply add a constant. We show in the security proof that as long as the adversary is not able to mount a dpa on t or on x, then we have the high security that we targeted. Somehow tny is a state, which is collision resistant. Ok, so the value y is something that the adversary can get by dpa in decryption during the verification and the Z value as well. But we will show that it's not possible, it's not feasible for the adversary to mount a dpa on t and x. To see that, we have to go back on the final state after the computation of the message block, and a and b is something which is actually not random. We do not have a 2 n-bit random final state. So both values a and b are actually the output of a smaller compression function. So that means that we can hope not to have too many collisions on that value, and this is something that we exploit in the proof. We rely on a usual technique where we define alpha as a multicollision parameter. That means that for all the smaller compression functions, we tolerate to have alpha multicollision. That means that a single value that is repeated alpha times. But if we have one more collision, then we abort in the game. And of course the probability to abort increase very very much with alpha. Ok, so we can think that the adversary will get alpha possibilities in order to target x. And so that means that the block cipher to compute t, you will have the same value x alpha times with distinct value of a. Actually the adversary has more than that. That is because we do not simply have a collision on b, but we might have a collision before. So the adversary is able to use previous collision in order to increase the amount of repetition of b with distinct a values. Fortunately it is not possible for the adversary to have alpha cube for instance. So we only have to consider what happened just before the last block cipher call that computes a and b. So with alpha square repetition of b, we will of course have alpha square repetition of x. And now the adversary will try to mount something which is not really a db. So a now is fixed. We simply have 2 alpha square computation with x since the value x is used twice during in the computation of the tag. Ok, so from a theoretical standpoint we can have high security, really high security. But of course the more alpha increase, the more we have to hope that the AES in practice will be safe against the SPA. So in practice if we want to have 112 bits of integrity, that means that we need an implementation of the AES where we have an SPA security with nb equals 7. And so before in the slide that is something that we do not have with the implementation. So we just have to look for a situation where the adversary does not have alpha square repetition, but only alpha. If we have alpha everything will be better. And to get that we simply have to add a simple tweaks in the design. And we had one more block cipher call here before going into the long PRF. And by doing this we managed to have repetition on the w value only alpha time. And of course with this we have all what we want and for the targeted bit security we only need the SPA security of the implementation for nb equals 4. And this is something that we see that we can hope to have. So that means that with one additional block cipher at the beginning before going into the long PRF we have a solution that we can implement today. Of course from the start it would have been possible to use a second time a big PRF in order to have the high security. But of course here we see that in the tag generation function we simply use the long PRF and in decryption four additional calls to the block cipher. And of course these four calls is much less than one more call to the PRF. That's all I wanted to say about O mode. Now I let François Xavier ending the talk. Before to conclude let me say a few words about the performance evaluation of our schemes. The two figures at the top of the slide represent the performances of LRBC instantiated with the AS. LRAS2 is exactly the proposal that Thomas just described. LRAS3 is a two pass variant that provides stronger confidentiality with leakage. And then we implemented the retrofitting proposal with and without Shartou CC6 co-processor. We can see that for short messages on the left of the figures LRBC is always better than the retrofitting proposal. And this is mostly because we have a more efficient tag generation function. On the right of the figures we see that for long messages LRBC is better if only AS co-processors are available and the retrofitting proposal becomes better once you have a Shartou CC6 available as well. And as expected of course the 128 bit co-processor that is on the right provides better performance than the 32 bit one that is on the left. This leads me to the conclusions of the paper. First our results recall that securing low end embedded microcontrollers at the software level is challenging and likely to be very expensive. In this respect the LRBC mode of operation aims to offer good of the shelf security whenever AS co-processors are available. More precisely LRBC2 provides optimal integrity with leakage and confidentiality with leakage in encryption only. LRBC3 that we described in the paper additionally provides confidentiality with leakage in two passes. And we hope that these give ready to deploy solutions for the IoT. Finally we note that SPA security is highly non trivial to evaluate and to ensure and therefore it leads to the open question how to obtain SPA security in software for example using the shuffling counter measure. Thank you for your attention.