 La dernière part de la session est de l'implémentation d'un white box, nous allons les attaquer, pour récupérer l'encodement de l'affiné, et le talk sera donné par Baptiste Lambert. Bonjour tout le monde, je vais parler de récupérer l'encodement de l'affiné dans l'implémentation d'un white box, et c'est un point de travail avec Patrick Derbez, Pierre-Lame Foucq et Brice Minot. Alors, je vais commencer par donner un peu de contexte, puis je vais présenter notre algorithme générique, et faire une petite partie du délicaté de l'attaque délicaté qui est montée sur un scheme d'un white box spécifique. Alors, qu'est-ce que le modèle ? Donc, classiquement, quand nous étudions un black box, nous sommes dans un modèle de black box. Donc, par exemple, vous avez ce black box, qui donne des AS avec un key K, et l'attaque veut essayer de récupérer ce black box, et il peut uniquement utiliser l'input et l'output de l'algorithme. Donc, c'est un modèle classique, et il s'adresse récemment dans le black box, où les hypotheses sont les mêmes, excepté que maintenant, vous avez aussi l'access to some leakage when the computation is done. So, such like electromagnetic power or whatever. And if you go a bit further, then you get white box. So, white box is like the ultimate gray box. You have just access to the implementation of this box. You know exactly how it's implemented. And you could do whatever you want with it. You can set black points. You can skip an instruction. You can even modify the implementation if it can add you to the key. And obviously, you can read the implementation. So, if the key is like just plain text written, it's useless. So, the goal of white box cryptography is to try to provide an implementation which is secure against such attacks. So, the attacker has two goals. Either he wants to just extract some key material. So, this is like the best way to attack it. But he can also just want to compute the inverse of the function. So, for example, if you're given white box implementation of an encryption, you want to get the decryption scheme while you're not allowed to get it. And so, the main application of this is probably digital rights management because you need to do some decryption on the client side. And you can also think of it as some form of post-contrument public key encryption whereas your implementation is your public key but that's like more funny stuff. So, yeah. There were quite a lot of proposals for white box. So, the first one was proposed by Cho et Al in 2002, was broken quickly afterwards. And after that, several design came up. Mainly, we have two strategies, table lookup and SSI like. So, as you can see, basically everything is broken. It's a well-known fact for white box. A few things still stand. So, we have the proposal by B Yukoff et Al at TOSC 17 which is just using a lot of SSI failures. It tends to go up quickly in term of the size of the implementation. So, it's a bit hard to use. We also have a white block Le bloc de Fouquetal, c'est prouvé et sécurisé, mais le modèle est un peu oude. Ce n'est pas exactement ce que j'ai dit juste avant, c'est une sorte d'outre-bord ici. Et d'ailleurs, tout le monde était toujours bloquant et l'un que nous avons laissé était celui de Baiketal 2 ans plus tard. Et c'est sur ce scheme que nous avons montré un attaque de Baiketal. Nous nous concentrons sur le look-up des tables, et les constructions des look-ups des tables sont basées sur le framework de la show. La idée c'est d'obfusciser un bloc de Cypher, et pour cela, le bloc de Cypher peut être découpé dans une fonction round, et d'abord, d'obfusciser chaque fonction d'une certaine façon, pour qu'une fois que vous mettez elles tous ensemble, la fonction d'obfuscisation peut être découpée, et vous avez toujours le bloc de Cypher. Et pour obfusciser cette fonction de round, vous allez générer des encodings, et vous implementez votre fonction d'obfuscisation avec quelques tables. Et pour évaluer votre bloc de Cypher, c'est juste une chaine de table de look-up. Et pour augmenter un peu la sécurité, vous ne voulez pas que l'attaque soit accessible au point d'obfusciser le bloc de Cypher. Donc, vous avez quelques encodings externes autour du bloc de Cypher. La chose est que vous pouvez juste évaluer les encodings d'une certaine façon. Si vous faites cela, l'obfuscisation sera beaucoup plus grande. Donc, vous devez donner un peu de structure sur ces encodings. Donc, vous expliquez en deux parts l'affin et la partie non-linear de l'encodings. Vous devez faire cela pour obtenir une déplementation efficace. Mais la chose est que la partie non-linear peut être récoverie très efficacement par l'algorithme Baik et tout. Donc, la plupart des fois, la partie non-linear est récoverie par l'affin de l'encodings. Et si vous pensez à ce problème dans une très générique façon, c'était en fait sauvé en 2003. Vous pouvez voir ce problème que l'on trouve ici. Vous avez deux bijections, c'est 192. Vous voulez trouver des mappings affin, mais l'autre biject, s2, est equal à bxs1xa. Oui, cela existe. Donc, en notre cas, s2 sera la fonction de l'affiscatation et s1 sera la fonction de l'affiscatation. Donc, ce charisme est bien connu, il fonctionne très bien. Mais la complexité est basically exponentielle dans le nombre de fonctions. avec l'improvement de la nourriture de l'année dernière. C'est exponentiel dans le nombre de fonctions. Donc, si vous pensez à appliquer ça sur les chiffres de blog, c'est basicement exponentiel dans le nombre de chiffres de blog. Donc, même comme 64 bits, c'est un moyen d'exprimer en pratique. Et... Mais, c'est... le problème que nous voulons qu'on quitte pour quitter. Mais, il turned out that when applied to whiteboard cripsos on some blog ciphers, we don't have a generic instance of the algorithm. We have an instance which looks like this. So, if you want to obfuscate a blog cipher like a yes, you will have in the middle of your one function a layer of S-boxes. And... So, our middle layer which was S1 in the previous slide, it's just not some random non-linear part. It's really a concatenation of some S-boxes. So, there is already a bit of structure in our problem. And so, to resume, the problem is the following. So, you are given the encoded one function F. So, you know it. You know it's built as B times this layer of S-boxes times A. B and A being our final secret. And you want either to find a way to invert this function. So, in the context of whiteboard cripsos, this lead to a documentation function. Or even try to find exactly which A and B were used and this can lead to a key recovery. So, in our case, our generic algorithm solves the first point. So, given such a function, we can efficiently get its inverse. So, I will give how it works. So, basically, it's a two-step algorithm. First, you isolate the input or subspace of each S-box. And it's actually a technique which was known since 2001. By Björkhoff and Shamir, when they cryptanalysed Celsas. And once you did that, you just need to apply the generic affinic resonance algorithm I spoke about earlier. But this one, not on the whole block software, but on each S-box, which are a lot smaller. So, to get the input space of each S-box, the first step is to find such a space V1. So, this space is a linear space of differences. And you want that the image of this space, through A, leads to M0 consecutive bits. So, basically, the difference at the input of one S-box is zero. So, the value is constant. And, otherwise, it takes all the possible values. So, this leads to a space of dimension N-M after A. So, since A must be invertible, V1 must be of dimension N-M2. But now, all S-boxes are bijective. So, if the input is constant, well, the output is true. And, if the rest takes all possible values, so does the output. And, again, this goes through B. And B is invertible. So, the resulting space U1 is also of dimension N-M. So, we will use this to build V1. So, we want to build, first, such a space V1. And, to do this, we just need to use a very simple test. If we want to test if a difference belongs to V1, we just generate a bunch of random vectors, bigannels. So, basically, a bit more than N-M. And, we compute the resulting output difference space. So, the difference between the image of each vector to F and the image of X plus delta. And, we compute the dimension of the Lerna space generated by this space U. And, if it's of dimension N-M, then, as shown here, the resulting space would be of the same dimension. So, we guess that delta will belong to V1 with high probability and we can, basically, adjust this property as high as we want. And so, we just need to do this with some independent vectors to build the basis of V1. And, just making sure that the output difference space is always the same. And, well, we do this for V1, which led to zero on one S-box S1. But, we can do this K times to get a space with other zero difference in each S-box. And, once you have all those K-spaces, well, you can just take the attention of all of them, except one. And, basically, this will put zero difference in all S-boxes except, well, for example, the first one. And so, the resulting input space will be of dimension M, because, again, everything here is invertible. And, the output space O1 will be of dimension M2. So, you now have a mapping from an M-dimensional Lerna space to another M-dimensional Lerna space. So, you can compute some projection, PI and QI, to send F2 to the M to the input space and to send the output space to F2 to the M. But now, what you have is, basically, a map on M bits, which is affin equivalent to one of the S-boxes. And, so now, you can apply the affin equivalence algorithm because it will only be exponential in the size of the S-box, so, basically, eight most of the time. And, this will give you some affin mappings AI and BI. And so, you do this for each S-box, so, you will get a bunch of AI, PI, BI, QI, so, you know all of them. And, you just put them together correctly and this will give you two affin mappings, B-prime and A-prime. And, you can now write F, so, our encadé load function, as B-prime times the layer of S-box times A-prime. Now, you know B-prime, you know A-prime, they are affin, so, easy to invert. And, you know all the S-boxes, and they're small, they're bijectives, you can compute the inverse, too. So, now, you can easily compute an inverse of the encadé load function F as this expression. So, for each encadé load function, we can actually compute its inverse, so, we can compute the whole decryption algorithm. So, in terms of complexity, yeah, to compare a bit with what were made before, you can see that they don't have that much time. Basically, everything before was exponential, so, the complexity of the Bayer-Cathard algorithm may seem polynomial with the first term, but it's actually high degrees, it's quite ineffective when n is large. So, in the base case, we have the following complexity. Basically, polynomial in n, and just exponential is the size of the S-box. So, this is if all S-boxes are the same. If you have some different S-boxes, you just get a linear factor k, which translates to m2 times n at the end to m times n2. And, in the worst case, which is actually the case with the RASX box, you can't use the improvement by denier et Hall, by denier et Hall at your equipped, so, you have a factor 2 in the exponent on the last term. And, in practice, if you look at this for AES parameters, so, this leads to only two to this early operations to even to invert one-one function. And, if you look at Bayer-Cathard proposal, so, which use twice the block size, it's just two to the 35. So, I don't have much time remaining, so, just quickly, the idea of Bayer-Cathard was to obfuscate two parallel AES to increase the block size and as I made a security claim of 110 bits. But, so, again, you need to structure the encodings, so, the matrix A here. So, this matrix has the following form in the top right, so, each star is some adb block, adb non-zero block, and, otherwise, it's all zero. And, the adb is basically the same as the direct algorithm, except that, now, you can use the structure to actually identify exactly which encodings were used using some met in the middle technique. And, this allows us to recover the key more efficiently. And, so, this was implemented, it's quite short, so, only 2,000 lines. And, basically, it's over complexities to the 31, it's only 12 seconds times, almost no memory. It's available online if you want to take a look at it and it's basically impossible to fix in a reasonable amount of memory. So, yeah, I think I will be done. I just leave this summary slide, which sum up everything. Thank you. Thank you for the talk. Question, question, comment. May be one question. Do you think that there are possible improvements for the affine encoding in order to defeat? Do you think that it's still a good strategy to use affine encoding to implement white rocks? Or, there are some way out to adapt them to defeat your attack? Honestly, affine encodings are not useful. You will get them if you use some non-encodings because there is already nothing part. But, honestly, to get some high complexity with our algorithm, you need to have either a huge S-box which leads to some huge tables, or a huge block size which basically doesn't scale up very well. So, I think the only way is to either use some much bigger non-encodings but, again, the size of the implementation grows up quickly. So, table-based is, like, mostly dead. And some other solution could be polynomial-based, like SSA. C'est un proposer de Biukov-Schulbock. But I think the best way to do white box now would be to focus on finding a new paradigm, actually. We're finding a new things to do. Okay. Thank you. Good thanks to the speaker again.