 All right, so I'm kind of assuming that most of you already know what DPA is, but just in case, I'm going to give like a one minute flash intro about what it is. If that works. Okay, so it was discovered by the CRI research team in the mid-90s. They discovered both simple and differential power analysis. It's a low-cost, non-invasive attack on crypto hardware and software. You can extract keys with it. You can even reverse engineer some, for example, secret S-boxes and stuff like that. It typically applies to all crypto algorithms that use keys, so most of them do. So you can extract keys out of symmetric key algorithms. You can extract keys out of public key algorithms. Any types of variants, DH, elliptic curve, RSA, anything you want. It affects all types of hardware and software. It started off on smart cards, which are easier to break than other platforms maybe, but in the last few years, it turns out that even A6 and FPGAs and other types of more complex computation systems can also be attacked with it. And the same techniques actually work. You can use different channels, sources for your information. It can be power consumption, but it can also be radiofrequency. It can be electromagnetic, et cetera. Okay, so that was published in 1999, and let me tell you how this works. So let me try to start with RSA. This is super simple and it's super visual. In RSA, you exponentiate to a secret exponent D, and this exponent has bits. So if you use a typical, let's say, square and multiply algorithm and you don't worry about anything, then you're going to see that for each zero bit, you do a square, and for each one bit, you'll do a square and a multiply. So if you try to plot a power trace of what's going on on your chip, you will see that the multiplies are actually different from the squares. They look different, and so when you see like a big peak like that, that is preceded by a smaller one, that's typically a multiply preceded by a square, and if you only have a small one, that's a square only. So essentially what it says is that you can read the exponent off of one single power trace. Okay, so now for us, this is simple power analysis. For differential power analysis, it's a little bit more tricky for algorithms such as AES or DES. You don't actually see anything on a single curve trace. You have to use several ones, and then you go into the domain of statistical analysis. So you do key hypothesis testing, you guess, let's say, 8 bits of the key for AES, for instance, and then you choose one bit of the output of, let's say, one of the first S-boxes. And with your key guess, you're going to be able to pre-compute for different plain text what the supposed output bit should be. And depending on the value of that bit, you're going to be able to sort your traces and put them in two different buckets. This assumes that if you compute with ones or zeros, you don't consume the same power on your chip. So you put them in two different buckets and you take each bucket, you average all the traces in there, and then you compute the difference between the two buckets. And what happens is that if your key guess is correct, essentially think of it as weights. If you put all the heavy weights on one side and all the light weights on the other side, you will see a difference if your key guess is correct because you will be sorting correctly. This is the power trace that you see at the bottom of the slide, so you see peaks appearing. It's totally visual and you can just decide, okay, this is the right key guess. For an incorrect key guess, here's what it looks like. When you start sorting according to an incorrect 8-bit key guess, now you're sorting completely randomly your traces into the two different buckets and essentially what ends up being in one bucket and the other bucket on average is the same thing, so you see nothing. So the bottom trace here, you see there's no peaks, no nothing. You can't distinguish anything interesting and therefore you know that this is an incorrect key guess. So you see the difference, here's the bigger one, here's the smaller one. Okay, so that's SPNDPA in a nutshell in less than five minutes. Okay, so what do you do to try to counter these types of attacks? You have very classical countermeasures that have been found and described in more than 10,000 papers, I think, over time. You can do several things, you can do obfuscation, you can try to reduce the leak, you can try to balance your designs, you can try to introduce differential logic styles. You can add temporal noise, amplitude noise, dummy operations, you can add randomness which means that you try to decorrelate your data from the secret. So you can do all sorts of the implementation variations, but all of these suppose that you can change the actual algorithm implementation that you're using. And in the real world, that's not always the case, so sometimes we have people coming to us and saying, hey, can you help us, you know, secure our platform and our designs, but please don't change anything in the implementation. Okay, thank you, great, no problem. So what you do then is that you go one step further and say, okay, can I change anything at maybe the protocol level? So not touching the design of the AES or whatever, just try to change at the protocol level. And this is what I'm going to try to explain in the next few slides. Okay, so the idea here is that you say, okay, if each time I use the key I'm going to leak a little bit of information, my goal is going to be to not use the key for too many different plain text or cipher text decryptions. So I'm going to use the key very carefully and only very few times. And what I'm going to describe next is mostly applies to symmetric cryptography, it doesn't apply so much to RSA or anything here, but what I'm describing is for symmetric-based challenge response encryption decryption and stuff like that. So you're assuming that you can find a way to only apply your key to very few plain text or cipher text blocks at a time. And then each time you think, okay, this is enough, I've used it enough, now I'm going to change it, evolve it, rotate it, and choose a new key. So the way you do this is that you typically use something called a key treat. So you start off from your shared secret, which is noted as K root here, and then you define something that's called a path, and I'll come back to what the path is. And then you walk down a tree of keys, meaning that depending on your path bits, you're going to choose to go one direction or the other. If it's zero, you'll go one way, if it's one, you'll go the other way. F, zero, F1 being the functions applied to your key to make it evolve, these are one-way functions, and you're going to use something at the bottom of the tree to do your actual computation with. So as a first example, I'm going to show a challenge response using this technique. So normally what you do is you send a challenge to your device when you want to authenticate it, the device will reply with a cryptos-metro-crypto function of your root key applied to the challenge. That's a very classical way, and if you want to do it in a DPA-resistant fashion, you'll do it using your tree of keys. So what we call the path here is going to be, for example, the SHA-256 hash value of your challenge that's received on the device's site. And then from your root key, K-root, you're going to go down your tree to define what the response will be to the challenge. So F0 and F1, again, are two distinct one-way hash functions with low and bounded leakage implementations, but essentially for every bit of the path you're going to choose one direction or the other, and in the end what comes out is the response. The interesting thing here is that no matter how many times you traverse this tree, each individual intermediate key that you're computing can only be attained by three different computations. So either it's coming down from F1 here in red, or going down from K-root, comma, one to F0 or F1 depending on the next key bit. So you can do anything you want to this protocol. You can attack it any way you want. Every individual different key in the tree will only get used with three different values, which is not enough to run a DBA attack. Okay, so this is a very simple challenge response, but you can do a little bit more complicated. You can actually do leakage-resistant encryption decryption. And leakage-resistant encryption decryption also means that you're going to use authenticated encryption to make sure that you're not vulnerable to chosen ciphertext attacks. Okay, so where do you use this? Well, okay, encryption decryption will be used in many different contexts for firmware decryption, packet encryption decryption for storage, and also when you have something stored on a disk, for example, you're going to repeatedly decrypt the same thing, which would be fine in general, but obviously the main issue here is going to be that an attacker is able to submit chosen ciphertext. So even if you don't intend to decrypt any other ciphertext with your key, somebody's going to submit different ciphertexts. Even if the MAC fails in the end, it does matter. It will get rejected, but you get the traces you need for your prior analysis. Okay, so our approach here then involves to use every key for only a very few number of blocks and to change keys all the time. All right, so how does this work? First of all, here is the encrypted data format. So you choose a message identifier at random, and you put it together with something called a validator, which is basically a MAC, and I'll come back on how to compute that into a header. And then you start encrypting chunks, which are noted 01 to ON. The red part is the ciphertext of the corresponding ciphertext to your encrypted plaintext. And the gray, I guess, portions are hashes of the block that you're in, hashes of the next block, and hashes of the full plaintext. Let me describe this in a little bit more detail later. So first of all, let's say we start off from a message-specific key, KMessage, and I'll explain how you compute that one later. You start off from the KMessage key, and what you do is that you hash it as you go so that it will change every few plaintext blocks. And by plaintext blocks, I mean not necessarily just one AES block can be a few, but not too many. So you can set a specific threshold here according to how much your actual AES implementation leads. So you hash the keys one after the other, and you change every so many AES plaintext blocks. Okay, that's first step. So now you encrypt each block PI with the corresponding KI. Once you're done with all that, you take all the plaintext blocks and hash them together and put them at the end of the last encrypted ciphertext chunk on the complete right-hand side, HFP. And then you go backwards and create a hash chain ciphertext. So then you create the hash of the full last block and include it in the previous block, that's H of O3 in this picture. And then you do the same thing going backwards, and then you end up with the very first hash of O1 here, which we're going to keep aside for a second. All right. This is the first step here. The second step is now, okay, how do I derive the message-specific key? Well, it's very simple. As previously, very similar to the challenge-response protocol, you're going to traverse a key, three of keys, and compute a message-specific key K message by applying the message ID, the randomly chosen message ID, as a path. And you end up with one key that completely depends on your message ID and only on that. And the two different functions, H0, H1, or one way, and they need to be different. So for example, you can use a shot to 5.6 by prepending 0 or 1 or some different constant value to what you're hashing. Okay. Message ID. Now you have a K message. And now to finish off the protocol, how do you finally compute the validator, so the MAC that will check that nobody submitted a wrong ciphertext to you? You take your K message, and you take the, remember the hash of the O1 value, which you hashed backwards. And now you apply this hash value to the K message key, again in a treat fashion, and out comes at the end the value that we call the validator. Okay. So now that everybody's confused, let me try to show what this thing looks like. So as a summary of the whole protocol here, again, you do generate a random message ID, plug it at the beginning, then encrypt each chunk, hash the keys as you go, encrypt each chunk, hash the full plaintext put at the end, hash backwards on each block, and take the final hash, which is the hash of the very first block, walk down the tree again and compute the validator, plug it in there, and that's your full encrypted data. For decryption, you go the other way around. You take the K message specific key by walking down the key tree from the root key with the message ID, and then you start to verify the validator first. So if you know that the validator is correct, you can start decrypting. You decrypt each chunk and hash it as you go to make sure that each chunk has not been modified, and in the end, you can check the final hash to make sure it corresponds to the hash of the plaintext. So essentially, what you're doing in this protocol is every time you touch a key or something that's message dependent, you walk through a tree of keys, which means that every single key is only applied to very few values, and if anybody tries to submit a wrong ciphertext to you somewhere, something is going to change, which makes sure that some of the values somewhere will not match up. OK, so this is a typical protocol level countermeasure for the real world. You don't have to change pretty much anything in your AES implementation. You just need to make sure you can actually change the protocol itself. So that's not always possible, but sometimes you can be lucky and you can be in a scenario where you don't have to necessarily follow standards or be interoperable with anybody else than your own other side of your system. And so this works nicely in some settings. So for example, as an application, typically you'll have bitstream encryption on FPGA. So you have a bitstream encrypted that's sitting somewhere in memory, and when you load it, this is how you decrypt your bitstream and start running it on your FPGA. And this is kind of a closed system where you can decide on both sides, can run the same protocol, and so you can apply this kind of countermeasure. OK, so let's see. I have a few minutes left. So let me talk about something a little bit different now. Something else we're working on at the company right now is trying to change a little bit how you do product testing, side channel resistance testings for actual products. And for that, we propose a somewhat new methodology, which is called test vector leakage assessment methodology. What the goal is of this is that you want to try to achieve some kind of repeatable and very precise method for testing, which is more automated, less subjective than pure lab analysis, which is low cost and also allows fast time to market. And the reason for this is that in general, at least in the smart card world, people have chosen so far to go down the evaluation route, which is very similar to common criteria evaluation, for example, where you have labs essentially testing and trying to break into your products, and if they succeed, then you fail, and if they don't succeed, then you win. What we're trying to propose here is a little bit different and goes into the direction of validation. So this means you want to try to write a specification, which could hopefully be standardized, and which says, OK, if you apply these specific test vectors and you run these vectors on your device with specifically designed and standardized tools, then if the result corresponds to some expected output, then you know that your product is secure. So it's a slightly different approach. And the reason, too, you want to do is this is to say, OK, I might not necessarily be able to extract keys if I find a leak, but if I don't find a leak, then for sure I know I can't extract any keys. So this is somewhat going a little bit further than trying to break keys out of a product and do it over and over again. So each of these methods have pros and cons, obviously, but the one that we're trying to describe here is a little bit more standardizable, and we're hoping that this could end up in ISO standards. There's already efforts to do that, and maybe I can show you a little bit of what the standards are at, but it's on its way. OK, so essentially, this is a test vector approach, similar to known answer tests in some sense. So you want to change from a text style methodology to a white box validation, and you define exactly what the test vector should be for every single algorithm. So essentially, you're going from giving a lab all of the work to do to try to hack into each new product into trying to define, have experts sit together and try to define what is the perfect set of test vectors for each specific algorithm. That mostly depends on the algorithm, and so once this is done, then you can have everybody run the same kind of tests and be validated in a comparable way. It does require a lot of control over the keys and everything that gets loaded into products. So your product and your test needs to be a little bit open so that you can set specific keys and set specific data, run it, and then measure what the outcome is. But once you have that implemented, then it's all pretty much automated and standard. OK, so what it then gives you is that you measure leakage, and if your leakage is below a specific threshold, you're reasonably assured that this will mean that your product is secure. OK, what we propose is to use something called the Welch's T-test. So what you do is you have sample populations that you're going to compare. You have two different samples. You have their mean, you have their standard deviations, you compute the T value for that, which is the equation on the right. And then what you're trying to do is to assess whether there's any significance when you observe two different populations, if there's any significance in the difference of their means. Remember, DPA attacks are about comparing sets of traces to see if there's any difference. So here we're trying to assess if that difference shows in the sense of not visually but statistically, and if that difference means anything or not. And if it's below a certain threshold, which is set here at 5.9's confidence, so 4.5 times the standard deviation, then if you're below that, it means that you statistically can't see any difference between the two sets, and therefore you know there's no leak. And if there's no leak, then for sure you know that you can't extract any keys out of this. What this requires though is a little bit of specific data sets and test factors. So you have to study the algorithms a little bit to figure out what the best test vectors are to try to find out where the best attacks might be and then standardize or define what those vectors are that you want to submit your product to. Again, there's different techniques. It takes a little bit of time to explain all of that, but I guess you get the idea. You, for instance, you compare fixed data against random data, random keys, et cetera. Try to see if there's any difference between those two sets. Okay, two independent data sets. This just tells you that you need to collect quite some data, so in order to have higher confidence, what you can also do is collect two different data sets and run the test separately twice, and then if both tests fail at the same point in time, then for sure you know there is a leak, and if they don't, then you can decide that there is no leak. From a vendor perspective, the requirement is that you submit documentation to explain what your algorithm looks like, what your modes and use cases are, how you implement countermeasures, what your rationale is, so that doesn't change too much, but then you submit a device to be tested, which is able to allow the tester to set any keys that he wants, set data the way he wants, message is the way he wants, trigger signals, and essentially run the standardized test factors as described. The other thing is that for this type of approach, it's defined to be time bounded and analysis bounded, so you only get a certain number of data collection points, let's say for one day of measurement, or 100,000 traces or something equivalent, and then you can only do bounded time analysis so that it's repeatable and measurable in the same way on different devices. Okay, so this is the kind of approach that we propose, and let me tell you a little bit how far along we are in the last minute, I guess. How many minutes, one? Five, okay, 10? Okay, five. So let me tell you a little bit how far along we are with this initiative. So in terms of standardization, there are a different number of standards that try to look at this approach. ISO 19790, 1780, 825, and 20085 with two parts. The first one, 19790 is pretty far along, but there's a little caveat at the end, let me try to explain what that is. So the standard is defined security requirements for cryptographic modules. It has four security levels defined. In section 7.8, which talks about non-invasive security, read side channel attacks, it says that modules may implement various techniques to mitigate against these types of attacks, and then it goes into describing mitigation techniques. There's two types. Those that are not referenced in something called Annex F, and there it says they will be validated when requirements and associated tests are developed. So future work, not ready. And then there are the ones that are referenced in Annex F, and they have to meet specific requirements. Those requirements for level three and level four, which are called, three is called strong and four is called stronger. It says they shall be tested to meet the approved non-invasive attack mitigation test metrics referenced in Annex F. Great, so far so good. And then you move on and you arrive at Annex F, and it says approved non-invasive attack mitigation test metrics, and there it says there are no approved non-invasive attack mitigation test metrics defined at this time. So we're almost there. We only just need to fill in Annex F, and then we'll be done. Okay, but there is actually another one which does define test metrics, so there could be a link between the two at some point. Who knows? This one is called testing methods for the mitigation of non-invasive attack classes against cryptographic modules. So it does actually talk about test metrics for level, security levels three and four, as mentioned before. It talks about side channels, such as power, electromagnetic, timing attacks, and it also leaves open the possibility for future side channels, such as photon emissions and acoustic signals. And it describes how to collect and analyze measurements, and it gives the test limitations that I talked about before as well. So remember, a test is limited in time. It's limited in the number of waveforms. So for example, at level three, you can have six hours maximum collection time per test and 72 hours total acquisition time max, with 10,000 waveforms, and level four, it's a little bit more. It's 24 hours per test, 288 hours maximum total acquisition time, and 100,000 waveforms. And then when you've done all your tests, we'll fail if a bias exceeds a specific leakage threshold and otherwise your device passes. Okay, so we're almost there. We just need to link this here with this one. Then there's two more, one more standard that has two parts in it, but this one's still under development. And to be complete in this world, you also have to have tools, and these tools also have to be calibrated, and these tools have requirements to them. So there's this third one, which talks about how to define test tools and test techniques and how they should work and how they should get calibrated and what that apparatus should look like, essentially. Okay, so we're almost full circle. And this is, let me skip about that one, and that leads me to my conclusion. So we're almost there. What I've shown in this presentation is a protocol level countermeasure for the real world when you can't change anything, but it still makes your device secure. If you're able to change the protocol, of course, sometimes that's not possible, but if you can, this is a really nice approach. It allows you standard legacy crypto cores and crypto software, no changes needed in the implementation. And the second part, I talked about alternative statistical side-channel resistance testing approaches called TVLA, which is undergoing standardization at this point, and you can find in the two links below here on either our website or at the upcoming International Crypto Module Conference, you can find documents that show you what these test vectors look like, and there's documentation for AES, DES, when used in HMAC mode, RSA, elliptic curve and so on. And I guess I'm almost on time. That's it, thank you. Okay, Greg, you're standing between us and coffee, so you get one question. In your authenticated encryption model, you needed, everything started with the random message number. And the current direction is to try and get away from the necessity for randomized IVs rather than counter-annonces or whatever. And of course, the usual solution to that is you just encrypt a counter and now it's randomized, or it looks randomized, except that doesn't work for you, I don't think. So do you have a scheme for randomizing, for coming up with good randomized message numbers? Right, so I said randomized, but what I actually mean is that it just has to be unique per message. You don't wanna repeat per message, so you could just use a counter directly. Oh, okay. No need to encrypt, because otherwise, yeah, chicken and egg problems. Fantastic, thank you. So just the counter is fine. I was just wondering from this test vector analysis for power, if you, so it sounded like degenerating the test vectors is still sort of a manual look at the algorithm and figure it out. Have you looked at automated methods for determining the worst case test vectors in order to then have that be the standardized test? Well, so I described it a little bit as magic, right? You have to look at the algorithm and find the best ones. You can actually be very systematic, but that will make a quite big number of tests, right? You could look at each bit, at each byte, at each output, at each round, et cetera, things like that. So you wanna be a little bit more precise and faster, and that's why we look at the algorithms, but you could have, you could do them all. It would take a little bit long time, but there's certainly room for research to build such things. I guess I was wondering if you could use simulators or something to, without exhausting the space, determine maybe not the worst case, but find bad test vectors that you could then test on the implementation. I think that pretty much relates to cryptanalysis, so I think there's room to go together. So if you cryptanalyze the algorithm, you'll figure out what the worst test vectors are that you could submit this thing to. Okay, let's thank Linda again. Thank you.