 The final talk of this session is given by Michael Tunster and it's on masking tables and underestimated security risk which is a joint work with him and Caroline Wittner and he's at a loss for words. Yes so continuing the theme of a site analysis with the conducting of a simple differential power analysis we base our attack on the observation that an instruction manipulating data may consume more or less power depending on the how many weight of the data is being manipulated. For example this instruction here you can see different how many weight levels depending on different power or something level depending on the how many weight of the data it is manipulating. So I want to do with this typically we will take a bunch of acquisitions and for each one we predict the how many weight of the out of the box in the first round for a given key hypothesis. So if you're making a hypothesis on one bite and if our hypothesis is correct we will see a significant correlation at the point in time where that value is manipulated by the process of we're trying to attack and then putting in correct hypothesis we'll see something like these gray traces in the background where no significant correlation is observed and then as we increase the number of traces we get a better the better distinction between not correct and indirect hypothesis. So counter measures for this being able to state the things that are much simpler than the counter measure described by Thomas just now. So the first thing that's proposed was just a simple and use all with one random value. We absorb this random value with your plain text going in, then you choose another random value to absorb with all your key bytes and then you manipulate all of your plain text and your key such that every intermediate state is manipulated. This poses problems as described by Thomas when you have a nonlinear function. So for an Xbox AES you require to construct a table in memory. So you have our S-Box S where you're absorbing an index with the random value R then the value at that index is absorbed another random value S to produce a table in memory S prime and we can then use to quickly compute an AES. So another proposition was to use affine masking for AES. We're using the structure of AES to to mask again. One is the most difficult to mask on an additive mask where it were obliged to construct one table that allows us to map X to RX, XOR, R prime and a second table is constructed as prime where the index and the data itself are masked using our map. So in all cases this has to happen before we can actually compute our AES. I'm moving on to what higher order masking schemes so as Thomas was saying we often have more than one share if we want to have a higher order resistance. So in this case for the second order masking scheme you have two random values R1, R2 that mask an index and should be S1, S2 will mask our data but in this instance this table has to be constructed every single time we want to do A table look up as if this A produces the result of one plate and we have to then train to random values to avoid leakage. So these things can be very very costly. So it looks at these schemes, now we've been proposed it can show to be secure, there's some fruits or it's actually some of them but we have to construct these tables in memory and we have a node index. So I have to go from 0 up to 2 by 5 and then the loop loops are invisible in the power consumption so here you can see a repeating loop between the back lines of an implementation of a Boolean masking. So there it's creating a table where the index and the data being stored are both masked with some random value and this for the offline masking where it's generating the energy in the map or so to move X into X times R, XOR, R prime. So you can then pick out these into sub-traces and use those themselves to conduct a decay attack on on the masking values which will then allow us to conduct a decay afterwards. So we have two implementations on one of our ARM7 microprocessor and one on an H51 microprocessor. As you can see the mask recovery is almost almost perfect with this being the number of bits that are being correct. So for the address mask going in the result of these all with our index was our random value. You're always correct. Here for the data mask always one bit wrong for some reason when we're completely unable to find our way. But in any case this completely removes the masking and you can conduct a standard DPA afterwards taking the mask values into account and we have similar results for offline masking. So in counter measures what could you do? In terms of things to do it would be to propose some function F that controls the route that I take from 0 to 255. So if I can can't form a hypothesis online which is harder than you'd imagine because we propose three different things. So you have a random start index. So when you generate your table you choose a random value k and add that to each index and generate your table like this. Or as a random walk we generate have some function waiting for lots of operations. So let's just have the table generated as a random order governed by lots of random values. Or a random computation. What for a random computation that you use repeatedly. So here our random computation looks like length and weight and divides the size of the table. And then we choose to use this again and again. So we can have a look at the random walk counter measure in this one here in more detail. So we recall that we can take out individual traces like this. So we have 256 different traces where no new mix is going in. It's then computing generating our table where you want to retrieve the mask. So if we superimpose some of the some of those traces. So it builds tension. So these traces these sub traces are 200 to 2000 points long. And I superimpose four of them here so you can see they have the same form. And the dates of differences we're looking for you can see the red, green, magenta. There's more differences with the correlation we're trying to get at. So given that the processor is obliged to compute this we can divide that into the different operations. So x is known this is our index going from 0 to 2 by 5. w brings some fixed random for our single trace. So for the 256 possible values for w we can get the correlation trace. And here for the correct value w is the red trace and the incorrect one is being the gray trace is being hide. So let's put 255 gray traces there. So I'm sure you can see what's going to happen next which move along to the next operation. So this here will be the next instruction to the output of the multiplication. It gives us a correct value for u in the same way the red trace or the incorrect values in the given gray trace. This larger peak here is in fact a multiplication by 1. So that's the result of x is all w times 1. So what we know from our previous observation when we derived w is that that occurred here. So obviously here we want some of the next instruction and then doing the same thing for the addition of y we have exactly the same observation where we can derive the correct value of y and just before we have an addition of 0. And in the next part of the attack we have xor z, xor our address mask, finally, but we don't need to derive the address mask we just want either our address mask or our data mask. So we look for the xor sum of z and m1 and because we compute s we look at the output of s because this will have no linear relationship between all of these functions. We won't get any peaks beforehand or we get a good clear result for the correct value of z and xor m1 which will also confirm all of our hypothesis. So if we see no clear peak here, clearly something has gone wrong and we have to go look at the lower rankings values for previous masks. Finally we can get hold of our data mask and again is the trace. So this would be this being the value that you would store and then used to enable your DPA attack options. So you have to have to do that process for each trace you acquire to like a thousand, seven, a thousand traces and then use those mask values corresponding to each trace to conduct the DPA. So obviously if we have a random start index we're essentially doing the same thing but much simpler because we just have one addition with a random value beforehand. As you can see the error rate is quite small but the majority of cases we have zero or one bit error which is a negligible when conducting a DPA afterwards and so these numbers were generated with a thousand traces, one thousand, yeah. So the band of permutations, so the permutation of length then that was repeated is the first element of each permutation would be the same so we could pick out those and try to get some variation out to show what mask bits and what permutation value was used and this was just done through a generation of lots and lots of different values storing those that were the best and they're benefiting more parts of the random permutation and building up what looked like it could be the best combination of permutation and mask values and in the case the case where the permutation was 30 to 30 by as long we've went up to about 16,000 combinations that we've kept in memory until trying to get hold of the dread mask at the end. So here are the error rates for permutations short short permutations, the error rates are quite low as you can see as the permutation length increases it tends towards a point by random distribution so when the permutation length is the same size as the table then we should have a perfect one in the distribution and there'd be no information to get hold of but these values should come here, these values here only going up to 32 should be adequate for the EPA afterwards. So the part of the bias here 20% to 32 is about half a bit so you count the measures before it says nearly impossible we would sort of run out of ideas so what I was saying about permutation being kind of just simplified so I'm generating such a thing you would need kind of 56 true random values actually on our smart card it's going to be very difficult to generate so that combination time that may be prohibitive and what the success of the attack assumes that 256 traces are sufficient to to achieve the attack and we have an analysis that I am in evaluation of the signals and noise ratios even in our paper for those that are interested so that's it I'm going to talk. Are there any questions? Yes. Did you try to compare your result of attack with more classical second order attacks? No, because it's not the second order attack. No, but did you compare the efficiency of the attack? You couldn't want a second order attack on these implementations right? Sure, but the second order attack you end up with a correlation coefficient that's about 1.5 also and there when you remove the mask you back up to 0.7 as you would for a first order attack because you're just pushing the mask to one side. Yeah, yeah that's right. So you should have a better distinguish for a smaller number of traces if you assume that's possible. That is right if you manage to remove from the T-the mask I mean it's I assume if you assume that you you're able to retrieve exactly the mask during the table with computation maybe on very noisy devices you are not able to do so. Also for some very very low noise like some of your devices one of your devices is very like there is almost no noise. I think a second order attack would be very fast because as we know it's very related to the noise to the noise rather but well as you said it would anyway cost something so maybe if you can retrieve the mask completely then for sure this first order attack would be more efficient. Yeah, for the second order on the arm I've done a second order asses but I can't remember where the how many traces you require or how many. All right thank you. They're further questions. If I have understood correctly you recover the masks right I'm all right then in just 256 traces you try to recover the mask for every trace or you do something else. So you have one trace from which you take 256 sub traces so your mask is then fixed for those 256 traces. The mask is fixed for all sub traces? Oh yeah because it's from the same trace, the random value of what within one trace will be fixed but you're using it multiple times. Then you do mask reuse right? You have the same mask with the subsequent spots in one round for instance or? Well yeah it's going to stand the Boolean masking for example. You refresh the spots, mask spots let's say mask table and not so frequently right? Well it makes a difference. Since you have to do the mask right but for every sub trace let's say you need to know the mask. But the mask will be the same because the mask will be the same in each sub face. Then how frequently you recompute the mask table? Never? Well is there a need to reconvene? Not a demand to carry. I don't understand the question. As a designer you have a tag right? You have a design that you have a tag. In this design you refresh the mask, sorry refresh the mask table. How frequently do you do it? The thing is you would pick one instance where the mask for the table is refreshed or created and you attack directly afterwards. So you could refresh or recreate it afterwards be like everyone that's in the services. And you attack on the part that you refresh the mask table? You recall the message that you just used to refresh the table? You go back to it right? So something like this for the second one of the masking when you're creating a table there was only one to look at. Yeah. You will attack that and then the UDP on the other side. Got it. Further questions? We still have five minutes time. So otherwise if there are no further questions let's thank all speakers of the session again.