 Hi, welcome to the presentation of the paper Masking and Fine-Grain Leakage Models, Construction, Implementation and Verification. I am Marc Rojant and this is a joint work with Schildbach, Benjamin Grégoire, Maximilien Orl, Clara Paklianonga and Lars Bohr. This paper and the presentation are about masking and verification. Masking is a countermeasure against side channel attacks and it's a great countermeasure because it comes with an information theoretical security guarantee, which allows us to establish proofs that a certain algorithm is secure against side channel attacks, certain side channel attacks. And this allows us to really rule out entire classes of side channel attacks, which is great because it provides a great resilience. Now the problem is that every proof comes with an assumption and in reality these assumptions are often violated. An example is the leakage behavior which is considered in the proof and the leakage behavior which is actually happening in reality on a real device. There in reality the leakage behavior which can be observed in the side channel attacks or exploited there is much greater, much more diverse and furthermore different than what is considered in the proof. And this means that there is a gap between provable resilience and resilience in practice. And this gap is unfortunate because provable resilience is actually quite cool because it might rule out entire attack classes. Furthermore an implementation is challenging because we want to rely on provably secure algorithms but we don't only want to implement them correct and secure, we want to furthermore harden them such that we achieve actual resilience in practice, that is we always want to land in the green interval shown here. And this is very tedious to say. Implementing means that one has to come up with an implementation measure on a real device, perform test factor, leakage assessment and so on. And this is very tedious and time consuming and really restricts the level of creativity or optimizations we can explore because this is so tedious to perform and take so much time. So in our work we actually shrink the gap. We narrow the gap between provable resilience and resilience in practice. Such that verification actually can deliver something to persons who want to implement masked algorithms on concrete hardware. We do this by performing the verification of masking security on concrete implementations to say executables and on assembly level. And we also consider the leakage models which are much more fine grained and are able to capture what is actually observable or leaking in practice. And as a last ingredient we also perform this verification in stronger security in the definitions than we already have. Overall this is very beneficial as it aids the construction implementation of the masking counter measure with the result of practical resilience in the end. And this not only at first order but also at higher order. In this presentation I will briefly go into hardened masking then explain how we are actually able to make verification fine grained leakage models possible and to automate the verification aspects there. And then in the end I am going to show how in our case study we were able to really efficiently harden in present S-box and explore so many more optimization strategies that in the end our second order present S-box which was hardened and practically resilient was as fast as the naive strategy of composing gadgets which are also hardened. But we actually had that both of them were as fast that is the second order was as fast in number of cycles than the first order implementation. So we were gaining essentially a first security order for free. So stay tuned for how we did it. Okay quick recap of side channel attacks. Side channel is the physical effect that is for example a processor which is performing some execution for example here X or in a sensitive value X with a value P executing this instruction on a device causes a certain power consumption and this power consumption is data dependent due to the charges involved in the processor. And then an adversary is able to measure this power consumption and observe data dependent power measurements for example this also works for electromagnetic measurements and thereby launch side channel attacks which allow to retrieve the sensitive value X here. Now there is for a teeth or the side channel attack which exploits T measurement samples there's also T's order masking which works by taking the sensitive value X and producing multiple shares of it splitting the this original sensitive value X into multiple pieces X1 to XT and then there's provable security in a sense that in in probing security we can say that no T observations may reveal on the secret and we can make a proof based on these shares that this is indeed the case. Now you can already observe that here it says observations and here it says measurement samples so there is a difference between the two and in our work what we actually do is that we narrow the gap between these measurement samples and observations by working in very fine grained expressive leakage models. So let's quickly look at how an algorithm looks like and an implementation of such an algorithm. To the left hand side we have a mask and XOR gadget which computes the XOR of the shares AI and BI share wise so it produces it takes a number of shares as input and it produces a number of shares as an output and then usually this comes this gadget which performs an XOR computation comes with a correctness proof saying well the gadget is computing an XOR of the shared values and also with a security proof which is for example in a probing model and specifies that for a number of observations in a specific model where for example A0 and B0 can be observed or the XOR sum could be observed and all the other sums can be observed as well then no such observation set consisting of multiple such observations is able to reveal the secret. That's quite great because we have a provable secure algorithm here but this algorithm does not execute on most devices so there needs to be an implementation this implementation can for example have to be on an assembly level so the right hand side we have the implementation a mask XOR and software and this is quite different it's quite easy to observe that this is operating on shares which are stored in memory these shares have to be loaded then the XOR operation is looking quite different than then actually here where there's a three address XOR and here the destination register shared with an operand and results have to be stored in memory but moreover the problem is that when this is executed on an actual device then this might be provably secure but there's additional leakage behavior from the processor so the processor executing this will perform additional combinations and allow additional values to be observed by an adversary and this means that for example the store in this line will leak a combination of the value which is to be stored and the value which was stored in the last store instruction this is very common on our Cortex M0 plus device and such effects also exist between loads between ALU instructions of for example the F5 here and there are five in line eight here um line eight sorry will be combined as well or possibly combined as well and then the question is does the proof still hold and what we want to do is essentially to come up with verification techniques which allow to assess this in an automated manner the counter measure against is usually to assert insert additional instructions to to prevent these leakages and this is an overhead right because we have additional instructions here and here which lead to an additional overhead and we want to minimize them and this is possible with our verification approach as well so our approach to verify this is that we have a domain specific language which allows us to represent both the side channel behavior of a specific device as well as a concrete implementation on the assembly level that is we have a we have a prototype tool which is called scverif and it takes a mask implementation for example in assembly format and it also takes as an input uh semantic of the instructions of the assembly instructions as well as side as a side channel behavior um specification of each of these instructions and then um this is used to represent um the this implementation and later on perform a check of the masking security based on the existing mask verif checker now the problem is that mask verif and most other verification tools are not able to work on the level of assembly implementations or which are for example using memory or are incapable of working in diverse leakage models and commit to a fixed leakage model or multiple different leakage models so this is our our approach and I will go through the different stages of representation the partial evaluation and later on our benchmark okay our first um task is to represent the semantic and the side channel behavior of an instruction so take for example uh excellent instruction with three registers a destination register and two operand registers here and in the usual setting in an implicit representation would be quite easy to specify the semantic what this instruction does well it will assign the destination register um the value the boolean xor um of the two operand registers rn and rm but the problem is what is leaking here and this is not clear and usually then deeply embedded into the language by specifying for example that every xor operation will be observable by an adversary now we take a different approach we make an explicit representation that is um we have the same semantic specification here um but this is now leak free there's no side channel observation in our formal world possible here this is completely free of observable um side channel instead we have additional statements to specify that a certain value is observable and has to be considered for example in the proof of masking and this is the leak statement which takes in curly braces the number of expressions or values which are to be considered in the proof and which represent essentially the capability of a physical adversary to measure something in a power trade a very common model is the hamming weight model where the hamming weight of the computation result is leaking now the problem is here now that we are specifying that exactly the hamming weight is leaking and this is a bit unfortunate because in reality this is rarely the case and rather weighted hamming weight or the most significant bit or something else some other combination is leaking instead in our models we usually take the approach that we just leak um the pure term that is the full result all the bits involved in the x or of r n and r m and this now specifies all the different kinds of observables here that is this could be the most significant bit this could be the least significant bit this could be an arbitrary bit combination of this result and this is much more um realistic um to what is happening in actual devices the approach is not limited in a sense that we can also model transition leakages for example that is we can just specify well the hamming distance between the value which is stored in r d prior to the assignment and the the value which is to be assigned is leaking and again this is not a very good idea because now we are specifying that exactly the hamming distance is leaking whereas we would maybe just say that some combination of the destination registers value prior assignment and the value which is to be assigned is leaking and has to be considered in the proof now this essentially with the two terms here allows a side channel and probing adversary to observe two values at the cost of one probe for those who know about these terms um given this ability to specify semantic and side channel behavior um in in independent form or somehow independent form we can now very easily construct models of individual instructions that is we just define a macro in our language which has the name of an instruction and takes its operands and performs the semantic operation of it and also contains a few annotations of what is leaking an example here the transition as again with the hamming distance given the model of several such instructions it is very easy to represent the disassembly of a larger program that is in disassembly or represent an entire program that's what i wanted to say so if you look at a line of a disassembly then this usually comes with an address of a certain assembly instruction um and the actual disassembled assembly instruction its name and its argument now in our representation of low-level programs this becomes very easy just the definition of a label and um the the actual x or representation that is this macro definition here to the left applied to the arguments given above so it's very simple but it's very powerful in the end because our language is quite simple it has a few control flow constructs like if the nulls y-loops and features labels and ghoutos to to mention and to to make the modeling of assembly jumps possible but apart from this it's quite really simple and all of this you know it's important to know that all of this the entire language here is free of leakage that is it does not specify any side channel behavior with the sole exception of this leak statement only the leak statement is able to express that adversary is able to see or to observe some information um the the small language despite being so small um allows to really represent an entire instruction set architecture for example the cortex and zero plus um including flags um for carry overflow and and so on which are used to express for example control flow operations in arm assembly so the goal of our language and of this model is to represent the assembly instructions and to model the leakage of execution of a program and we can do this by just specifying for example that there are global variables r0 to r12 which represent the global registers and there's also a program coin program counter yeah which mimics the program counter and the flags then we can move over and this is again our x or instruction as we have seen it before and model how this x or instruction behaves in terms of the semantic in line 9 and its leakage behavior which is going to appear up here we have already modeled the transition leak but actually there's more um leakage behavior of this or x or instruction in practice so maybe the most relevant is the revenant leakage effect which is a generalization of a behavior we have observed multiple times that is if you have two x wars in a sequence and they operate on certain data um then what we see in practice is that in during the execution of the second x or there's a combination of the values which were used as an operant in the prior execution a prior instruction that is in this instruction there's a combination of c and a leaking and the same on the right hand side with b and d and we can express this quite easily by introducing additional state global shadow registers we denote that usually as leakage state and specifying that in this instruction for example the values a the value a is cached in in this global register op a and then in this instruction here we actually have a leak of op a in combination with this operant c and we specify this with a abstraction here that is we have a small helper macro which we denote emit revenant leak which takes the leakage state and the value um which is leaked in combination and then we'll always leak first the combination of the two and then assign the this um leakage state the new value that is in this x or you would have the case that op a would receive the new c um then we can annotate our x or again with this and we get a more complete model um in our case we actually specified an additional worst case assumption a leak of all these four values at the cost of one single probe that is an adversely is able to observe any combination of those four values here in the end we come up with a leakage model which is um sufficiently complete for our use cases and specifies the side channel behavior of x or and load and store where load and store actually need to um use one additional leakage state um each and this model is then in the end sufficient to achieve side channel resilience and tvl a up to one million traces but before we go into the results let's briefly look at the automated verification what i've shown you before where it was this stage of master uh sivirif which was the representation of a mass concrete implementation say again assembly here in a specific leakage model and for specific instructions now we have a representation which is in our domain specific language and only in this one only using those constructs and it turns out that masquerif the checker we want to use is using a strict subset of ill so we have an additional stage in between the these two where we actually remove all these language constructs in ill which are not able um which masquerif is not able to understand and this is performed by a partial evaluation you might also know it under the name symbolic execution and then you have this partial evaluator which is able to remove all these constructs and perform this symbolic execution all we have to do is to preserve the side channel behavior and this is quite easy because this is this dedicated leak statement which we have just to pre reserve um there's some limitation here because partial evaluation actually requires um some identification and it's also in general not complete in the end we have a proof of concept tool which is called sivirif this is able to verify masking security of concrete implementations um on assembly level um using user provided leakage models and semantics of instructions and there the verification stage on the verification tool as masquerif is completely decoupled from the leakage behavior um which is provided or specified by the user um this sivirif tool is able to to verify in a certain number of um security definitions for example noninterference and strong interference but we also came up with our own security definition which is stateful strong noninterference or stateful noninterference and this is one refinement of noninterference in a sense that we actually ensure that there is no residual left after the execution of a gadget and that means that if we turn back to our original gadget um xor then after the execution ends um all registers have to be cleared which might con might have contained sensitive data that is here the contents of r4 r5 and r6 um are purged um to remove all shares which might be contained in this and this greatly helps to construct larger composition so we have used our tool to evaluate um how well it performs and we have actually implemented two um present s boxes on multiple different um versions of a present s box masked at first order using two three two shares to say and masked at second order using three shares our goal was always to reduce overhead as much as possible by for example removing those dummy instructions which had to be inserted but also by coming up with new combinations of gadgets to come up with new gadgets to say um to reduce the overhead and this um really worked well in the sense that um we have been able in our first order implementation um um to compared to the composition approach the composition approach is that you develop for each primitive operation one gadget and then you compose them together um where we have our optimized approach which might combine multiple operations into one gadget um we have been able to reduce the overhead massively to just 40 percent of the original implementation so this is a fair comparison because this one is actually like a hardened implementation and this one as well so this one is so this approach is the one we usually take because it's so much effort to harden an implementation that we don't want to um perform specific compositions or specific new gadgets and do more work than necessary but here with our verification approach we are really able to explore more optimization strategies because the verification approach is really helping us to come up with implementations which are then later on practically resilient as we expected just with a verification tool which has a much faster response time so in total we've been able to save a lot of dummy instructions on first order was 72 percent and in second order it was 86 percent and also the ratio of dummy operations um and uh semantic operations which are actually needed goes down quite a lot and what we can observe is as well that in a second order setting where we have our optimized constructions actually the number of cycles we need is less due to our optimizations than what you would expect from the composition approach at first order that is we gain an entire security order for free by using our verification technology and our optimized composition strategies we have performed a lot of physical evaluation both at first and second order tvl a we're using one million versus one million traces so quite a lot of traces on two devices with multiple implementations in the end what we can say is that there appears to be some completeness of our model which we have shown in the paper that is there is a link between provable security and the fact that every provable secure implementation of our final model um has no leakage detected anymore in physical evaluation clearly this is an empirical link which we cannot prove and which might be wrong for different other settings this is true for our devices for our implementations for our instruction set architect instruction set architecture subset and our model um but it's quite um quite cool to see that this actually works out and that we have in general the ability to come up with models which are so complete on the other hand our models are also not overly conservative in the sense that they are quite precise and whenever we remove proof relevant countermeasures then there's also leakage detected there's a lot more to be found in the paper i hope you enjoyed this presentation and you can ask all the questions in the live session at jess thanks a lot