 All right So welcome over you to to this last session of the conference second session on such an analysis and liquid resilience and So the first talk is Is a tale of two shares why two shares threshold implementation seems worthwhile and why it is not By conchan Mohammed from money and Thomas Eisenbart from Worcestershire and Polytechnic Institute And the presentation will be made by a call Thanks for the introduction and In this work, we attempted to use only two shares to achieve a first order threshold implementation and We discuss what benefits we can get from this share reduction and what other presses we must pay So first of all is the motivation of this work So the current deployment of Internet of Things has some special needs in terms of cryptography because of its resource constraint nature and compared with the traditional conventional cryptography and lightweight cryptography is considered and Better suited for the this constraint applications so in addition to the resistance against the mathematical analysis The physical implementations of those algorithms should be resist against a physical attacks for example such channel attacks, which is of interest in this work and Of course the share channel leakage resistance does not come for free and usually the counter measure will you introduce a lot of overhead so Motivated by a better need for a more efficient and secure implementation in this work, we study one of the most popular Contra measure and aims to bring down the cost in terms of area and randomness used So the state of art Contra measure is the masking scheme called threshold implementation Which is based on secret sharing and multi-party computation and the basic idea of threshold implementation is to split the intermediate variables and functions into different shares and a value sharing of Ti should satisfy three requirements, which are correctness non-compliceness and the uniformity and here one of the most important is the non-compliceness means that each shared function should be independent of at least one input share For example in this first order threshold implementation in this figure One shared function f1 only process the input share two and three and is independent of the share one and similarly function shared function f2 and f3 are also non-complete and Uniformity means that the output shares Z should be also uniform such that in the next stage the Ti is still valid So of course this comes with Increase in overhead This is not exactly triple of the unprotected implementation But it gives you an idea that the overhead should be related with the number of shares you used in your Ti implementation so it has been shown that in a D probe model and At least T times D plus one shares should be used to provide this order protection T here means the Algebraic degree of the non-linear function and the D means the protection order you want to achieve and Two recent work has shown that actually we can reduce the number of shares to D plus one if we can relax some restrictions to construct a Ti and In this and after when you Bring down the number of shares the area of the implementation can be greatly reduced so in this work we continue with this direction to Reduce the number of shares, but the difference is first of all We are not going to give a very generic construction For any arbitrary function or at any average arbitrary and protection order and we focus on the first order protection only a second of all in those previous works even though the area is reduced, but it shows that the number of the rent randomness is Increased so in this work we are trying to reduce the randomness required for ti and in the end we are particularly interested in lightweight to cryptography and We want to show that ti is actually a very nice fit for lightweight cryptography to counteract such an attacks So Here we use an SA sefer salmon as a case study Sam is a very straightforward block sefer is very simple and it consists of only like three cyclic rotation one multiplication in binary field and three exclusive or operations So it has many interesting features makes it a very good target for threshold implementation So as you can see the this end operation only has a algebraic degree of two and and Plus salmon can be a fully beat serialized means that actually our implementation a hardware implementation can process only one beat Clock cycle and this way we can minimize the resource used to implement a hardware Of salmon So now I'll go to how can we use to share to implement a ti? so actually This is very easy and very trivial for the linear function for example the xor In a three shared ti is very straightforward. We just use three xors and We also split the variables into three shares as long as the input shares are uniformly distributed and independent and This is actually a valid ti and there will be no such and leakage and the two-share version is quite straightforward just remove one of them and steal the the three requirements for Constructing a valley the ti is still there. It's present is non-complete and uniform Actually, this is not surprising because even in today's three-share ti People always use to share for linear functions. It's only when it comes to non-linear functions The shares are extended the two three shares or more So the the challenging part is the non-linear function for example in salmon we have this linear part, which is a times B plus C and One way to develop a to share ti is to enforce each variable to have only two shares so we can compute z by compute z one z two and Use those two equations But the problem here is the two-share functions are not independent of the input shares anymore for example z two right now is dependent on a two and a one which are the two shares of eight and This might be a problem where glitches exist in our circuit and that may leaks at first order so to solve this problem our solution is We can pipeline this operation which means we can break the calculation into two stages and in the first stage we only compute the calculation in the in the premises and Notice that this calculation is only depends on one share so You can see the intermediates actually is a two plus P two for C two and on the Yeah, they are on the other hand is a one times B one plus C one. So up to now The non-completeness is satisfied because each share of the function is independent of at least one input share and Uniformity is also achieved because here C can be uniformly a Split so the both share function the output is uniform and now and after that the result will be saved in two registers and in the second stage We finish the rest of the calculation and if you notice if you can notice that the second stage is also non-complete and Uniform and the correct So in this way by breaking this calculation into two stages and we make those three requirements and satisfied and Now we compare this with three share Implementation so right now we only have two shares for each variable. So of course we need only less random numbers to share The intermediates and we also need less logical operations because here we only use for and operation and for XOR which in three share we need more and But we need to extra flip flops here to start the intermediates Well, this seems to cost more but actually we don't consider you need Flip flops to start B to all the shares of the inputs and outputs actually in some we still see with some storage here Another thing is the computer commutation use two stages, but if we can if this calculation can be pipelined Actually, we don't lose too much So so far everything seems good and to share seems beat three share But one problem potential problem is that the two share ti shows very strong second-order leakage here I use a sharing of one beat for example Suppose we have a binary value X, which could be zero and B1 if we use to share to share X to zero it could be zero and zero or B11 and Assuming the hamming weight model and common caution noise the blue curve is the leakage density probability density function for X equals to zero and If X X to one and you use to sharing the leakage and density function should be the red curve I mean in the left in left figure, so Both these distributions have the same means which means at the first order moment. They are equal Okay, okay, okay, but At the second order They can be easily distinguished which means if you use to share X can be easily distinguished using the second order moment, but this is not the case for for three-share ti and As you can see the the two distributions As the shape of the two distributions two distributions actually are the same. It's just a mirrored. So the The means and the variants are the same So surprisingly three three-share ti is intended for the first order protection But it can provide a certain level of second order resistance so We will use a practical set channel leakage analysis to verify This problem later So now we want to shortly introduce we Application on salmon for the first application is around based salmon, which means Our implementation can precise the whole block in one round unlike the beta serialized version and This calculation is also broken into two stages and here the solid line means the first stage and The dotted line means the second stage and in both stages We make sure that the three requirements are satisfied to make it a widely the ti implementation But this implementation is not a pipeline version because at each clock cycle the output will be written back to the registers so which means each round we use Use exactly exactly to block cycles. So the whole encryptions we use as twice as twice many as clock cycles As the unprotected salmon Another implementation is the bit serialized which means the implementation and process only one bit For clock cycle. So the left part is a shift registers used to hold the blocks and what it does is just keep rotating the bits for the combination of logic So now let's focus on the coming up in combination of logic actually here you can see a register which is inserted into the Into the logic and to divide it into two stages and just like what we discussed and now it's a it satisfies the three requirements and It's a bit serialized, which means it could be small because all the logic here only process only one bit not to the whole block and It can be pipelined. So which means actually it just use one more clock cycle to process one round and pair with the unprotected bit serialized salmon So next we want to introduce the implementation results in terms of FPGA and numbers So here we have two implementations Round based salmon and bit serialized salmon and for each one we implemented the unprotected which is the yellow color and the two-share Which is the red color and the three-share version, which is the gray color and You can see that a follow-up to TI implementation. It saves about one-third of the slas registers but it's not exactly one-third because both of the implementations have some common control logic and also We also have some insert some registers and our pipelining so for slas loose use we have a similar results and In terms of throughput the round based salmon the 2ti round based salmon and Because it needs two clock cycles for each round. So the middle one the 2ti is only half of the throughput compared with the unprotected version and For the bit serialized because we use pipeline actually it's very close to the unprotected bit serialized version So leakage analysis so Leak analysis and we have two parts the first part is the theoretical analysis use the simulation simulated power consumption Here we do not use salmon anymore We use another led with a block suffer present and We implement a 2ti present and we target at the Xbox and output and We use the hamming with liquid model Assuming there's no no no no noise and We also perform first and first order and second order analysis using the t-test to and the practical CPA attack So here's the the first order analysis on the present Xbox So the left one is used the first order t-test and the the red curve is always below the 4.5 which means that our 2ti of present doesn't show any first order leakage Similar similarly in the CPA the red curve corresponds to the correlation between the leakage model and the Leakage using the correct key as you can see is it cannot be distinguished even you use about one million traces But at the second order things is different. So using second-order t-test You can see it's only 200 traces the t-test the value is already go beyond 4.5 which means that there's a strong second-order leakage in our 2ti present and Similarly into second-order CPA it also use less than hundreds of traces to recover the correct key and The next step we performed a practical analysis. I mean we We have three targets which are 2ti round-based salmon and 3ti round-based salmon and the last one is 2ti present and We partied our design to the one of the popular said channel evaluation platform and we take measurement and perform the leakage analysis So the first analysis is First-order t-test and second-order t-test on the measurement from the 2ti salmon So on the left one is the first-order t-test You may find that With about 10 million traces the x-axis shows the number of traces we used and even with 10 million traces The t-value is still below 4.5 which indicates There's no leakage But in comparison a second-order t-test is only 400 less than 600 We already saw a very strong t-value showing that there's strong second-order leakage In comparison in comparison we look at the 3ti 3ti of the salmon doesn't show any first-order leakage Which is not surprising because it's aimed to protect the first-order attack against the first-order attack but it also can provide a certain level of second-order Set channel leakage resistance because as you can see when the number of traces increases to 10 million The t-value is still below 4.5 But for 2ti It's way below about 4.5 So which means even though our 2ti saves some area and Randomness but in terms of set channel protection 3ti has its advantage over 2ti Okay, so previous Previous analysis are just use t-test now. We want to exploit exploit this leakage use practical CPA and so the first target is 2ti salmon and we perform second-order CPA and Use about two million traces The correct key can be distinguished to Close to three million traces the correct key can be distinguished Which means this leakage it can be Exploited with about three million traces and The 4ti 2ti present we use less traces about one million to recover eight-bit keys stubble keys and The thing is from the t-test we only use hundreds less than 1,000 traces To show that there's leakage But when we performing from the practical CPA it use millions of traces I think one of the reasons that CPA is not the most efficient Attack because we are assuming a hymen with model which is not the best model So we are hoping that if we from the Template attack or other profiled attack we can improve the efficiency of our attack Yeah, in conclusion and we we attempted to introduce a 2ti To share threshold implementation to reduce the area and the randomities randomities required at the meantime, we want to maintain the performance and the results show that we achieve our goal and second threshold implementation is quite fit for light with cryptography since the their Degree of their non-linear function is very low and And operation is quite simple and We also show that our 2ti can protect against the first order attack, but it shows strong second leakage compared with the traditional 3ti That's it. Thank you. Thank you Thank you. So we have just a little time for questions or comments Thank you. Can you go to the slide that you showed the PDFs? distribution functions of The second order masking and third order masking Yeah, yeah, I think the microphone isn't does it work? Yeah, okay You know in the right side with a tree shirt II As far as I understood you have the simulation of the second order masking Second-order Boolean masking Right, then you don't have the second order leakage here for sure as your distributions hard ear actually Then they have the same some of deviations, right? Then in the case of ti in the case of ti You have three shares, but then the quadratic functions that you implement in ti Will provide second-order leakage for you means that these figures that you have are not related to ti They are just second-order masking and first-order masking in the case of Ti with three shares you have a steel Second-order leakage because you have the common you have the quadratic functions and definitely the Standard deviations of the distributions will be different in case of yes, three three ti and yes because you explain here This is a three ti and then there is no second-order leakage, but actually there is second-order leakage is yes in three ti Okay One more point is is the The beat serialized architecture that you made for For Simon right can you go back to that a slide that you have the beat serialized architecture of the of this Sweet with two shares to ti the design Yeah, I believe this one In the case of to ti if you don't use the fresh randomness That you don't use here right as far as I know If you don't use the fresh and then you have the serialized architecture beats serialized architecture You are overriding the data on itself Right, I mean that you're shifting the data and that at the first rounds of the Simon is not a problem The data are completely independent of each other the masks, but after a couple of rounds they are completely mixed and then the Adjustment actually the neighbor data which are saving the registers. They are not completely in a masking way I mean with the with respect to the masking. They are not independent of each other and then if you Store them on each other. I mean they override them because of their rotating because of the shifting you will exhibit first or the leakage that this is already known that if And in and actually this is this will happen in the last rounds of the Simon If you consider only the t-test on the first rounds, let's say a couple of first rounds It doesn't happen, but in the last rounds because they are mere More mixed together then you have you explain exhibit the first order So it means we should add a refreshing layer maybe in the last few rounds You or you should have the pipeline architecture for this one as well means that two Consecutive plain text of Simon are going to the system and then in the rotating when you are shifting You are not overwriting the data of the same. Let's say a state on each other. Yeah, okay. Thank you. Thank you Any more question commit All right, let's send the speaker again