 Hello everyone, now I will present on our work revealing the weakness of addition chain-based masked S-Box implementations. My name is Ming Jingdian, I am the first author of this paper. I come from the Institute of Information Engineering, Chinese Academy of Science. This presentation is divided into four parts and I will start at introduction and previous work. Setchannel attacks exploit various physical leakage such as the running time, the power consumption, the EM emanations. Using these leakage, the adversary is able to recover the sensitive data of a crypto system. Among all con-measures against setchannel attacks, masking is one of the most widely used for its great performance. Specifically, masking recognizes the dependency between the sensitive intermediate and its corresponding leakage by splitting the sensitive value into D plus 1 shares. When protecting the cryptographic algorithm using masking, the linear operations are simple to mask. Because for a linear function, it is sufficient to compute for each share's parity, but the non-linear operations are difficult to mask. There are mainly two ways with acceptable cost to solve this problem. The first one is lookup table-based implementation. And it is needed to generate a masked table for this solution. The second one is compute the annual functions over a finite field. In this solution, the S-box is achieved by several masked computations over a finite field. Here is a table for the running time of protected implementations of AS. It can be seen that the lookup table-based solution costs at least four times more than that of the computation-based one. Thus, in this paper, we focus on the computation-based implementations. Addition chain is widely used for its computation-based masked S-box implementations. Specifically, the S-box can be expressed as a sequence of squares and multiplications. These non-linear multiplications can be then implemented using previously known schemes such as SW. Here is one of the most popular addition chains for AS S-box. Lots of masking schemes are used in this addition chain. For example, the blue masking, the mixed active and multiplicative masking, the inner product masking. Most studies focus on the analysis on the final S-box outputs. However, the addition chain implementations induce lots of extra computations for monomials. In particular, the nearly half monomials over the finite field are not balanced. Note that the function is said to be balanced if every output element admits the same number of inputs of pre-images. So, a natural question is that what if the computations of some intermediate monomials leak more than the S-box outputs, especially some unbalanced monomials? Here is an example for 4-bit case. It is a simulated experiment of higher-order safety attack, and the simulated leakage of each share are under hamming weight model, and the combined leakage are obtained by a normalized product. It can be seen that the results can be divided into 4 groups. Next, we introduce our resistance measurement, which can help explain the result. Transparency order, which is called TO for short, was proposed to focus on the intrinsic resonance of S-boxes. Namely, TO qualifies the basic DPA resonance of S-box. So, we introduce DPA briefly. In DPA, the leakage traces are divided into 2 groups based on the J-speed of S-box output, and their differential expectation is a distinguisher value. If the leakage is assumed to follow hamming weight model and analysis of its entries, we denote T as a plan-taxis. Then, the distinguisher can be expressed as this equation. However, if the function is unbalanced, the target base may always be 0 or 1, which leads to the fact that the leakage cannot be divided into 2 groups. And then, the following calculation for TO becomes meaningless. Actually, it is not a hard problem to be fixed. In unbalanced functions, some output bits may be always 0 or 1, which are useless for distinguishing the secret key. So, when the output bits are always 0 or 1, we define the differential value equal to 0. Then we have this equation. From the distinguisher value on all bits for all key hypothesis, we get a new notion which is able to measure the DPA resonance for all functions. And this notion is named polygon degree in our work. There are three properties of polygon degree. The first one is that the smaller polygon degree of a function is as stronger as it recedes against such an attack. The second one is that for function f, we have that its polygon degree is higher than 0 but lower than 1. And last, the polygon degree is also valid in higher order attacks because, as expectations of combined leakage follow a linear transformation of hamming weight distribution. And then, we introduce how to verify the soundness of polygon degree. And the polygon degree is called PD for short. Step one is to calculate the PD values for all monomials over the finite field. And then, we perform simulated higher order CPA on all monomials. The leakage are also under hamming weight model. And last, we measure the PD values and the simulated attack results. If we classify the results on 4-bit cases, we can find that based on the PD value, the resistance is that class 3 is lower than class 7 and lower than class 5 and last lower than class 1. It can be seen that the PD values match the attack result well. We additionally verified on the 6-bit cases. And we can find that the powers fall into the same PD value if the exponents lie in the same cyclomic class. There are a total of 62 monomials for the 6-bit cases. And the results are shown as a histogram. The y-axis shows the number of cases for gas entropy to be below 4. It can be seen that the smaller PD values come with stronger resistance. We verify the property. And the number of cases for the monomial function with lowest PD is nearly 3 times than the highest one. As for the 8-bit simulation, there are 254 monomials and a total of 34 classes. So we use inverse functions to face these results. It can be seen that the results match the PD values as well. Besides, we also tried other set-down metrics. Information theoretic metrics are widely used as indicators for set-channel attacks. And mutual information is a well-known information theoretic metric. So we use it in our work. The results for 4-bit cases and 6-bit cases are shown. It can be seen that the monomials with the same output size fall into the same class. And mutual information metric does not match the results well as the PD does. And next, we introduce our practical experiments. In our work, the AES S-Box is a study case. Because firstly, AES is one of the most popular block ciphers, and AES S-Box can be expressed simply over the finite field. Moreover, there are many public mass implementations for AES block ciphers. And here comes the first problem. We said how to find all feasible and most efficient addition chains. The most efficient addition chains need to have 4 multiplications and 7 squares. Step 1 is to find addition chains including 4 multiplications. And we randomly choose two elements to add. Then we can get a new set. We union the new set and the initial exponential set. As for additions, we judge that the 254 belongs to the final exponential set. If it is, then this addition chain includes the 4 multiplications. And the step 2 is to count the number of squares in this addition chain. Our approach is to sum the squares number from the red to the orange in each cyclomic class. Then we get 1330 addition chains with 4 multiplications and 7 squares. And now with the lower squares number. Moreover, we assume that there are two types of adversaries. The adversary 1 has limited computational resources. So he is only able to find the leakage corresponding to one sensitive intermediate. For this adversary, the measurement is max PD value. And the adversary 2 has enough computational resources. So he is able to launch higher auto attack on all sensitive intermediates. Then sum the results together to achieve a higher success rate. Thus, the measurement for this adversary is the sum of PD values. With two measurements, we find three typical addition chains. The first one is the weakest addition chain for both two adversaries. And the second one is the strongest addition chain for both adversaries. And the second is a strong addition chain and can be proceeded in parallel. Here is our experiment setup. The power traces are collected by a chip whisper light board. And the code are running on an ARM Cortex M4 based microcontroller. The leakage is shown. And we can find that it clearly is a low noise scanner rail. But the noise level for different monomials are also different, which will affect the attack results. The EM traces are collected by an Angelin oscilloscope with an EM probe. The code is also running on an ARM Cortex M4 based microcontroller. Clearly it is a high noise scanner rail. We can see its leakage, the correlation coefficient of leakage and the corresponding sensitive value is lower. But the difference of noise level for monomials gets more obviously, which may affect the attack results more greatly. Here is our power analysis. And it can be seen that all the implementations can be broken within a small amount of traces. More and the two strong addition chains are always better than the other two. And most important, in the worst cases, the resistance is close to that of unprotected ESS box. And here is a result for EM analysis. Since it is not direct to understand, our main results are summarized in a table. In this table, we can see that in the worst case, its resistance is also close to that of unprotected ESS implementation. Besides, we can find that the attacks on the other three addition chains become less efficient for the adversary too. It is because there are the inefficient results on some monomials, such as the two power effects are combined and negatively affect the final attack results. What's more, we also launched profiled attacks, and they are the template attack and the deep learning base attack. In template attack, we get the probability of each shares in each trace. Then we get the probability of the sensitive data on each trace. With the help of the inverse map, the probability for sensitive intermediate can map to the secret key. And we add up the probability for key of each trace, then we get the probability for all key hypothesis. And the max probability is our result. For the deep learning base attack, it is training using a CNN model. And the last fully connected layer contains the number of the output of the function F. Our targets are two typical functions, the 85 power of X and the S-boxes. It can be seen that with increasing noise, attack on the 85 power of X becomes more efficient. And since the smaller size of the 85 power of X, the cost for storing templates and the running attacks are also lower. And this is the experimental environment and the CNN architecture. The network architectures may be not optimal, as our goal is to compare the different addition chains but not to find optimal parameters. It can be seen that with increasing noise, attack on the 85 power of X becomes more efficient as well. Finally, we conclude our work. In our work, we target on the addition chain implementation using bling masking. Then we propose a new notion named polygon degree to qualify the side channel resistance of a function. And we also try the mutual information as a metric. And last, we validate our metric using higher order CPA, the template attack and deep learning base attack. However, this work has not been completed. Besides the addition chain implementation, there might be other implementations including the imbalance functions. Moreover, the imbalance functions may also affect the security for other masking schemes, such as the multiplicative masking, the inner product masking, the shamrock secret sector sharing, and other schemes. Then we may need other metrics to qualify the side channel resistance and validate them with other distinguishes. This is my presentation. Thanks for watching.