 Hello, I'm Wu Xiaowang, and I'm going to present the work entitled Pack Demonstration How to Amortize the Cost of the Channel Master, just which can go up to one of their standards, e.g. education cuts here. Here I list what I'm going to present. Firstly, I will give the background and motivation of this work. Then I will present the Pack Demonstration, which is a man's contribution. After that, I will give the performance of the ESS box application. Finally, I will give the conclusion as discussed in the future works. Here I'm going to use why it's like to give a short introduction on the said channel attack. As we know, a top pick from kit has to be implemented and withdrawn near given environment, which may have some said channel leakage such as power consumption, running time, and forced information, and so on. For example, we can implement a top pick from our kit that takes a plantax as input and output from self-attacks. During the competition, an adversary may utilize the said channel leakage to recover the safety key inside the kit by using some statistical analysis. Masking is one of the most investigated countermeasures against the said channel attack. Where each circuit dependent safety variable is randomly encoded into several shares. Here I present more details about the masking technique. A masking scheme is made up of two ingredients. The first one is called encoder, and it's randomized each circuit variable Cx here into a number of shares, such as any D shares are independent of X. So we can see that the encoder provides the security for the safety variable such as a K. Besides, the cryptographic permanent usually is a competition from some input to some output, so we need to secure the competition, which requires the second ingredient, the private competition. Here I give an example. Say we want to compute X plus Y times C, whereas the X, Y, Z are safety variables. What we can do is to transform each elemental operation into their correspondence, in the mask domain, whose output and input are both shares. Here addition is transformed to addition gadget, and modification is transformed to the modification gadget. After those transformations, we can transform an unprotected competition to a protected one. Ensuring that, NAD intermediates an independent of the input circuit. We call this kind of security as depravity or deproving security. So above is a basic idea of masking. As masking transforms each gate into the corresponding gadget, we are interested in the construction of different types of gadgets. The first one is linear gadget that performs a linear operation in the shared form. As the randomization is homomorphic of a linear function, linear gadget can be correctly and securely constructed by applying linear functions on the shares of the same index, which we denote as the trivial implementation of a linear function. But it becomes more difficult for the nonlinear gadgets implementing nonlinear functions such as medication. In the following, I will introduce the well-known ISW medication using an example with three shares. We introduce this ISW medication because it is quite famous in the community of said channel countermeasure, and our proposed scheme later is also based on the concept of it. The ISW scheme was proposed by Uwe Yishai, Amin Shahai, and David Van Buren 17 years before. The input are two three shares corresponding to two coefficients, and the output are three shares. The gadget implements medication in the shared form. First of all, we calculate the output duct of the input shares resulting in a 3x3 matrix like this. We can see that summing the entries of this matrix is the circuit output of the medication, and summing the entries of each row is the circuit shares of the output. And to secure the process of summation, some random variables should be added in. Like here, here, here, and here, here, here. And finally, we can sum the entries of each row here, here, here to get the output. So the procedure of ISW medication can be objected as out duct, then refresh, and then compressed. So this is the basic idea of the ISW medication. So now we have the mask linear operation here, and we also have the mask medication here. So theoretically, we can transform any crocule algorithm into the mask form. Of course, we may need to add some mask refresh in between the gadgets to avoid improper propagation between adjacent gadgets, but I prefer to omit them due to the lack of time. We can say that the masking parameter can't switch some overhead. The overhead of linear operation is reasonable and seems hard to mitigate. The cost of the computation increases by a factor of the number of shares. But when it comes to the cost of the medication, the computational cost of ISW medication increases by a factor of d square, and it requires d square by linear medications. Besides, the medication also needs some uniformly distributed random variables, which is practically heavy to generate. So we are interested in reducing the implementation overhead of mask linear operations, which is a challenge regarding the practical usage of masking techniques. And we consider both computational complexity and randomness complexity. Our natural idea of doing this is to concentrate on designing more efficient gadgets. Many peer-reviewed work devoted to such direction and tries to push the limit of gadgets. Overall, this approach considers every gadget separately, and we can call it an isolating approach. This type of approach considerably simplifies the situation and has achieved many good results. On the other hand, we note that cryptographic algorithm perfectly consists of excluding a basic function for many times in parallel. For example, the ES evaluates an S-box for 16 times within each run. And at a higher level, many modes of operation are designed to support running several primitives in parallel. For example, the CTR mode encrypts several blocks in parallel. So facing this situation, our work takes a global view and applies the optimization technique, which aims at reducing the average complexity for a masking of several operations. The more parallel modifications, the lower average cost we will have. As a result, we propose a new construction named tag medication, which computes multiple mask medications in parallel. In the following, we denote the vector of shares related to a sensitive variable as share. So, for instance, the boolean shares of a sensitive variable X, X hat 0 to X hat d. And we call this vector of d plus Y shares as a share. And the number of parallel medications is denoted as L here. So, as shown in the left, using the classical local approach, the two input vectors are built as several pairs of shares, and each of the pairs is possessed independently. In the contrast, as shown in the right, our packed medication considers medication over a finite field with size q. And it has two steps. First, each input vector is re-encoded as a packed share rate using a linear code. When the field size q is greater than the number of medications L plus security order d, the size of data is compressed from 2 times d plus 1 times L to 2 times d plus L. Then, a medication over the packed share rate is calculated. Resulting in a boolean share rate and the data size will return back to the d plus 1 times L. Here, I give an example to illustrate the theoretical performance gain for the masked 16 S-boxes in one ES round using our approach. The ES S-box was designed as the inverse in the finite field of f2 to the 8, which can be decomposed into four medications and several linear operations. When we use the isolating approach with the S-dial medication, we need to implement the 16 S-boxes separately, and it requires 64 times d squared by linear medications and 64 times d times d plus 1 divided by 2 random bytes. But when it comes to the case of our packed medications, we can say that only four packed medications with the number of parallel medications L equals 16 are needed. Thanks to our amortization, it requires a much smaller number by linear medications and random bytes. In the following, I would introduce the construction of the two steps, packing and amortification. Note that the packing packed pooling shares into smaller data sizes packed shareings and the medication performed the medication over packed shareings, resulting back to the pooling shareings. The first part is the packing. The input L pooling shareings each of which consists of d plus 1 shares, and the output is a packed shareings in which L plus d shares. The basic idea is to transform each share in which d plus 1 shares to another set of d plus 1 shares, such as the last d shares. The last d shares here are common in different shareings. So, for example, the first sharing is transformed to the Qt-x1, which is a variable, you find a field and you add 1 to unit d. And the second sharing is also transformed to Qt-x2, which is a different variable in front of the field, and most importantly also you add 1 to unit d. We can see that on the right side, only the first shares corresponding to every input pooling shareings are different. And the last d shares are identical. Clearly, it is not secure if the output are pooling shareings, since, for example, if we know the Qt-x1 and Qt-x2, our adversary can element the mask and recover the XR over first-processed variables. Instead, to keep the proving security in our construction, the output sharing is similar to the inner product masking. The sensitive variable is equal to the Qt-x, plus the inner product of your head 1 to your head d and a constant vector, whose value varies according to different pooling shareings. So, particularly, how do we design the packing procedure? Here I lose one possible construction by an example with security order d equals 2 and the number of mutations a equals 3. Firstly, we generate a uniform random d times d matrix Q here. Then, we can get the common vector hq with some columns of the matrix Q. After that, for each input pooling shareings, we calculate the vector here times the matrix Q, resulting in a vector of less d. Of course, for different input pooling shareings, we use the different constant vector. For example, here we use y1 and for here we use y2. And then, we sum the vector to the last d shares of the input sharing, resulting in a vector of d shares. Then, we can sum the element of shares and plus the first share, and the result should be the Qt-x. It can be checked that the sensitive variable equals Qt-x, plus the inner product of head u1 to head ud and the constant vector. Now, we have the packing. The next subguided is the packaging of the packed shareings. That takes two packed shareings and performs the modifications in the mask domain. First of all, as the output should be pooling shareings, which means that each shareings corresponds to one such variable. We consider each pooling shareings separately, and here gives the procedure corresponding to the case output pooling shareings with k in 1 to L. The modifications follow the paradigm of odd-dark-dark-reversions and then compression as the IW scheme. So, first of all, we compute the odd-product of input shares related to the sensitive variables xk and yk, which are Qt-xk, Qt-yk, vector head u and the vector head b. Say, if we consider d equals 2, the odd-dark should be this matrix. We can say that the lawyer write 2 times 2 matrix is the odd-product of the vector head u and head b. Then, we can refresh this matrix by adding a random symmetric matrix to the lawyer write some matrix of the odd-product. We also replace the odd-right row with the diagonal of random matrix. At last, we do a compression. For each row, we calculate the inner-product of the last d entries and the constant vector ak and then add the first entry. After that, we sum the first row here and here. Finally, we can get the output. It can be checked that the output is the pooling shareings of the product of input sensitive variables. And, the most important of the procedure is that it works with all the values of the key for the same random variables by which we can achieve the organization. We showcase our new method on ESR-Bestapp, which consists of 16 S-boxes. The S-boxes are implemented with secret orders 4 and 8 based on the ARM Cortex-M architecture. We can see that when the security order d equals 8, our implementation achieves a gain of up to 30% in total speed and saves up to 16 random bytes that the state-of-the-art bit-slice implementation reported at ASHA Group 2018. Indeed, the code size of our implementation are larger, which is due to the loop unrolling of our implementation. And finally, I would like to give a short conclusion and discuss on the future work of our scheme. Our work focuses on the overhand of the sectional masking techniques. We consider air mask operations which prove the security order d and reduce the average randomness and complete cost via the optimization technique. Regarding the future work, first of all, we think the hardware implementation might be more suitable to our approach since the future, since the future modification and linear transformation can be optimized in bit-level. Second, as we have proposed a new technique for the mask team, we think that combining our approach with the investigated formal method and automated tools for the mask team should be quite a promising future work because now our security proof is down by hand. It might be good to give a more efficient communication if we can apply the verified proofs. Thirdly, since for now we only do abstract proven security which is necessary for step, we think it is interesting and an open problem to look at the concrete security for the proposal. That's it. Thanks for your listening.