 I am going to talk about R-Paper Lightweight Authenticated Encryption mode suitable for Threshold Implementation. I am Takeshi Sugawara and this is a joint work with Yusuke Naito and Yu Sasaki. This is the outline of my talk. Our motivation is to design an authenticated encryption scheme suitable for Threshold Implementation. Here suitable means to minimize the memory size in Threshold Implementation. We follow the approach of PFB of improving the performance with TI by reducing the ratio of nonlinear part that uses many shares. In the previous work, the size of the nonlinear part was the same as the security level in bits. We extended the work to design two TI-friendly AE modes. The first one is PFB+, and its size of the nonlinear part is a half on the security level, which is a half of PFB. This figure on the left shows how the memory size is multiplied in TI in PFB. We can trade a half of the nonlinear part with a linearly updated state, as shown in the figure on the right for PFB+. Since we need less number of shares for protecting the linear part with TI, the total memory space for PFB+, can be smaller than the other PFB with TI. PFB or MIGR is a generalization, and its size of the nonlinear part is even smaller, and it is the security level divided by a positive integer or MIGR. We give conditions about internal parameters that PFB or MIGR should satisfy, and give concrete parameters up to omega equals to 3. We want authenticate encryption for resource constrained IoT devices, so the design goal here is to minimize the implementation cost without compromising the security. As a result of the recent advances in lightweight cryptography, register or memory dominates the hardware cost rather than combinatorial circuits. Here we show an example. A 4-bit S-box, which is common among lightweight block-cypher algorithms, uses 20-40 gates. In contrast, a register for storing 128-bit data needs 6-900 gates. There is a threat of such a channel attack for IoT devices, because they are sometimes operated under a hostile environment in which an owner attacks the device. This needs lightweight cryptography standardization process ongoing, and it optionally considers the security against side-channel attack. So there is a need for lightweight and side-channel attack-resistant AE. In other words, the performance of an implementation with a side-channel attack counter measure can be an important performance metric for AE schemes. The goal of our work is to design an AE that has minimal memory size with threshold implementation. Threshold implementation, or TI, is a side-channel attack counter measure suitable for hardware. It's based on multi-party computation, and it has the non-completeness property that ensures the security in the presence of glitches, which is the transient signal propagation through combinatorial circuits. The number of shares multiplies the memory size. This order TI needs TD plus 1 shares for a target function with the algebraic degree T. So for first order TI, the minimum number of shares is 2 for linear functions, and 3 for non-linear functions. So the memory is related to non-linear operations is multiplied by 3 in threshold implementation, while the linear part multiplies the memory size by 2. PFB is the previous design, and what we learned from PFB is that different part in tweakable block cipher uses the different number of shares. The state for data randomization uses 3 shares, and linear key scheduling needs 2 shares, and the public tweak needs no protection. And we can change the ratio using a beyond-the-bust-a-burn scheme. And the resulting scheme has a small memory size when implemented with TI. So the figure on the left is a block cipher-based design, called SAEB, and one on the right is the TBC-based design, PFB. So what PFB did is to convert a half of the non-linear part into a linear and public tweak. The total number of memories is the same without TI, but because of different multipliers, the memory sizes with TI are different between the schemes. We compare different design approaches and their memory sizes with TI. In permutation-based design, like duplex, the capacity C should be 2S for the security level S because of burst-a-bound. So the non-linear size equals to the permutation size, and it is 2S plus all. In block cipher-based design, like SAEB or Sunday, the block size B should be 2S, again because of burst-a-bound. Here the non-linear size becomes 2S with the block cipher using linear key schedule. Finally, in tweakable block cipher-based design, like PFB or Aramulus, the block size B equals to S because of beyond the burst-a-bound security. So the non-linear size equals to S with the tweakable block cipher using linear tweaky schedule. Here is our approach for TI-friendly AE. We extend PFB to reduce the ratio of the non-linear part even further while keeping the S-bit state for security. This video shows the ratio of non-linear and linear parts and how they are multiplied in TI. This one on the top is PFB, and it's our baseline. We replace the non-linear part with linear part within the S-bit. The total memory size is a constant without TI, but we can reduce the memory size with TI because the non-linear and linear parts have different multipliers. These are our results. We extended PFB. The first one is PFB plus is the security S equals to the double of the block size. That means the size of the non-linear part is a half of the original PFB. The second one is the generalization we call PFB omega, and its size of the non-linear part is S divided by a positive integer. We get conditions that PFB omega becomes secure. Next, we extend the skinny with a larger tweekie for PFB plus. It is what we call skinny E with 64-bit state and 256-bit tweekie. Finally, we gave concrete hardware performance evaluation by combining PFB plus with the skinny E. I briefly recall the TBC best-skinned PFB. This figure on the top left is TBC, which is a familiar BBIT permutations indexed by the key and the tweak. It offers many distinct permutations by changing the tweak, and using this functionality, we can obtain a highly secure and lightweight AE like PFB. This figure on the top right is PFB, composed of a hydrated structure of TBC. It uses a non-n and a counter value as a tweak for each TBC core, so they are distinct permutations. This structure ensures that the security level is equal to the block size B. To further improve the security level from B to XB bits, we modify PFB by adding this X-1 times B-bit state that uses linear operations. This is PFB plus, and the additional BBIT state is defined by X varying each TBC output. Now the internal state size is the double of the block size, and it ensures that the security level is equal to the double of the block lengths in the non-respecting setting. The ratio of the non-linear path in the internal state is a half, which is smaller than that of PFB. This is the generalized mode we call PFB omega. The additional omega minus 1 times B-bit state is defined by processing each TBC output with these linear operations. The extended internal state ensures that the security level is equal to the internal state size given by omega times B in the non-respecting setting and under some conditions that these coefficients in the linear operation should satisfy. The ratio of the non-linear path in the internal state is 1 over omega, which is even smaller than that of PFB plus when omega is larger than or equals to 3. To instantiate PFB plus, we need a TBC with four BBITs tweaking space for the block size B, because we need two BBITs for a key and another two BBITs for a tweak. There is no previous lightweight TBC that satisfies this requirement. For example, Skinny supports up to 3B-bit tweaking space. So we designed Skinny E, which is an extension of Skinny that supports 64-bit block and 256-bit tweaking. Skinny is based on the tweaking framework. Skinny 64 is a variant with a 64-bit block and can have up to three tweaking states, namely TK1 to TK3. So the 192-bit tweaking is the maximum and the number of rounds is at most 40. These three tweaking states are linearly and independently updated. In Skinny E, we extended the original Skinny by adding a new tweaking state, TK4, along with a new linear tweaking schedule. Skinny E supports a 256-bit tweaking with a 64-bit block. For ensuring the security, we extended the number of rounds to 44. Please see our paper for a detail about the security analysis. We did hardware performance evaluation of PFB plus combined with Skinny E. For comparing its performance, we also implemented PFB instantiated with the original Skinny with 128-bit block and 256-bit tweaking. We implemented both schemes with the same design policy. These are the diagrams of the data path architectures of the two schemes. Both of them used the serialized Skinny implementations from the original Skinny paper. We added some extras, selectors, and ungates to support commands for processing blocks for PFB and PFB plus. Another thing you might be interested in is the various number of shares coexist in these implementations. The blue region I extended to three shares for Skinny's nonlinear operations, and the orange regions used two shares for secret keys and the PFB plus extended state. This is the implementation results. This table compares the number of registers and the circuit area between the two schemes with and without TI. The two schemes are almost the same without TI both in register size and circuit area. In contrast, PFB plus combined with Skinny E is smaller by 1009 gates with TI compared to PFB. This difference comes from the difference in the number of registers that PFB plus is smaller by 64 bits. This is the conclusion of my talk. We designed two TBC-based TI friendly AEs, having the ratio of the nonlinear part less than one. The first one is PFB plus and its nonlinear part is a half of the security level S, and it's a half of the conventional PFB. The second one is PFB omega and its nonlinear part is only S divided by omega. We gave the conditions regarding the internal parameters needed for satisfying the security. Then we designed Skinny E with 64-bit block and 256-bit tweaking by extending the original Skinny E with the fourth tweaking state. Finally, we made hardware performance evaluation, the PFB plus, combined with the Skinny E, and PFB plus was smaller than the original PFB by 1009 gates with TI.