AES LBBB、AES Mode for Lightweight and BBB Secure Authenticated EncryptionI am Takeshi Sugawara and this is a joint research work with Yusuke Naito and Yusasaki.In this paper, we propose a new authenticated encryption mode called LBBB for AES Accelerators. Our motivation is to provide higher security with lower memory overhead by using AES Accelerators,available and will be available in many computational platforms.We propose a new mode of operation LBBB and its instantiation AES LBBB.BBB represents beyond the birthday bound security and AES LBBB achieves almost 128-bit security using AES.L represents lightweight in terms of memory size.The diagram on the bottom shows the operational unit for processing a 128-bit block in AES LBBB.We use a 256-bit internal state flowing from left to right and the upper half is the AES 128-bit secret keyand the bottom half is the 128-bit AES internal state.To realize a large internal state to resist internal collision attack,we update the AES key as we process the message blocks,which is the key point for reducing the memory size.Lightweight cryptography has been a hot topic in cryptography for more than a decade.The motivation is to design crypto algorithms that achieve better performances in resource-constrained IoT devices.In particular, NIST is running a competition for choosing a new lightweight standard,making this research area very active in the last few years.Most candidates in the competition use new lightweight primitives such as Asconpe, Skinny and Gift instead of AESbecause they are better in various ways.There were only three AES based candidates in the previous roundand unfortunately none of them survived to the final round.The new lightweight primitives are the reasonable choices for rigorous optimization.However, AES based schemes are still valuable especially because AES acceleratorsare getting more and more common in many computational platforms.AES accelerators are everywhere in these days.Most modern processors including X86 ARM and RISC-5 have AES interactions.AES core processors are getting very popular in general purpose microcontrollersespecially for the recent ARM V8 architectures.We can expect that these AES accelerators will be available in the futurefor backward compatibility, interoperability and standardization.People are still using RC4 in WPA2 Wi-Fi securitywhich is a good example showing the resilience of one standardized algorithm.AES GCM is the most common AES based schemeand its security is limited by a birthday bandand is 60 bits only.People are moving toward 128 bit securityand designing a better AEAD for AES is our motivation.There are conventional AES based modes for BBB security.In particular,ALE appeared in 2013is an AES based AEAD claiming 128 bit security.It uses the AES run function as a basic building blockand processes 128 bit message blocksfor each 4 round AES operation,achieving very high data rate.We can use AES interactionsfor implementing these AES run functions.Meanwhile,AES co-processors usually finishesthe entire AES encryption at onceinstead of A round functionso they are not for ALE.Unfortunately,there is an attackand ALE does not provide the claimed security level.Another candidate is RIMAS N2a family of Rameraswhich is the NIST LWC finalist.It achieves BBB securityand since it uses a block cipher as a primitivewe can also use AES co-processorsfor accelerating RIMAS N2.The diagram on the bottom showsRIMAS N2's basic unit for processinga 128 bit message block.It uses a 384 bit internal state flowingfrom left to right.The upper and middle lanes representthe AES key and the AES internal state.RIMAS N2 uses additional 128 bitsfor the bottom laneand reducing this 128 bitsthat key challenge we're tackling this paper.Before coming into the main partI want to emphasize the importance of memory sizein lightweight cryptography.Memory or registerdominates the entire circuit areain compact hardware implementations.Here are quick examples.A 4-bit sbox typically uses20 to 40 gates.Just a 128 bit registerneeds 6 to 900 gatesand reducing this essential memoryis almost impossiblewith hardware design.Memory is much cheaper in softwarebut it's still crucial.The memory size determines the chip costand there is continuous pressurefor low memory software implementationeven today.Recent microcontrollers havea special memory for securitywhich is even smaller.For example some L11a microcontroller we use forour benchmarkinghas trustram which is limitedto 256 bytes only.Ok we are coming back to the motivationafter this work.We designed the scheme with these design goals.We want beyond about the band securityachieving almost 128 bit securityusing AES.Second we want to achieve 256 bit memorymeaning no extra memory outside AES.Third we don't want tocompromise the speed too muchand want to maintain the right one performance.This means that we want to process28 bit message block for each AES call.finallythe table on the bottom comparesALE,REMAS N2and hours and we can seethat hours satisfyall the three design goals.Here is the list of our contributions.We propose the new mode of operation LBBB.We specify AES LBBBand instantiation of LBBBusing AES 128.We benchmark the software and hardware performancesand compare them with the state of the earth.We implement the software on a microcontrollerwith an AES core processorand implement the hardware for ASIC.We also implementREMAS N2 for comparison.The tables on the right shows the softwareand hardware performancesin which AES LBBB shows better performances.finallywe discussed these design extensionsfor further performance.These figures comparethe basic processing units of LBBBand REMAS N2.In these diagrams, a box with a black barand the input toward the bar representsthe secret key for the block cipher.A256-bit internal stateis necessary for the resistanceagainst internal collisionneeded to achieve 128-bit security.A block cipher's internal stateis 128-bits in case of AESwhich is not enough.REMAS N2uses the additional 128-bit state.A challenge is to further reduce the state sizeand we achieve this by using the block cipher's keyin addition to the internal state.To achieve the randomnessin this 256-bit internal statewe feed the block cipher's output to the key.We mix them through thepi and lambda functionswhich are the linear functions with several requirements.In the instantiation with AESwe set both pi and lambdaas the constant field multiplicationso we can implement them very efficiently.This figure shows how we constructLBBB's hashing and encryption.It's basically a simple iterationof the basic operation unitbut there are some important points.First we can feed 256-bitassociated data block at onceso this scheme's rate becomes 2 in hashing.We use yet another functioneater for domain separationand the eater is a simple LFSRin the final instantiation.After feeding all the message blocks,the final key state in the block cipheris used as a tag.We prove the securityin the ideal cipher modelthat is a block cipher is idealunder the non-respecting settingin which there is no repeated noncebetween messages.We use thenaesecurity which claims the indistinguishabilityfrom the ideal system consists of a random bit oracleand a rejection oracle.The random bit oracle returns random cipher textsand the rejection oracle always returns rejectionin tag verification.As far as we interact with the ideal systemwe obviously learn nothing about the plain textand forging a tag never succeedsand we are going to provethat lbbb is as good as the ideal one.For the naesecurityof lbbbwe can provethat lbbb achieves n-log2nbit security.They are indistinguishable up to2 to the power of n data blocksin all queriesor2 to the power of n over n local complexity.As a resultthe final security becomes121 bits with AES.In the next few slideswe are going to explain the security proof.This is a proof sketch for the encryption.In this proof the goal is to showthe indistinguishability betweenthe lbbb encryptionand the random bit oracle.For the purposewe should analyze how the scheme maintains therandomized state.The first twoblock cipher codes randomize the entire stateand we get a 2nbitcompletely randomized internal state.Since weconsider the non-respecting scenariowe are going to be able topropagate it to the last blockand the internal state values are allindependent and random as long as no statecollision occurs.The collisioncomplexity determines the securityand by the birthday analysisthe collision complexity isorder2 to the power of n ciphertext blocksdbb achieves n bit securityregarding encryption.We are going toexplain the next one,the proof sketchfor decryption.A forgery attack isn'tmatterand we analyze two attack cases.The first case is to guessing a tagin decryption.The key point is thatwe can get a fresh tag for eachnew input block to the last block cipher.In this case,the tag is almost randomso the success probability is2 to the power of minus nfor each decryption query.As a result,the forgery attackwith this case requires2 to the power of n decryption queriessatisfying n bit security.The next forgery attack isin some internal statein decryption.LBBB uses a2n bit internal state,and half of itthe one shown in red in this figureis public because they are exposedthrough ciphertext.If the adversarycan obtain the remaining n bits,there is no secret anymore,unforging a tag is trivial.So the key point is the complexityof recovering the remaining n bits secretshowing blue in this figure.Using the multi-collision techniqueon the public part,we can ensurethatrevering one of the secret valuesrequires2 to the power of n overn data blocksin decryption queriesor2 to the power of n overn local complexity.SoLBBB achieves n-log nbit security regarding decryption.AESLBBB is an instanceof LBBB using AES1 to 8.For the linear functions,we usea constant multiplication by2 to the power of 8 for bothpi and lambda.Constant multiplicationis widely used as a lightweight linear functions in lightweight cryptography.We use 2 to the power of 8 becausewe preferred a bytewise operation.Another good thing is that we cancombine pi and lambda usingthe distributive property to furtherreduce the computational cost.We aremoving on to performance evaluation.We use microchip some L11procontroller as a target platform.It's an ARM Cortex M23MCUand has an AES core processorworks at 57.2 cyclesper byte.We implementthe entire AEAD operationswith the SuperCop API.We use memory-aware implementation.The depth of the nested functionsis a tricky part becausedeep nest increases stack memory.For a fair comparison,we limitthe depth to one level fromthe top level functions.We put sensitive values only inindependently acquired global memoryassuming a special memoryand never put them in the stack.We implement the current state of the artremas-N2 using the same designpolicies and compared itwith AES-LBBB.This table compares AES-LBBBand remas-N2 in RAM,stack,RAMand speed.AES-LBBBuses 16 byte or128 bit less memory becauseof its smaller state.Although it's relatively smallerthan the 88 byte stack,the secure memoryneeded for storing sensitive valuescan be more crucial,as I mentioned earlier.AES-LBBB alsooutperforms remas-N2 in RAMand speed.Since AES is very efficientin this platform,non-AES operationssuch as the linear functionsor elementary operationssuch as moving and exploringpollonecte performances.AES-LBBB uses simplerand less non-AES operationswhich contributed these better performances.We are moving on to hardwareimplementation.This diagram showsthe circuit architecture for AES-LBBB.AES obviously dominatesthe entire hardware costand for compact circuit areawe use popular byte serial architecture.Something special about it is column-orientedserialization and integratedinverse key schedule.We should care about byte orderingwith a model operation and we choosecolumn-oriented serializationwhich respect the AES-native ordering.On-the-fly key schedulingis very popular among compact hardware implementation.It updates the key in placewhich is the problem for various modesthat requires the original AES keyin the following operations.So we should recover the original onefrom the last run key.We efficiently achieve this by addingsmall amount of circuit componentsin the key array as shown in blue.We also integrated the constant multiplicationby 2 to the power of 8needed for LBBB in the key array.We also implementedREMAS-N2 using the same AES circuitfor fair comparison.We evaluated their circuit areasusing NANGATE 45nm standardcell library.This table compares the circuit area of AES-LBBBREMAS-N2and the baseline AES.AES-LBBB is smallerby more than 1000 gateswhich is a great improvement.This mostly comes from the reductionof 128-bitextra state which usesmore than 1000 gates in REMAS-N2.Finally, we discussed the extensionof LBBB regarding inversekey schedule.As I mentionedin the hardware implementation,we should recoverthe original AES key after theon-the-fly key schedule.We updated the key state key in placeusing a key schedule functionksf within the block cipher code.Before proceeding to the next step,we shouldapply ksf inverse to getk out of ksfk.Then we can move on to the nextprocessing with the py function.This is necessary for AES,but we can improve itfor non-AES primitives.The idea is to integrateksf into the py function instead ofconstant multiplication in AES-LBBB.By doing this,we can completelyskip ksf inverse.However,the py function should satisfycertain requirements,and AES key schedule doesn't satisfy them,unfortunately.ksf of some block ciphers satisfy the requirements,and Karan is oneexample.So Karan LBBBcan be very small,although theresulting scheme will have 64-bit security because of Karan's block sites.Another approach isksf.More recent block cipher designs use simpler key schedulewith which ksf inverse is very efficient.That is in contrast with AES-ksfthat incorporates sbox operations.Gift 128 is the case,andgift 128 LBBB achievesalmost 128-bit security.Another candidate,which is something inbetween,is using a modified AEShaving a better ksf,such as the one-by-coreat all.We can still acceleratesuch a scheme using fine-grainedaccelerators,such as AES-NI.I'm concluding my talk.LBBBprovides beyond about-the-bound securityat smallest memory costfor block ciphers.AES-LBBBenjoys the power of AES-acceleratorsand it outperforms the conventional state-of-the-art RIMAS-N2 in bothsoftware and hardware benchmarks.Wefinally discussed the design extensions.Thank you very much for watching.