 Welcome! This talk presents the Message Authentication Code DAVMAC. My name is Eklist and this is joint work with Tony Groho and Ritu Nandi from ISI Kolkata. Message Authentication Codes. Take a message N and a secret key shared by the legit made sender and receiver. The purpose of a MAC is to produce unfortable texts to guarantee authenticity of the transmitted messages. MACs can be stateful, randomized, non-spaced or state the statistic. Here we focus on the letter. For the usual MAC security notion, the adversary has the goal to forge the valid tag for a message that has not authenticated before. For this purpose, the adversary can ask authentication and verification queries. However, it is more convenient to target PUF security instead. An adversary who cannot distinguish valid outputs from random bits has also a hard job to forge. A variety of MACs is built from classical block ciphers. Many are variants of either the sequential CMAC-like design or the paralyzable design of PMAC. Many of those constructions limit their security to the well-known birthday bound, which suffices in many cases. However, sometimes this is not the case, for example in applications with small size of the primitive. To provide higher security, a single accumulator is not enough, at least not for state the statistic MACs. Instead, a second accumulator is then used for both sequential constructions like 3KF9 or in parallel constructions like in PMAC+. To be able to block ciphers are keyed family of permutations that take an additional public input called the tweak. They are useful not only for MACs. The additional public tweak can be used for instance as a domain separator or to process more message material per call. Various existing MACs already make good use of tweaked block ciphers. At Profsec 2015, Nitro proposed PMAC-TBC1K and TBC3K. The primitive here ensured ambient security by having different domains for every call. In 2017, Coyati et al. proposed in total 4 MACs to a MONG based on regular block ciphers, called hash-stweak here, HST, and also nonce-stweak, however, are focuses on state the statistic constructions. Both used a universal hash function to first compress the message and then use the second hash of the message or the nonce-stweak and obtain NVID security also. At Profsec 2017, Iwata et al. proposed CMAC, also highly secure and even more efficient construction. Moreover, there are some hashes in authenticated encryption schemes that build on regular block ciphers. But let's focus for the moment on CMAC. The innovation of CMAC relied in its internal regular block cipher-based hash function, called C-Hash. C-Hash processes the message input in both the state and the tweak in nearly fully paralyzed manner. Therefore, CMAC could be more efficient than previous constructions. The state is accumulated in two lanes and at the end is processed in a complex finalization that consists of two sums of two independent permutations each to obtain a final output. One problem for lighter applications of CMAC is that it needs to hold the tweak state, the message state, masks and the accumulators. This increases the overall number of variables and therefore the size of the state. Sequential MACs can be made lighter, which may be desirable for some applications. Our motivating question was therefore, if we can reduce CMAC to a more sequential construction that does neither sacrifice its high security nor its high rate that CMAC can offer. In our work, we revisited the structure of CMAC and proposed DAF MAC. We arrived at a sequential tweakable block cipher-based design that accumulates the state in two lanes. We could obtain a security of the minimum of n or n plus t half bits. The hash function DAF hash processes n plus t bit per call to the primitive, similar as CMAC-like designs. Every block is processed in a single call to a tweakable block cipher, where in each block, both lanes are used as the input. The t-bit part is added to a top lane as a tweak input. The remaining n bits are exored to the lower lanes input. The feed forward ensures that the primitive output always affects both lanes. The pad function extends or truncates the n-bit output to t-bit before the next t-bit value is added. At the end, our hash function needs to check some to ensure high security, which adds all t values of a message. Otherwise, the modification of a single tweak input could produce collisions at the birthday bound. Our construction is an instance of hash as tweak by Coyati et al. or its generalization hashed in TBC. The finalization uses the n plus t bit output of the hash function and generates a random bit stream. Therefore, it employs the same tweakable block cipher as the hash function under a distinct key. The construction is easily extendable to a variable output PRF by adding a counter to every block and to produce a longer bit stream. Moreover, we could easily derive a single key version by reserving, for example, one more tweak domain bit or by fixing one bit of the tweak input. In the following, we would like to give a brief overview of our proof idea. Our analysis consists of three parts and an easy initial step. First, we replace the primitives with ideal ones. We reduce the construction to the fixed output length PRF security of hashed in TBC. For this purpose, we need two properties of the hash function. First, an upper bound on a collision probability among all messages. And second, a low probability of truncated collisions in X. In the following, we briefly sketch the ideas of those three steps. Prior, we briefly want to recall our notions. By call, we mean the maximal probability of a collision among any two, among Q pairwise distinct messages of at most B bit blocks each and sigma blocks in total. By T and epsilon truncated almost universality, we mean the probability that two messages collide in the first T bits of their outputs to be at most epsilon. First, we replace trigger block ciphers with two independent random idea trigger block ciphers over the same spaces. Next, we review our construction as an instance of hashed in TBC. So, we can apply its security statement. For a single output block, hashed in TBC needs two security properties from the hash function as set. One is a maximal collision probability of epsilon one for any pair of this joint messages. The other one is that the probability that any two among the messages collide in X, which has to be at most epsilon two. Then, we have a relatively simple statement left. What remains is to show appropriate upper bounds for epsilon one and epsilon two. To upper bound the collision probability, we consider structure graphs. Those represent the message walks where vertices are state values and edges are transitions from one state value to the subsequent one. Labels are the block values of TI, II. A cycle represents a collision. We define a number of bad walks that will represent the options in a single message to produce a collision. We distinguish between multiple cases depending on the size of the prefix and are the size of the cycle. In the first case, we consider a structure graph that has a prefix and a cycle of length exactly one edge. In the second case, we consider a walk that has no prefix and again a single edge cycle. In the third case, has an existing prefix and a cycle of length at least two edges. Finally, we consider a bad structure graph that collides with the beginning. In the first two cases, we have m possible blocks that can lead to a single cycle collision. Therefore, we have m as a factor. In the latter two cases, any two blocks can collide, so we have m choice II as a factor. In the former two cases, we have a single n bit random variable that needs to be hit from an ideal permutation output. Therefore, we have a denominator of 2 to n minus m since the outputs have not collided before and therefore assembled randomly from a set of this size at least. In the second two cases, we have at least two independent random variables. Therefore, the probability for a collision is at most 1 by 2 to the n minus m squared. The sum of all terms yields the following bound among bad walks. We also have to study good walks. Those are walks of two distinct messages that do not collide with themselves as the bad walks instead produce a collision between them. Those messages can have a common prefix. Here we distinguish four cases by the length of their path from the point where both walks deviate until they meet again. In the first case, we have one message that has a diverging path of length at least three edges. We have another case where one walk has a length of two edges and the other one of at least two edges and the difference in their tweaks is non-zero afterwards. We have another case which matches the second case in the length of the diverging path but has a zero difference in delta t after the diverging path. And we have a case where one of the diverging messages walk has only a single edge until they meet again. In all cases arbitrary blocks can collide. Therefore, we have m by 2 as a factor everywhere. Moreover, in all cases we have two independent random variables that must be hit. Since no non-trivial collision has occurred before, they are sampled at random from sets at least 2 to n minus 2m. Since we have two random variables, we have 2 to n minus 2m squared in the denominator in all cases. Finally, since t may be smaller than n, we can have a factor in every term. This leads us to the following bounds for a collision in good walks. Summing up for the collision probability from bad and good walks, we obtain the following upper bound. We conducted a similar study for upper bounding the probability of collisions in only x. Again, we considered good and bad walks. We call walks bad that have a loop in either a single message or a collision in the full x in some blocks between two messages m and m prime. Clearly, non-trivial tweak and input collisions are also collision x when non-trivial means that two messages are not simply identical. Since we considered such full state collisions already before, we can exclude them here and only will give a term for correctness. For both cases of bad walks, arbitrary block indices i and j can collide with each other. Therefore, we have a factor of m choice 2 in both terms. In both cases, we also have a random variable that must be hit where the new input has sampled at random from a set of at least 2 to n minus 2m values. Over all messages, this will become sigma again and we will have the following upper bound for the probability of bad walks. On the other hand, we studied also good walks. Here, two messages collide in x without any bad events. We distinguish between two cases. The difference in the checksum data is either 0 or not. If it is not equal to 0, then we need a certain output difference from the last primitive call. Since the input was not due to a collision, the probability to hit such a certain output difference is at most 1 by 2 to n minus 2m. In the other case, the difference in the checksum should be 0. Again, for good walks, the inputs of the final primitive call are fresh and the outputs are chosen randomly and independently from a set of size at least 2 to n minus 2m values. Since we truncate x after the output of the last primitive call, we can have a factor that is larger than 1 if t is smaller than n. We did not have that factor for the bad walks since there we had considered a collision in the full primitive outputs. Here, we obtain a probability as shown here. Finally, we obtain the following bound for truncated almost universality of our construction. And then, in sum, we can add all terms to obtain the following bound of the PRF security for DAVMAC with IDE primitives. This corresponds to a minimum of n or n plus t half bits of security. We also implemented our scheme. As for the original ZMAG, it is beneficial to use an efficient tweakable block sulfur. Moreover, a long tweak input is good when a tweak schedule is efficient. The OXSPC or skinny appeared as natural choices of a primitive. We implemented our construction with skinny64128 on two AVR microcontrollers that are widespread. An addMega2560 and an addMega328P. We chose the existing implementation of skinny for AVR as a basis and adapted it for our MAC. Since our motivating construction for comparison was ZMAG, we also implemented the same variant of skinny in ZMAG 1. So the variant by NITO in 2018, which replaced the more complex original finalization also with hashed NTPC. Our goal was to have lower state sizes at the end. We used an existing approach of filling stack and RAM with random values in advance to see how much of them were changed to identify how much RAM was used. While we did not achieve the smallest possible footprint in fully space optimized implementations, we could reduce the space for the mask and the accumulators. More optimizations are certainly possible, for example, by avoiding functions and only have code without them. We tried that both implementations are comparable. We also obtained a slightly faster implementation for DAFMEG, however, this was not really our goal. Our code is publicly available and can be compared and optimized further. In conclusion, we proposed DAFMEG, a sequential variant of ZMAG with a high rate of N plus T bits per primitive call. It uses both the message and the tweak input to process the message material. We could achieve a high security of the minimum of N and N plus T half bits without any nonces or random values. DAFMEG could be easily extended to variable output lengths, however, this is not an innovation here. We used two keys, but a single bit fixing or a single tweak bit for domain separation could yield a single key variant. There are certain limitations of our design that we do not want to hide. The checksum at the end is necessary to obtain high security. We propose in our work that it is possible to add the value of the checksum as a final message block. However, we note that this has to be done with caution since tampering with this value can lead to a security reduction to only the birthday pound. Since our work, we observed that simpler and smaller high rate schemes have appeared. For example, COCB and COTR by Bauer et al. or Romulus by Iwata et al. and AETLR very recently by Google et al. Then this lightweight competition has motivated some of them. They also exploit both the message and the tweak input to process much message material in a single primitive call. Their design are actually ways of how to process associated data and not how to encrypt since they are authenticating encryption schemes. Here we sketch them all in an abstract way how they authenticate all. Their designs are smaller than our proposal, which is actually great. And for less, we stress that they need unknowns. Otherwise, the security is always limited by the birthday pound. Much of the hassle and the higher size of our construction compared to those was needed by the fact that we explicitly targeted a stateless deterministic design, so a nonce and random value-free design with beyond-birthday bound security. Therefore, we fully acknowledged their efforts and let designers choose the way they wanted. This is so far our proposal and we thank you for your attention.