 Hello, my name is Mustafa Hairella and today I will be presenting the paper Security of CUFB against chosen ciphertext attacks. First a short introduction about CUFB. CUFB was designed by Shakraporty, Vatan, Menometsu, Anandi in 2017, and the abbreviation CUFB stands for Combined Feedback Mood, and it is the basis for the gift CUFB which is the finalist in the next lightweight cryptography standardization project. The encryption algorithm works as follows, first it takes an unsn, encrypts it to generate the initial state, and also to generate a half block mask L. After that the associated data is absorbed block by block through the linear function row, and each time the internal state is masked with a version of the mask L multiplied by a different constant, and the block cipher encryption is applied. After all the associated data blocks have been absorbed, we start encrypting the message blocks, so we encrypt the internal state, then absorb the message using the same linear function row, but this time it outputs two blocks, one is updated to the internal state and the other one is a ciphertext block. Then again we mask in a similar way the internal state using the mask L. After all the message blocks have been encrypted, we encrypt the internal state one more time to generate the authentication tag. In the original publication in Chess 2017, the authors claimed security as long as the number of forgery attempts or decryption queries is led then 2 to the n over n, and the number of encrypted blocks and decrypted blocks is led then 2 to the n over 2. These numbers in practice are 2 to the 58 and 2 to the 64, and the forgery bound looked as shown here, and we can see that it's a birthday bound in terms of the encrypted blocks and the decrypted blocks here, and also it has a logarithmic term here n, which is related to the number of forgery attempts. And here we assume that all the block ciphercodes have been replaced by a random function. Later the authors presented an extension paper in the Journal of Cryptology 2020, where they improved the security bounds, so we can see the birthday bound in terms of the encrypted blocks and the encrypted blocks still exists, but no longer logarithmic term. Security as part of the NIST lightweight cryptography efforts, the designers of GIFTS-EOFB presented a similar bound to the original bound, which still had logarithmic term, it's different in some parts, but the overall conclusion is similar to the Chess 2017 paper. A few observations that motivate this work are related to schemes that are secure up to the birthday bound. A lot of the schemes that are secure up to the birthday bound have a security bound on the form sigma squared over 2 to the n, and this bound is close to 2 to the minus n when sigma is small. On the other hand, schemes that have a security bound on the form sigma over 2 to the n over 2 are less secure because the bound is higher when sigma is small. And they usually have n over 2-bit tags. CUFB has a security bound on this form, yet its tag size is n-bits. Moreover, the Journal of Cryptology 2020 security bound improves the security to sigma squared over 2 to the n. So this raises a question like, can this bound also adopted for GIFTS-EOFB or not and is this improvement correct or not? So we asked two research questions. Can we break CUFB algorithm with only 2 to the n over 2 over n forgeries or at least break it with 2 to the n over 2 forgeries and a negligible number of encryption queries? And since CUFB has a bound that looks like schemes with n over 2-bit tags, can we show that CUFB behaves in a similar way to schemes with short tags? Normally, in authenticated encryption security models, the security model is defined such that once a forgery happens, that's already considered a success for the adversary and the security game will be stopped. But in the next section, we will look at a sort of a misuse scenario where we will allow forgeries to happen, but we still require the adversary to break privacy. So just getting a successful forgery is not enough. In order to do so, we have to slightly modify the definition from the standard authenticated encryption definition. So in this case, for the encryption, it will be similar to an ideal world versus a real world scenario where the encryption oracle will choose whether to output a correct encryption or to output a string of random bits, depending on a bit P. But in this case, ideally, we would have had the decryption oracle always output error. But here we will have the decryption oracle always perform the correct decryption. This will allow the adversary to get successful forgeries, but these successful forgeries will not be considered wins, except if he can still guess the bit P. And we can see that this is relevant because not every forgery will allow to break such security game. So for example, if we have a simple forgery adversary that encrypts a message and then changes one bit in the ciphertext and tries to guess the new tag, such forgery will succeed eventually, but it will not lead to a break in this security model because in the nonce respecting case, this nonce is already used in an encryption query before and cannot be used to generate new encryptions. Why does this matter? It matters because we have usually two security claims for an authenticated encryption scheme, an integrity of ciphertext claim and an indistinguishability of chosen plain, against chosen plain text adversaries. And in practice, maybe that the security of each of these claims is different, like one is higher than the other. Let's say we have an integrity claim that's very low compared to the indistinguishability claim. If we can use the forgeries to perform an indistinguishability against chosen ciphertext adversaries, this will affect the privacy in practice, even though the indistinguishability against chosen plain text adversaries is still high. And this happens because in the chosen plain text scenario, the number of encryption queries or forgery attempts is irrelevant. So this attack was presented in the NIST Lightweight Cryptography Forum informally by Alexander Match in 2019, where you take a scheme that has a short tag, so let's say the scheme has only 64-bit tag. You start decrypting random messages with a nonce that was never query before during encryption, and at some point you will get the tag correctly. Once the tag is guessed, this will leak some information about the key stream bits related to this nonce. At a later time, the adversary will observe a message encrypted with the same nonce, and this will allow him to partially decrypt it. So here we describe a class of nonce-respecting authenticated encryption schemes, where if the length of the associated data is equal for two queries, and the length of message is equal, but the messages are different, then if we encrypt the two queries with the same nonce, such that the first block of the message is equal will imply that the first block of the ciphertext is also equal, and if the first block of the message is different, then the first block of the ciphertext will also be different. Now given these types of schemes, we can describe an attack similar to what we just described, where the adversary will try to decrypt a random ciphertext many times until they can guess the tag correctly. Then once they have done that, they will ask for the encryption of the decryptive message after modifying some part of the message after the first block, and they can observe whether the first block is equal or not. And this will break the privacy of the scheme with about two-to-the-part tau decryptions where tau is the tag size and only one encryption. If we were to apply this attack to COFP, it will require two-to-the-part tau decryptions, and tau is 128 bits. However, the security bound of COFP looks like a scheme that has a 64-bit tag. So the question is, can we find a similar attack on COFP that requires only two-to-the-part 64 decryptions? So first, we show in the paper some forgery attacks on the COFP schemes. However, all these three forgery attacks fall into the type of forgery attacks that cannot be used for such indistinguishability attacks. Here we show one example, which is the best performing attack out of the three. And in this attack, the adversary will select a random message, random nonce, and will encrypt them to get ciphertext and tag. Then they will try to guess the value of the mask. And once they guess the value of the mask, they can apply a linear transformation and generate a forgery that will be decrypted successfully. However, because this is based on an ant that has already been used before, the information from this forgery cannot be used to ask for another encryption query. This attack requires one encryption and about two-to-the-n-over-two decryptions. And it has a success probability of qd over two-to-the-n-over-two. And this attack, while it doesn't lead to indistinguishability problems, it's important because it contradicts the security proof in the Journal of Cryptology 2020. So this shows an error in the security bound. So in the paper, we give a fourth attack, which also breaks privacy with two encryption queries. Let's recall how COFP works in this diagram. We can see that if we have a non-plaintext encryption query, then we know the blocks of M and also the blocks of C. This gives us partial information about the internal state. So from, let's say, M2 and C2, we can get the full output of the block cipher code before that. And from M1 and C1, we can get half the input to the block cipher code. And to get the full input and output of the block cipher code, we have to guess the mask. And this will be the basis of the attack. So first, we will call one encryption query with non-plaintext. And we will select one block cipher code. We will get all but half of the inputs and outputs. And we will use this information to construct the forgery. So in the first phase of the attack, we initialize N with some random string and M with some random string. We ask for the encryption of N and M. And we initialize our forgery message to verb and our mask as 2-0, N over 2-0 bits. In the second phase of the attack, we set a variable v equal to the tag of the encryption query. And from the message block and the corresponding cipher text, we set p this way, where L is our guess for the mask. And we set the mask for the forgery to be the first N over 2 bits of v. Then we assign the nouns for the forgery in this way. And the tag for the forgery is equal to v. We select the associated data for the forgery and the cipher text for the forgery in this way. And then we ask for the decryption. If the decryption is successful, this loop will stop. If it's not successful, we will update our guess for the mask and perform this operation one more time. In the third phase, after the forgery has been successful, we extend the decryptive message with our N random bits. And then we ask for the encryption using the nouns we used in the forgery. And most likely, these nouns would have never been used in an encryption query before. And then we compare the first N bit of the encrypted cipher text with the cipher text of the forgery. And based on that, we can guess whether the oracle is outputting random bits or it's outputting the real encryptions. And if we look at this main loop, we will see that its success depends on guessing L correctly because everything else is known, m, c, t, all this is known. So its success probability will depend on guessing L correctly. And also, we have a slight detail here, this zero vector that's added to N. Ideally, N is equal to t, so this zero vector will not be included, it's just included here for correctness for cases where N is not equal to t. And if N is not equal to t, then this may not succeed and we will require more encryption queries to make it succeed. But this extension of the attack is described partially in the paper and you can refer to it if you're interested. But here we're talking about cases where N is equal to t. So this vector has length zero, so the nonce for the forgery is basically just p. So if we guess L correctly, the attack will succeed. The probability of guessing L correctly is 1 over 2 to the N over 2, which means in practice after to the par 64 forgery attempt, we would have guessed L correctly. So some comments about this attack. As we mentioned, the attack is optimal when the nonce size is equal to N and if the nonce size is less than N, then more encryption queries are needed. But for the next version of gift C of p, the nonce size is 120 bits. And the attack works in a similar way to the attack on schemes with short tags that we described earlier. So the attack on schemes with short tags, of course it has algorithms with short tags and it only requires the encryption queries and one final encryption query. While our attack here requires one encryption query at the beginning, so in total two encryption queries. Also, the structure and amount of possible forgeries that can be generated from each attack is different. And the complexity here depends on the nonce size and the mask size, not on the tag size. Nevertheless, for sensitive application, we can conclude from this attack that C of p offers little security over schemes with N over 2 bit tags. So in other words, practical implementations do not lose much by truncating the tag of C of p to 64 bits. So to conclude this presentation, we have shown that C of p can be forged with about 2 to the N over 2 attempts and one encryption query. We have shown that the tag size of C of p offers little advantage when it's beyond N over 2 bits. And we have shown that the attempt in the Journal of Cryptology 2020 to improve the security bound is inaccurate. In related work, Inu and Minimatsul showed that the forgery can also be done with 2 to the N over 2 encryptions and one forgery attempt. And later, Inu and Minimatsul showed that privacy can be broken also with 2 to the N over 2 encryptions and zero decryptions. One remaining open problem in the security analysis of C of p is improving the logarithmic factor in the security bound. So all these attacks do not take advantage of this logarithmic factor. But in the paper, we discuss why this bound is likely artificial and cannot be improved. And we leave proving a better bound out of the scope of this paper. Thank you for watching this video and hope to see you in the conference if you have any questions.