 This talk is about security analysis of Calta Diabici, a national standard of random number generator. Our Viet Tung Hoan and his joint work with Yavin Shen. As I mentioned earlier, Calta Diabici is a random number generator, or ING for short. Syntactically, an ING is a stateful algorithm that keeps refreshing state via inputs of high ambient entropy and provides pseudo-random outputs upon request. The ING that we will discuss today is Calta Diabici, which is the most popular standardized ING. It has been used in numerous cryptographic libraries and operating systems such as OpenSSL or Windows 10. The security of these systems crucially depends on Calta Diabici. Despite the importance of Calta Diabici, for a long time there have been very few papers analyzing this scheme. These limited analysis, however, consider just some simplified components of Calta Diabici and therefore cannot support security claims in these documents. In 2019, Woodard and Schumann point out that some options in the overly flexible specification of Calta Diabici might be exploitable. Shortly after their theoretical observation is confirmed by the work of Carney at all, the latter gives a side-general attack on real-world implementation of Calta Diabici. The lessons learned from these two papers are, First, you should deprecate in-sql options in Calta Diabici. Next, if you are a developer, you should be mindful of misuses such as, using licky table-based AES, or failure to repress a state periodically, or using low entropy input. Still, that remains a big question. That is, if we adopt all these recommendations, is Calta Diabici secure? Our work gives an affirmative answer for this question. Before we get into the technical details, let's define what we mean by security. Our target is the robusted notion of notice at all, which is a standard goal of RNGs. Informally, an RNG should still provide security even in the face of state compromise or adversarial input. In particular, we will consider a pair of adversaries, A and D. The sampler D would generate the inputs, the adversaries A would try to compromise the state and also distinguish the outputs with truly random strings. If the RNG is based on an ideal primitive pi, we will give A oracle access to pi. The sampler D, however, doesn't have access to pi. In other words, the RNG has a huge seed, namely the encoding of the ideal primitive pi. In addition to pi, the adversaries also give other oracles. For example, it can get the state of the RNG, or it can set the state of the RNG to evaluate lines, or it can force the RNG to refresh the state. The adversary, however, doesn't control the random inputs. Finally, adversaries A is perceived as a real oracle that provides either the real outputs of the RNG or random strings of the same length. This oracle, however, will provide real outputs if not enough mean entropy is supplied from a last state compromise. We define the advantages of the adversaries A and D as a normalized probability that A can guess correctly whether it will receive real outputs of the RNG or the random strings. Let me now give you a bird's eye view of Cal.diabG. This construction is based on a randomness extractor that will go condense, then encrypt or CTE. Cal.diabG consists of three algorithms, one to rewrite the initial state, the other one to refresh the state, and yet another one to generate outputs. The state of Cal.diabG consists of a key K and an IVV for AES. For example, if you want to rewrite the initial state for an input I, we will apply CTE on I, we will then use Cal.diabG under a 0 key and a 0 IVV to encrypt the extractor string. The resulting cell text would then be passed into K and V accordingly. If you want to generate an output, we will use the Cal.diabG mode to encrypt a 0 string under the key K and IVV from the state. The resulting output would be passed into an output R and an updated state K and V accordingly. Our result bounds the robustness advantage of Cal.diabG via two terms. The first term measures pseudo-randomness quality of outputs produced by Cal.diabG mode. The second term measures how well we extract randomness from CTE. Here's our bound for the pseudo-randomness quality of outputs produced by Cal.diabG mode. In particular, because we'll use Cal.diabG on many keys, so the first term bounds the multi-user PRF advantage of Cal.diabG mode. And the second term bounds the advantage of guessing one of the key Cal.diabG via at most key attempts to bound the quality of extracting randomness from CTE. We'll use the generalized leftover hash lemma from Barak et al. In particular, if egg-random input has lambda-base of conditional width entropy given other inputs, we recommend that lambda to be at least 216. For the rest of the talk, I will elaborate more on randomness extraction in Cal.diabG, namely the CTE construction. This picture is a blueprint of CTE. In particular, given an input i, it will add some prefix-free encoding and a padding. It then iterates through CVC mark up to three times with different constant IVs. The resulting cell deck is then passed into a key and an IV of a yes. We will use this key and IV to encrypt a zero string under CVC mode, and the cell deck is the extracted output string. As mentioned earlier, CTE is based on CVC map as a building block. Conventionally, CVC map uses the zero IV, but in CTE, the IV is non-zero. A classical guide for using CVC map as a randomness extractor comes from the work of Dulles et al. In particular, if you want zero random outputs, egg-inch would should have high conditional mean entropy given the past inputs. This recommendation is however violated in CTE. In particular, CTE uses CVC map on essentially the same input multiple times. It is one of the biggest challenges in analyzing security of CTE as a randomness extractor. To get around this obstacle, we realize that the outputs of CTE will be used to rewrite a key guide and an IVV for the counter mode. Therefore, if we model a yes as an ideal cipher, these outputs only need to be unpredictable instead of being zero random. As a result, we can circumvent the requirements in the classic work of Dulles et al. As an added plus, because we only need the outputs of CTE to be merely unpredictable, we can reduce the mean entropy requirements from 280 bits to 216 bits. Before we analyze the unpredictability of CTE outputs, let's define an unpredictability notion for key hash function H. In this case, we first sample a random input i of random bits of mean entropy. We then generate a key k uniformly at random, independent of i. We then hash i under the key k to produce an output z. We then define the guessing advantage of an adversary a against h via q queries as the probability that a can guess z via q guesses. Analyzing the unpredictability of CTE is rather tricky. Let's begin with an intuitive approach. In particular, we'll first show that CPC MAC is an almost universal AU hash function. That is, it will pick two distinct strings x and y and sample a uniformly random probability pi. Then if we hash x and y under CPC MAC or pi, then this is unlikely to result in a collision. x will employ the generalized leftover hash lemma or product at all, which essentially says that any good AU hash function is also a good randomness extractor for unpredictability applications. As I mentioned earlier, we want the collision probability of CPC MAC to be small, but how small is enough? If we use the classic analysis in the work of notice at all, it gives a good collision bound but only for x and y of the same length. If we use this analysis in the context of CTE, it means that we are effectively assuming that the length of its random input is linked to the adversary a before it makes guesses, because the length is a part of the entropy of inputs. So it means that we are wasting entropy of random inputs and it is undesirable. Alternatively, we can use the analysis in the work of Bollary, Piesek and Raoult's way or Ja and Nandi. These analysis can handle arbitrary x and y, but the resulting bound is inferior for our purpose. Let's now try to find a different way to give an unpredictability bound for CTE. As shown in this picture, the output of CTE is the sabotage of CPC encryption. In order to predict this sabotage, one needs to guess both the key and the IV of CPC. In our failed attempt, we only require the adversary to predict the key and as a result, the bound is poor. We now can have a better bound if we require the adversary to guess both the key and IV of CPC. Realizing this observation translates to a multi-collision property of CPC MAC. In particular, we need to show that if we pick two distinct strings x and y and then sample a random permutation pi, then if we hash x and y under CPC MAC or pi with different IVs 0 and IV 1, then the chance that we have a double collision is small. In our work, we can show that the chance of this multi-collision is at most 64L cubed over 2 to the 256, where L is the maximum block length of x and y. This multi-collision property then implies that it is hard to predict both the key and IV of CPC encryption inside CTE. As a result, CTE is a good EU hash function. We can prove that without squandering entropy of random inputs. Finally, we will employ the sterilized leftover hash lemma to show that CTE is a good determinist extractor for unpredictability applications. It is the vow for the guessing advantage of an adversary against CTE for Q guesses. In conclusion, our work shows that if you adopt the recommendations in the work of Pudak and Schumann and coordinate all, then counter-DIBG is robust. Moreover, our work also shed some light in the design of counter-DIBG. In particular, this construction looks quite cumbersome at first. However, underneath that, it contains very neat design ideas for getting around the limitation of using CPC MAC to extract randomness.