 Hi, welcome to my talk on super-efficient entropy accumulation. My name is Jie Xie. I'm a first-year PhD student in New York University Shanghai, and this is a joint work I did with Yvgeny Dodes, C.L. Guo, Noah Stevens, David DeWitts. Our work studies real-world random number generators. In 2019, Microsoft published the white paper of Windows 10's random number generation infrastructure. In their design, two steps are needed in order to get random bits. The first step is entropy accumulation. Multiple low-entropy sources are gathered together to form high-entropy states. Then, in the second step called randomness extraction, those high-entropy states are sent to a cryptographic hash function to generate random bits. In our work, we have modeled the entropy accumulation procedure of real-world random number generators and built the theoretical foundations for it. Prior to our work, entropy accumulation was often modeled by iterated harshing with that cryptographic hash function. However, this is not realistic in the context of real-world random number generators because many practical entropy sources, such as interrupted timings, come at a very rapid pace but with relatively low entropy, for example. So, running a cryptographic hash function every time we receive an input is too expensive to use. We would like to know what is the best way to design extremely fast and practical entropy accumulation procedures to accumulate entropy as fast as possible. To answer this question is helpful to see what is typically done in practice. Windows time implements the following rotate-dem-xor procedure. For example, in the 32-bit CPU, a register R of 32-bit LAM will be used to accumulate entropy. The entropy source X, such as an interrupted timing, is also a 32-bit string. To update the register R one needs to secretly rotate the bits of R by a fixed rotation number, 5, and then XOR the input X to the result. In total, there will be 32 entropy sources being accumulated into R. Similarly, in the 64-bit CPU, R receives 64 entropy sources and the rotation number is 19. While this design appears reasonable, it raises a lot of questions that I would like to answer. The first question is, can rotation indeed accumulate entropy? In other words, given about rotating the XOR procedure after receiving 32 or 64 samples, we really want to know whether the register R will converge to a high entropy state. If rotation works, then we want to know how Microsoft selected those mysterious rotation numbers alpha. In particular, are Microsoft's choice of rotation numbers 5 for the 32-bit CPU and 19 for the 64-bit CPU reasonable? Finally, is rotation really the best way for entropy accumulation? Can rotation be replaced by a better permutation? To start answering these questions, the first priority is to model input sources. According to the white paper, interrupted timings are the primary entropy sources of Windows time. So, as the first modeling assumption, we will assume interrupted timings are independent. While it might not be entirely accurate in practice, we believe that it somehow captures the feature of interrupted timings which don't appear fully adversarial. Second, to minimize the number of parameters and focus on the high-level picture, we will assume that the entropy of each independent sample is lower bounded by some parameter k. The key point is that our entropy accumulation procedure doesn't know the input source in advance. So, our goal is to find a rotation number alpha whose corresponding entropy accumulation procedure works to the best extent for all possible k. Finally, we will further restrict each sample to come from some family of distributions. In our paper, we have given a country example showing that the rotation of the XOR procedure is too simple to work for arbitrary distributions of entropy k. So, we must choose a proper family of distributions to both include most of the natural distributions and capture the feature of interrupted timings. Notice that the lower order bits of interrupted timings change more rapidly than the higher order bits. So, the intuitive requirement is that these distributions should have most of their entropy in the lower order bits. According to this, we can define a very wide class of distributions which we call two monotone. Two monotone distributions are unbeat distributions such that the probability mass function has at most one peak. This is a large class and it includes discrete gaucho distributions, discrete exponential distributions, and uniform distributions over the first k bits. These three natural distributions are often used to model interrupted timings. The simplest two-month distribution is the uniform distribution over the first k bits, where k-list significant bits are uniformly random and the remaining n-k bits are fixed. It's obvious to see that this distribution has most of its entropy in the lower order bits. Most surprisingly, our MinLama shows that any two-month distribution does in fact have most of its entropy in the lower order bits. To summarize, we will instantiate our family of distributions to be old monotone distributions on n-bits having entropy at least k, and we will allow arbitrary independent but not necessarily identical samples from the family. With entropy sources being modeled properly, now we are prepared to present our results. First of all, since any two-month distribution has most of its entropy in the lower order bits, we can use this property to show that any rotation on n-bits with rotation number alpha co-prime to n can accumulate nearly n-bits of entropy within n-steps. Although we typically think of rotation by one as the worst condenser, our result implies rotation by one can also condense two monotone distributions to linear entropy within n-steps. What's more, this result can immediately generalize to cyclical permutations so that any cyclical permutation can condense to linear entropy within n-steps. The first result justifies the use of rotation, but only if we are willing to wait n-steps. It fails to distinguish between different rotation numbers alpha. One rotation number alpha is one. Even if the input already has a very high entropy, we do need roughly n-steps in order to accumulate nearly n-bits of entropy. So if we wish to do better, we must somehow distinguish between different rotation numbers. To do this, we introduce a simple efficiently computable quantity which we call the covering number. Intuitively, covering number alpha k is the smallest number of samples needed for rotation by alpha to accumulate four entropy from the uniform distribution over the first k-bits. Equivalently, it's like covering a black segment of length n using a shorter red segment of length k. Every time the red segment is cyclically rotated by a constant alpha, the goal is to count how many times are needed for the red segment to cover the intel black segment. In our work, we show that the covering number is the right measurement of samples to accumulate nearly four entropy from any two-month distributions with entropy at least k. We justify this statement by both theoretical proofs and empirical data. To sum it up, our second result suggests comparing rotations according to their covering numbers. It effectively reduces a very difficult problem to a simple calculation. As we have discussed earlier, our rotation then XOR procedure doesn't know the input source in advance. Our goal is to find a rotation number alpha such that the covering number alpha k is relatively small for all possible k. We can use this theory to review Microsoft's choices on rotation numbers. In the next two slides, we will see the covering numbers of different rotations when the length of the register is 32 or 64. In terms of 32-bit CPU, we have selected four different rotation numbers, 5, 7, 9, and 13, and depicted their covering numbers versus k, the mean entropy of sources. The red lines represent the covering numbers of certain rotations while the black line and over k represent the optimal covering numbers. We can see that Microsoft's choice 5 is reasonable since it's very close to the optimal line but it's generally worse than 7. Similarly, in terms of 64-bit CPU, we have also selected four different rotation numbers, 15, 19, 23, and 27. We can see Microsoft's choice 19 performs reasonably well for all k. Our analysis of the covering number of rotations extends immediately to any thick-legged permutation. So can we find a better permutation to replace rotations? The answer is affirmative. We have constructed a permutation that we call bit-reversed rotation, TorN, where n is a power of 2. For example, when n is 8, we can write out the cycle notation of Tor8 in the following way. First, write down 0 to 7 in a cycle notation. Second, convert each integer into a binary string of length 3. Then, reverse the order of this binary strings and convert them back to integers. The resulting cycle is Tor8. Bit-reversed rotation has a very nice property. If k is a power of 2, then the covering number is n over k, which is optimal. So it has optimal covering numbers for all k, which is a power of 2 simultaneously. In this slide, we compare covering numbers of bit-reversed rotation against covering numbers of rotations used by Microsoft and the optimal value on over k. We see that bit-reversed rotation seems to perform at least as well as rotation and better in several regions. So we leave it to practitioners to determine whether implementing our new permutation would be favored in the context of their random number generators. Our study suggests that it seems to be the most natural choice from a theoretical perspective. Now let's move on to the proof session. In this part, I will present our proof techniques, our Milama, and a brief proof sketch to answer why rotation works. Suppose each sample is independently sampled from some two-month distributions of mi-entropy k. After receiving n-sept samples, we denote the distribution of the register as dA. Here, A is rotation by alpha. To see whether dA has linear entropy, we need to compute the mi-entropy of it. However, we can hardly compute the mi-entropy from the probability mass function because the convolutional form is invisible to compute. The Fourier coefficients arise naturally in our setting because they interact nicely with both convolution and linear transformations. So our goal becomes computing the sum of Fourier coefficients of dA. Since entropy sources are independent, each Fourier coefficient of the compound distribution dA can be decomposed into a product of multiple Fourier coefficients of two-month distributions. The Fourier coefficient of a two-month distribution has a very nice feature which is formalized by our Milama. Our Milama says that the absolute value of the Fourier coefficient of any two-month distribution at some vector w is small if the i-speed of w is 1 for some small index i. For example, if d is uniform over the first k bits, as long as the first k bits of w contain 1, its Fourier coefficient is 0. This formally captures the intuition that the lower-order bits of two-month distributions should have high entropy. So we instantiate our Milama by taking i to be 0. This means if the first bit of a vector is 1, then the absolute value of its Fourier coefficient is upper bounded by a small constant. For any vector w, we consider the set of n vectors generated by w and a transpose. We only need to count the number of these vectors whose first bit is 1, and then use the Milama to upper bound their Fourier coefficients. For remaining vectors, we'll use 1 to upper bound their Fourier coefficients. It turns out that if a is rotation by alpha and the alpha is co-prime to n, then the number of vectors among this set whose first bit is 1 is equal to the hammy weights of w. So we can take a summation over hammy weights and easily show the entropy of dA is linear. As a summary of our proof technique, if the first bit of w or the red dot in the illustration is 1, then the absolute value of its Fourier coefficient is upper bounded by 2 to minus k plus 1. Then repeatedly apply a transpose to w and check if the red dot is 1. After n steps, the register accumulates linear entropy. In this case, we use too many ones to upper bound a Fourier coefficient. Can we reduce the number of steps while still having strong condensing? The answer is affirmative. If the first k over 2 bits of w or the red segment in the illustration contain 1, then we instantiate our Milama and see the absolute value of its Fourier coefficient is upper bounded by 2 to minus k over 2. Then repeatedly apply a transpose to w and check if the red segment contains 1. If so, the absolute value of its Fourier coefficient is upper bounded by 2 to minus k over 2. Otherwise, upper bounded by 1. In this case, we only need to carry number alpha k over 2 steps for strong condensing from two monotone distributions. Unfortunately, in our proof, we lose a factor of 2 in k. In fact, we also show that roughly covering number alpha k steps is already enough for very strong condensing by empirical data in the paper. So we believe the factor of two loss is an artifact of the proof. If you reach the end of this talk, I just want to quickly recap what we have seen so far. We justified the use of rotation in Windows times random number generator and proved that the rotations satisfying some number theoretical condition can accumulate the linear entropy within a few steps. In addition, we justified Microsoft's choice of rotation numbers 5 and 19 and introduced the theory of contrary number. Finally, we suggested replacing rotations by b to reverse rotation in the real-world random number generators and gave some empirical data. This one closes my talk. Thanks for listening and bye-bye.