 Our next talk is on the plain simple reality of entropy and we all know that you need randomness and entropy if you want to do something like encryption or generate keys. And if you don't want to do it the XKCD way, using only four as the random number, you need a cryptographically secure random number generator. And what this is, how it works and where you can find one, will be the topic of this talk. So I present to you a Philippo Valsorda on how I learned to stop burying and love your random. Hello. Okay. I'm very glad so many people showed up, even if I essentially gave away the entire content of the talk in the description. Wanted me to stop to leave something. Okay. No. Anyway. I'm Philippo Valsorda. I work for Cloudflare. I do cryptography and systems engineering. I recently implemented the DNS second implementation of the Cloudflare DNS server. And maybe in April 2014 you used my Herbley test. Anyway. Well, thank you. Okay. Anyway, I'm here to tell you about random bytes. So here are some random bytes. They're pretty good. You can use them. But if you need more, Amazon sells this excellent book full of a million random numbers. Anyway, more seriously, random numbers are central to a lot of the security protocols and systems of our modern technology. The most obvious is encryption keys. You obviously want your encryption key to be random, to be really hard to predict, and you want your encryption key to be different from the person next to you, unless you're doing key scroll, which, well, we don't want. So and also a lot of other different systems use randomness to prevent all kinds of attack. One amongst many, DNS, using random key readies to prevent coming scare attacks. So what makes a stream a source of random bytes good? So what are we looking for when we look for good random bytes? First of all, we look for uniform random bytes. Every time we draw a random byte from our random byte source, we want to have the same probability to get all values from zero to 255. For example, you don't want your distribution to look like this, this is RC4. But that's not enough. You also want your random bytes to be completely unpredictable. And here is where the task actually becomes difficult. Because if you think about it, we are programming computers. Computers are very deterministic machines, even if they don't feel like they are. And they're essentially machines built to sequentially execute always the same set of instructions, which we call code. And when we ask them at some point to do something that is completely different every time they do it, and two equal computers should do it differently, we get in trouble. So where can a computer source its randomness? When where can a computer find unpredictability if it can't have its own? Obviously in our messy meat world, in our physical world, where everything is not always happening the same way. So user inputs, every time you type on the keyboard, you do that with different timings. When you move your mouse around, you do that differently every time. You're simply reading from this. Every time your computer reads something from this, it takes a slightly different amount of time. Or interrupt times, I.O., you get the idea. So all these events are visible to the kernel. The kernel is the component of your system, which is controlling all these interactions with the outside world and can measure them and observe them with the right precision. And each of these events can have a wide or narrow range of different possible values. For example, when you read from this, it might take from 0.17 nanoseconds to 1.3 nanoseconds. I made numbers up. How wide this range is, is what we call entropy. Essentially it's how many different values, how spread apart the values are, which also means how easy they are to predict. But something that they definitely aren't is uniform. Because as I said, for example, reading from this might take in a specific range definitely not from 0 to 2.5 nanoseconds. And usually they're not enough to satisfy all our random bytes' needs. So now we have some unpredictability. We have some events that we can see from our system. And we want to turn that into a stream of random bytes that we can use to generate SSH keys and DNS key readies, et cetera. It's a CSPRNG. Cryptographers like their acronyms very long. It's a cryptographically secure pseudo-random number generator. It's not that hard to pronounce. Okay, it is. Anyway, it's nothing else than a cryptographic tool that takes some input and generates a limited, reasonably unlimited stream of random uniform bytes, which depend on all and only the inputs you gave to the CSPRNG. So you can see how we can use this. We have this amount of random events. We feed that into the CSPRNG. And we get out random bytes that we can use for everything. So to understand how a CSPRNG works, I decided to simply present you with a very simple one. One based on hash functions. I assume that everyone in the whole knows essentially what hash functions are. But the properties we care about today of hash functions are. The fact that the output is uniform. If you take the output of a hash function, all the bits should be undistinguishable from random if you don't know the input. It's impossible to reverse a hash function. If I give you the output of a hash function, you should know nothing more than before about what the input of the hash function is unless you can specifically figure out the input and try the hash function. And finally, it takes a limited amount of input and makes a fixed amount of output. These are the properties that we're going to use to build a CSPRNG out of hash functions. So this is how it works. We start with a pool. We call pool array of bytes. And we fill it with zeros to start. And every time a new event comes in, for example, you move the mouse around, we take that event, we serialize it to some binary format, doesn't really matter. For example, mouse is at position 15 to 135. And we hash together, we hash the concatenation of our pool, which for now is just zeros, and the serialization of this event. We hash them together. We get an output, and that's our new value of the pool. And we repeat. Now instead of zeros, we have the output from before. Now we have this output, and a new event happens, a disk read happens, and it takes exactly 1.27589 nanoseconds. And we hash together the old contents of the pool. This information, disk read happened, and it took this amount of time. We hash them together, and we get a new value of the pool. You see where this is going? We keep doing this. Every time a new event comes in, every time the mouse moves, every time a CPU interrupt is raised, every time disk read happens, we call this steering function to mix this event into this pool. And what do we end up with? We end up with what we call an entropy pool. Now to figure out this value, you need exactly all the events that lead to this value. If you are an attacker and you really want to figure out what my entropy pool is, you don't, you're not supposed to have any better way to figure it out than to guess all the different hard disk timings and mouse movements that happened all the way up to now. So now we have this essentially unpredictable value, but now we want to generate keys out of it, and we can't just use these few bytes here. So we can use again hash functions, same hash function. We take the entropy pool and we hash it with a counter. You want 5,000 random bits, sure. You hash entropy pool and zero, entropy pool and one, and two, three, four, five, six, seven, and nine. You get all these outputs, you concatenate them, and now you have 5,000 bits which are as unpredictable as all the events that were steered into the pool. Let's think about it for a second. We said that hash functions are not invertible. So even if you know one of the outputs, you can't get back to the entropy pool. We said that hash functions have, with hash functions, all the bits in the input affect all the bits of the output. So even if just the counter changes between one run and the other, the output is completely unrelated. So did we get what we want? It's uniform because we said before, hash functions' outputs are uniform. It's unpredictable because the only way an attacker has to figure out what the output will be is imagine or brute force or observe, I guess, all the hard disk timings and user inputs, which is impossible for a third party. And it's unlimited because we can keep incrementing that counter forever. Now really, please don't go implement this scheme and say, Philippo told me it was okay. No, also because it's exactly not what this talk is about. So if CSPRNGs, if we have this tool to turn some unpredictable events into a limited stream of random bytes, which is what we need, and we have all these unpredictable events observed by the kernel, doesn't it make sense to just put a CSPRNG in the kernel and just have the kernel run the CSPRNG when we need random bytes? It's such a good idea that it's exactly what Linux did and all the other operating systems. In Linux, it's called devurandom and it looks like a file. You read it like a file. And it's technically a character device. And every time you read 100 bytes from it, it runs a CSPRNG on an entropy pool, not different from the one I presented. And this entropy pool is steered with all the events that the kernel so happened from its privileged position. Other operating systems have something similar. Ozec and VSD have dev random, which is exactly what devurandom is on Linux. And on Windows, you can get something similar with the cryptgen random call. One last thing, putting the CSPRNG in the kernel is not only about convenience. It's also about security. Because first of all, the kernel is the entity that can observe the unpredictable events. If you take a CSPRNG, which is just code, so you can implement your own, and you implement it in your library or in your application, now you have the problem of how do you take the random, the unpredictable events from the kernel and take them to the application? This is something that you can forget to do often or do wrong. And moreover, the kernel can protect the memory space of the entropy pool much better than applications. For example, applications can fork. There's a whole lot of different things that applications can get wrong. And finally, you have one single centralized implementation that is reasonably easy to own it. I don't know. Was anyone managing Debian servers in 2008? Just asking. Unrelated. Right. So, yeah, devurandom, better. So we have a solution, right? We have a tool to turn unpredictable events into an unlimited uniform stream of random bytes. We have a source of unpredictable events. Sold. What are everybody talking about? Why is there even need for a talk? Well, sadly, there are some common misconceptions in the field, which is also why I'm here to give this talk. One of the most common is fueled by the very Linux mempages. The recent versions are better, but they still give you this impression that if you want real security, you should be using dev random. Because devurandom is okay, but kind of, and, well, we want real security, right? So, but you might ask yourself, okay, if devurandom is a CSPRNG. And a CSPRNG is all I need. What else can I get? What does dev random have more? Well, the idea of this talk is giving you the knowledge to figure out by yourself whether you need dev random or not. So first I explained how a CSPRNG works. Now I'm going to go a bit into the details of how devurandom and dev random work. This is taken directly from the kernel source. Both devurandom and dev random are, yeah, sorry, essentially everything I'm going to say now applies to both devurandom and dev random. They both are based on a pool of 4,000 bits. Not dissimilar from the one of the CSPRNG we played with before, which is implemented as a series of 32-bit words, I think. The pool is mixed with all the unpredictable events using a CRC-like function. This is not a cryptographically secure hash function, but this is just about how the unpredictable events, the interrupts, the disk timings are steered into the internal pool. Every time one of these events happen, this very fast function kicks in and steers the pool with the unpredictable event. Then extraction, so actual random bytes generation happens with sha1. So you want some random bytes from the kernel. What the kernel does is just run sha1 on the pool, give you the output, and also take the output and feed it back into the pool using that mixing function. This is a bit different, you might have noticed, from our design, which used a counter, because keeping counters turns out it's still hard. And you can, you know, they can reset, you can lose count, that's bad. Also this has more security properties against compromise. So what it does is simply that when it generates output, it also steers it back, and if you need more output, sha1 again on the new pool, gives output and steers it back into the pool so that the pool keeps changing. Now both devurandom and devrandom do the exact same thing. Same code, same sizes, same entropy sources, literally in the source, random read is a call to extract entropy user, urandom read is a call to extract entropy user. The only difference is, I finally get to what's special about devrandom, is that it tries to do a couple really hard and weird things. First it tries to guess how many bits of entropy were mixed into the pool after each unpredictable event. This is already very hard, because think about it, a disk read took 1.735 no seconds, great. We don't know how many different values this might take. We don't know if this is a spinning rust disk, which has timings all over the place, or if it's a SSD which almost always takes the same time. So we don't know how much predictable this is. So this is already hard, figuring out how unpredictable the pool is. So it keeps a counter arbitrary number of how many bits of entropy, how much unpredictability there is in this pool. And then when you run the hash function on the pool, it decreases this count. It reduces this number. And if this number gets too low, it blocks you. So you're reading from the random, this number of twinals, and now you're still reading from the random, but you're blocked, until more unpredictable events happen. This is useless in the modern world, because entropy does not decrease, entropy does not run out, and everything freezes. Once the pool becomes unpredictable, because too many different events contributed to how the entropy pool looks like, it's forever unpredictable, because the attacker doesn't learn anything from the output, obviously, unless the CSPRNG is broken and is leaking information about the entropy pool. However, saying that CSPRNGs are broken is equivalent to saying that a lot of cryptography constructs are broken. It's saying that stream ciphers are broken. It's saying that CTR mode is broken. It's saying that TLS and PGP are broken, because they're both about reusing the same key for multiple packets or messages. So if cryptographers didn't know how to build a secure CSPRNG, it would mean that cryptographers weren't able to build most of the things we're relying on today. It would mean that cryptography was doomed. Now, I'm not DJB. I can't tell you if cryptography is doomed. But I can tell you that if cryptography is doomed, your problem is not your CSPRNG. So cryptography relies on being able to build secure CSPRNGs. And on the other hand, that makes that random blocking useless, obviously. It can be unacceptable, too, because you get a TLS request and you're like, I have that HTTP page, but wait a second. I need someone to start typing on the keyboard of their hack to serve it to you. And it can even be dangerous, because you're essentially giving away information about what other users in the system are doing to, you know, other users. On the other hand, DevUrandom is safe for any cryptography use you want to use it for. You want to generate long-term keys. My GPG keys are generated from DevUrandom. And I'm not the only one saying this. Bring SSL, Python, go Ruby, use DevUrandom as the only source, the only CSPRNG, send storm even replaces DevRandom with it. And here is a long list of people saying exactly what I'm here on stage to tell you. So you don't, I hope that at the end of this, you see that you don't actually need DevRandom. As well as you don't need to keep measuring how much entropy you have in the pool. You don't need to refill the pool with things like Havagda, or I don't know how to pronounce it. Actually, I've even seen people take output from DevUrandom and pipe it back as root into DevRandom so that, you know, the entropy does not run out, which is exactly what the kernel is doing. Which is obviously, you know, pretty upvoted answer on Stack Overflow. Anyway. And finally, random numbers quality does not decrease. There are not like premium level random numbers, and then they kind of rot after you use them for a while, and no, that's not a thing. Okay. So there is only one small case in which DevUrandom does not do exactly what we would expect it to do. Which is early at boot. If you think about it, everything we said is about using unpredictable events to build up unpredictability. As soon as you boot the machine, you don't have observed enough events yet. So this got embedded devices. This got the Raspberry Pi recently. Essentially, it's a Linux shortcoming, which by now is too late to fix. Which is the fact that DevUrandom will not block even at boot before being initialized. The solution, in most cases, is just that the distribution should save the state of the pool at the power off and reload it at power on, or block until the pool is initialized. So your distribution probably solves this for you, anyway. So to sum up, CSP and Gs are pretty cool, and they work. You don't need DevRandom. You shouldn't use user space CSP and Gs because they're very easy to get wrong. And if you need 100 random bytes, read 100 bytes from DevUrandom. That's it. I glossed over a lot of different ways to do it wrong. So if you have questions about why not this other thing, please come forward. Okay. And because the people on the stream can't be here in person, we will start with questions from the internet. The first question is, how do you explain regarding what you explained about DevRandom versus DevUrandom, the fact that on a 433 kernel, DevRandomOutput is identical with something from DevInput something. Someone claimed that. And so you have to repeat on a what? On a kernel 433, someone claims that sometimes the output from DevRandom or DevUrandom is identical to something that comes from DevInput, like an input device. I'm not sure I got what system, but oh my God, what system? Linux, Linux 433, the guy claims. That sounds like a pretty bad bug, but I don't know. If that's the case, I'm not aware of it because I read the kernel source and it's really a call to extract entropy user. File a bug report, maybe? No. I mean, I'm joking. I want to talk about this offline. Is there another question from the stream? Yes, I have two more questions. One is, what do you think about hardware entropy generators or hardware random generators? Aha! I have a slide for this. So hardware random number generators, very quickly. Some CPUs and some platforms have real random number generators with essentially, I'm told, they just read electrical noise to give you actual randomness. Linux has support for them and if they're loaded, they will immediately be used to refill this pool and they will also be used as the initialization vectors for the SHA-1 of this extraction. So if they're turned on, you don't have to worry about them and they will make DevUrandom work even better. Yeah. Okay, quick question from the stream. Yeah, no one, what's your opinion about these entropy gathering daemons, like, how much daemon? There was probably a time when they had their reason to exist, maybe because the Linux implementation of this entropy gathering was not that good. Today, they don't really have reason to exist. Okay, thank you. And microphone for please. Hello, I wanted to ask about the early boot problem. You say that we should mix, that we should save the state of DevUrandom. What happens if a machine crashes? Wouldn't you restart from an earlier state of DevUrandom? Yeah, I think that the correct way to do this is as soon as the input, even before the input is used to initialize the pull, the one from the last shutdown, it should be deleted from the disk and the disk flashed. Yes, kind of hard. Yes. Okay, unfortunately we are running out of time because we have to clear this room and you have a short announcement. Oh yeah, tomorrow at 15.30, I am giving a quick workshop about how to implement a voodoo night paddling oracle attack in hall 14. I think it doesn't take as many people as they are in here now. So maybe I shouldn't have said that. Okay, then thanks again, Philip, for your talk.