 Jeff is back up on the stage, so please welcome him to the tour camp stage. Thank you, thank you. So, yeah, Jeff Koslow. I gave a talk yesterday on building and sailing a small boat, and my boat is now on its trailer down somewhere down there. So if you ever want to talk to me about boat building, we can do that. But for right now, we're going to talk about entropy. And entropy is this weird thing. This came out, okay, first of all, this came out of some of my work when I had to do a terrible, horrible, horrible thing, which is put a product through common criteria and or FIP certification. If you can avoid that, avoid that. But I did learn something out of it, and so I'm hoping to transfer that information to you with a lot less pain. So entropy is a measurement of the amount of information in a system. And anytime I think about measuring something, I want to kind of gauge it. So we're going to start off with some password stuff here. I wrote this for a little bit less technical audience, so we'll kind of breeze through this a little ways. So entropy is measured. It's the log base two of however many bits you have. So it makes sense that if you flip two coins, you have four combinations, and then log base two of the number of combinations is your entropy. All right, so that's pretty simple. You flip two six-sided die, you have 36 combinations. The log base two of 36 is five, a little over five. So we're starting to be able to measure this sort of thing. And generally speaking, we're pretty good at powers of two, so I'm not too worried about this room and this thing. So I thought, okay, if we're measuring the amount of information in a system, let's talk about passwords, right? Because we can all kind of gauge how strong our password is, or maybe we're wrong, but entropy is kind of funny because it measures things in different ways. So I thought about this for a little while, and I thought if your password is eight characters of a 26-character alphabet, this is just all lowercase, you get 37 bits of entropy out of that password. All right, that seems like a decently large number, but we know in fact that it's not two to the 37th. You know, you could crack that fairly quickly. I have speaker notes that say what the numbers are on there, but if your password is eight characters of a 72-character alphabet, so that's upper, lower, special characters, and I think I threw in a few other things in there, gives you just about 50 bits of entropy. So that dramatically increased it. Two to the 50th is a lot bigger. Then the problem is, is you start to do stuff where I happen to know that your password, and this is where entropy measurement gets kind of weird, right? Because this may be the case, but if I know something about your password, then the entropy all of a sudden changes, that measurement changes. So if I know that it starts with a capital letter and ends with punctuation, you can check my math there if you feel like it, but there's roughly 45 bits of entropy. So you just lost some information in your password. And this is, I believe, we're going to keep going through this. Your password is a dictionary word. From a 4096 word, that gives you 12 bits of entropy. So if you were to do that, and you can see what this is leading up to here in just a second, but for each character, you have a chance to capitalize or substitute one special character. Yeah, okay. So I'm on a different track in my brain. 12 bits of entropy for, if it's in the dictionary. That's dramatic, right? You really drop the number of possible passwords, and so you've got terrible entropy there compared to 37 or whatever. So if I know your password is in the dictionary, I'll run a dictionary cracker against it. The TerraHash guys down here would know all about this, but I find this really interesting. But let's pretend that you're doing a little one-to-one substitution. You're doing some crazy leet-speak, and I happen to kind of know what it is. Well, you're really, really clever there, dude. You really upped it, and you made it hard for me to guess at 15 whole bits of information, right? Okay, this is where this was leading to. This famous XKCD cartoon, 2 to the 11th. And when this first came out, there was a lot of people that were really kind of fussing over this, and it's pretty simple. He selected, instead of, I said 4096 word dictionary, he said 11, 2 to the 11th, so 2048 bits. Doesn't he say that somewhere in here? Well, he says 44 bits of entropy, and he selected four words. So that's 11. And then the moral of the story is, of course, that, you know... Well, I think this is where actually where I butt in, and I say the real moral of the story is we've leaked so many passwords over the years, and we have such a good statistical model of what your password is, that you need some sort of algorithm like this. I didn't want to turn this speech into how to make a good password, because that's kind of a lame sort of speech. So we're going to move past this. What I wanted to get across here was the measurement of the entropy. Some people say you should have 75 bits of entropy in your password, which, if we back up a little bit... We can only get 45, well, that's no... We can get 50 bits with eight characters. So we have to go a little bit more. I think we have to go out to about 12 characters in your password to hit 75, something like that. I'd have to do the math. Okay. All right. I'm just explaining how exactly you do this. And then finally, I make my little recommendation down at the bottom. Just use a password manager. It'll generate a password for you. It's got nice, good random algorithms in it. Okay. All right. So this is, again, this is more interesting. Entropy is a measurement of information. So I'm going to talk now about the Linux kernel's entropy pool. You can cat, proc, sys entropy under a veil, and it will tell you exactly how many bits you have in your entropy pool. Usually that number goes up to 4,096. Generally speaking, when you cat that out, you're going to see somewhere between two to 300 bits, maybe up to about 2,000 bits of entropy, something like that. What does that mean, though? Does that actually mean that there's some little pool that's growing? No. It doesn't really mean that at all. What it really means is there's an array of integers, and it's measuring how much entropy has gone into that system. Right? So this array of integers is a fixed size. I believe it's actually 4,096 bits of actual entropy. Every time you put entropy into the system, which usually happens on software interrupts, whenever these things happen or hardware interrupts, whenever those happen, the kernel will do something and take either some sort of hashed version of the last time it got an interrupt or the interval between the interrupts and then stir that into the entropy pool. And if it knows that it added a couple bits, it'll add those couple bits to it. Otherwise it just stirs the pool over and over again. But the stirring of the pool is completely deterministic. We have to be really clear about that. So, how is entropy used in a modern system? We just had a very nice talk on ASLR. Every time you start up a process, you're going to pull a few bits out, hopefully they're ASLR. You're going to pull a few bits out of the entropy pool to start up that program. Every time you make a TCP connection, you're going to have an initial sequence number that's random. That's actually really important to get correctly, to get correct. You're going to pull a little bit of entropy from the system. And then user level applications or server applications are doing CSRF tokens, cookies. Generating encryption keys is super, super important. And we're going to talk about that in a little bit because if I know the state of the entropy pool, I can guess what your keys are going to be because keys are deterministically generated. So, this is really important. So, what I did was up top there, that is actually the single C file. The comments on it are horribly outdated actually. But the comments, that's the C file in the kernel. I already said the pool is just a large array of ints with a certain state. Let's pretend that state is all completely zeros. And I know that it's all completely zeros. Or maybe it's all completely ones, I don't know. But if I know the state, how many bits of entropy are in that pool? Zero. Zero bits of entropy in that pool because I know the state of it. And since everything is deterministic from there, if I ever did know the state of that pool, then I can know everything downstream. And even if I know the state of the pool at one particular point in time, I can build a tree. Maybe it happens that there might be three or four interrupts that happen in between there. They may have added one bit each. I can build myself a little tree to go down and clearly I don't want to keep a tree of 4,096 bits for every single state of it. But the pool stirring algorithm is completely deterministic. So I mentioned this. To add entropy, you stir that pool with some new bytes and then declare how many bits you're adding to the pool. And usually, like I said, that's done by the kernel. Whenever it gets an interrupt, a lot of people think that you can just cat something into dev random, and it will automatically add entropy to the system. That is not correct. It doesn't add entropy to the system. It does stir the pool, but it doesn't add any entropy into the system. And then there's a counter on the side. And the counter doesn't do anything at all, but the pool changes. To remove information from the pool, remove some number of bits. You should hash those bits because really you don't want somebody taking a look at any particular portion of the state of the entropy pool. And then usually what happens is you're going to seed a PRNG to get more random numbers. I don't really want to talk a lot about PRNGs in here, but PRNGs, there's been a lot of study on them. Pseudo random number generators, essentially you give it a seed value and it will generate usually on the order of 32 billion, 32 billion random bytes after that before it cycles around, but it does cycle around. Curiously enough, there was a really awesome black cat presentation about three years ago where a guy went through and for most of these, like I had CSRF tokens and some tokens up there, what he did was he found all the applications that might generate this, like ASP.net, PHP and Python web applications and figured out how to put their random number generators into a program. What he could do was push the little button on your website that says, please recover my password and it would send back a token and he would run that through all these algorithms trying to figure out if he knew what it was, if he could guess the state and if he could guess the state successfully, then he would just type in root and push the button and of course the mail would go to root or whoever, but he would know the token so he could go and change their password. I thought it was an ingenious, rather impractical but a really ingenious little hack for it. If I ever meet that guy, I'll buy him a beer. Is it? Okay. Yeah. Yeah. So what was the name of it again? On Twister is the name of the project. On Twister is the name of the project. And Joe DiMessi and Dan Petro are the two photographers. Joe DiMessi and, okay, thank you. Just recorded that to get it on there. Let's see. Okay, so how to fill the pool. This is where we get a little bit more technical. It happens with random intervals between interrupts. So we kind of already talked about this. Interestingly enough, when that happens, I've never been able to measure this quite correctly and one of the reasons why is because if you go back here and if you were just to do something like cat proc system kernel, random entropy avail and put that in a while one loop or watch loop or something like that, it keeps executing it over and over again so as you're observing it, you're actually changing it. Now I could write a program to fix that but I just haven't got around to it. Anyway, as you're filling the pool, the real point here is it interrupts fill the pool with only at a rate of bits per second. And I haven't been able to measure it but believe me, it's bits per second. It does not grow quickly. Sometime on startup, just go ahead and try to create a 4096 bit RSA key or worse yet, a Diffie-Hillman key and just wait for it for a while because sooner or later it's only going to have network interrupts and maybe a few disc interrupts coming and nothing's going to happen and you're going to hang up on your console. I've done it more times than I care to admit. The pool gets stirred with a function that works similar to a PRNG or a hash. This is one of my favorite properties of a cryptographic hash is one bit of input will change on average half the bits in the output. That's a fundamental property of hash algorithms, of cryptographic hash algorithms and it's a pretty cool thing, right? If there were any bias either one direction or the other you'd be able to guess what some of the output was and you'd be able to crack hashes even faster. So that's what this does. It keeps stirring things into the pool. I didn't bother to put any notes in here about the actual algorithm that's in the random.c file but you can go ahead and look at that. There's been a lot of research on PRNGs and it's pretty cool stuff. There's lots of algorithms for this. Oh wait, sorry. I went too far, didn't I? One bit of input will change on average. Okay, pseudo-random number generators. So this is just a quick diversion into PRNGs. Given a seed we can generate pseudo-random numbers for a while until it loops around without repeating the loop but it will eventually. The seed is the really hard part. Remember back in the days of like Mozilla 3.0 when that came out like literally Mozilla 3.0 they were seeding the random number generator with the date and time and a lot of programs did that and that was so hard to guess, right? And we already know that if you can guess something and your entropy is essentially zero on it so you could guess a lot of the output coming out of these things. The seed is the hard part too easy to guess the current time. Here's another quick diversion on this. Entropy does not equal random numbers. A lot of people kind of think that way. You can kind of wrap your brain around it but it's not exactly true. Generally speaking what happens pulling random numbers out of the entropy pool reduces the bits in the entropy pool but you should never actually take the contents of the pool directly again because we already talked about this because you're pulling the state and you can give somebody else the state and they may be able to go backwards the way through the state. It seems hard but it's probably not. And the hash is to protect the state of the pool right there. Use a good hash for it. The kernel takes care of all this for you. The size of the entropy pool is now zero. We've been through that. So how do you get random numbers then? So DevRandom is out there. You can go ahead and cat dev random and you'll see it go for a little while more and then it'll stop. The reason why is it's blocking when the pool gets low on entropy. This is considered cryptographically strong. We're going to talk about that in a minute. It's considered cryptographically strong and it's suitable for key generation and just about anything else again this is sort of where FIPS and common criteria come into play. They happen to be really picky about how you generate keys. Imagine that. It's FIPS. They care about the state of the keys and how good quality they are and how long they're going to live. This does block and so if you're using DevRandom to generate a key on startup like I said you'll hang the box for a while because it doesn't have enough entropy and your startup scripts are just going to be stopped. Your second choice is DevUrandom. It does not block when the entropy pool is low and we'll talk about how it does that in a minute. It is not generally a matter of fact if you read the man page on it it says this is not considered cryptographically strong and should not be used for cryptographic purposes. I'll come back on that. It's often used anyway. So I did a little experiment. There's a program out there called ENT which essentially goes through and runs a whole bunch of different tests and I'm not really going to go through them but what I want you to do is I want you to kind of look at the screen. The entropy of a random file took like two megabytes maybe I'm not sure if I put it in there. It measured the entropy as very nearly eight bits per byte so we tend to think that's random eight bits per byte should be completely random. Optimum compression would reduce it by zero percent that also seems like a good property for random numbers to have. The square, the chi-square, the distribution there's a bunch of different ways to do this. So just watch with your eyes what happens. The next screen is going to be a u-random file and I'm just going to kind of go back and forth between them. There's almost no difference in these. I think this was different on the six significant digit. This wasn't different at all. This is slightly different and actually it turns out in this one sample that I did there's the random file and there's u-random. Turns out u-random was actually a little bit better in this one. Now knowing what we know about this why should I not use u-random? At some point in the past probably it was because those random numbers might cycle around and you might use that many. But we've come a long way since the mid-80s on PRNGs and like I said 32 billion or things like that and all these algorithms are really good at generating random numbers now. So I take a little bit of issue with that u-random thing. FIPS and common criteria guys don't happen to believe me but that's kind of up to them and you're probably not going to be able to get around that so you may have to do what they tell you to do. So I took this after I did this talk once or twice I found this really awesome website where this guy right here I don't know this guy either but I'd love to meet him buy him a beer sometime went through and talked about this whole subject and then came to the same conclusion that I came which is u-random is probably a pretty good source of entropy you may want to look into it. So here we've got our entropy count our little bit counter and then over here we've got our pool oh every time you get a randomness from several sources which is really the interrupts it estimates the amount of entropy that it can put into the pool from those and then it bumps the pool from here but if you're pulling from dev u-random you've just got the cryptographically secure pseudo random number generator generating numbers for you out of dev random that is the later versions of the kernel there's actually another diagram for earlier versions of the kernel but that's not interesting to us. Okay, so I think we already kind of went through this if we pretend that the kernel just booted up and since I already said that we generate entropy at a rate of bits per second pretend the pool has like 20 bits in it right but you have to generate a 2048 bit RSA key early on on your first boot especially coming out of something like manufacturing or something like that what's going to happen with that you have to generate 2048 bits you only had 20 bits of entropy how many keys are you going to be able to generate? 2 to the 20th possible keys if you've got that many I actually did this experiment one time I don't think I ran it long enough for a weekend and I think it rebooted on the order of thousands of times and I never got a key to regenerate the same I erased, I scrubbed all the dev random stirring and things like that so I haven't been able to prove this from a practical sense but I believe it so we can somebody else can do a setup of reboot loop if they're interested in it so if we have a problem here we can only get in bits per second into the entropy pool at a time how to get faster how to get more entropy into that pool if you really really want to use dev random how are you going to get more entropy into that pool so you don't deplete it and block dev random well there are hardware RNG devices up until a few years ago they were rather uncommonish you had to buy one from Cavium or you had to buy one from there were various other vendors Cavium was kind of one of the big ones but Intel a couple years ago they used these read ran instructions and it's in most server class CPUs you can cat proc proc CPU info and it will show you I believe it's RDR or ND rather than RAND but in any case that does pretty good I haven't I read did I put that in here they generate gigabits per second of entropy because they've got hardware that will do it for you usually it's some sort of white noise source and then they take a hash of the white noise source and then hash it again in order to keep getting random numbers out of it we already talked about why you want to hash random sources right okay so this is my slide on since we're technical in here you can open dev random I got some errors there open dev random and select on it and then when you wake up the entropy pool will have gone down below a certain preset I think it's like 64 bytes or 64 bits or 128 bits I'm not sure what it is but when it goes below a certain level it'll wake you up you as a process as long as your root you can run this eye octal right here and that eye octal it's a little hard for me to see there but that eye octal will go ahead and add some add some bits into the pool earlier when we said you cat something into dev random it stirs the pool but it doesn't add any entropy this is the way that you can add entropy you can only do that as root you can read the man page for it right there okay we're going to summarize this out entropy is a measure of information and we sort of went through and we talked a little bit about passwords and how to measure entropy and then we talked a lot about measuring entropy in terms of what you can do with it the pool the kernel keeps random state in a pool of integers in the kernel and it bounces a counter up and down and you can watch that counter if you want for cryptographically secure purposes use dev random is the official line you may want to see what else you can do there's a few other if you don't have available to you like cavium hardware number generators or intel readrand instructions a friend of mine by the name of I'm blanking on his name right now Stefan Mueller in Germany wrote a it's called cpu jitter and it's a kernel module I believe the kernel module the kernel guys wouldn't accept it from him because they like it they like that little I octal thing so he moved it out into user land but essentially what it does and I was a bit amazed when he told me this but he can actually look at the jitter between cpu instructions and he knows that there's certain amount of jitter in between each instruction as it executes for the micro instructions underneath the intel architecture which I don't know that much about and he's able to generate random noise from that and so he's able to build I never got quite a good measurement from him but I think he claimed megabits per second that he can add into the pool um caveat emptor on that one you know make sure that you've done some good work there again I don't think the FIPS guys are going to buy it but they might I happen to kind of like the hardware solution but it doesn't work very well on virtual machines curiously enough that's my summary slide so is there any questions? I'm not really a hardware guy my understanding is a lot of times they'll take, or another example use a white noise a source of white noise so I'm like wondering what that source would be I mean all you just gave us is one but it's not the one now so on an interesting level the guys at cloudflare have set up a wall of lava lamps and they actually have webcams on it and then they claim that they're using that as a source of entropy now my particular feeling on that is again you're probably generating even with a wall of lava lamps you're generating let's call it a kilo bits per second right because probably what they're doing is they're just well you know that they can probably do better than that they're hashing every frame and well actually figure that out if you hashed every frame no matter what it is you're going to get you know a certain hash size 6, shot 512 yeah but if you're hashing it well you could chunk it up into different chunks and then hash it anyway do the math and you can figure it out I don't know exactly what they're doing they do have a blog post it's been a long time since I read it but yeah the cabium so I don't know exactly how readrand works generally speaking what I've heard is if you leave like a radio antenna floating you're going to get white noise from it and so if you're able to hash that white noise now that seems way too simplistic to me I can just approach your device with broadcasting on a certain frequency or broadcasting like that maybe can I affect your random number generator that seems a little bit not quite right to me but somebody's done all the calculations on this and it's not me so hardware engineers want to chime in on that one it's thermal it's thermal is the answer you can detect small variations and they'll have to seek the US okay so if you're CPOs at low load at all time would that be low I believe it's probably taking a delta each time so the question was if your CPU is at low load or high load all the time would it be generating zero entropy because it's always maxed out I wonder if the thermal casing you're going to have some fluctuation in those thermistors it's closer to the ground state so yes it would have a lot fewer states accessible to sure okay it's good to have hardware engineer did you explore any of the functionality of the hardware random generator no I have not explored some of the TPM chips nor the secure enclave stuff I'm interested in both of those but I haven't had a chance RNG tools and Linux allows you to use that as your hardware seats yeah kind of looping back on that theme and I hate to keep hitting this but so in a previous gig when I had to do some FIPS and common criteria stuff and I said okay I'm going to use some cavium that happened to be cavium at the time random number generators they were unsure of it because they hadn't tested it and so essentially what they told me to do is every time I put in you saw my slide on the eye octal there where's it at every time I put in 2048 bits of what I knew was good entropy they said divide that by 256 and then you can put that much entropy into the pool just to be safe so I'm stirring off a lot of bits in and then so that's how they chose to do that any other questions over here yes you you talking about the ends that was well after start up that box had you know uptime to Unix box so it had an uptime of a month or so I don't I don't know for certain but yeah that's it's interesting question I don't know if there would be any difference there you can again you can watch the kernel deplete the pool and it moves around quite a bit there's a lot of sources pulling entropy out of the pool yes there should be an init script that runs on shutdown to save the current state to disk and then restore it if that is missing then it's vulnerability in that distribution and that bug is popped up the original before it was released had that bug in it where it didn't save off that entropy and so it should be different on every start up unless it's a hard shutdown then that shutdown de-script doesn't run and so in that case I don't know if I can refresh entropy pool but even if you have two machines starting up at the same time if I recall correctly it includes a lot of timestamp information as a part of the mixup and so if it's getting current timestamp data then even if you have two systems that are running on the same underlying hardware and have very similar timings if they're started up at different slightly different times then they'll add second time high precision timers and high precision timers have jitter to them and all sorts of things it's a lot different than in the days like I said in mozilla 3.0 where it was literally catting date into its jitter revenue generator something terrible like that ok I think I'm out of time so thank you very much