 We are here for the talk Wheel of Fortune, analyzing embedded OS number generators with Joswetsuz and Alia Basi. Short introduction, they're both researchers. They work with distributed and embedded systems groups at the University of Twente. Joswetsuz is hardening systems and hands-on teacher for offensive security. Alia Basi comes as a researcher from Ruhr Uni in Bochum. He has a chair at System Security. Before that, he was head of vulnerability analysis and penetration testing group at the Sharif University of Technology in Tehran, Iran. Starting with that, I'll just give over to both of them. And we'll have a great talk and questions and answers later. Yes, thank you. OK, thanks. All right, welcome everybody to our talk, Wheel of Fortune, analyzing embedded OS random number generators. So start with Jos, you want to introduce yourself? So I'm Joswetsuz, as I already was introduced. And I'm a researcher at the Distributed and Embedded Systems Security Group in Twente. And this is Alia Basi. Yeah, I'm Alia Basi, a PhD student at Distributed and Embedded Systems Security Group of University of Twente and visiting researcher at Chair of Systems Security of Ruhr University, Bochum, in Germany. So we start with the introduction to embedded OS random number generators today. And then we overview some embedded challenges for generally and specifically for OS random number generators. We will have some cases studies with showing these challenges, how it's affecting already existing embedded OSs or real-time operating systems. And this work is actually ongoing research. It's part of your thesis, which I am supervising. And yeah, so first of all, embedded systems are actually now everywhere from consumer electronics, medical devices, critical infrastructures, military equipment, or aviation. So actually, you see them everywhere. And beside that, there is a drive for connecting these embedded systems as an intern of things. While these original devices were not designed to be connected to the internets, but now there is a gap which vendors have to fix. And this can cause problems. As you can see, for example, for the random number generators, which we specifically work, you can see there are the problems. For example, millions of embedded devices use the same hard-core SSH and TLS keys, which make some noises in the media. So the OS random number generators. Well, what is randomness? I don't want to talk about the philosophical talk of randomness here, but it's generally in a stream of bit if you cannot predict the next bit with the probability of more than 50%. Then let's call it random. And another side of it is entropy, which is very interesting. It's a measurement of information, unpredictability, and how unpredictable are the information. And usually, high entropy means very random or higher randomness. And of course, why these things are important? Because actually, random numbers are actually a fundamental part of other security ecosystems we have. So for example, we use it for cryptography stuff. So for keys, nonces, or exploit mitigations, which maybe you never heard about it. But actually, we are using them for exploit mitigations, such as ASLR or stack-smashing protections. So randomness is actually critical for other underlying system which builds up on them. So that's why they're critical. So how we can generate random number? So physically, true random number generators are like radioactive decays or shut noises. You can implement it in two ways, as external devices such as TPM, or HSM, or as an integrated device within, for example, Intel, Ivy Bridge, CPUs, or certain smart cards. But there are some downsides for it. For example, they are expensive. Or there is some portability issues when the Covendor tried to move to the next architecture. So that's why, because you can't use those expensive hardware, is you are using software with random number generators. Well, there are deterministic algorithms which you stretch seats into sequence of random-looking bits. But the problem is that not all of these random number generators are designed to be used for security purposes. For example, RAND, which we are going to show, it will be used later by some people. It's not designed for security purpose, but some people use it anyway. So because of that, we have secure random number generators. And usually they have three features or properties. First of all, the output must be indistinguishable from the uniform. And there must have forward security, which means that in case the internal state of the system is compromised, the past outputs must still appear random. And again, for backward security as a feature, which is that if internal state is compromised, again, the future output must appear random, provided that the receding is with a sufficient or good quality entropy. So because of that, designing random number generators are not easy. So because of that, you have, for example, some certain standards for it, like NIST SP890A, which assumes some access to possibly by a source of seed entropy. But the problem is that this standard leaves some hard problems. For example, initial seed entropy, or receding control, or how is the quality of the source of the entropy. So because of that, well, some other people think about it and design something else, such as ERO or Fortuna, which already implemented in some advanced OSes, such as OS6, or FreeBSD, or iOS. So the problem here is actually a chicken and a hen problem, because well, to generate random numbers, you need some kind of entropy or randomness. But these randomnesses, again, needs another source of randomness. So ideally, you want to use the physical phenomena, such as if we consider quantum randomness, it's like radioactive decay, shot noises, or non-quantum randomness, such as thermal noises, atmospheric noises, and sensor values. Well, this is so, let's say, not always available. So practically now, for example, general purpose computers, you can see that you can use unpredictable system events. One good source is the user itself. So kiosk timing, mouse movement, or disk access can be used in general purpose computers as a source of entropy. And therefore, actually, we believe that randomness, or secure randomness, should be provided as a system service, because it's hard to implement it. And well, of course, people make mistakes later if you don't. So actually, many OSes actually already provide such secure randomness as a system service. For example, do you have DevUrandom in a Unix-like system, or Criptgen random API on Windows? And another important thing is that, again, lots of other security products are built upon them. So for example, OpenSSL, for example, assuming that there is a secure service from operating system as random number generators. And again, based on OpenSSL, you have OpenSSL and OpenVPN, so it's very, very important. So now I think you'll start with that. All right, so we've taken a look at the general background of random number generators. And now we'll take a look at why it's so difficult to get these right in the embedded world. So the common advice when you need random numbers in the general purpose world is just use DevUrandom and draw from there. But it's not as easy in the embedded world because of various design issues, which mean that operating system random number generators in the embedded world are often absent or broken, as we'll explore in this talk. So the three main areas of constraints are polyculture, resource constraints, and low entropy environments, which we'll all discuss into detail and how these relate to particular implementation difficulties. So the first of all is that in the embedded world, you have a polyculture of operating systems. Whereas in the general purpose world, you've got Linux, you've got Windows, you've got your Mac OS X in the embedded world, you have various kinds of different operating systems ranging from high capability micro kernels to very small monolithic systems, all with different constraints and different kind of capabilities and catering to different systems. So if you design a general operating system, a random number generator design, it's very hard to have this standardized all across the board because of this variety. And the same, of course, applies to the hardware spectrum because you have a wider range of microcontrollers and microprocessors, all with different capabilities. And it's not uncommon to see older or functionally stripped down versions in newly-fielded devices. So if you design a random number generator based on the assumption that you have some source of hardware, random number generator or physical source of entropy, then this means that your operating system cannot be deployed across a wide range of chips. So that's definitely a major design issue there. And of course, embedded devices are designed to have a small footprint and to be resource efficient. And this translates to various limitations when designing random number generators. So for example, limitations with regards to CPU speed translate to lightweight cryptography requirements, power consumption limitations, especially for battery operated devices mean you need to have a simple design to have limited entropy polling activity and memory limitations mean you need a small entropy and internal state in your random number generator in order to implement it in these constrained devices. And of course, there's the issue of the embedded world just generally being very boring. There is little activity going on and what activity is going on is usually very predictable. And there's a limitation with regards to common entropy sources like Ali discussed in the general purpose world, you often use disk activity timings, keyboard events, mouse events. But in the embedded world, you often have to deal with diskless nodes. You don't have peripherals. You don't have a user and you don't have hardware random number generators. And even commonly available sources like interrupt request timings are often not that good because they're too periodic. And these conditions are usually worse during boot time because you have predictable boot sequences. There's little activity going on at boot. And some entropy sources you might want to rely on are simply not available yet because they have to be initialized by the system after the boot sequence. Yet non-blocking interfaces to random number generators such as the DeFi random interface allow from drawing from the random number generator even when insufficient entropy is actually available in the system. And this results in something called the boot time entropy hole. So this is particularly bad because a lot of embedded devices often generate cryptographic keys on the first system boot. So this means that if you have a system with general low entropy conditions and an initial state predetermined in the factory combined with these low boot time conditions and then generating a key this results in very serious cryptographic issues. Common solution you encounter to deal with with boot time entropy in the general purpose world is using so-called seed files which is basically a file with collected randomness which is drawn from by the random number generator put into the system and when the system shuts down it writes to the file again. But in the embedded world it's kind of hard to generally deploy the solution because how are you gonna deal with diskless nodes? How are you gonna draw your entropy before a file system is mounted which is often required and still this doesn't solve the first boot problem. So some common embedded work around you encounter is including an initial seed file in the firmware and this initial seed file obviously better be unique and unpredictable per firmware image otherwise it doesn't do you much good. Or using personalization data such as the MAC address or serial number of a router and using it as seed entropy which is also a very bad idea as shown in the resource mentioned on the bottom or using other dubious sources of entropy such as clock timings, process IDs, foreign MAC addresses, et cetera, et cetera or simply including hard coded pre-generated keys which is also a bad idea as shown in the little black box project. So now that we have an idea of why it's hard to get embedded operating system random number generators right we're gonna look at a couple of case studies of operating systems fielded in various embedded... The cold part. Yeah, the cold part. Various embedded systems and how they get this wrong. So the first system we'll look at is QNX which is a UNIX like POSIX compliant real-time operating system initially released in 1982 and later acquired by BlackBerry. It's basically the underlying operating system for BlackBerry OS. It's used a lot in automotive systems as well particularly the entertainment systems. You also encounter it in carrier grade routers, military radios and some nuclear power plants. And it provides a custom debut random implementation which is always non-blocking so you don't have this blocking, non-blocking distinction you have in most UNIX like systems and it's implemented as a user space process addressed by a kernel resource manager because QNX is a micro kernel. An interesting thing to note is that the random number generator has always started after boot by a startup script so that's a thing to keep in mind when designing something for QNX. And we reverse engineered the implementation of the random number generator and it turned out to be based on Yarrow by Bruce Schneier, John Kelsey and Niels Ferguson but it turned out to be based on an older draft implementation of Yarrow and not the reference Yarrow 160 document which accompanied the paper release. So it only has a single entropy pool and no fast and slow pools and no blocks if or is applied to PRNG output at all. It's directly drawn from the internal state. So and QNX Yarrow in turn diverges from this older implementation as well by mixing PRNG output back into the entropy pool and having some re-seed control divergences which we'll discuss later on. So this is the design of QNX Yarrow which is relatively simple. You have your boot time entropy, you have your runtime entropy and it's drawn into an entropy pool and then you have the output function which draws from the entropy pool and also seeds back into the PRNG state. We first tested the randomness quality of the device output using the die-harder and the NIST statistical test suite tools and it passed both of these test suites but this only tells us something about the quality of the PRNG output. The source entropy can still be heavily biased as we'll see later on. After reverse engineering the boot time gathering routines we found that it draws from four sources only and these are static and non-configurable. That's the system time, the clock cycle count, the currently active process IDs and the currently active device names and they concatenate this and they pull it through the SHA-1 hash function and the resulting digest is used to initialize the QNX YARO initial state. So we decided to, because they sounded kind of dodgy, we decided to evaluate the boot time entropy because if this is very biased it might be feasible for an attacker to replicate the PRNG internal state after a reasonable number of gases. So our quality measure is the min entropy which basically means how likely you are to guess a particular value on the first try and to earn the 56 bits of uniformly random data have to earn the 56 bits of min entropy. We use the NIST entropy source testing tool to evaluate this data. We collected 50 boot runs by instrumenting the random number generator and logging the raw data that was collected during boot time. The average min entropy is 0.0276 which isn't good at all because that means it has far less than one bit of min entropy per eight bits of raw data. As you can see in the visualization on the right where the dark spots are low entropy spots with a particularly low entropy spot at the bottom left, at the top left, I mean. So we also evaluated the cross-boot entropy because even if a system has relatively good entropy quality during a single boot run if it's consistent among various boot runs this is also behavior you don't want and it turned out that apart from having less than stellar single boot entropy gathering it also had a very consistent and predictable pattern across 50 boot visualizations as you can see on the right. And you need to consider that this operating systems like these are deployed in firmware images so processes always spawn in the same order and there's the same number of processes spawning. So all these process IDs that are fed into the boot time entropy are usually static and the same goes for the device names and really the only randomness here comes from the clock time and the clock cycles and even there there is less variation than you would want because of the real-time nature of the system. So this is what the after reverse engineering the runtime entropy collection engine looks like. On the left you got your high performance clock measurements. On the top you have system information which is basically process IDs, thread IDs, flags all these kind of process variables which are fed into the, through a SHA-1 into the Yaro input function. And in the bottom left you have your interrupt timing source. In the bottom right there is an undocumented function which is a callback option to a library possibly for true random number generator support but there's nothing in the documentation there and it's not clear from the code either so. So some thoughts on this runtime entropy. The system information polling has some problems because there's lots of static information things like user ID, flags and priority are not likely to vary at all between different runs and stack and program base will only vary if you enable address space layout randomization which is disabled by default and time or program state based randomness is really the only randomness you're gonna get from this source. When it comes to the interrupt timings one of the big problems is that it really puts the burden on the developer because the developer has to select which interrupts to draw the entropy from. So that means that they have to decide are these quality sources are they not triggered too periodically, et cetera, et cetera. But this doesn't really matter because in almost all Q and X versions there is no receipt control. They actually after reversing the binary they implemented the functions but they never actually call them which means that runtime entropy is accumulated but never actually mixed back into the state and boot time entropy is the only entropy you'll find in the entropy pool of a Q and X Yaro implementation and this is very dangerous especially if we consider the quality of boot time entropy we earlier saw. In the latest version Q and X 6.6 this is there was an attempt to fix this by integrating some form of reseeding into functions called during initialization and output. So whenever the PRNG output it reseeds from part of the pool but an issue is that no entropy estimation is actually done. And this is what Yaro was initially designed to do to do entropy estimation on your entropy pools so you only reseed when you have proper entropy quality in these pools but this isn't what it does it just reseeds all the time. Luckily we disclosed some of these issues to BlackBerry and based on our suggestions they drafted a new Fortuna based PRNG Fortuna is the successor of Yaro for those who don't know and it's available in patches for Q and X 6.6 and it will be the default random number generator for the upcoming Q and X 7 which should be released I think in January or something like that. So this brings us to the next operating system which we can't mention because it was studied under NDA. It's a POSIX compliant real-time operating system used in highly sensitive systems such as the Joint Strike Fighter, the JTRS military radio system and the International Space Station. And it has a random number generator available via DEVU random interface and it has two associated functions called U-RandomRead which fills a buffer with N bytes from its random function and U-RandomWrite which reseeds the PRNG using only the first D word from the buffer you provided so that gives you an idea of the quality of this thing. So we reverse engineered this as well and took a look at what the underlying PRNG actually was and it turned out to be the G-Lib CBSD random function with custom constants and as the documentation clearly states this is not a secure random number generator so I don't know why they implemented it there but it's there. That's not the worst thing because we also discovered a local reseed attack because the DEVU random devices were writeable which means that anyone on the system can force a PRNG reseed regardless of their privileges so a very low privileged user can simply write a seed to the random number generator and then all across the board control the PRNG output. So, yeah, that's nice. Even if that wasn't enough we also discovered a known seed attack because if you reverse engineer the initialization routines you see that there's no seeding at all there's just a static 32-bit seed which is the same across all these operating system deployments it doesn't vary from firmware it's just the same sequence over and over again and there's no actual entropy in the system at all. Don't put my brand eyes on it. So, an attacker who knows the PRNG seed also knows because PRNGs are deterministic functions they also know all corresponding PRNG output consumed by crypto applications so she can see on the slide here the SSH key generator simply draws from the same output we saw earlier produced by this known seed. So, consider a remote attacker no local attacker who has a public key generated on this target operating system and they know the initial PRNG seed they simply clone the random number generator seek the appropriate state offset read from the random number generator generate a corresponding public and private key pair and if it matches the target publicly well obviously also the private key matches and if it doesn't it iterates to the next state offset and because these state offsets are determined by how many bytes have been read from the DevU random device this is bounded by a very reasonable brute force upper bound because I don't think more than four gigabytes will have been read from the random number generator before generating your keys. So, we can even pre-compute a lot of this and we can do a live demo but this is a screenshot of an attack on the SSH day of the device itself where we will recover the private key corresponding to the SSH host key within a couple of tries and yeah, that's basically it for this operating system, you're welcome. Yeah, and the funny thing is that I don't think even they are going to patch it but so the last case of study is VxWorks 6.9 it's a real-time operating system initially released in 1987 it's actually using for example, Mars Curiosity rover Apache helicopters or like X-47 B drones or lots of telecommunication equipment. Well, about VxWorks actually it doesn't provide any secure random number generator and actually you can see in libraries such as OpenSSL, WolfSSL or creep, there is no reference for it either. And well, it will have predictable consequences too so as you can, if you search in the internet for the developers who are asking questions about this stuff, about VxWorks you can see that, well, somebody comes and say, hey, I implemented, I use the RAND function as a seeding source for, I don't know, the OpenSSL which as you remember, we initially say that it's not secure, it shouldn't be used for security purposes and well, yeah. Well, and if you are thinking that this tree operating system or at least the first two which had problem and you were laughing a lot about it is the divorce actually is not. Actually in the meta-dorses and real-time operating systems actually majority of them do not support any secure random number generator and VxWorks is far from the only one with these problems but well, VxWorks is one of the biggest one and you expect that they provide something which they don't and you want to take away, yeah. Yeah, so what are the takeaways from this talk? Well, the first one is that the embedded world is harsh because there are constraints everywhere and low entropy issues are very serious and it's hard to deal with these in an operating system that seeks deployment across various kinds of chips with different capabilities. CSP or NG design is not a joke, secure randomness should be provided as an operating system service whenever possible. Please don't put the burden on developers because they will screw up as these issues from the mailing lists have shown and more scrutiny is required because the advice just use DevU random should not lend the developer into trouble as it would with these previous operating systems and too much of the embedded security world is still unexplored terrain so we need more offensive research into these embedded operating systems to get them almost up to speed with the general purpose world. So if you're looking for more technical details on embedded security, I recommend our talk at USENIX Enigma in 2017 and if you've got any questions, you can ask them now. Okay, thanks a lot first. For questions and answers, please use the microphones. Just line up behind the microphones and I'll pick you for your questions. I'll start with you. So this is not a question, this is a nice story that might be replicated. Please, only questions at the moment would be really impressive. No, this is about randomness. This is about randomness. I talked to an HP lab engineer from Bristol and he told me the following story because they had so much bandwidth on the internet they then decided, well, let's see what happens if we send a random bit stream to Dev Zero in a foreign country. It just took two weeks for the GCHQ to knock on that door and ask them what the heck they are doing there and the random number generator they used was a noise diode. Okay, thank you. Question from the internet, is a radio noise from a software-defined radio a good source of entropy for a random number generator? I'm sorry, can you repeat the question now? Is radio noise that you get from a software-defined radio chip, would that be a good source of entropy? Yeah, I mean, I guess it depends on your threat model because is it possible that the attacker controls radio signals around the device that draws solemnly from the RF chip? I mean, a specific frequency range, yes. If we can control a specific frequency, then, well, it's not. It's really dependent on the case. It could be good, but it really depends on your threat model. Is it a remote attacker you're dealing with, an attacker with physical access, et cetera, et cetera? One of the slides you mentioned... A little bit closer to the microphone, please. In one of the slides you mentioned, early boot, low entropy attacks, do you have any best practice recommendations for application developers? Because on Linux, there are random blocks. For example, on the BSDs, the random does not block unless it hasn't been seated yet and open BSD seats it in the boot loader stage before even the kernel main function one. So is there no application developers opposed to know every operating system? Well, the problem with the blocking thing is, and this came up in communications with BlackBerry as well, is that if you say, I only provide randomness when I have sufficient entropy in the pool and I'm sure this is high quality, this is not as easy to do in the embedded world because that means that, for example, if you need some source of secure randomness during boot and you don't have enough entropy, then boot times get really slow and especially if you have devices under real-time constraints, this is an engineering hurdle right there. So I'd say a paper was published, I think, at Security and Privacy 2013, I think it was called something like Welcome to the Anthropics or something like that. And there were some good best practices recommendations for the embedded world there, mainly in the form of seed files, but as we mentioned already, if you have diskless nodes there, it's kind of an open problem to design a real good embedded open source random number generator all across the board. Second microphone back there, please. Thank you. You mentioned the NDA you have with the vendor. What is the nature of your relationship with this vendor? How did you end up with an NDA with them? And how are you so sure that I'm not gonna fix it? I have an arm in front of me. I can just tell you that it took one year for us to get to us because they're providing, well, they claim security via app security, so that's how it is, to be honest, in lots of real-time operating systems. They are living in 1998. So it's an academic effort, not a consulting assignment. Okay, thank you. Okay, then you please. Thanks for cool talk. First question. ARM V8 relies on UAFI random for initializing kernel address randomization, layout randomization. Do you think that relying on bootloader is better for operating system than trying to solve random problem with ugly means? I mean, don't take random number generating into their responsibility of the operating system, but put it to bootloader. Yeah, I mean, I guess that depends on your design model. For example, that would require you to have a bootloader that's standardized all across the board where you want to deploy it. For example, if there's no bootloader that's suitable for the kind of embedded systems you want, but you do want an operating system and it doesn't have a random number generator, then you're left without a random number generator. So it's one of the biggest problems, and this is where this polyculture argument comes in, is that in the general-purpose world, you can make some assumptions about hardware roughly looks like this, roughly has these capabilities, software, the same. But in the embedded world, there is so much diversity that it's probably better to optionally include functionality all across the software stack in this case than to simply say, oh, we simply assume it's there in the bootloader, and then it turns out that it's not there in the bootloader. Okay, thanks. Another idea, can we rely on the manufacturer which makes a small device without random number sources to feed the device by random input during the manufacturing, I mean attach the device before selling it to some random, expensive random number generator, get the random input and then go. Yeah, it's one of the things I mentioned, that's essentially a seed file, and it's one of the things I mentioned in the slides. If you include a seed file with your firmware image and you make sure it's random per firmware image, then that might be a workaround, but that depends on how well the vendor understands what they're doing, because a case in point would be the Western Digital self-encrypting drives thing, where they actually generated keys based on the Lipsy RAND function. So. Or VxWorks. Yeah, that kind of stuff. So it could be a solution, but yeah, that depends on the vendor whether they do it well. Thanks, Jo. Thank you. Okay, we have time for one short last question. Okay, you have three pretty bad examples here. Do you have a broader view of the field? Are there, is it a general problem that they are all this bad, or did you just pick the three worst ones that you stumbled upon? Are there some operating systems that do it right, and whatever right might be? She does. No, actually these are the best. You got the best. Okay, thank you very much. A warm round of applause for them again, please.