 Okay, wait, I haven't given the talk yet. And I have a tendency to talk too long. So I have taken measures, preventative measures. Unfortunately, I am as sick as a dog, but I have a background in theater and as we say, the show must go on. And so I'll give it my best shot here. I hope you can understand me though. Hello? Hello? Yeah, it's okay? Okay, I'm Andy Tannenbaum and I'm gonna talk about Minix 3 and I'm sort of running the project, but there's a whole bunch of other people too and I didn't have enough space to list everybody, but sort of take my word for it, there's a whole bunch of other people. Know what? Oh no, do something. Here we go. Okay, let me start with a brief history of Minix. It's got a somewhat strange history and it's confused a lot of people. In 1975, Bell Labs released Unix Version 6. It was the first one that got out of Bell Labs. It was a huge success. John Lyons, a professor at the University of New South Wales wrote a book describing Version 6, a little booklet, it's become a real classic, describing it. People taught that at university courses for a number of years. And then AT&T, in its wisdom, when it released Version 7, 1979, put in a clause in the context saying, they'll shall not teach this in courses. It's our product, we want to keep it secret, we don't want anybody to know about it. And so that was a brilliant move on AT&T's marketing department. That's why Unix rules the world now. In 1984, I was teaching an operating systems course and since we were forbidden by contract from teaching and I decided to write a clone of this myself, which is probably a crazy idea, but I did. In 1987, I actually finished it and I released it and I wrote a book about it. And so it sort of got out there, it was mostly for teaching purposes to get around the stupid contract from AT&T. Then there was a second edition of the book 10 years later, and then we sort of got interested in doing research on reliable operating systems. In about 2004, there's another version of the book came out and then in 2008 everything sort of changed. I got a grant from the European Union, from the European Research Council, okay? A couple of words about the grant. Grant was two and a half million euro, okay? So it's just a fair amount of money. And the goal was to develop a highly reliable operating system. So apparently they think that reliability is lacking now if they're willing to give me two and a half million euro to try to make one. So they at least thought, you know, this is worth some real money to make one. In 2009, there was a discussion within the EU which didn't come to fruition, but there was a discussion of having software fall under the standard product liability laws. If you're a tire manufacturer and one tire in 10 million explodes, you can't say tires explode sometime. That's the way it is, you know, get used to it. It doesn't work, okay? You know, it's gotta actually work all the time. With software, if it doesn't work, people say it doesn't work, that's the way it is. You know, if software fell under liability laws like everything else does, all of a sudden companies would be liable for selling software that doesn't work. That really changes the situation. And you know, I've been interviewed a number of times where people say, you know, what difference does it make? I said, you know, suppose they go to Microsoft and say, you know, by law your software has to work and they say it can't be done, you know, you can't make a law saying pigs can fly. And if the EU could say, well, Tannerbaum did it, why don't you do it that way? Then they would say, well, we don't want to. That's not as strong as it can't be done. And so, you know, that's sort of the direction we're going. A couple of words about software reliability. Hackers, like all of us, have very much the view, you know, if God had wanted software to be reliable, he wouldn't have created reset buttons, okay? You know, so your grandma, however, thinks, why isn't it like a TV, you know? You buy it, you plug it in, and it works perfectly for the next 10 years. You know, why aren't computers and software like TVs? Because we all know, you know, because the software is changing every 45, 90 seconds, and so on. But a lot of people kind of want it to be like that. And so I think, you know, like, if your computer crashes once every three months, you hit the reset button, you think it's really reliable. Suppose your car stopped working once every three months at random for no particular reason, and you knew from experience, you get out of the car, take the key out, you know, open the hood, close the hood again, get back in the car, put it, it works. You know, you say, no big deal. You know, I think most people wouldn't like that from their car, even though the car manufacturer said, what's the big deal? It costs you 10 seconds every three months, no big deal. I think people wouldn't like that. But we're used to software not working, but grandma isn't. So the question is, can we make software that grandma actually might think is pretty good? I think the main cause of the problems is bloat in software. There's all that code, and it's getting bigger and bigger, and so on. You know, Windows and Linux and everything else, they're growing like crazy. I saw this article in PC Pro magazine just a couple of months ago, and the headline is, Linus Torval says, Linux, Linux is bloated and huge. This is Linus saying this, this is not Bill Gates, okay? So, you know, if Bill Gates said Linux is bloated and huge, I'd, you know, okay. But Linus has said, if Linus had said, Windows is bloated and huge, okay. But even Linus thinks that, you know, Linux has got out of hand, that there's certainly nobody who understands all of Windows. Maybe Linus understands all of the Linux kernel, maybe. But, you know, things have just gotten, I think, out of hand. So, I have the feeling that there's a need to rethink operating systems, and the research that we're doing is in that direction. We have basically infinite hard drive. My little notebook here, it's 5,000 times the computing power of the PDP-11 I sort of started with. You know, it's smaller also, at the mention it's, you know, 150th the price and has, you know, 1,000 times more memory and a disk 1,000 times the bigger. Nevertheless, booting the notebook takes about four minutes and booting the PDP-11 took five seconds and the machine 5,000 times slower. What's wrong with this picture? Okay. There's infinite cycles, there's infinite RAM, there's infinite bandwidth and there's infinite bloated useless crappy software. Okay. And so, you've got all this bad software and to become more like a TV, you know, I think future operating systems has to become smaller, simpler, more modular, which is very important. Modularity is, where people have built more complicated things than operating systems. An aircraft carrier is much more complicated than an operating system, but aircraft carriers are very, very modular. The designers understand that. So like, if a toilet gets clogged on the aircraft carrier, it doesn't begin firing missiles. Okay. Because the toilet system and the aircraft system, the missile system are separated very much. They understand you don't want problems in one part going over to another part, you know. And likewise, if an incoming missile is detected, the toilets don't start flushing. It's, you know. They really understood about building modular systems in most other worlds, and we don't, okay. And make it reliable and secure. And I think self-healing is a major part of this. It has to fix itself. People don't understand it anymore. So that's where our stuff is going. I'm not talking about intelligent design, at least as applied to operating systems. I'm well known for espousing micro-kernels, and I still believe that. You know, ours is about 6,000 lines. L4, I think it's about 10,000 lines. It's QNX, industrial systems, you know, Green Hills and Pike and QNX. There's lots of industrial systems and avionics and automotives that are micro-kernels. And they're all in the order of, you know, five, 10,000 lines of code versus 6 million for Linux. And I think Windows is probably above 100 million now all together. Been a lot of studies about how many bugs there are per line of code. Okay, you know, if you find a bug, you just fix it. You know, done. In industry, that's not the way it works. Many companies have careful bug tracking systems. If you find a bug, you have to report it using some automated bug reporting system. And they get a log of, you know, how many bugs they found. And the experience is about five to 10 bugs per thousand lines of code in industrial projects where they're really careful about quality control. There's a study in FreeBSD, which is very, very good quality control, three bugs per thousand lines of code. That's considered very, very good. Okay, at that rate, Linux would have maybe 18 bugs in the kernel and Linux would have 18,000. Now, of course, not all the bugs are fatal. So maybe, you know, spelling errors on messages and, you know, minor stuff. But if you get 18,000 errors, there's gonna be some that are gonna be important, even though many aren't. Also, there was a study at Stanford of the Linux drivers. And they have about three to seven times more bugs than the rest of the kernel. Because everybody wants to look at the virtual memory algorithm. That's really cool and neat and fun. But nobody wants to look at the driver or some obscure printer, okay? Because it's no fun at all. It's a real mess. It's, you know, a hundred pages of yuck, okay? And unfortunately, 70% of the code is the drivers. And they have an error rate of three to seven times. You know, I mean, that's where the problems come in. In Windows, it's known that 85% of all crashes are due to drivers not written by Microsoft. They get the blame for it. But it's some random guy in Taiwan who was in a big hurry to get his driver out who wasn't very careful and that's where the trouble comes from. So in a modular design with sort of walls around the pieces, you got a chance to do this. So ROS runs as multiple user level processes. So the operating system runs basically in user mode. So here's the basic architecture of Minix 3. The microkernel at the bottom, it's about 6,000 lines of code and it handles the interrupts, very basic notion of a process, scheduling, IPC, and the clock and this thing called system. But we've already moved that out of the kernel. So basically it's just handling the interrupts and the inter-process communication and a very primitive notion of a process. It doesn't really manage the processes. Somebody tells it, you know, there's a process, go run them and it runs it without really understanding what it's doing. One level up in user mode are the driver processes and each one is a separate process. So the disk driver is running as a user mode process in protected mode with the MMU turned on, just all by itself as an non-privileged process and the terminal and the network and the printers and all these things running as separate user mode processes with relatively little power. On top of that is another layer of user mode processes with the file systems and process manager and virtual memory manager and all that stuff. Again, running as user mode processes and the top layer are the regular user programs. But from the kernel's point of view, they're just user processes. The file system is no different than make or the shell. It's just another user process. In all Unix systems, the shell is just another user program and there's many shells, the born shell and the born again shell and KSH and there's a whole bunch of shell, C shell. In Windows and many other programs, the shell is sort of built into the operating system. It's hard for them to understand how could the shell be a user program? It's that way for many Unix people, how could the file system be a user program? But it is and the virtual memory manager is a user program and so on. That's sort of the basic design. It makes it very modular and it has a number of interesting properties. First of all, the kernel has some calls that these drivers and servers can make and these calls are different than the POSIX calls. These are internal calls for the benefit of drivers and servers. We also have the POSIX interface for user programs but the kernel calls are low level things like a user process can't read or write an IO port. They have no access to the IO system. If you wanna read or write an IO port, you gotta ask the kernel, hey, here's a bunch of IO ports, go read them for me and it'll check to see if you're authorized to do that and if you are, it'll go read them for you and so on. The stuff for setting interrupt vectors, there's things copying between address spaces. There's a file server runs in its own address space. It can't talk to anybody else's address space. So if somebody says to give me a block, it can't actually deliver the blocks. It doesn't have the authority to get outside of its address space. So it has to ask the kernel. Go give him the block. There's all kinds of checks which I'll describe later. You know, DMA mapping, assigning memory maps, setting up privileges and all these things that are internal kernel calls, about 35 or so of these kernel calls for the benefit of the drivers and the servers. Everything uses the principle of least authority. So they're running as user mode processes. Everybody's got very carefully granted powers so they can't just do anything, for example, nobody can execute privilege obstructions. They can't touch the privilege registers. They're time-sliced. So if one of these things gets into an infinite loop, it doesn't hang the system. It simply, you know, waste a certain fraction of time, but not all of it. None of them can touch kernel memory. So they can't corrupt kernel memory due to a bug. They can't touch other address spaces. There's a bitmap per kernel call saying which kernel call you're allowed to do. So if you're an audio driver, for example, and there's no reason for you to fork, you can't do the kernel call. Let's sort of the primitive for fork. There's also a bitmap saying who you can send to. So again, if you're the audio driver and even no business talking to the network driver, you can't do it. You'll try to send to the network driver and the kernel return an error saying you're not authorized. There's no direct IO. You can't touch IO ports. So the disk driver can't get to the disk. It has to ask the kernel. It's a humbling experience for the disk driver not to be able to touch the disk. That's life. It says, can I please write on the disk? And it says yes. But if you're the audio driver and you say, can I write on the disk? The answer is no. That is the consequence, for example, if a virus or something should get into your system and take over the audio driver, it can make really weird noises, but it can't take over the disk because when it tries to use the disk, it's told, sorry, no permission. So there's a lot of security value in this too. The drivers use a mode. They run a separate processes. They don't have any kind of super user power. The MMU is turned on so there's no special protection. They don't have access to the IO ports. To do anything, they've got to ask the kernel, which checks. It's copied to other address spaces. They've got to ask, they're really not very powerful, which is intentional. There's a bunch of user mode servers. There's one or more file servers. There's a process manager. There's a virtual memory server, data store, information server, network server. The X server, of course, is always a user process. And the reincarnation server, and I'll talk a bit more about these things in a couple of minutes. Let's start with the file server. That's one of the more interesting ones. Here's a picture of the file server. So here's a user program, and it wants to read, it doesn't read. So read is a library routine, and the library routine sends a message to the file server saying I want to read 512 bytes from file descriptor six or whatever. And here's the file system's little cache, these little colored thingies. If we're lucky, then the file system will have the block in the cache, and it'll call the sys task in the kernel to say please copy the block to the user space because it's not allowed to do that. The file, the system task will say okay, done it. And then the file system replies to the user saying, read completed with error code, or no error code. So there's like four messages needed to do this. You might say how long is the message? It's about, I don't know, 500 nanoseconds and that order of magnitude. So there's a little bit of overhead, but it's not immense. Let's look at a more complicated case when the block isn't in the cache. The user says to the file some copy block. The file system looks in the cache, can't find the block, sends a message to the disk driver saying go get the block. The disk driver sends a message to the system task saying here some IO ports, please write these values on the IO ports so you get the block. Then it waits. Then there's a notification from the hardware into the disk driver, which is like a message which says basically okay disk has finished activity, you want something from you. And so then it goes and checks the, does some reads from the IO ports, finds out what the status is. And of all as well, the disk driver notifies the file system saying I did the IO for you and so on and then the file system calls the system task to say copy it to user space and it copies to user space and then it tells the user it's done. So there's like nine messages in here at 500 nanoseconds each. So we're talking about burning up four and a half microseconds in this kind of overhead plus a bit more. But we've actually read a block from the disk that takes several milliseconds. So another five microseconds more or less really isn't a central issue here. The process manager, it manages processes. It contains the logic for starting and terminating processes. I think signals are in there too. They had to go somewhere. But the basic logic for keeping track of things is in this process called the process manager. It's a virtual memory manager. It has the logic of the virtual memory but not the mechanism. So the kernel, the virtual memory manager is the kernel. Here is the memory map for this process. And the kernel just says yes sir and just takes it and doesn't, if it's stupid it'll crash. It doesn't know, it just follows instructions. All the logic for keeping track of the pages and setting up the maps and so on. That's all done in this user level virtual memory manager. So it knows where the free pages are. It knows who's got what. Page false, it'll get redirected to it. It figures out what to do about the page fault. So all the smarts and all the algorithms are in the virtual memory manager as a user process. When it's all done, it builds the appropriate page map and just tells the kernel, here's the page map for number six. The data store is a little name server where you can say here, save this. And then you can ask for it later. And there are reasons for that which I'll come to in a minute. It could be used, in fact, is being used for recoverable drivers, okay? Very briefly what that means is, if you want a driver, if you want the system to be reliable and be able to survive a failure in a driver, some of the drivers have a small amount of state, some of a large amount of state. But for example, an audio driver knows what the volume levels are for the audio devices. And it stores those in the data store. If the audio driver ever crashes, it's replaced by a new audio driver which goes to the data store and says, you know, give me the appropriate data for me. And then it gives it back the audio levels and the new driver can set the audio levels to whatever the old driver had set them. The information server is basic for debugging. All the F buttons and minix display various debugging dumps so you can see what's going on. This is a network server. It's a complete TCP IP stack, runs entirely in user space, the whole thing runs there. Then there's the reincarnation server which is kind of a fun thing which most systems don't have. The reincarnation server is about reviving the dead. It's the parent of all the drivers and the servers. Whenever a driver or a server dies, the reincarnation server collects it and sort of fixes things. And there's a table which it looks at and that tells it what to do. It also pings the drivers and the servers frequently. So the reincarnation server will go to a driver and say, hey, disk driver, how you doing? And the disk driver will say, fantastic. I did 57 megabytes a second in the last second. The disk is really humming, we're going great. And then three seconds later, it'll ping the disk driver again and say, how you doing? And the reincarnation server says, hmm, not good. Okay, disk driver, how you doing? Okay, I'll give you one more chance, okay? Now either you answer or you're toast. And so at this point, the reincarnation server, hey, get three chances, sorry about that. Okay, I'm killing you. So it kills its child, goes and fetches a new one, starts it up again. The new one goes to the data store and says, do I have any state, which we haven't really got for complicated state, but for simple state. This driver doesn't have any state, so it's easy. It starts up a new one. A new one tries to patch up the pieces and it may be, depending on the nature of the driver, you may or may not screw up one or more processes, but you don't screw up the whole system. And in most cases, you don't screw up anything. Or in some cases, you don't, it's tricky, that's what the research is, for example, the file server knows it gave a command to the disk driver and didn't get an answer. So we have to deal with that and I'll show that in a second. So here's the recovery, for example, for disk drivers. So the user sends a message to the file server saying read from a file, okay? And file server sends a message to the disk driver saying read block and the disk driver now crashes. Okay, so what happens next? The reincarnation server detects this, because it doesn't answer its pings, and it fetches a new driver. You might ask, how does the reincarnation server fetch the disk driver from the disk when there's no disk driver? And the answer is it has a shadowed copy of the disk driver in memory all the time. So it can always get to the root device from its memory copy. And once you've got a working root device, you can keep all the other drivers on the root device so you can always fetch the rest of them. So there's even a failure of the disk driver isn't fatal. And then a message is sent to the file server saying, I have some bad news for you. Your friend, the disk driver, has passed away. We're really sorry about that. The file server says, all right. I wonder if there's a new one. So it goes to the data store and says, by any chance is there a disk driver? And the disk, the data store says, you're in luck. A new one just appeared a couple of microseconds ago. Here's its address. Go for it. And the file system says, oh, new disk driver. I'm trying to write a block key. I think you could do it for me. And the disk driver says, I'd love to write blocks. It's my favorite activity in the whole world, besides reading blocks, is writing blocks. So yes, I'll do that until you're back in business. And the user process doesn't notice this. It's all transparent. So you're replacing parts of the operating system on the fly while it's running without the user process being disturbed. That's sort of the whole idea. Now we're not finished with all of this yet. We can only do the simple parts now. But we're working on, if the file server goes down, with all that state, that's much more complicated. We're not really dealing with stateful things yet, except for small amounts of state, like audio drivers. But this is a basic idea in where we're going. So the system is self-healing. It detects its own problems, and it can fix itself to some extent. What about crashes of other drivers? Well, Ethernet is very easy, because Ethernet is unreliable to start with. That is its best efforts. If it goes down, you'll lose some packets. But nobody ever guaranteed that packets were going to be delivered. So the higher-level protocols, if it was a UDP packet you lost, well, nobody ever guaranteed that UDP packets were going to be delivered. So nothing's wrong. If it's TCP packet you lost, TCP itself times out and tries again. So that's really easy. The printer goes down. You don't know where you were. You're sort of halfway in the middle of a file or something. So all you can do is the printer demon detects the problem and just prints the file again. There's really no way out. It's theoretically impossible. In audio, it doesn't know where it was, so it could play the song again. So sometimes you can recover completely gracefully, and sometimes it's hard to do. But still, if the audio driver goes down and the result is this little small glitch and then the song plays again from the beginning, it's better than an operating system crash. No, it's not totally transparent. It's very hard to make it totally transparent. But we're trying to get as close as we reasonably can. The kernel has many reliability and security features. There's less codes than fewer bugs. If you're talking three bugs per thousand lines of code and there are thousands of lines of code less, there's going to be fewer bugs. It also means the trust that computing base, the stuff that actually has to work, is much smaller than conventional systems. There's no foreign code in the driver. If you go out on your Linux or Windows, whatever, PC, and you buy a firewire card, it comes with a CD-ROM. And you put the CD-ROM in the drive. And guess what happens? Random code written by somebody in Taiwan gets installed in your kernel. It may or may not work. The kid may or may not have been in a hurry. Management may or may not have cared about the quality of any of this stuff. And there is this foreign code sitting in your kernel now. You really don't want that. It's a very, very bad design. In Linux design, you have a new user process started with the firewire code in it, which might or might not work. But if it didn't work, it wouldn't take the system down. It would just take firewire down. I think it's the modularity principle. I think it's much in principle of better designs. We also, in an effort, have static data structures. RAM is cheap. And so there's no malloc in the kernel. Tables are all fixed size. The number of processes that you can have is fixed by a constant in some header file. So I think it's 256 processes. If you ever went above that, it says full. But that means no malloc and no free and no memory leaks. So we waste a little bit of RAM by over dimensioning various tables. It seems in this day and age, with memories, you can't buy a computer with less than a gigabyte. So we waste a couple of thousand bytes on bigger tables. It's really not a bad trade off. Of course, moving bugs to user space doesn't reduce the number of bugs. It's the same amount of code. It just makes them less powerful. A bug in a user space process can do less damage. So we have the same number of bugs as everybody else does. These bugs are sort of emasculated, with these castrated bugs up there, which you can't do very much. IPC, reliability and security. We have fixed length messages, although we're having to rethink that a little bit now. So there's no buffer overruns. Initially, we had a rendezvous system. You sent somebody a message, and he did a receive. Message was transferred, and everybody's happy. We're having to move away from that for reasons I'll say in a second. This had no lost messages. There's no buffer management. It was a very simple scheme. Interrupts in messages are unified. So if the interrupt happens, it's turned into a message by the low level kernel. And the driver just gets a message saying, message from this hardware. And then it's up to this hardware to figure out what to do next. There's one problem we've run into, reliability. The client sends a message to the server, and the server tries to respond. But the client has died that hangs the server, because it can't do the respond. And so we're having to go over to asynchronous messages a little bit, which I don't like, but it's really necessary in case to avoid hanging servers. Everybody worries about sick servers, but we also worry about sick clients. And so we may have to go over to that. The drivers are untrusted. We regard drivers as untrusted code. In every other system, the drivers are regarded as trusted code. And Experian shows that we're right. It is untrusted code. And bugs and viruses and stuff can't spread from module to module easily, because they're so isolated. None of the drivers operating system processes can touch the kernel data structure, so they can't mess them up. If a bad pointer occurs and see bad pointer errors are very common, it wipes out one driver, it crashes. Its parent, the reincarnation server, is informed via a signal that one of its children has died. It says, oh, the disk driver has died. Look up in the table, what am I supposed to do? It runs a shell script. The shell script is likely to log the event somewhere, possibly take the corpse, the ADOT out, the core dump, and save it somewhere for future debugging. It might send an email to an administrator. You can set it up to send an email to the manufacturer. There's all kinds of things you can do. It's running a shell script. When it's all done, it starts the driver again. If a server gets into an infinite loop, then when the reincarnation service is how you're doing, there's no answer, because it's in this infinite loop. And after three or whatever the constant is, number of tries, then it kills it and starts a new one. And so we can survive infinite loops inside critical components of the operating system. Nobody else can do that. And again, because these things do not run a super user, just regular user processes running at the lowest possible privilege level, they can't do a lot of damage if something goes wrong. Okay? Let me describe memory grants, which is an important concept that we have. File server and some other processes need to access memory from another process. So if you say to the file server, read a block, it's got to write the block into your address space. But it can't touch your address space because it's just a humble user process. So how do we do this? Every process that wants to have somebody else write into its address space builds a memory grant table. Okay? And it makes a kernel call saying, here is my memory grant table, here's the starting address, and it's got so many entries in it. Okay? So the kernel knows where everybody's memory grant table is. And it can make in its memory grant table an entry saying the disk driver, process nine, may write in my memory from bytes 1400 to 1499. So it puts in the exact amount accurate to the byte. There's no page alignment required. Okay? And then when the user wants to talk to the file server or the file server wants to talk to the disk driver, it passes an index saying you have the authority to write in my address space using memory grant number one. Okay? And then when the disk driver wants to write into the file server's address space, it says to the kernel, I want to use this memory grant to write into his address space. And then the kernel checks the table to see is this guy authorized and which bytes is he authorized? And if so, the kernel does the copy, accurate to the byte. So we can protect memory down to the byte granularity, but having the kernel do the copying instead of the other guy doing the copying. Now this introduces a little bit of overhead, that's true, but it protects the address space so you can say to somebody, you can only write in this little piece of my memory and no more. And after the operation is completed, normally you'd erase the grant, so we can't use it again. Okay? So this protects memory down to the byte level granularity. Fault injection. We ran experiments injecting faults into drivers to see how good it was. And so there's a program you can get which generates faults in binary code to test it. And we injected 800,000 faults in each of three different, I think it was network, ethernet drivers. Okay? So this fault injection program runs in real time, another process like using ptrace kind of calls. It generates bugs in your code. It doesn't just generate junk, what it does is it analyzes the code, understands the machine architecture, and then it does things like, if you said move one register to another register, it will swap those. So it simulates the error of your wrote, i equals j, and you meant j equals i. So it's a very, it's simulating semantic errors that programmers might make. It looks for conditional branches and changes the condition. So it finds a branch less than, it turns into a branch less than or equal to. This simulates the error for i equals zero, i less than n, i plus plus two, for i equals zero, i less than or equal to n, i plus plus. A typical error programmer makes. So this program generates errors that are common programming errors, and we injected like 800,000 faults in these things. The way we did it was we injected 100 at a time, so we took the driver, put 100 faults in it, changed the 100 pieces of code, and we waited one second to see if it crashed. Many times it didn't crash because it never executed the code we messed up in that one second, or the way we messed it up wasn't really important. If it didn't crash, we injected another 100 faults, and it repeated the process, and then we waited until it crashed. Well, we eventually got 18,000 crashes out of this stuff, but the operating system never crashed despite running two and a half million trials. We found a lot of things. We found bad hardware sometimes in the 18,000 crashes. We found ways to hang the PCI bus that couldn't be recovered other than pulling the plug out. We found all kinds of wonderful things, mostly crappy hardware, but the operating system never crashed in all these trials. So we think it's relatively robust because we crashed the drivers all the time, but not the operating system, okay? So that's sort of the basic technical story. There's a lot of academic projects where they build a little tiny kernel and it does something to do whatever, but it only works when the grad student is there and it can't actually run anything because it's too much work to do all the rest of the stuff. But we've actually made a real effort, maybe too much of an effort, to make it an actual sort of usable UNIX system. It's not state-of-the-art. It's not nearly as fancy as UNIX or Linux or BSD or any of those things, but it's a usable, recognizable UNIX system so you could actually believe it could work. For the screen, there's X11 and we have a simple desktop program, EDE. We don't have kernel threads yet, oh, it's on our agenda, so we can't run things like GNOME and KDE, but EDE gives you a desktop if you want, but it's our experience that a lot of programmers just want X11 and a bunch of X terms that we have. We have a bunch of shells, BASH and the public domain, corn shell and Z shell and I don't know, there are other shells. We've got compilers for C and C++ and Python and Perl and PHP and a number of the other sort of standard languages. We actually have two C compilers. We have GCC, which is a big pain, and we have ACK, our own compiler, which is in some way, it's not as fancy and it only implements ANSI standard C. I wish GCC implemented ANSI standard C, but it doesn't. And it's much, much faster. We can build the entire operating system, all of Minix, the kernel, all the drivers in user space, all the servers in about eight or nine seconds with ACK. So that's a relatively fast build for 125 compiles and 11 links in about less than 10 seconds. Editors, we have Emacs, we have NVI, VIM, Vile, Net-It, you know, standard editors. Photos, we have Image Magic, JPEG package, XV, Utilities, we have all of the version seven, you know, standard real Unix Utilities. We also have all of the GNU Utilities, we also have the BSD Utilities. We put them in different directories. So if in your path you put user GNU as the first thing in the path, then it'll find the GNU Utilities first and you're getting GNU copy and GNU LS and GNU everything. If you put, you know, the BSD path, the BSD directory in the front of your path, then you get the BSD Utilities as first choice. If you don't put either of these, you get our Utilities as the first choice, the version seven Utilities. So you can sort of choose which environment you want by setting it to a path with the appropriate directory in the beginning. We have some stuff at the web, Apache, Dillow, Lynx, Lynx. We don't have Firefox, we like that, but I think it needs kernel threads and it's a very big, hairy program and I don't think it's very portable, but we'd like to have it. Somebody wants to support Firefox, that'd be wonderful. We really love it. Mail programs, Pine, PopTart, Xim, Sim, and so on. Database, Postgres, SQLite, MySQL Client, we don't have the server. QMU, which is nice, you can emulate on the set. You can run Windows 98 on top of Minix and it runs at the same speed. Windows 98 ran on the hardware. We can't run Windows 7, it's just too big and complicated. It barely runs on current hardware. But Windows 98 ran on 60 megahertz Pentium Ones and QMU slows you down by a factor of 10. It's like a modern machine. It's like a 300 megahertz Pentium 2 and Windows 98 runs perfectly on that. So running at a better speed than Windows 98 ran on the hardware it was designed for. Then we have Mplayer and NetHack and we have Subversion and it's about 600 programs available for it. So it's not a full blown system with thousands of packages, but it's enough to prove the point that it's doable and you could actually use it. Does anybody care, does anybody use this stuff? Well, we have, of course, we log the traffic to the website, minix3.org. Here are the number of visits per month we've had to the last year. It's running about 25,000 visits a month. We've had 1.6 million visits since the site went up in 2005, about four years ago actually, at the end of 2005. Compared to Linux, it probably isn't very much. But for those of you who work on other open source or academic projects, 1.6 million visits is probably a fair number compared to other comparable open source, low budget projects. Downloads per month, it looks like we're running 12,000 downloads a month for the last year. So we've had 610,000 downloads since about four years. So again, compared to Linux, this is nothing, but compared to other open source projects of other kinds of things, 600,000 downloads. There is some interest in there in the world. The current team is a me. I got five PhD students working on this. There's one post doc. I've got two student assistants. I've got three paid full-time programmers. I've got a couple of master students doing work on it. We had four Google Summer of Code students last year and there's various volunteers around the world working on all kinds of things. So it's sort of the group. This is an ad, which is why I came here, of course. We're always looking for help to work on new things. Volunteers, for example, to port programs. Porting programs, I don't know. Everybody here has probably taken a course in software engineering somewhere along the line and all of you forgot it the day after the exam. It's just amazing how much free software is so crappy. It's not portable. You run .slash configure and it always fails, always. What happens? It was looking for, I don't know, Perl 5.2.3.26.9.4b and if that wasn't there, it gives up. Even though the application you're trying to install doesn't use Perl at all. These configure scripts of 20,000 lines of incomprehensible code generated by a program, which itself was generated by a program, it's just not the way to do things when the application question didn't even need this facility. So, porting programs, typically it's one or two lines have to be removed from it, but it takes some effort to figure them out, so that's a problem. Porting libraries of all kinds would be useful. We have a wiki adding documentation. So the wiki would be fantastic. All kinds of things are not well documented. Translating the wiki into other languages is very welcome. And now the real thing. I actually have money from this EU grant to hire more people. I spent most of the money on the PhD students and so on, but I'm looking for a fourth full-time paid programmer. So if you're looking for a job or you have a job now you don't like, you're doing web design for a company making bathroom equipment or something and you don't really like the job much, and you'd rather do kernel hacking and be paid for it, this is your chance, so get ahold of me. So I'm looking for a fifth programmer for another related project, but so basically I have two openings for paid programmers in Amsterdam. So if you're interested, try to see me later or at least email me your CV. You can find my, just type my name to Google, you can find my homepage pretty easily. So you can do all this hacking as a full-time paid job, Amsterdam's a lovely city and so on. This is more on my homepage, or just Google me and find the homepage, okay? What's the current work? Work on live update. We want to replace pieces of the system while it's running, so if you can replace it after a crash, doing it intentionally is easier in some sense. In some ways it's harder. Like in most operating systems, if you're using version 6.2.4.13 and there's now a .14 that's come out, they tell you reboot the system. We don't want to reboot the system, we want to go over to the new version while it's still running and while all the processes are running. That there you are, your TV used to be hardware, now your TV is software, it's got a little box, which is 40 megabytes of code running in there, and there it's the SuperBall and it's the last couple of minutes, and you get this message rebooting, and it takes seven minutes and it reboots with the new version of the operating system and you've missed the final couple of goals. You sort of want to replace the system while it's running. Banks don't like going down and so on, so we think it's important to try to be able to update the system component for component while it's running and while things are going on, and during the update process it might slow down a little bit for a couple of seconds, but not have to have a reboot. Multi-core has become very common. Now we've had a multi-server operating system. The operating system itself runs as many processes. Suppose you had a lot of cores and you had a lot of processes. Well, it might occur to you, gee, we have a lot of processes and we have a lot of cores, can't we sort of make that match together somehow, like run each process on its own core, for example. But it's not so easy, but we're looking at that whole area of sort of multi-server meets multi-core. We're rethinking the file system. The current file system is basically the multi-s file system from 1965. The only major change really is in Multics, they used the greater than sign as the separator and now we use the slash and Windows uses the backslash, but other than that, it's the Multics file system, essentially. It's now 2010. Let's rethink the file system. So we're looking at that. And we're trying to recover stateful services, which is a bit tricky. The license, which is always an issue, it's the BSD license, which says do whatever you want, except Sue us, but other than that, do whatever you want. I know there are religious wars. I was a keynote speaker at the Linux conference in Australia a couple of years ago, and I didn't have the slide then. And somebody asked me the question time, like what's the license? And I said, it's the BSD license. I was expecting tomatoes to be headed in my direction. It was a huge cheer from the audience. Linux grew up cheering the BSD license on, anyway. But for better or worse, it's the BSD license. We have a lot of GPL packages, but they're all sort of add-ons. In a theory, if they got annoyed at us, we could put all the packages on a separate CD-ROM. You have to have two CD-ROMs. Okay, positioning of Linux. We're trying to show that multi-server systems, it's not about micro-curls, it's about multi-server systems. Show they're reliable, demonstrate that drivers belong in user mode, high-reliable and fault-tolerant applications, possibly future $50 single-chip, small RAM laptops for the third world, the one laptop per trial, the next generation of that, embedded systems. We have a logo, that's a raccoon, wild raccoon, well, everybody has to have an animal. The logo's small, it's cute, it's clever, it's agile, and most important, it eats bugs. Okay, conclusion. Current systems are bloated and unreliable. It's an attempt to build a reliable operating system. The kernel's quite small. The OS runs as a collection of user processes. Each driver is a separate process. The operating system components have restriction privileges. You can replace the drivers on the fly. We have a website, minix3.org, go there and find all kinds of stuff. We have a Google News group. You know, you can talk to us and you can answer questions and the whole thing. We have a Wiki, please contribute to it. I also have CD-ROMs. I didn't realize the scale of this meeting, so I didn't bring very many. Uh, the seating frenzy. But you can download it, make your own CD-ROM. Only these are the official ones with, you know, our little thing on it, but it's the same thing on the website. So I have the CD-ROMs and whatnot. Okay, questions? Hello. Are there any similarities between what you're trying to achieve with what Chrome OS is trying to achieve? With what, which is trying to achieve? Is there any similarities between the design philosophies that you're taking and the ones that Chrome OS are taking and they're completely different? Which, I mean... Chrome OS, Google Chrome OS. Chrome. Chrome? Chrome. Chrome OS. Chrome, Chrome, Google. Chrome, I think, is trying to run basically only one program that's their own browser. And for many people, that may be enough. All they want is a reliable browser. I mean, internally, I don't know, I haven't seen the source yet of that, but it's a similar kind of idea, only this is a real operating system, you can run all of the Unix stuff on it, and not just the browser. So there's philosophically a certain difference, but I don't know what their architecture is of Chrome, so it may or may not be technically similar. But the idea of not so big, I think that's probably in there, yeah. You said that per message you have a delay of about 500 nanoseconds, saying that it's not much delay, but we have people, especially from the audio guys, that Nagat ask that we have a delay of five microseconds, and would like to get it to two microseconds, or one microsecond. Grandma doesn't know what a microsecond is, but I mean, if you're the guy who wants to get every drop of performance out of it, you know, this is not for you. But I used to have, when I got my first 60 megahertz Pentium One, I thought that was fantastic, because it was still 100 times faster than my old PDP-11, and there's some people, whatever you have, it's not enough. This isn't for that crowd, I'm afraid. But if you want a reliable system, that's sort of our direction. But no, the last drop of performance, this is not gonna be it. I don't know how much the performance loss is, we never measured it. The L4 guys who have a similar kind of system have actually measured, they've really done a lot of work to optimize it. And they say that the microkernel approach costs them about five to 10% in total performance. And so if five to 10% is much too much for you, then this is no good. First thanks for the presentation because it was really, really interesting and well, quite funny actually. I have a question in the line of the previous question regarding the performance. You go for a pure microkernel approach, but how about the hybrid approach and how do you actually place yourself with respect to the level of performance that you can achieve using a hybrid approach? Again, performance is not on our agenda. That if you have a three gigahertz machine and it behaves, I've never seen this experiment, but if you went to the average grand ma or regular user, and the grandma calls up Dell and say, I want a new computer. And Dell said, we have two models. We have the high speed model runs at three gigahertz and so on, and it crashes once in a while. And we have the two gigahertz model which never crashes. I bet they'd sell a lot of the two gigahertz models. I mean, I don't have any actual data, but I have a very strong feeling that there's an awful lot of people who'd give up a third of their performance or maybe if I give up personally, 50% of my performance, if you could guarantee it would never crash. Right off the bat. So if people want highest performance, this is not your thing. But I think an awful lot of regular people don't care about that. Things are fast enough now. Yes, also thanks for the presentation. What happens when the reincarnation service goes down? Can you recover from that? No, we're dead in the water. In theory, we could have three copies of it running and have a triple modular redundancy scheme where all three copies were checking on each other. And if any two could outvote the other one. So if one of them failed or began acting weird, the other two could kill it and then start a new one so that the techniques for doing that, triple modular redundancy are well understood. But the reincarnation server is so simple in terms of its code. It's only 10 pages of code that we haven't had any problems with it. But we could go to TMR if that became an issue. So you compared to standard operating systems and said you want to boot faster. How fast does Minix reboot? I don't know. The actual booting of the basic operating systems on there maybe 10 or 15 seconds. Most of the boot time, normally when computer boots, it starts up doing the bios checks of the hardware. And for Minix, nearly all the time is the hardware checks. I mean, the time the hardware is actually finished and starts booting Minix, it's only a couple of seconds. So it might be 30 seconds the booter of which 22 seconds or 25 seconds are the hardware checks of the memory and so on. But the actual booting the operating system is fairly short a couple of seconds. A question about scalability. You mentioned that a lot of permission checks are done with bitmaps. Can you on compile time for your kernel system go beyond 32 servers or devices? Are you just limited of 32 devices? Are enough for the world? At the moment, it's in fact a bitmap with 32. Although there's no reason it couldn't make it a bitmap of 64. But it has to do with which servers you can talk to. And the number of servers is limited. The number of servers and drivers is on the order of 15 or 20, something like that. The software, we haven't run into that limit. But if we ever ran into the limit, we'd have to change something from being a 32-bit number to a 64-bit number. But it's not, you know, if you have lots of user processes, that's not an issue. The user processes see the POSIX interface and they can't send messages to each other. They just have the normal POSIX interface. If they want to talk, they use pipes. It's only a question. Now, if you had more than 30 or 40 drivers running, then we'd have to have more bits in the bitmap. But it wouldn't be very hard. I mean, if words were 64 bits, which we don't support, you'd have more bits. So you could have two words. So it's a small point, but it could be dealt with. What happens when a data server goes down and maybe afterwards, a driver? The data store? The data store. The data store is, you know, I talked about a smaller TCB. And smaller TCB means we're trying to get the amount of trusted stuff as small as possible, okay? The TCB includes the reincarnation server, the data store, the kernel itself. There's a certain amount. It's maybe 20,000 lines of code. That's absolutely crucial to have the system work. The data store is indeed part of that. Again, because the data store doesn't have a lot of traffic going to it, you know, then you could have, again, triple module redundancy, having three data stores and they could vote and so on. So at the cost of a little bit more overhead, you could make that reliable if that were really important. But we don't currently, because the amount of code in there is only a couple of pages. And it's fairly simple and fairly reliable. But one could do that, you know, for even more reliability. Yeah. Given you're talking about TVs and things, any work on exit six ports, say ARM, for example? We've tried to get an ARM port, but one of the problems, unfortunately, with volunteers is we've had two people start the ARM port and both sort of get bored and then drift off. And so our ARM port isn't finished yet. And so if someone says an ARM port, we're really happy to do that. So to get into the embedded world in a serious way, we'd probably have to get the ARM port actually finished. And I read somewhere recently that more than 75% of all the Linux code was written by paid programmers working for IBM Red Hat and a couple of other companies that the story of the users contribute stuff that is, you know, it's a nice story, but it isn't actually the reality. And so if somebody wants to work on the ARM port for us, we'd be very, very grateful, you know. So we're trying to, that's why I'm trying to hire another paid programmer, the volunteers, they have other things in their lives sometimes and the programming sort of becomes second choice. So I think we need the ARM port to get to the TV world and the embedded world. We don't have it yet, although we've tried. Even aircraft carrier launches a missile and the reincarnation server is going to restart the process of the kernel and it's going to launch a new missile. So can I avoid this from happening and is this programmable in a way that I can say for this specific kernel process, I do not want this to happen, that it will retry all the time? Yeah, yes, that as I said, when a process, when a kernel, when a driver or server dies, the reincarnation server gets a message that it died, it then looks up in a table what it's supposed to do and what it normally does is it runs a shell script and that shell script could be set up to say, stop, don't start it again, just send a message to someone, saw an email or whatever, or it could restart it. So you could easily put into the script, don't restart this one, take some other action. Hello? Oh. Okay, so most people that write software have the experience of having a QA department that file five identical bugs in the form, I click this button and it crashed, then I clicked it again and it crashed, and then I clicked it again and it crashed. And it seems great to restart, for example, the disk driver, but then the file system is told to send the same request that crashed the last one. So what are you doing to sort of make sure that these ripple failures don't, you know? If the failure is a true algorithmic failure and it is incapable itself of converting the linear block address to the correct head sector and track and cylinder, then there's no way we can fix it on the fly by starting it. But our experience and everybody else's experience is that most errors you run into are transient errors caused by weird timing or two things happen at the same moment or something like that, that the true algorithmic bugs that it never works typically get checked before they ship the thing. But, you know, so we mostly can recover from what are effectively transient errors and that works most of the time. What is possible, we don't have, but wouldn't be hard to do, is if you had two copies of the disk driver or three copies that were actually different code, different algorithms written differently, the reincarnation codes, after number one crashes, then run number two next. So the mechanism is there to, if you have a backup, we can't fix the code, debug the code and fix it on the fly, but we do have a way to run a different one if one wanted to do that, but in practice that isn't needed. What is MINIX doing or can do to protect against like a hardware that does a bad DMA or a driver maybe that programs DMA incorrectly? I mean, if the programmer has, you know, if the code in the driver is incorrect and it's programming the DMA wrong all the time, nothing we can do about that. I mean, if somebody's gonna fix the code. If there's another driver available, maybe an older one, you know, we could run the older one or something. So we can't fix program bugs on the fly. Hardware errors, of course, we can't do anything about it. If the DMA controller is actually broken, well, nobody can fix that. I mean, that, of course, isn't putting a new controller in. The best we could do is we conceivably could have a series of drivers, but, you know, run this one, if that one fails, second choice is to run an older version. That we can set up easily, just the shell script says, on the first failure, run this driver, on the second failure, run that driver, and so on. So that mechanism's easy. Okay?