 And we will start, I will start by asking the panel members to introduce themselves and then I have a whole list of questions here. I'm very good at getting these people to talk, it actually doesn't take a whole lot. But what I would really like is to get questions from the audience as well so that we can ask these developers what it is that you are interested in hearing about. So somewhere around here there will be a person with a mic if you want to ask a question. Raise your hand or throw something and the mic should come to you and we can go from there. As we said, my name is John, I've already been introduced. So what I'd like to do is just start with Mel and if each of you could just introduce yourself briefly for the audience. Hi, my name is Mal Gorman. I work with Susie Labs. I work on both the distribution kernels and the upstream kernels. I specialize in memory management but move around a small bit when time allows. So sometimes I'll work with scheduling type stuff and sometimes I'll work with IO stuff. But the bulk of work I do would be memory management related. I'm Greg Kohartman, I work for the Linux Foundation. I release the stable Linux kernels. I also responsible for the driver core and USB and a lot of other things that nobody cares about like serial. And our breakfast cereal. I'm Jens Akspo and I work for Facebook on Linux kernel storage related issues. I've done a bunch of work in Linux kernel storage in the past, including the IO scheduler that people love to hate, CFQ. I'm Dave Chenner, I work for Red Hat. I'm the XFS maintainer. I work on XFS, obviously. Generally file systems, VFS, a bit of memory management, really whatever needs fixing. My name is Matthew Garras, I work for Nebula where I do cloud security. My kernel related work is mostly firmware related, ACPI, UEFI, stuff that makes me swear, stuff that makes other people swear. Also drink. That's after the panel. Okay, so we've heard where each of you works, but I wanted to get into something that we got into briefly in the interview that went around a week or so ago. Almost all of the people who work on the kernel are paid to do this. We have something on the order of 10 or 20% volunteers depending on when you look, which development cycle. Everyone else is doing it on somebody else's paycheck. Which is awfully nice for kernel developers. So what I wanted to ask is, why is it that somebody is actually paying you to do this work that so many of us have shown that we're plenty willing to do for free? Sorry about that. What is it that they expect to get from it? We'll start at the far end. Matthew, who is? As a completely cynical level, we can't rely on people to volunteer to make changes to the kernel that make us money unless they're guessing something in return. And so we need to make some of our own changes. But in terms of then getting that upstream rather than keeping stuff in the house, it means that we benefit from other people helping us, giving us advice on coming up with the best solution. It's often the case that I've written some code that solves our immediate need. And then I've posted that upstream and I've received feedback on how, well, we'd like these changes if it's going to go upstream. And it'd be easy to say, well, I've scratched my itch. Why should I continue caring? Why should I go to the efforts of making these changes when there's no direct corporate need? And the real answer is that by doing that, you're indicating that you're willing to spend time helping other people. And that typically results in them being much more enthusiastic about helping you. Because I work upstream, because I care about guessing my patches upstream, I know that I can go to many other people in the kernel community and they will be more enthusiastic about working with me about engaging collaboration. I thought I should just get collaboration in there. And I help people, they help me. And that means that we can, if we hit a memory management issue, I know that I can go to Mel and he will be able to give me good advice. If I hit next FS issue, I can go to Dave and he will be able to take a look at what's happening and tell me, oh, yeah, it's that bug. And if we weren't engaged with upstream, then there'd be no reason for these people to help me. So, somewhat selfish, but in terms of, you know, altruism in general, you can argue is somewhat selfish. Dave? Well, the way I look at it is that I've got to be doing something in life that pays the bills. And if I can do something that I actually enjoy doing and helps others that pays the bills for me, then I've won. So, whenever I have a problem and I fix it for myself, I stop and think how many other people have just suffered from that bug or that problem or wanted that feature. And by giving that effectively away to the wider community, I get more from that than just the benefit of fixing my own systems. And then I get paid to do that. So, what we've got here is a great big positive feedback loop. You know, people feel good for doing this. People benefit from this. Not just me, but everyone in the room, everybody uses Linux. They all benefit from the work that we do. And we're finding now more and more there's actually a commercial imperative to give your software away to help others because then they help you and then everyone benefits. And coming from originally a background where everything was kept in-house and no secrets were shared and everything was hush-hush and you couldn't talk to anyone, to be able to walk up to anyone in the community and say, I'm having this problem. And there you're able to talk openly to anyone pretty much, has massive benefits. There's so many people that can help you solve a problem. And we've got a community here, the collaboration, got to get that in, is a great benefit. And I think some companies got it sooner than others, but we're getting a much wider community that is actually understanding that this is the way to do business. So they pay people to do business the right way. Here we are. Yeah, if you look at it from the point of view of the company, I think as Dave brought out back in the day, open source was viewed as great. Somebody did all this work for us for free. Now we can steal it and run with it. I think these days companies realize that if you build large infrastructure on top of open source stuff, it economically makes sense to contribute back and ensure that this project goes probably in some of the directions that you like to see it do. And I think that's what some of the employers are hoping. They can get a bit of influence perhaps. And I think outside of that they probably just view some of the guys that they bring in as proven good developers. And hopefully those are useful in lots of other regards and just working on that particular project. So I'm especially interested with regard to Facebook because one can say, I understand why a distributor would want to pay people to help us support our customers and so on. But something like Facebook is selling a very different service, right? It's selling all of our private lives to the rest of the world. I'm going to get fired now. So why does Facebook want to hire a bunch of storage and file system level developers? Well, I think as Facebook has demonstrated with their open compute platform initiative, it actually makes economic sense to try and develop this in the open and not just improve your own data centers, but hopefully improve data centers elsewhere. And I think I'm not quoting any private numbers here, but I think they publicly said they saved over a billion dollars with their open compute. And I view this as pretty much the same thing just on the software side. So I think for, in particular, maybe recently when Facebook's gone after a couple of kernel folks, they want to ramp up a little bit more on the software side, similar to what they've done on the hardware front. So are we going to have to get a minimum number of likes to get a block patch merge now? That's the intent. I'm setting up a review group on Facebook now. Very good. Greg. What they said, I mean it does make business sense, but now I work for a nonprofit so we don't make any money. I do stable kernels for everybody, so I guess that's about why. I guess it benefits everybody that way. So I think on the personal level, I don't think I'd be willing to work on this for free anymore. Memory management does that to you. That's a certain element of it, but there's two reasons for that. The primary, once I've reached a particular level of experience and started working on stuff, the level of emotional and personal energy required to keep working on this is quite high. I think if I was working on this as a hobby, I would no longer have the beans available. By the time I would be finished my work day if I went back and became a Java developer again or something equally horrible, I don't think I'd have the emotional energy to work on this anymore, but the bills have to be paid. In terms of why their company works upstream, there are a few reasons for that. We keep getting paid to keep working on this because things keep changing, like workloads keep changing, the characteristics of machines keep changing, stuff that's not unusual for people to buy new hardware and then find that it's working nowhere near the expectations that it is and the needs to keep access to people. But it's not just going to say, oh, why don't we just use this kernel for free, but the expertise required to fix it if anything goes wrong or to even analyze why it's going wrong means that's what you're really paying for is access to people, that they're available, not just to code in isolation. Now, in terms of why would you focus on upstream, it's because it spreads around the risk quite a lot. If it was only us that we're working on, we're only doing our own kernel, it would be kind of sitting effectively in an echo chamber. The workloads that our customers are running are not the same as what anybody else is running, we could paste ourselves into a corner by accident. But continually working on upstream, not only do we get correlation on bug reports across the whole community and a deeper understanding of how our software is actually being used, but it also means we have access to a far greater pool of expertise. Like what some of the other panelists said, this is a better economy to some extent. It says I will contribute my expertise to you if you contribute your expertise to me. But working upstream on that is extremely important to keep our relationship going. One of the things that's always striking me about how this community works is that we have a lot of people who are working for companies that don't necessarily like each other, or at least are intensely competitive with each other. But somehow we manage to keep almost all of that out of the way our community works. We're able to be very open in our discussions, we're able to get things done, and while company agendas do appear, they're surprisingly absent from a lot of what we try to do. How does that work? How are we able to do that and maintain that? Especially when a lot of other projects seem to sometimes be more driven by one specific company's objectives. I can answer that from personal experience. I used to work for a distro company. I used to tell people our engineering department is our competitor, and it's that simple. We can't do this alone. We have to rely on them. Engineering-wise, we couldn't do anything to make them mad, and we also had the fun thing of people couldn't pit us against each other because we knew they were trying to play us off each other, and that wasn't true. So engineering-wise, we all know each other. We all work for different companies. We all know we're working towards the same common goal. Let the managers and let the marketing people fight it out. That's fine, but the companies rely on each other. Competitors rely on each other in order to survive. It's a simple, it's a crazy ecosystem from what it used to be, but it's that simple. I think part of it is also culture. You might work for company X and somebody else works for the same company. You work together on some sort of project, then you go to Worst Competitor, and you're still working on some of the same things. So the collaboration, say, I did it too. This sort of continues across the company boundaries, and at least for me, often you may not feel like employee of company X, Y or C. If you go to conferences, you're just that guy that works on cereal or something equally unfortunate. I think for the most part we do, as you said, see most companies that are of similar sizes contributing similar amounts of work, and it's very rare that we see large companies effectively violating that implicit social contract of taking work as other people are doing and contributing very little back in return. And the small number of cases that we have seen there, there's tended to be overall harmful effects to the community. People change the way they release software in order to avoid giving their competitor a competitive edge. And that's harmful. It would be nice for those situations not to arise. I don't have a great answer for how to avoid them, but when that does happen, I think it does also bring a certain level of social awkwardness. If there are people working for a company that I feel is almost behaving unethically within the community, there's less incentive to be even out of work as quite as nice to those people as you would otherwise be. But for the most part, as Greg and Jens have said, there's very good collaboration between companies. We very rarely see people doing that. And as a result, these people are my friends. I have an incentive to help them. They have an incentive to help me. Even if somebody is working for a competitor, it's hard to really think of them that way when you've been struggling walking in circles around Prague, trying to figure out how you're supposed to get home. So we like each other. It's good keeping it that way. I guess in many cases as well, we'd also have common goals. So even though the other companies are competitors, it's also symbiotic relationship. I think it reached a point quite a long time ago that there is no single company that has access to enough expertise in all areas to completely go out alone on this. And each of us kind of recognized this. Now, there are certain red topics. You would never try and find out what your competitor's release schedule is or something like this. We know implicitly just to kind of stay away from what's going to be the information that's sensitive from a competition point of view insofar as that is possible. But the rest of the time, we have the same types of goals and we're better off working together to try and accomplish that and then go back to competing from a sales perspective. Okay, well, let's talk about collaboration at a bit of a different level. You said it. Did Diane cut someone off? No, you said collaboration. Oh. It's like a knight who said knee or something. All right. Where was I? Sorry. Oh, of course. Collaboration. There is a bit of an upset in the mailing list for the PostgreSQL database community recently because they felt that the Linux kernel community wasn't really paying attention to what they needed. That their needs were pretty much irrelevant and we tended to break their code fairly often with the changes that we made, mostly breaking from a performance point of view. So this was actually made much better, I think, during this very week when some of those developers came to the Storage File System and Memory Management Summit and we were able to talk about some of these issues. I'd like to ask, how are we doing in general with talking to and collaborating with our user communities? We're users. By users, I'm thinking about projects that develop software to work on top of the kernel that we are creating. Are we talking enough to them or are we kind of leaving them out in the cold? If so, what can we do to do better there? I think it's very dependent on the subsystems as to how much interaction there is with various application developers, users, and so on. In the XFS side of things, we have an awful lot of people that tend to lurk on mailing lists and in IRC channels and so on, and pop up at the weirdest times asking questions. And one of the things that comes clear after you've seen this happen time and time again is that there's a lot of people actually following what we do, but not necessarily actively participating. And so they see what we're doing, but we don't quite see what they're doing until we break something. And then we find out that they've been watching what we're doing, but they haven't been telling us anything. I'm not sure how we can change that, but often the way that I've treated that problem is to say, well, that's great. I didn't know you were doing this. Can you describe your workload better so that we can understand it? And now in understanding it, take that into account when we design new features or have to change the functionality that we currently have. If I don't understand what you're doing, I can't take that into account. You've got to talk to us as much as we're talking to you. You're listening, participate, collaborate. I've done it again. And really what it comes down to, I think, is this has got to be a two-way street. If we don't know what each other is doing, then we can't possibly know that we're doing something bad for you. So how should some random application project communicate to us? Are we going to tell them to go on to the Linux kernel mailing list? No. I mean, we get that all the time. I hear from users and programs. When we break something, people are not shy in letting us know. I see it all the time. Maybe I'm just breaking everything. I don't know. Serial? Yeah, serial. I just got a bug report today. No, I mean, we are very visible. It's very easy to find us, not in the Linux kernel mailing list. But it's easy to find the subsystem of what's going on. I mean, our email addresses are out there. But the Postgres guys are great because they've been coming to our conferences for a long time and they've been telling us about problems. And that's a good thing to do. I'm not sure I fully agree with that. I felt, at least for the Postgres community and not just exclusively that, I felt that it was a disconnect with the upstream community itself and some of the other projects. Now, many of us individually are familiar with what some of the users are doing based on their interactions with our respective companies. But the upstream community itself, I feel, is disconnected. It's kind of like one step removed. One thing, at least for when the Postgres people came along and the discussion that led up to that, I got distinct impression that a number of kernel developers felt that it was a solved problem and that Postgres is completely fine. And 350 mails and a lot of reading later, that was completely not true. And from talking to one or two of the Postgres people, they had felt that they had tried to get some of this fixed and then just gave up. That kind of lost the energy to do it anymore. And I think it's something we need to re-visit to kind of say, maybe be more open to being contacted by the other communities and working with them or just as what happened with Postgres, invite them to show up. Yeah, I mean sometimes it's architectural or design issue and other times it's just we change something in, say, an IOS scheduler and the VM, you get drastically different behavior and that just ends up meaning that performance really sucks on their workloads right now. So I think as kernel developers, we know nothing better than people showing up and say, you know, version X worked, why doesn't? Here's a nice test case that shows that performance dropped 20%. Because now, after that, we have this test case, we can run it through. I mean, as users come, it's hard to say, you know, what is a user because they could be using latest RC kernels and they could discover bugs on, say, a remove a device and it crashes instantly or it could be a year before they see this bug because they run on distributions and the whole time from release to actually in production for that particular user is a lot longer away. So I think it all depends on what the class of issue is. At least for me, for Postgres, it seems like they, yeah, they'd sort of given up and went off and tried to architect their way around things a little bit and that didn't work as well. But for that particular issue, it sounds like we made good progress. Well, I think the important point that Jen's just made there is the test case. Not, it was only a couple of hours ago, one of the Postgres guys posted the test case that showed the problem that we talked about. So we now have a test case for that and we should be able to reproduce the problem and make sure it doesn't happen again. It goes away and it stays fixed. So test case actually make our life a whole lot better because that's a direct encoding of the problem that users have and we can make sure that it doesn't come back once we've fixed it. I think there's some cases where when some user space developers come to us and say, this kernel behavior is causing us problems, our immediate reaction is, no, kernel's fine, you're using it wrong. No, never. And sometimes that's true. And sometimes, in fact, we've creased an interface that's impossible to use correctly. But I think we tend to lean a little too far on the user space program that's no less about the kernel than we do, which is probably true and therefore user space programs are wrong, which is much less true. It would be nice if we didn't jump to that conclusion quite so quickly sometimes. Also, there's been some cases of we've added features to the kernel and then we've got upset when user space started using those features, which seems a little bit unfair, again. We should learn from those if there are interfaces that someone wants to add to the kernel and we think that this is going to be easy for user space to abuse or it's going to tie us down into long-term commitments that we don't want to make. We need to be more vocal about preventing those interfaces being merged in the first place. And so people should really be keeping a very keen eye on what goes into RC1, making sure that any newly exposed interfaces work in the way that we would like them to and making sure we can rip those out again if it's impossible to make that work, if we're going to end up making commitments that we don't want to stick to. We should give user space the credit of assuming good faith. They should be able to assume that features we add to the kernel are there because we think they're good, useful features, and we shouldn't then get upset at user space developers for using them. There are still, however, a lot of people who appear to really, really hate seed groups. That actually leads to another question that I had. If you've watched the recent battle over the choice of init systems in the Debian distribution or if you've watched other discussions recently having to do with containers and the use of seed groups and some people are saying, well, seed groups are nasty. We shouldn't be designing other features around the use of seed groups. We heard some of that again yesterday with regard to out-of-memory handling and all that and whether that should be tied to control groups. We see that with other features as well. We tend to see it fairly often when we do something new that wasn't really envisioned in 1970s version Unix or so. Part of this, I think, just comes with the territory of blazing new ground or new trails and trying to do things that haven't been done before. But sometimes it seems like there's a real conservatism out there in the user community but also within the kernel community about anything that's new. That's not traditional Unix. Am I seeing that wrong or do we suffer from that? What can we do about that to make progress in our interfaces? It's hard. I mean, we are blazing new trails. I mean, we've been doing that for a long time. I mean, we way past what Unix could ever do. And that's hard. It's hard to do. But it's our job to say no. I mean, it is. It's our job to say no as a maintainer and developer to push back on things that are half-baked and half-things. So it's hard to reconcile that when you finally do say yes, you have to really embrace it. I mean, it's a tough dichotomy to live with. I think one of the reasons why I would be resistant to introducing new interfaces are even modifying some of the existing ones, like taking the C-group OOM handling as an example. My first reaction is always, well, why do you want that? And it says, well, we're working around the problem whereby X, Y, and Z does not behave as expected or that the wrong stuff is getting killed. And my response to that is, well, if you have a problem because IO is interfering with loads of different workloads, I don't think the solution to that is to slap C-groups onto us and build a whole pile of infrastructure around. Actually, how about we just fix the stuff that we have rather than bolting new stuff on it to work around us? And it's the same so much for the C-group OOM handling. I was kind of saying, why was the kernel able to get into such a situation where it just was even necessary and should we be fixing what we have rather than trying to add new stuff? Now, of course, that's always possible. Sometimes the additional infrastructure or interfaces are needed, but there are certain features that I think are working around flaws or bugs in the kernel, and I don't think that's appropriate. But then it's hard to tell which it is sometimes. It can be very hard with something that has never been done before. We're always going to make mistakes when there's something new we don't know exactly how it's going to be used five years down the track. The architecture that we come up for the first iteration that solved some particular problem, and with C-groups it was for CPU sets and big NUMA machines and HPC, it wasn't designed originally for system D to control everything in the box over. It's a completely different use of the same technology, and it certainly wasn't architected for that use in the first place. And so we don't get things right first time. We certainly can't read the future and know how something is going to be used five years or ten years down the track. And so sometimes what we do now, we think we're doing the right thing, but hindsight, sometimes we've done the wrong thing. I should be remembered that the Linux kernel is one of the largest collaborative software projects in the world, in the history of the world, and has almost nothing in the way of formalized management structure. We have people who have a strong operating systems background to a contributing code, and then we have people like me, I have a background in fruit fly genetics, and yet someone lets me get codes in the Linux kernel. This seems wrong, but anyway, I'm not complaining. And then we have people who are genuinely kids in their bedroom sending us codes, and somehow it's a miracle it works as well as it does. We should be astonished at the fact that we are able to get it so right so much of the time. The fact that sometimes we end up with bad interfaces that don't work the right way is unfortunate, but given how much progress we see outside that, I think it's astonishing that it's as rare as it is. And to go back to John's question in terms of are we too conservative, and there's maybe an argument that we are. There's certainly cases where there are features that I think are justifiable, that's a worthwhile, that bring real benefits, where instead we'll have a three-year argument on LKML, and then finally someone will get bored, and then two years later someone will re-implement it. And it'd be nice to avoid those. I think that is possibly something where... Wouldn't it be nice if someone could actually go and do some research into the sociology of kernel feature development, and then write a thesis on it? And then if other people would actually read that? Which, okay, I'm sorry, that's clearly not going to happen. We should do better than we do. We do much better than I think there's any reasonable right to expect us to do, but we should still try to find ways of doing better. We still do it better than anybody else ever has. Sure. I mean, so give us credit for that. If you look, all other operating systems do the same problems, have the same issues with their interfaces. They change them all the time, and they fail. Just like we failed at times. We just happened to fail less. That's good credit. Okay, in the area of things we could do better, some years ago I remember discussion about testing, testing releases, regression testing, and I saw people saying, well, you know, testing isn't really applicable to the kernel because all of our problems come from workloads we don't have and hardware we don't have, and so that's what we have users for. At the, in the last couple of days of discussions in the storage, file system, and management side, there's a lot of talk about testing and automated testing and about how every time somebody adds a new test, they find new bugs, and we are able to hopefully avoid those classes of bugs in the future. So what changed, that we seem to be now getting better at testing, and how can we do better yet? Because I don't think anybody thinks we're anywhere near as good as we should be. So you asked what changed. Speaking from the file systems point of view, if we go back even two, three years ago, all the file system developers sitting down the front here, very few of them were using an automated regression test suite for every single one of their changes before they pushed stuff up to Linux and so on. As I asked, it was yesterday, yesterday morning, now pretty much every file system developer that was at the LSF conference, they're using an automated regression test suite to test every change that they're making. And so the result of that is that we're getting much better initial test coverage and it's catching all the silly brown paper bag bugs that would have otherwise been missed because things like off by one bugs are hard to see when you're reviewing code but if you've got a test that tests all the corner cases, you catch them straight away. You catch them before you even send the code out for review. And so from that perspective, I think we're doing a lot better job and we're less reliant on eyeballs to find hard to find bugs. They just don't get out of the developer's machine because they're found before reviewers even considered. And so what we're doing is we're lifting the base. You know, the bar has gone up for the code coming out of the developers is improved. It's a better quality. And so I think from we're actually getting better at doing this, we're having less silly problems. That's not to say we're doing as good as we can. We're getting more tests all the time. And as we find out, you know, we fix one bug and then find that that was covering up another five bugs. That just happens. But we still need better test coverage. We still need to be able to, you know, improve that because as we find out, as you said, every time we add a new test, we have more bugs that are found, which means we're still not doing a good job on it. It's better than it was, but we still need to improve. I think it's a really hard problem to solve because depending on where you're developing your features or making your changes, you know, getting full coverage testing would just take a lot of hardware. So I think I would love to have some place where you could submit patches or a kernel and have it run through all the things, all the benchmarking, how to get it back. And everybody, I think, would love to have that, but it seems nobody's willing to front the bill or the warm bodies to actually implement this. We are getting something. Intel are providing that through... I was going to mention the kernel, the zero-day testing, which is excellent. So that improves the code quality and the branch quality, I'd say, that's being pulled in. So for people that don't know, Intel's running, every sort of visible branch out there will run through and compile every step of it. So when you submit it, you can actually be certain that who has gone through every bisection point in your branch and it won't screw up Linda's tree once it goes in. So that's kernel-compile-level testing, but I'm talking about things that impact performance or introduce race conditions, bugs, this sort of thing. Our own little QA department at hardware, that'd be nice. Well, that infrastructure that Fingering has actually does do performance testing, and we are getting performance test regressions reports coming out of that infrastructure. So it's running XFS tests, so we're getting regression reports for file system functionality coming through it. We're getting performance reports for things like IO benchmarks and whatnot coming through it. So there is actually a lot of testing actually being done behind the scenes. The fact that so few people know that that's actually being done means that we're not actually creating a lot of problems, because otherwise you'd know it was being done. I know it's being done because I've been getting bug reports from it. That explains it, yeah. Well, at the kernel summit, they said we talked about this. We wanted to have make tests in the kernel, and there is a bunch of tests in there. So actually a few days ago I went and ran them, and they failed. I was like, oh, is this my kernel broken or is the test broken? And I tried all I had to run some as root, so I ran them all as root, and they still failed. So the tests we have, even in the kernel tree, don't work. We need tests for the tests. No, we need tests that work. But like you said, all the file system tests, I wish I knew about those. So when I do a stable kernel release, I'd like to run those. Where are those tests? Why aren't they in the kernel tree? Do you want me to put them in the kernel tree? Yes. We agreed that the kernel summit, we would take them. Okay. I wasn't at the kernel. Okay. We said we would take them. I mean, we want that. So when I do a stable kernel release, I'm relying on my local build and relying on a few other random people to do testing. As the maintainer of XFS tests, we can look to move it towards the kernel. Yeah. And keep it there. That'd be great. So on the one side, we've got the problem that dealing with a bunch of workloads is difficult. And if you're the kind of person who spends most of your time thinking about server-level hardware, that's probably the most relevant thing, trying to come up with a situation where your tests represent the majority of real-world workloads. If you're like me and you're mostly working on firmware, the problem with testing is that, well, okay, I'll make this change that fixes this bug, except now we're slightly changing how we interact with the system firmware. And then we find that someone's 15-year-old laptop no longer boots. That's the kind of testing that's really difficult because it's impractical for us to actually have every single computer ever made. And then boot the kernel every time we make a change and make sure it still boots. And the kind of sad thing there is that that's often, for a lot of users, the most user-visible thing. They upgrade their kernel, and their laptop doesn't boot anymore, or it doesn't suspend, or video output doesn't work properly. And I don't know that we have any way of having a good answer for that without manufacturers thinking of Linux as a first-class criticism. And it's really not practical for us to expect that from most hardware vendors at the moment, which is a shame. I don't have a good solution for that. Well, to some extent, I don't think you can solve that problem. I mean, even if the hardware vendors ultimately cared about Linux, they're not going to run every RC kernel and whatnot. So we're always going to have little bugs like that. Microsoft benefit from the fact that they only release one update every three years. We do it every three months. Related thing, as part of that three-month development cycle, we used to have people who were doing formalized tracking of regressions reported for each development cycle. We haven't had that for a few years now, because the people involved found bigger and better things to do. My question to you is, how can we release kernels without knowing whether we fixed the regressions or not? Is this hurting us, or are we somehow muddling along anyway and not breaking things? I think at the moment, we have a far greater degree of automation than we used to. And so we're catching more bugs at the developer level before they reach the point where they actually hit users. Now, that said, the last time that I was checking, there are several areas where we've been slowly degrading over time. Or else we've fixed one thing and then break something else. So it looks flat, even though it should have been improving. Even if we had somebody that was tracking regressions on a per kernel release basis, I'm not sure that would have been cut. Previously, when we had a dedicated regression tracking guy, it was functional regressions that it was checking. Now, my impression on that is that we have fewer of those, but then again I am a million miles behind an LKML at the moment, and I don't see it anymore, and because I don't see it anymore, I think it's fine. But I think the greater degree of automation we have now in comparison to four or five years ago has really cut down on the number of problems that are found on a per release level. But again, that doesn't fix the... It breaks somebody's laptop problem, that sort of thing, and we're not really tracking those now. So how do we... How is it that we're producing kernels that still work? Well, it's up to the individual maintainer. I mean, there was a nasty USB 3 bug that broke a bunch of people's machines, but yet it didn't break any of the USB developers' machines. So we kind of blew it off for a release, and then it hit a real release, and we got yelled at, bad, because we broke a lot of people's machines, and we fixed it eventually. But I mean, sometimes it's... Then that was my fault for not pushing back and backing that down, so it's up to the individual maintainer of the subsystem to stay on top of this. That's part of a maintainer's job is to do that. And there's a scuzzy patch that needs to be merged, James. Speaking of regressions. Is this patched by embarrassment? No, I'm sorry. It's already made. It's already written. But anyway, that was the regression that we hit, and it needs to come. It's outstanding. You need to reset it. Okay, it might be already there. So the embarrassment-based scheme works, and it doesn't seem like anybody's all that worried that we're not... I think it depends on what you work on, as Greg. Yeah. Because I used to work on CD-ROM stuff, and that was horrible. You had a standard that defined how you talk to the hardware, and it just means everybody interpreted a little bit differently. And any sort of change you'd make, as Matthew describes, would fix one thing, and it would break other drives. So I solved my personal issue by just moving further down the stack and getting away from device drivers, and hardware as much as I could, because it is really hard. CD-ROMs is still hard? It's not my problem anymore. I remember CD-ROMs. Okay, looks like we're actually about to run out of time. I wanted to ask one question really quickly. Some of you may not know that Matthew, down at the end there, was just giving the Free Software Foundation's award for the advancement of Free Software for his work with support for UEFI Secure Boot. So I wanted to say first of all, congratulations. How are we doing with that now? Does it actually work? Secure Boos? Yeah. Yep. How many people are using systems with Secure Boot enabled now? You and James? It's the first thing I turn off on a new machine. You could just try it, except I guess most of you probably actually haven't bought new hardware for years because you're kernel developers. We used to curse BIOS developers, right? I mean, it's the same thing. Yes, we still do. Well, the three of you who are using it can thank you, Matthew, that it works. Are we concerned that we now have a system that essentially, in some sense, gives Microsoft a veto power over future kernels? Because I mean, no, I see a lot of restrictions and so on justified on the basis that if we allow this, then Microsoft will cancel our key and we'll no longer be able to do that. And this is always going to be a per-distribution decision. Distributions should make their own policy choices based on conversations they've had with Microsoft. If people don't want to be signed by Microsoft, they don't have to be. You're under no requirement. There's no obligation. We can do whatever we want with the kernel and Microsoft have no say over that. It's only the cases where people want to be part of this Microsoft secure boot ecosystem that they have to think about what they're doing, think about those rules. And for some people, that's worthwhile trade-off. And for some others, it's not. I don't think there's any case where we should change the behavior of the kernel in such a way that everybody is forced to abide by a set of restrictions if they aren't choosing to be part of that ecosystem. And the work I've been doing there, I've been very careful to try to avoid any situation where a change I make prevents someone from using some functionality unless they're in this situation. But part of it's also being trace-designed in such a way that it's very easy for even users, less-alone developers, to change the system into a mode where those restrictions aren't enforced, where people still have the freedom to use whichever feature they want. Personally, I think this is a good balance. We make sure that users can choose between freedom or some other security. In some cases, accepting certain security features implies that we right now have to disable some other functionality, partially because that was really bad functionality to begin with. A lot of this stuff is dating, but a lot of the things that we're talking about, turning off and skew of these are features that are only there because way back in the 90s, we decided that graphics drivers should be in user space. That wasn't really with hindsight a great decision. And we, to a large extent, made that decision because we wanted to be able to share graphics drivers with Solaris, and that also wasn't a great decision. We didn't really end up benefiting from that in the slightest. There's a couple of features where I'm talking about, well, okay, Kexex is an issue in this environment, but we have ideas of how to fix that. It's not ideal that right now my patch disables that feature. I would prefer not to, but there's work being done that with luck will be mergeable, that will result in you not having to make that choice. You'll be able to continue using Kexex, and secure it simultaneously. Okay, well, I have opinions on this topic. All right, very good. Well, I've run us a little bit over time and I don't want to interfere with the collaborative drinking part of the conference. So I think it's time for us to thank our panelists for joining us, and to thank you all for listening. Thanks.