 Introduced Jonathan Corbett standing right there. I've actually got some notes this morning because he's so impressive I don't want to get any of this wrong If anyone's reading the bio put your hand up every time I stuff it up. John here's a Colonel contributor co-founder of LWN net the lead author of Linux device drivers the third edition and if I can read my own writing he's The Colonel summit on the Colonel summit program committee for some years apparently and on the Linux Foundation technical advisory board So if you would please give a very loud Australian welcome To Jonathan Corbett. Thank you All right. Well, thank you all very much What a nice welcome People who have seen me give talks before know that I tend to talk a whole lot about how well the Colonel development process works And in fact it does work Well, we're doing four or five releases a year every 80 days pretty much like clockwork anymore Everyone of these releases incorporates Something like 10,000 changes coming from over a thousand developers this all this work shows up in everything from Very large systems to your phone and your toaster and next year your toothbrush perhaps The whole thing actually works pretty well, and that's what I tend to talk about, but I'm not here to talk about that kind of stuff today That's boring talking about that It's time to talk instead about when things don't work quite as well as it is one might like when things go wrong And so one might ask well Why is it that you want to do that and there's a whole lot of reasons for Looking at failures and analyzing them trying to figure out what went on starting with That the problems we've had at times with high-profile failures giving the Colonel process of bad name People coming away saying you can't work with these people things go wrong. It's not a friendly development process that sort of thing You see stuff that shows up in the media sometimes So we had this terrible thing that a key contributor has Admitted admitted that the process can be intimidating and hard to break into It came from a reputable source Although it was actually pointing somewhere else the actual source actually came from LCA a little while back Maybe it's time to start saying some other things for a while gotta be careful But more to the point what I'm really after here is that you learn from failure, right? You can learn from when things go wrong. This is another quote from another highly influential figure in in our community This is not from me. It's comes from another someone else who says that That yes, it's great to celebrate success, but it's better to heed the lessons of failure So that's what I'm here for and the quote that really motivated this something I read many years ago in grad school From a book called the science of the artificial artificial by Herbert Simon He was a Nobel Prize winner in economics did various things like that very good book for students in this This is a book really about artificial intelligence in the study of the functioning of the brain But he says that a road or bridge under normal conditions just serves as a flat surface that you can drive vehicles over It's only when it's overloaded that you learn something about how it is built in general Working system is a black box. It's just something that does what it's supposed to do is boring as soon as it fails You learn something about what's going on inside. So that's My purpose here is to look a bit at what's going on inside one note that I want to Put up first there. There are a lot of Interesting figures in the kernel development Community right But I'm not really here to talk about them at all This isn't I'm not here to poke fun at the stranger folks who tend to hang out on the edges of our milling list and so on I'm gonna be naming names because I'm gonna be looking at where things went wrong And I don't see any way to do it without naming names But every developer they talk about here is someone that I really respect somebody who's contributed to our community Someone who I hope will continue to often. They won't be but I wish they were I'm not here to shame anybody because We all make mistakes the point is to simply say that yes I made a mistake. I found another way to not solve your problem and we'll go on from there So if we're ready, we'll hit the road and start looking at Where things go wrong so example number one back A few years ago a couple of them have to do a file systems in the effort to get a new next generation file system a Guy in Daniel Phillips who's shown up at this conference in the past till I haven't seen him for a few years Came out and said okay I'm gonna create this file system called tux 3 because after all tux 2 was going to be the great file system that took over the world But then it didn't for various reasons But tux 3 was going to do it had a lot of very interesting architectural ideas built into it he came out and in July 2008 posted the The initial version of the file system. There were a lot of interesting discussions discussions with Matt Dylan of drag you fly BSD did the hammer file system about how you might design file systems It looked like it was going somewhere a few months later. He had it Working well enough to to boot a Linux system as the root file system and then things kind of Went quiet and so on slowed down and if you look in their repository, which is still out there You'll see that in 2009 year and a half ago was the final commit and even the last commits for a while I've been sort of trickling off into fairly desultory sort of little clean up things and the project is dead Tux 3 is not in in the mainline kernel. I don't think it ever will be at this point. So What happened as this process was going along and some time after he was saying is it boots says is route and so on and Do Morton came to him and and said very clearly do not fall into the trap of continuing to add stuff to an out-of-tree Module get it merged first. Otherwise, you just make it harder and we have a whole lot of examples of when this has happened. Well we had another one Because this merger did not happen instead were continued to happen outside of the mainline tree and That really killed the project and in fact even Daniel even came around and said yes, I should have Simply followed through with what Andrew said and put it into the mainline the lesson from this is a lesson I've made in other settings in other places many a time Code that's outside of the mainline kernel tree is invisible to a great extent. It doesn't get users It doesn't get contributors It just doesn't have the momentum the code that is in the mainline tree has When you're outside of the mainline you were really pushing against the wind if you've ever watched a bike race You see how the racers ride in a peloton altogether because if you're in the middle of a group like this That the momentum of the group and the air that goes with it carries you along It's really an exhilarating experience to ride in a group of bicyclists like that as soon as you go off Inside and on your own you're pushing through the wind by yourself and it is much harder It's really the same thing with with kernel projects or really projects I think in any large open-source project not just the kernel if you're not in the mainline then you're not Partaking in that momentum Yeah, and you just have to work a whole lot harder and as soon as you stop pushing you fall behind So that's what happens. So Lesson is clear go for the mainline If you look at the butter FS file system, which was under development at pretty much the same time Chris Mason merged that code very early on even though it was nowhere near ready for production use It still isn't but that project has just continued to thrive since and it's only accelerated And that's that's really the way that you have to do it Next example is a project to dig it into the mainline back in 2005 This is the m28 xx video for Linux driver a webcam driver Something that you might not think is all that huge a thing But this code came in written by a guy named Marcus wreck burger back in 2005 He was actually working for the manufacturer at that point. So he put it in it looked like a typical vendor driver. Let's just um We don't really need to get into that But it was there for a while various things happened if you look through the video for Linux mailing lists And you want to find some fairly incendiary stuff Look for this guy during this time and you will find it. Okay We came around to a couple years later and some and we see the last change that the original author made to it And then less than a year after that he disappeared from the community altogether, and we haven't released him since he was a He was a capable developer He's someone we wish he had but he just he went out of it and the key to what went wrong here is Mistake that a lot of people make especially people who are writing Code on behalf of corporations sometimes who are not used to working with the the community process in the middle of one of these Sort of high-temperature discussions He told the video for Linux maintainer that people who are submitting code Do you have to be aware of the fact that they will lose control over that code? And the fact is that yes They will that's the point that I'm trying to get it. Let's look at another example in 2004 Hans Reiser Who had put the riser 3 file system into the kernel? Saw a patch coming from elsewhere from Chris Mason actually to add access control list and extended attribute capability to the riser FS file system and he said no no you cannot merge that into my file system. Don't do it I don't want it. I want that file system to be stable. I want people working on riser 4 instead I'll come back to riser 4 So But he lost okay All that functionality was merged into the riser FS file system which extended its lifetime for several years because you need Access control lists for various security policies. You need extended attributes to work with se linux things like that So we needed to have it it went in and he simply lost the lesson is it's pretty clear This is not true of just the kernel. This is true of any true open-source project, right? If you contribute your code you've lost control over it and others will do things to it if you Understand the process and you like it. You think that's really one of the very best things I love to see code that I put into something kind of take off and fly away and become something that that I really never Would have made it be myself. I think that's the beauty of the system But if you want to keep it in that cage Then don't contribute it because it's just it's not going to work. It doesn't go that way And you see this also at the maintainer level where people who are maintaining subsystems who think okay This is my code. I have control over it, but it's free software. It's not anybody's code It's not really even Linus's code and so anytime you run into a situation like this where somebody is trying to stake out territory And say my code don't touch you're gonna lose you don't want to do it that way And again, that's that's true of any properly functioning open-source project So different story starting back further now in 2002 if you think back to the in the beginning of the 2.5 Development kernel series that we had that the the position of IDE subsystem maintainer was vacant at that time because That particular body of code was widely held to drive people and saying any time they tried to work with it And so we'd had a few people already kind of go and leave so another guy came along Name was Martin D'Lecchi. He said here's a set of cleanups for the IDE subsystem and they went in and Things were okay Month later less than a month later. He'd gotten up to version 18 of his cleanup patches 18 iterations that also went in at that point. He made himself the maintainer of the IDE subsystem It was now his sort of thing. This went on for quite some time until by July of that year he was invited to the Kernel Summit to talk about his work with IDE and so on In August of that year August 9th the 115th IDE cleanup patch went in One week later all that work was torn out of the kernel and thrown away The IDE subsystem was reverted back to where it was at the beginning of the 2.5 development series and Martin left It's like that. It was all gone. So what happened there? Anybody who was actually working with 2.5 in those days Quickly learned that if you wanted to run a 2.5 kernel on a system and still have the system you needed to have scuzzy disks Okay, because Martin had a very interesting approach to how you Improve a particular body of code All right So things tended not to work you install a system and then you're back to restoring from backups And that's not something that people really like to do especially if you end up doing it repeatedly So the lesson once again is clear right if you have a scorched earth policy towards Improving a subsystem you will lose and this is much much more true now Than it ever was back in the two out five days At this point if you break something with a patch that goes into the mainline kernel if you haven't fixed it within about a week Chances are very good. Your code will simply go out of the kernel We've become very very intolerant towards regressions as the whole process has become faster and One might say more professional So you just can't do that if people are telling you hey, this is causing me pain you need to listen to them because other people will and If you can't respond to that then things will go wrong Related example, this is a local example Hope I don't get in too much trouble Con Kalevis by training Course is not a kernel developer. Oh, he's a doctor. He's an anesthesiologist But despite that he trained himself and became actually a very capable very talented And very useful developer who did a number of things all over the the core kernel somebody who was quite quite productive Sometime in 2007 he took a look at the scheduler and he was very interested in desktop interactivity The scheduler that we had was allegedly designed for that purpose for interactivity and so on if you looked at the scheduler during those days I mean you could go back to an old kernel in the repository and look at the scheduler What it had had become was this incredible mess of heuristics that people had it Okay, well, let's do it this way boost somebody's interactivity here something like that It was very complicated. It was very hard to read very hard to work on Nobody really wanted to go near the scheduler anymore at that point Now with very good reason it was very hard to change So his idea was let's take all of that code and we'll just throw it away And we'll start over and we'll make a very simple scheduler that is based on strict fairness If you have four processes running contending for the CPU each process gets 25% period We won't worry about interactivity stuff. We won't worry about any of these sheuristics We will simply divide the processor up fairly between the tasks, you know Taking only priorities and new count to to shift things when you want to do that The interesting thing was that he got better interactivity out of this After having thrown away all of these heuristics and so on so he posted this it grew grew a whole lot of attention In fact almost right away. Linus said, yeah, that actually I like this idea. I like throwing away all that code I think it could merges once it's in in shape for for merging into the mainline kernel By two weeks later Linus was seeing a very different tune for the simple reason that The scheduler although worked better for Khan on his system and for other people as well Worked a whole lot worse for others and it worked worse for people, you know running large systems They even worked worse on a lot of desktop sorts of systems It was creating regressions for other people other people were complaining Khan was not fixing this and so People started to get increasingly frustrated with with the approach that he was taking This kind of reached its its peak in April of that year when Ingo Molnar Did what has since come to be called ingoing somebody I've heard this term Where he kind of disappeared for a day and wrote his own version of it and posted and said well How about this version instead? It was based on the same ideas, right complete fairness In fact, it was called the completely fresh scheduler, but it was a different Implementation which was intended to work better for for a much wider class of users And not just the particular set of users that con was aiming for so over time that drew attention very quickly It was CFS it was merged for the 2623 kernel and shortly after that con left the kernel development community And he left in a very public and very sort of unfortunate sort of way Things out of here. You can't stand it gave exit interviews and everything and And You know, this was bad publicity, but it was really bad because we lost a really good developer Right. We lost somebody that we can't afford to lose we can't afford to lose people like that No matter how strong we are no matter how well things go so It's this kind of thing that really motivates me to give this kind of talk all over the world for some years now So what do you learn? How do we avoid creating more cons? Well if you want to Change the kernel one of the first things you learn is that you have to improve the kernel for everybody Not just a specific set of users and if you don't improve it for everybody You must at least not make it worse for anybody, right? You cannot regress the kernel We have to take it forward and make it better if you have a patch that makes things better for some people But creates losers over here. You will have trouble. We just don't want to create losers in in the process now You know that is of course an ideal if I said there were never any losers and changes that go into the kernel I would be Be laughable because that's not the real world, but that's really what we shoot for Beyond that There's a simple fact that some parts of the kernel are really hard to change All right con was aiming at parts of the kernel that are first of all in the core that are important for everybody but also which tend to be very heavy on heuristic code and Things that have been sort of tuned and adjusted over the course of many many years and Where we have learned that if you make changes that seem to make things better Then you discover you've destroyed somebody else's workload a year from now when they finally get around to testing it So people are very leery of changes in memory management There's stuff you wanted to do in memory management that never did get merged because it's really hard It's just plain hard if you want to play in that particular piece of the kernel playground You have to have a lot of patience and you have to be prepared to really show over a long period of time that That your code doesn't make things worse for people Beyond that communications Contended not to hang out in the Linux kernel mailing list, which is certainly an understandable position to take because you have 500 emails in your mailbox every day is Painful sometimes So, um, you know, we can blame your call man, whatever But you know, it's hard and it's you get a lot more work done or so It seems if you let that all happen without you especially since a lot of it seems irrelevant Some of it seems really unfriendly you may hear things. You don't want to hear in various ways But if you are not part of the discussion, you're not part of the discussion You're not hearing the things that you need to hear You're not speaking to the people that you need to speak to and so you're out of the process You're not really there anymore Con like to hang out on his own list where he heard he had people who really liked his work Those are the ones who are motivated to subscribe to his particular list So he was hearing all these people saying yeah, go con you're doing great stuff. We love it but The larger community that was saying yeah, we like this stuff, but we got a problem here Was not getting through to him and so there was a real disconnect there that kind of disconnect creates problems for people We've seen this a lot of times where you have separate little communities that don't participate in the mainline discussion And it hurts you really need to find ways to avoid having that happen to you And then finally this is actually one of the key points to this whole episode is when you're working on Any software development project you need to aim for the solution not for the merging of any specific body of code If you look at what happened with the CFS scheduler con one he got what he wanted He got a completely fair scheduler into the kernel. He got the credit for having driven that idea Nobody was pushing that till he did it. Everybody knew that was his work Even if it was not his code, but that was not what he wanted he wanted the code merged this this happens a lot Okay, and it's it's something that you really want to avoid if you're working with the kernel because kernel developers will try To merge the best solution that they can find. I mean developers in any project will if they're thinking about what they're doing at all So you need to aim for that If you see a talk that Dan Fry gives about how IBM works with the open source community He'll say that one of their internal policies is that if you as a developer push a discussion for it Push that push a solution forward for the kernel if you make things better Then you were credited for that internally and your performance reviews regardless of whether it was your code that was merged or somebody else's It's a really good policy for a company to have And you've I've seen this work with IBM a couple of times where people have lost out on the on the inclusion Discussion and they picked themselves up and moved with whatever did get merged because that was where their incentives were and they Were working for the kernels a whole so If you take nothing else away take this away what you're looking for is a solution to the problem Not the merging with specific piece of code no matter how nice it is to see your name in the change lock so You know by 2002 it was becoming clearer that we needed an x generation file system even if Hans was a little bit ahead of the rest of us on this We needed something new that the ext whatever and so on we're reaching the end of their life So Hans had all kinds of very interesting very interesting ideas about how you would do a file system And he got some funding and he started developing riser for to implement some of those even if riser for Wasn't actually his end vision for how systems should work July of 2003 he'd posted the first version of the code No, we're acting in 2002 he posted it by 2003 he was trying to get it merged for the 260 crown Say, okay, it's ready. Let's put it in. There were some very interesting discussions at that point about how How he was solving all of our problems and the coach is going but it didn't At that point because Linus was not really interested in merging stuff the feature freeze that already been on for a couple of years and we Merged way too many features during the feature freeze And so that one didn't get in it's the only one perhaps that didn't get in but um, they didn't get in but in 2004 He did get it into Andrew Morton's mmTree which was seen as a track for merging into the to six kernel at that point But it kind of languished there for a long time He tried to get it into 614 he tried to get into 619 every now and then you come back and say it's ready Please merge it, but it didn't go in and then at the time of his arrest His soul really had no way to go in instance and it has pretty much died Even though there's still people who post a patch to it every now and then try to make it work on current kernels, but For all practical purposes. This is a this is a dead project So why is that? Why did we lose? What can really be think of as an innovative body of code from a very brilliant and talented developer? Well, there are quite a few problems not all of which are even listed here Riser For was not really a POSIX file system even if it was meant to plug into a POSIX system It's the only file system I've ever found where you can change your working directory into a plain text file and then read out the the metadata as little files You know you cat Creation date and you'd see when the file was created things like that And very much more things there was a transaction engine and an interpreter built into the file system Although some of that stuff got ripped out over time as you try to get the code merged It was it was a very strange thing trying to implement a vision of the operating system future that was not UNIX Right it was something totally different And that that's a problem especially You know as soon as you start trying to break things or change behavior you run into trouble Lots of technical difficulties people are finding ways to deadlock the system there are a lot of Implementation problems that came about largely due to the fact that the file system was developed Really behind a company wall until he was ready to post it and by then it was too late to change a lot of Fundamental design decisions and that could be a lot of trouble Hans's approach to benchmarks was creative And other people could not get the results that that he did in fact I had a long discussion with him because I posted some results that didn't match his and Turns out that you had to create your files on the file system in a specific order to get the the kind of results that he was getting and things like that that tend to make the Benchmark less real world than it even was before that sort of thing and tied to that was a Very antistatic approach to dealing with others when when people raise criticism. He would bite back at them He would accuse people of conspiring to to keep his work out of the Colonel Accused companies of conspiring to keep his work out of the Colonel It was a very aggressive approach that he took towards people who you said stuff that he didn't like and that very quickly made People not want to deal with him anymore And so it cut him out of a lot of the conversation and finally people remembered the riser 3 episode that I mentioned before Right and they thought okay, we're gonna merge this file system, but then you're gonna move on to riser 5 You're not gonna want to maintain it anymore. We're gonna be stuck with it, and we don't like that So for all these reasons he really had to push uphill to get try to get this file system merged And he maybe could have eventually gotten there, but but he didn't right so lots of lessons Linux is not a research system. Okay, there's a whole lot of innovative code in Linux people doing very interesting things But this is a system that people used to do real work, right? We're not doing research with it So if you were trying to take it off in very strange directions, you're gonna run into resistance You know, it's gonna be very hard to do Okay, especially if you're doing things that change Behavior is able to user space if you're really trying to change the way the system functions at that level It's gonna be very hard to do not impossible, but very hard No matter how smart you are and no matter how interesting your vision is if the implementation is not good enough It won't go in right if you've got a visionary file system that will deadlock the system. You lose It's it's not going to go in you have to have the technical side there quite simple Right if you see conspiracy theories in what's going on around you You're gonna have a hard time you still see this I've seen a couple of episodes of this just in the last few months of people saying, you know, this is X companies agenda to keep my code out of the kernel to promote this and promote that and While I would not say that this never happens in the kernel development community I will say that it is very very rare The community if you look at especially the the upper levels of the community the people who are in the maintainer roles These are people who have been working on the kernel for 10 15 years or so or more They all fully expect to still be working on the kernel five years from now But they have no idea who they'll be working for five years from now Right. So in a real sense, they're working for the kernel While trying to keep their employers happy at the same time. There's this really not a whole lot of Conspiring to promote company agendas at the expense of others in this way. It wouldn't fly It doesn't happen people are human things will happen. But if you think this is going on you're probably wrong and Finally the community remembers what has happened in the past and they think far into the future and they think okay What it's going to be the situation five years ten years from now because we'll be stuck with this code And if they don't like what they see when they think of that then again, you'll have trouble Intainability is is a key issue In terms of what the code looks like and whether you'll be there to to stay with it All right system tap back in 2003 Sun Microsystems came out with with detraces fancy dynamic tracing environment And said okay, this is great We've got visibility into the kernel that our kernel that nobody else has this is why you should all be running Solaris now And they they promoted it pretty heavily that inspired a certain amount of activity within the the Linux enterprise community in particular trying to come up with a credible alternative to this and so it took a little while but in 2005 Rel4 update included a thing called system tap which was really aimed at solving the same problem Trying to place probes into current into the kernel at arbitrary places Into a running production kernel right not just the bug kernel collect the data out perform statistics on it and do aggregation and Come up with with very nice clear pictures of what's going on inside your system So that was there It was under development in fact it had a very large development team had about a dozen developers working on a full time for Four years over this time, but still never quite got into the kernel instead in 2008 We merged a thing called ftrace instead, which was very a simple function trace. It would just trace function calls in the kernel Ftrace has since grown to pick up all kinds of other sorts of tracing functionality in 2009 we merged perf events, which is another piece of this system making is a forgetting events out of the kernel and Performing statistics on aggregation all kinds of sort the sorts of things that you want to do With with a tool like this and perf events too has grown amazingly and has developed all kinds of new capabilities between the two They're growing into what really looks like the next generation tracing functionality within the Linux kernel Even though in 2009 we saw the one dot o release of system tap Just a couple weeks ago. We saw the one dot for release But I don't really expect to see system tap in the main line ever at this point Which is too bad as big project. We're talking by a lot of people who are really trying to do something useful so Think back to 2008 again at the kernel summit in 2008 when we were talking about tracing Somebody asked the the crowd there. How many of you have tried to use system tap about half the people raised their hands saying yeah I tried to use system tap how many of you succeeded and most of those hands came down But 20% of the people for 20% of the kernel development community had actually succeeded in using system tap All right, these are not random users This is the sort of the top level of the kernel development community if these people can't do it then It's gonna be really hard to make work. Yes Of the 20% of the seed succeeded how many of them were working on the project themselves and the answer is some of them But there are other people as well There are people there were people who had actually succeeded in doing that if you worked at a hard enough you could do it Okay, especially if you just say installed a distributors kernel. It was already built to do that So, you know, this is a bad sign, right if you If this crowd can't make it work, it's gonna be really hard for your average system administrator to To get something useful out of it This was sort of reiterated more recently by Engel Molnar who said in short that we really shouldn't be focusing on requirements from CIOs or whatever what we have to focus on is Usability for developers and so on we need to really aim at what thereafter and nothing else So what it comes down to quite simply is that if the kernel development community doesn't see the value of something It's not going to go into the current right the kernel developers didn't see the value of system tap because it did not work for them It did not solve their problem and it was a pain in the butt to try to make work in the first place So for all these reasons it didn't go in beyond the sort of technical objections that also Exists but which could somehow be worked around if there were motivation to do that So a related example sort of showing the same thing my last example This was called at this time. Talpa showed up in 2008. Talpa was a security oriented subsystem the idea being to provide hooks for virus scanning sorts of utilities They wanted to put in a hook so that whenever some process on the system would open a file The the talpa demon could actually intercept that open operation go scan the file Look over look it over really quick decide if there's anything evil in there or not And then either allow the operation to proceed or block it forever more depending on what it found So this stuff Was posted it did not go in and there's a couple and there's a lot of reasons why it didn't go in But one of them was the kernel developers really did not like it said well Why should we bother implementing a really broken security model for to protect systems that are not Linux from Viruses right? Why should we do this? Why should we bother? This is not something that we're interested in Beyond that they didn't get their requirements, right? The requirements said basically we need talpa They didn't know what they couldn't say what they're trying to defend against that sort of thing They didn't really define their problem very well And so for all these reasons they were just so you know go away. We don't want this that sort of thing But if you look more recently back to last August There's a thing called f a notify that was merged into two six thirty six. Well, I need to put an asterisk on this because there have been some System call interface issues that have actually kept it disabled You can't actually use it yet, but it has been merged and this is The main purpose of f a notify is to provide hooks for virus scanners sounds very familiar All right, and in fact, it's the same code It was not the same code, but it's a derivative. It's a descendant of that same code done by the same people So what changed there are two things that changed In here one was that the notification mechanism was not only cleaned up But the developer went and took the existing two notification mechanisms that are already in the kernel Unified them clean them up and created a single notification system that served all three users in the kernel thus cleaning things up within the main kernel Main VFS layer considerably people like that kind of stuff, right? Simplify the code make it work better that sort of thing and then they Rephrase the requirement and they said okay What we really want to do is to enable these proprietary virus scanning applications which exist whether we like them or not We want them to hook into File system operations without having to behave like root kits and pass the system call table Which is what some of these things do Right, it's really pretty evil the way some of this stuff works But that's what they had to do to get the functionality that they wanted So we provide the way that they can do it officially using a supported interface and we don't have People abusing the system like that anymore between those two things and quite a lot of work He was able to get this code merged so Sell your code to the developers right don't try to sell it to To managers or even related customers you have to convince the developers This isn't always a good thing sometimes it can be hard to convince the development community of stuff There really is a user base for All right, but that's that's really nine is the way it works. That's what you have to do They're the ones who are making these decisions. It's not their managers who are doing it, okay? Yeah, and things like user space issues and so on which is why if a notify still isn't actually available Because once once we put an ABI in the kernel we can't break it So there's a whole lot of emphasis on getting that sort of thing, right? So we've seen a lot of examples as an awful lot more of them out there I could have filled several slides like this if I'd worked at it If any of these are particularly interesting you can ask me and the questions afterward I should have time for at least one of them But the point is that there's a lot of them right things go wrong fairly often is the potential for trouble is There so sort of finish out by saying well, okay, so why bother this seems really hard This seems like a pain. Why should we do this? Let's get go off and hack on you know, I don't know Content management systems or something with the standards are a little bit lower So why just a few platitudes you're starting with the fact that it's not as hard as it seems Okay All you really have to do is to pay a bit of attention to what you're doing read some of the stuff That's been written about how the development process works and listen to what people are telling you people will try to help you Get your code in the kernel. They really will even if their help takes the form of this stuff over here It's really crap throw it away burn it and so on they are actually trying to help you and if you listen to them You'll get somewhere. Okay. Um, it's fun right It really is fun. Once you get going with the system It's it's something that you want to do It's a club that not everybody can join All right, you really do have to do a little bit more than looking good in speedos But with some effort you can get there and be part of something that is really pretty cool If you're looking for work, you can get work that way pretty easily I don't know how much of a concern that is for people like that But the key thing really is that this is how our community works, right? This is how you get the kernel to do what you want to do. This is your vote This is the way that you can drive things in the direction you want to go if you don't do this Don't complain about where things go, right? This is how you do it and it's really worth doing so on that note I'll leave you with inspirational words from a former vice president of the United States No, and it looks like I have a few minutes for questions. So So I'll bring the microphone round plenty of hands one two. Yep. Fine. We'll start and work around Thank you Just reacting to what you said about riser 4 and the fact that it's very difficult to get a change in if it changes the visible behavior And yet actually I think we've seen such changes over the history of the kernels things like Udev or MPTL That we're well accepted by the developers and users at the time So isn't it simply a matter of the fact that maybe Hans riser was so antagonistic and so sort of unwilling to listen to At the remarks rather than the fact that it indeed actually his work was changing some of the established Semantics at the time. Well, as I said, it was a function of a lot of things But you have to look at how you change behavior Udev did not actually change the way the system worked before You didn't have to run it. You didn't have to use you events You could still have your dev directory with 20,000 device files in it And if you were happy with that you could do that. Okay, if you're running riser 4 then you kind of buy into What comes with it and some of what comes with it imposed requirements on the virtual file system layer and so on So it was a little bit hard. I mean, you know, but that was just one factor of many as you say there were quite a few others Yeah, yeah, if you break applications you things don't go well Just wait for the mic if you would John you've been talking about so many failures and You've been talking about so many failures. I'd like to I'd like you to elaborate on one sort of success story of a patch That's incredibly intrusive and very very deep level, but really really apparently works and people like it Which is Nick Piggins work on VFS scalability What are the things that Nick did right and what can others learn from him? Okay To describe those patches would take a while because that's very complicated work that Nick did but in short Nick Nick Piggins VFS scalability work has improved the performance of a whole lot of core virtual file system operations on Highly parallel, you know multi core sorts of environments You got to a situation where if you have a lot of processes trying to open files at the same time They were contending with each other for for various locks in the kernel and so on if you're running for example Big web server and that may be exactly the workload that you have so people were running into this so Nick fixed this and he did it with some very intrusive and very tricky code, but the things he did several things right starting by being a very good developer and and doing the code right listening to Most requests for changes when things need to be changed things like that But the key thing that Nick did is he didn't change the behavior that you see if you're a user of the system All you see is that it goes faster right so In that sense it wasn't intrusive at all if you run the VFS scalability stuff unless it breaks Nothing changes except the that a bottleneck has gone away So so that's a relatively easy example. There are plenty of others where where things had to be pushed harder There was one more. Yeah, thanks, Mike so about the scheduler example in CFS so you said that in it happened often is that Coders want their code in the kernel rather than solving the problem And isn't that an indication of the fact that the mechanism used to give credit for ideas is not up to par with the mechanism used to Give credit for code or you think it's just that programmers as it's normal. Just want their code in Well, it's both. Okay. I mean certainly it's normal to want your code in the kernel when when I put a patch out There's because I wanted in the kernel, right? But Certainly there is more visible credit Worldwide to having code in the kernel than to having come up with a useful idea Okay, I mean if you look at my own website, I post statistics on number of patches merged and their blinds changed I don't post statistics on useful ideas contributed because I can't count those Are those credits noted anywhere and the answer is that people try to do it in the changelogs Okay, so if you look there sometimes in the code as well But in particular the changelog is the place to do that, right? If you look just the other day there was a patch posted to remove the big kernel lock from the kernel altogether And he listed a list of like 20 people who had done the the key work in making that Particular thing possible even though it was our Bergman who actually put the patch in to do that So it's there, but that's that's hard to find you have to look for it. There's no real way to It's very hard to create the worldwide fame that you can get just by getting lots and lots of code in the kernel That that's just a sad fact of life You know people in the community perhaps understand where an idea comes from but beyond that it's hard Okay, we've got time for one more if there is any one more. No good. Okay On that case I have the gift which have you heard the story of you have My one that's the title wave that washed the The factory away and they use these to paddle to get back to land apparently Thank you very much. That was a really inspiring talk. I hope everyone