 Good morning everybody, this great said I'm John, I'm in charge of being wrong about the kernel on the internet and we have an interesting panel of folks to talk about kernel development issues here. I'm gonna start by having them introduce themselves. I have a whole long list of questions but we do have the microphone so if you can possibly get your way to it and you have a question to ask I'm sure they would be more than happy to answer it. I would encourage you to do so but let's just go ahead and start with Thomas if the panelists could introduce themselves and say what you do and why you're working with the kernel. So what I'm doing I'm maintaining interrupt subsystem, the timer subsystem and I'm part of the x86 maintainers team and the currently hobbyist state maintainer of the real-time extensions. And your name is Thomas. And my name is Thomas Kleitzner. I have a strong affinity to mission impossible. I'm Frida Ring Weisbecker and I work for Red Hat and mostly on no-hards and dintics feature so it's a subset of timers. So somehow I work for Thomas. My name is Grant Lekli. I am the maintainer of the device tree subsystem and I'm also on the Linux Foundation Technical Advisory Board as the chair. My name is Julia LeWall. I'm a researcher at INRIA in Paris. I work on tools to improve the quality of Linux code in particular on coxsinoe which is the tool that has been used for making a lot of patches in Linux. My name is Boris and I don't know why I'm here. Somebody thought I could have something to say but other than that I do a little bit of RAS maintenance ship and a little bit of x86 and everything else that's fun in the kernel. All right thank you. Well when I was asking people what I should be asking the panel there's there's one topic that came up over and over again and so let's just go ahead and get this over with in and then we can get to the more fun stuff. We've recently had a prominent developer out there saying that the kernel development community is sick. That was the word that was used. That we do not relate with each other well and that we are driving people out. And so I would like to ask what your perspective is on this and I would ask that we focus not on this particular developer because she is not the only one to raise such issues but to focus on our community and what is the health of our development community and what maybe could we be doing better? Let's start at the far end. Is there a problem? I don't know. I mean I never had a problem with the kernel community actually so they can be harsh but yeah well people tend to choose themselves so you tend to work better with some people and not so good with others. So I don't think we have a problem. So are we selecting for the people who can actually stay on the environment that we've created or you just have to find your particular place? I wouldn't say selective. I mean if people choose to stay with development they stay in. This is my experience only. Okay it's just in my humble opinion. I never had a problem on just a little one. Julie what do you think? So I have made I don't work on any particular part of the kernel. I fix bugs all over the kernel so I have interacted with a lot of people maybe not in a very profound way but at least in a sort of one-time way. Overall I've been very impressed by the supportiveness of people and the constructiveness of comments. Perhaps some people are more harsh about some things but sometimes one does something that's foolish and maybe harsh comments are deserved. I think if one is very motivated by working on the kernel one has to just push forward somehow. In my experience with kernel I think 99% of the time all the communication and all the conversations have been great they've been respectful they've been really productive and good work done but I don't think the problem is the 99% of the time. I think the problem is that when things do bubble to the surface those are the conversations that can end up setting the tone for the entire community. I am concerned about this. I think that it keeps some people out of the community that would otherwise be that would other groups be more involved. I can't speak for all kernel maintainers but I can speak for myself and from my perspective I don't want to work in community that will tear down other people where the anger bubbles to the surface too quickly or where conversations start to degrade into attacks upon a person or abuse on a person instead of instead of debating vigorously the code and the technical things that we're working on. So from my perspective it's I think it's really important for the health of the community not to just kind of say well that's 1% and we'll just kind of ignore that but to instead to be really clear about what my position is on on the way that we interact and for my for me it's I don't want to be ever in conversation where I am tearing down or abusing another developer who is working here and so if if that happens on my need the main list that I'm going to I'm going to talk I'm gonna call you on it I'll talk to you about it privately but I'll call you on it and I will hope that if when you know when I screw up when I start to get emotional about something and cross the line that you'll do the same for me. So if there is something that I think we could do better with the community is perhaps the development process which is which I think is sometimes a bit heavy not only for newcomers but also for regular contributors I mean at least that's how I feel myself I can't speak for everyone but yeah the life cycle of a patch is usually very very long from the time you write the patch then well that is just 5% of the development process then you have to write the changelog post to the to the mailing list then waiting for reviews then you have to iterate again so sometimes if you just have a wine line change to do you think about it twice before investing on it because it's yeah always quite some time to invest even for very little changes. I know that I'm of the persons who can get emotional on the mailing list I'm not looking for an excuse for that but on the other hand if you're in a maintenance position and you get bombarded with patches and the patch quality is lousy as hell so you go there and say very politely hey this is wrong for a technical reason please change this please do another a different approach the same person come back within 24 hours with another patch series which is similarly wrong but slightly different and you start over and make it a little bit more clear and I'm really trying hard to keep calm but after you get the fifth version of shit really there is a point where I can't stay calm anymore because it doesn't help I mean the only thing that helps at some point is tell these people to go away look for someone who is actually willing to help them to do their job I can't do that I can't hurt hundreds of random developers all over the world that's just a time question I mean I tried the clone thing it comes always back with you too many instances actually I can understand what Thomas says I mean you sit there and review patches all the time and you get the pad the next version and you say change it this way change it that way and then you get a different change completely different one and sometimes you just I don't think I'll say there's there's a lot of positive examples where you actually can work where work very well with people you explain them where you're going they come back after a week and have intelligent questions about it and have maybe in a different solution to it and we discuss it and then they go back and write a real real great patch that I mean that's that's the fun part of being a maintainer you see that actually they thought about it like really thought about it right well and I think that that's important is that one of the one of the key factors of our community is that we're really good at that attacking a problem and coming up with good solutions and we're not okay with hacky solutions I don't think we I don't think the two things that are going on are mutually exclusive we cannot give up the technical excellence that we demand but I think at the same time we can be clear about you know there's we're not going to attack people in the process well what I do for example when I'm really frustrated I go for a run we have a question from the audience here I have a two-part question about device trees can I go now no part one when are we going to get them out of the kernel tree part two when are we going to see them perhaps used on other architectures more prominently x86 or anything other than the couple that are currently using thank you right so I guess I'll take that one okay so the first part device trees out of the kernel it's we've gone back and forth on that a number of times it would be really good to have a separate repository it's really painful for some developers to do that for a lot of the embedded platforms because now you've got two repositories that you need to build from so we've made some directions that there is actually a kernel tree that on kernel.org that's been maintaining a mirror of just the device tree files that's been kind of a staging first step a big part of the problem is no one's working on it and you know to have someone get to be in their bonnet to actually get this get this done and make it so we can move the device tree files out of the kernel piecewise so that you know we've got some in the kernel tree for early development but still be able to go to an external repository to get them that would be cool as we're seeing it on other architectures there's going to be a session later today where we're going to talk about what are the next steps and one of the things that should be one of the things that I think needs to be done is we've got it we do have a specification for device tree and it's kind of old it's been around for a few years there's a lot of new things that have been done I think getting that spec restarted is one important part of that and I want to as for other architectures like x86 there is a little bit of use on on x86 for specialized use cases like when you've got a PCI ad in board with like an FPGA and the configuration that that may change but it really is what do people need it's I am happy to merge the code device tree works on all the test cases will pass on x86 but I haven't seen patches of people who have got things they are burning to be able to do on x86 we always hear that this the kernel is moving at this incredible speed we get more and more patches but there are certainly parts of the kernel that don't get so much laugh so what do you think are the parts of the kernel that we should put more work in tty layer we have a volunteer definitely not I saw a lot of people go nuts on it so I don't want to be part of that you know the the parts of the kernel that get work are the parts of the kernel that bother people typically you know if something is not working the way you want it to to work then eventually you get fed up and you go and you fix it and so the parts that are unloved often that's an indication that it's something that's not really being used all that much either that or it's just something that nobody can quite bother to get around to to fixing yet something that was on my list is when when are we going to fix the year 2038 problem I wish to do it I mean from the cool timekeeping perspective we are in a good shape now so we have converted all the cool code to over to to use 64 bit seconds representation even on 32 bit machines so that's that has been merged in 317 I think so yes but there's still the hard work to do to go through all the affected user space interfaces I mean the cleaning up the the in kernel uses is pretty much a no-brainer but the hard part will be to fix up the user space interfaces because we actually cannot change them without breaking the world and some more I mean the BSD's really did a great job on it but they have it easy they have to use the space in the same tree so they did a wholesale change recompilant wait what explodes so what we have to do is actually go and analyze all and every uh syscall interface the syscalls are not the hard part the hard part of the IO IO CTLs and there's a sue of it so you have to go through and look for stuff which uses time specs time balls or time t and create new either I octals or new syscalls and then the next step is going to be that user space has to support the new interfaces and then applications have to take it up so this is going to be I'm not afraid of the kernel part of the worker I'm more afraid of the overall ecosystem change so along the lines of the overall system and parts of the kernel needing love we used to have a regressions maintainer somebody who tracked the regressions in each kernel release and helped to make sure we fixed them and that person found other work to do some years ago and we have nobody doing that and so there were concerns certainly that we would lose track of regressions and that perhaps our kernel quality would go down what we've heard from Olaf for example is that this has not happened what we've heard at the kernel summit as well is that this has not happened that if anything we're we're producing fewer regressions and we used to both in terms of functionality in terms of performance and curious if if you all agree that we're getting better and if so how is this happening how are we getting better I think most of it is that we do have more tools we do have more in kernel debug infrastructure and we do have more fully automated testing which catches a lot of things before they start to explode in the face of uses we've we've also gotten better at that process at what we're doing it's very very clear before a maintainer sends a tree to linus or a sub maintainer to their maintainer you know make sure your kernel builds on all mod config on x86 and you know these are things that I think a few years ago were more sloppy than they are now and we have the zero-day testing I think that changed a lot of improved a lot and stuff like junia is doing with with her static analysis tools and actually finding a lot of interesting bugs in the kernel yeah it's yes it's for people were not familiar with it so every day a whole bunch of different compilation with many different options is run many different configurations and many different number of tools that are included in the kernel are run over all of the patches that have been contributed and then the developers who contributed any patch that causes a problem are informed immediately and hopefully since they were thinking about the code within the last 24 hours they will be motivated to fix up the problem quickly and I think in general the response has been very good and people fix up their code as they should oh and also I tend to see like bug reports and now came up but not so often we could use somebody to I don't know track those maybe and another way to get to get involved in the kernel is not only just fixing just trying to analyze the bug and try to fix it it's much better way than cleaning white white space so yeah yeah we have enough white space maintainers okay an interesting one of the more interesting changes that's gone into 3.18 is in the networking layer it's a very simple change it takes the form of a flag past the device drivers saying that there are more packets coming so you don't have to actually kick the hardware yet to make it going and this has improved our transmit performance considerably especially with small packets which was a big problem for Linux for quite a while very simple change and JantzExpo kind of ironically pointed out that the block layer has had a very similar mechanism for years where they have said there's more requests coming so you can do this so my question is and I want to start with Julia since perhaps you put your fingers into more parts of the kernel than almost anybody else around how are we communicating between the various subsystems of the kernel and moving good ideas from one part to the other do you see bad ideas that show up that need to be fixed here and then they show up again over here that sort of thing or do we I mean sometimes it seems like the kernel is there's a lot of different silos and areas that people don't understand across them and is that a problem or do we actually do well at moving ideas across the kernel yeah so I'm not sure we do so well so an example that I have studied a bit is the DevM function so it allows managed memory and this is something that was introduced in 2007 and so I made a graph for one kind of driver where it could be used and the uptake was extremely slow for a long time and then eventually it took off and then it slowly has moved it has become available in more different types of devices and then maybe as people have gotten more aware of it it the up the take-up has been quicker for some other kinds of devices but there's still a lot of other opportunities for its use and it's something that's in some sense I mean it's not a big thing it's just the management of different kinds of resources but it's something that people very often do wrong and the use of these DevM functions eliminates the possibility of doing things wrong so it's something that in some sense is very important so I've seen in that case the communication between different subsystems being rather slow I tend to use LWN when I don't know you're doing a great job thank you you anybody else have any thoughts on on that issue grant how does ARM learn from the other architectures this is part how does ARM learn from the other architectures say are the good ideas coming from other architectures or two other architectures painfully is that a useful answer it's I think that actually ARM and x86 those are two extremes of different ways to approach platforms and I think that there's been a lot of things that have worked quite well in the x86 like in the server server general purpose desktop market on x86 that we've had to learn the hard way on ARM so there has been a lot of as we're going from you know a good example a single kernel image instead of having to build a separate kernel for every single platform we want to have a single kernel and there's been we've recognized that there's a problem that cannot just continue to do what we were doing and then it was going and looking for the solution and it was going and okay well we know this isn't good we know we need to have single kernel image and then going and looking at x86 or power pc when we want to figure out when we're trying to solve the individual technical problems that are associated with that so it's it's been something of only when the problem is recognized and becomes becomes important or urgent does it actually you go looking for these other things another example would be I mean rcu has been in the kernel for years and years and years I'm just now looking at bringing rcu into the device tree infrastructure uh because there's patches that have been uh there's been a patch set that's been worked on the last couple of years by uh Pentelus to add dynamic changes to the device tree well we've got this kind of awful locking problem and rcu solves that locking problem and we I wouldn't have made the jump to do that because there wasn't a burning need to do so until this happened okay that sort of leads into another question I have because rcu with a read copy update mechanism is great for scalability and so on but it definitely adds complexity to the code that is introduced into in a lot of places we're adding a lot of complexity for what sometimes seem like crazy use cases you know people wanting no hurts CPUs for example or things like that are we approaching a complexity cliff for are we managing that so far I think due to the fact that we spent a lot of time in the last 10 years to move actually complex stuff out of architectures and into the core code we have a very small spot where we handle the complexity yes we are aim into it the complexity cliff but we are going we are taking care of not jumping down down the cliff too fast I think we've done a good job as well of compartmentalizing the complexity where the block layer is complex but it doesn't interact with cpu power management which is also complex now I mean there's varying levels some some things are nice and contained other things end up having hooks all through the kernel I think we're doing okay though and we're good in in reverting stuff which we where we went down the wrong road once we stand just in front of the clip we usually turn back there's a moment of oh that's awful let's not do that but there's a moment of oh that's awful let's not do that and in some point you have to go down to that point where we actually figure out it's was the wrong the wrong approach I mean some of the approaches work very well 10 years ago cpu hotpluck is one of those I mean it still works in some way yeah and that's one of the areas where I think we're actually quite conservative and quite careful when we start touching the complex parts I mean the cpu power management is a good example of that where there's been you know powerware scheduling has been an ongoing topic for the last number of years and you know some people are frustrated that it's not getting into the kernel but this is not an easy problem it's not easy to come up with a solution that's going to work across all the different platforms that Linux runs on so it it's in a sense it kind of has to be painful to make sure we're vigorously dealing with all the the corner cases that we're going to have to deal with thank complexity is also on the table when you do review for example you always ask yourself is it uh is the complexity worth the trouble is going to bring us anything and I've seen cases where we add the complexity in a couple of days later we just remove it because it's unhandable make it simple it's better sometimes we have a question hi so uh there was a keynote in san jose by tim bird about running linux on small devices and there's a lot of differences of opinion on how small is too small to be considering running linux um what would be some of your thoughts on where that barrier might be I would say the barrier is the amount of time someone's willing to work on getting the linux kernel running on that device it's it's going to be my view on the thing on it is that I think the work to shrink the kernel is great it's fantastic I support that all the way the problem is is that there's very few people who are either have the funding to pay for them to work on that or interest or product that they actually need that for you know we still are in the realm of especially on the small devices the smaller devices just keep getting cheaper and it's easy to throw more resources at it so there's lots of places where it would be advent where having shrinking down the kernel and making it smaller would be valuable but it hasn't gotten the attention it hasn't gotten the um priority to actually work on it and make it happen uh now as far as what's too what is actually too small there is of course going to be a limit to where you know our whole code base you know you're you're not going to run it on an at tiny 8 bit micro controller but I I don't know how to answer the question of what's what's too small because if someone's successful at getting it running and the patches aren't awful why wouldn't we pick it up you say that grants but there's been some real push back against some of the tinnification patches that have been put out so far that either they bring an explosion to configuration options or they would say put the network stack in a non-standard compliance sort of form good enough for the people who want to use it but not good enough for the networking maintainers we we do run into people who resist that kind of change and it makes it harder to get it in in general one could extend the question to how far should we push things in the kernel to meet the the requirements the use cases of people that we might regard as being crazy users the people who want to have cpus with no kernel involvement at all so they can run their task on it and not even have to deal with with the latency caused by an interrupt or you know people who want to run thousands and thousands of processors or people who want to run on some sort of little smart dust thing that you're going to spread all over the world you know these are use cases to push the boundary how far are we willing to go to support those i guess it long as long as it doesn't violate common sense and violate the functionality itself and restricts through the development we will go the same route we did for the last 20 years which has to accommodate with the needs but then the example you brought up with networking i mean that's why i said it violates common sense i mean if you want to have internet of things and then get restrict remove the firewall code and the ip tables code from the networking layer i mean everybody's talking about security i don't want to have an insecure refrigerator in my house i mean i don't want to have an internet connected refrigerator at all but if people insist on having that then we should actually tell them no getting rid of security features and just having the next annoying annoyingly wrong and and in incomplete tcpip stack out there is not an option we support actively we don't want to encourage people so stupid things the counterpoint to that is that there are going to be a certain segment of devices that are not able to that are going to require a smaller kernel that are not going to be able to have all the features uh and so the alternatives there's a choice then presented to them is either uh get this get the smaller get the features removed from linux or be able to turn them off or go to a different operating environment and i'm not i'm not comfortable pushing people away just on that what i think is a more important is the the comfort level of the maintainer on whether or not those patches are just going to cause a maintenance nightmare over the long term i think the explosion of configuration options is probably the most valid response to turning things off if it's going to make the code base difficult to maintain then those are the kind of horrible patches that i think we should be saying no to but otherwise i don't see the problem well i see a problem in encouraging people to do stupid things i mean people do stupid things anyway but i as a maintainer don't want to encourage them and actively help them to shoot themselves i mean i'm feeling bad about that really okay we have another question over here hello um so i'd just like to lend my voice to the uh small kernel uh i don't think internet of things is going to be so much uh refrigerators being connected to the internet but i think yeah they're talking about putting linux into light bulbs i know well i was thinking more of like all the biometric you know my heart rates and steps and things like that and i'm seeing lots of things we're having a secure uh having a secure network stack secure bl e stack um in a pretty small pretty small footprint would be would be exceptional and and we're seeing people kind of fork off and use free r toss and and other things because they can get these things but having linux be able to fill that niche would mean and you know just a wealth of opportunities you know even probably 10x 100x what android has afforded linux these days um that's just you know that's just what i've seen any comments i don't think i don't think it's so i don't think it's so like um i think there's there's really valid use cases here i don't think it should just be i think it's something that that we should work towards yeah i mean i mean there's no there's nothing wrong in in in getting linux into your biometric device or your light bulb i don't care but um i we have to have the people who actively work on that and work together with the maintenance to to not make it a nightmare so one of the patches i saw floating around it was the hell of an if death uh horror over 10 source files sprinkled if death constructs in and then i'm going to say as a as a maintainer no way go back do it in a different way so figure out how to do it proper we we have a lot of mechanisms to to do it right so but then we have to have a lot of people going through all over the tree and figure out how to solve that tinefication problem in the various places working with the various maintainers without uh imposing um any barriers on the development of of linux and of the global evolvement of the code i mean i think there's the the solution cannot be okay we're sprinkling uh 10 tons of if deaths around the kernel and then run away screaming and say yeah we've we made it no that's not going to happen i mean if people come with proper patches and and they are not hurting anything nobody will object to to to merge them and if you do that do the apply the if deaths and then you go and change that and break some device people go screaming you just broke my device and then it becomes a real real maintenance nightmare so this is like a big problem so if it's done clean i don't think we have to be well thought out i mean we have brought large and complex patches into the kernel without breaking the world without imposing too much trouble and the evolvement and if the tinefication stuff uh wants to to achieve that goal they should just do it i mean it's it's doable it's not rocket science it's just a hell of a work and you need people who are not scared to touch every other file in the kernel and the the other thing with doing that work is it's it's very fragile is because when you the when you get to that you're talking about a very small subset of devices that not a whole lot of people actually get around to testing so without someone really keen on getting in there and making sure that the the tiny configurations still work over the long time it will end up bit rotting it will end up not being useful in a year after the patches are merged to try to be a little bit more positive than what others have pointed out when i look across the kernel code i see a lot of um there's a lot of old code coexisting with new code and different small variations of things and it could be that a tinefication perspective could motivate people to clean up some of that and improve the general quality of the code and understandability maintainability of the code and so on so i think you're right unification of copied code should be a large portion of the tinefication project it's not just a copied code but small variations of things yeah we have mindlessly copped out all over the place okay i think that what we should do is we should move on to the next question so that we have time to answer them before we run out okay this is essentially just a comment um from a different direction so at SUSE we have been working with the high-end machines and you know 4k cpu's and whatever trying to get that stuff working with some vendors a while ago and i think when when you're doing reasonably weird stuff there's a certain amount of pain that you have to endure yourself so which means um there's certain things that we actually have tried to push in the kernel and in other areas we got a firm no and that doesn't mean we have to turn away from Linux it just has to mean we have to carry those patches ourselves and i think that is also an option for the tinefication if you are a vendor of a small form factor device with whatever requirements on the small side of things um rather than breaking the kernel for everybody there's a certain you know just is to actually keep some of the painful stuff on yourself with tinefication that may really be true just because everybody needs their kernel to be tiny in a different way and they have their thing that they can do without and so i think people will end up carrying some of that next okay i would like to ask a question or if so the phones are basically pretty much all of them are running linux now but there doesn't seem to be single phone capable of running mainline linux so is it going to change can we do something to help it's really big area well of course now we're talking user space because the kernel they're running and they're running something very close to a mainline kernel you can run mainline kernels on an awful lot of devices with the usual caveat of binary only drivers which is a long-standing problem that we've been fighting for years in terms of getting a mainline distribution running people have done that on various devices i don't know single phone that could run mainline kernel i'm talking kernel actually i don't know single device that can do that like i can run on serial port and there's a few devices that are mainlined i can't i can't remember them off the top of my head but but i don't know if they'll like tom knows usable as a phone not just yeah i can use it as a toy i've got mainline running on my n1 it's i have a phone that's running mainline and i'm making a call i'm sending a text but it's really mainline it's not the vendor lock-in yeah but we're this is really user space problem right the kernel can do it no actually no that's not true there's uh there's still a problem of there's a lot of patches that end up being in the vendor trees for the kernel that don't make it into mainline pretty much every mainline every phone that's shipped has its own patches that are on top to make the hardware work and it's a whole lot better than it used to be but it's still not there and i mean that's i mean i i work for lanaro so i i see a lot of this across the industry i don't have you know i don't know what the solution is other than i know what little bits of the puzzle are and one of those is continuing to go back to the soc vendors and helping them and pushing them and encouraging them to mainline their code so how are we going to fix the binary driver problem in one minute in one minute complain a lot uh it's we don't have it's not something that we can compel it's not something that we can force the best thing that we can do is i think to be talking to these companies and continuing to say you know you're just making it hard for yourself it's much better for the quality of the kernel for the quality of the products that are coming out if we can get open source drivers and within the companies that have graphics ip or radio ip it requires a culture shift within within those companies i mean we can't do much about it as long as the phone manufacturers actually just accept the state of the of it and as long as the only the only people who can put pressure on the on the chip manufacturers or actually the phone manufacturers and if they don't care there's nothing we can do about it i mean we can tell them that they are doing stupid but they don't don't want to listen they didn't listen the last 20 years why should they start listening now so no it's not it's not something we can do much about it i mean we can talk to people but that's everything all we can do although to say all we can do seems to be a kind of defeatist it's actually quite uh that's one of the things that we have at our disposal is we've got an awful lot of influence uh in the way that engineers are thinking and especially uh i think engineers who are coming up through engineering school now are having far more exposure to open source and thinking in terms of free software than 10 years ago than 20 years ago and so on so i i think the culture is shifting and that's a good thing all right well looking forward to their contributions meanwhile we were out of time i think it's time for the break so i would like to thank all of our panelists for joining us today