 So, good afternoon. My name is Mark Zangier. I'm with ARM, as you can see somewhere. And I'd like to talk about the hurdles some of the contributors have when it comes to getting patches merged into the main Linux kernel. So, who am I? I've been playing with Linux for quite a while, since 93. And if you find that annoying, well, don't blame me, blame this guy who wrote the IS64 port a long time ago. Got my first patches merged in 96, in the form of the MD driver. I offloaded the maintenance chip a year later. So, again, if your disk arrays got corrupted, just don't tell me. I really don't want to know. I've been with ARM for over eight years now. And my job is mostly to bridge the architecture, some bits of hardware, and Linux. And when it comes to Linux, I look after KVM on ARM together with Christopher Dahl, who's here, and the IQ subsystem together with Thomas Gleisner, who's at the maintenance summit at the moment. In real life, I'm a base player. So, first, let's start with the disclaimer. This talk is not about a maintainer renting at contributors. It's really not that. First, it's a talk for everyone. It doesn't necessarily apply to only two first-time contributors. It's really for everyone. Who has already submitted a kernel patch in this audience? Raise your hand. Okay, quite a few. Who among these people got their patch accepted? Okay, a bit less, but it's not that bad. Okay. It's not that bad. It could have been a lot worse. So, there are things I'm going to talk about here that not all maintenance would necessarily agree with. And that's fine. We're not necessarily about having a massive consensus about how things are done, but it's about discussing them. And if you get flamed for anything in that talk, for doing anything that's explained that talk, send them my way, and we'll have that discussion. So, over the years, you would notice if you paid close attention that there's a bit of a disconnect sometimes between contributors and maintainers. And you have things like my patches are getting completely ignored. Yeah. That's annoying. I can see that. I've posted these patches four times, and yet, they're still not in. Okay. It's, something is wrong here. It shows that we're not following up on something, either the maintainer or the contributor. There's no blame to be put here, but just outlining that there's a potential issue. I've copied this from a mainline driver, you know, actual existing code. I've copied it into my brand new piece of code, and you're telling me now it's not right. So, how can it be in, which is good, and out at the same time? I only want to get this code merged. I don't have time to do this extra work. And you're starting to see where that's going. There's a bit more disconnect between two people talking to each other, maybe past each other. I'm giving you this code for free. That's an annoying one. And this one, which I've seen personally a number of times, I don't have time to really get what you're asking me. Just tell me what I should write. Okay. We're at odds here. So, it's a story about how we deal with the kernel code, the kernel source tree, and there's a number of characters in that story. So, we have the contributor, which submits a change to the external source, to the mainline in external source. I'm not talking about some random SOC-specific fork. We're talking about mainline. That's all I'm interested in. And they intend to get that change merged eventually. So, that thing could be a new feature, a bug fake, some cleanup. It doesn't really matter. The complexity of that change can be ranged from one extreme to another. And in the extreme case, the extreme complexity, the contributor is often the one who knows best about it. So, we have a reviewer. It can be distinct from the maintainer. Often, they're often another contributor themselves. And they have spreading the load so that maintainers actually can scale. And they're probably the least recognized character in that story. We heavily rely on reviewers. No code can get in without being reviewed. And then we have the maintainer, who are responsible for some piece of code in the kernel. And for that piece of code not to break, whether it's at runtime or compilation time, has to be secure, especially in this day and age, and both readable and understandable. That's, these are the criteria we're interested in. And ultimately, they are the ones who take responsibility for that code. If that breaks, well, they will be the ones having to go and fix it at a weekend. There's no rest for the wicked. And they're often the targets of hundreds of emails a day. That's quite daunting. But they have a few things in common. Their motivations are not that different. They all converge on a single point, which is the code itself. And they try to solve a very, very difficult problem or problems. It means there's individual responsibility. There's a deep personal investment. A maintainer, for example, changing job doesn't mean, oh, I'm dropping my maintenance. It quite often means, okay, I'm going to carry that maintenance on my own time, because it's important to me. And quite often, a contributor grows into a reviewer and a maintainer, a co-maintainer. There's a continuum here. You don't have necessarily really siloed roles. So what's the story? It's all about how we submit patches and get them merged. So you have this brilliant idea and you've posted a patch series. You get it reviewed. There's a lot of conversation going on. You respond to some comments and you do it again and again and again. It looks quite simple, but that's only on the surface. Because, by the way, what's this patch series thing we're talking about? Who do I send it to? How do I get it reviewed? Who's going to take that responsibility to review my code? And I'm receiving a lot of comments. I don't necessarily understand what they're about. So that's quite overwhelming. So let's try and dig into that a bit. So what's the patch series, really? It's an older set of patches. You know what a patch is, a small diff. You know, you do a change. That's small change. It's an older set of patches. So conceptually, your series is a single change. But we have limited brain power, I certainly do, very limited. So I can't grasp your 2000 line change. So you have to make it easy for me and break it into tiny individual changes in a nice order so that I can understand it. And make sure that nothing in the kernel breaks if you apply only half of that series. And what does it look like? So that's as formal as I could go. But that's really how it should be, how we want to see it as a reviewer or maintain a receiving on the receiving end of that, of your code, of your patch series. So each patch has a title. Easy enough. That's often the subject of your email. It also has a clear commit message. If you have the idea of sending a patch without a commit message, please don't. Think again. Put a commit message. Each patch is numbered x out of n. x is unique. You don't have the same patch twice. n is constant, and x is smaller than n or equal. There's a unique version number for your series. Don't post the same series twice with the same version number. It's utterly confusing. Think of who's receiving that series. And you have a cover letter, which is patch number zero. Usually, you don't send one if you only have one patch, but it doesn't hurt to have one. And you describe the goal of this series. You add a change log. Really, it's important if you version your series. When you reach version 32, really, I completely forgot what you did before. So you need to have that explain, okay, I've changed this this time around. I've replied to this comment. That's extremely useful for the maintainer. It contains a diff stat of your series, showing what files you've touched and in which proportion. And all the patches are in reply to the cover letter. So that's what it looks like. Now, why does it look like that? So these are requirements. Why do we want this? For example, an ordered set of patches. That's for the maintainer who receives your code, who reads it, to understand the progression in your design. You don't change things randomly. There's a thought process in your design. And also, it's needed for bisection. I need to be able to apply half of your series, bang in the middle, and test that, that can't break, that needs to compile, that needs to run. Logical changes. If you put two unrelated changes in the same patch, well, again, that makes it harder to understand. And it's all about trying to come to the smallest possible change that you can convey as an idea, as a concept. Patch numbering. Am I missing anything? Has my spam bot eaten any of your email? That's useful to know. I can go and fix my email and also I can make sure I review your series. It's in entirety. And also, it helps with some threading and ordering in email clients. Version numbering. Is it something new? Or something I've already reviewed? So all these are small details that helps the reviewer understanding the progression in your series. And the cover letter. We need to know what you've changed from one version to the other. So for example, make sure all the recipients receive at least the cover letter. And that's your chance to convey an idea and to have that conversation with the maintainers. Don't do a bad job at writing detailed cover letters. That's really something important. And that's the don't do it kind of thing. If you look at this series, that's actually, it's a copy paste of something that ended up in my email client. I've just anonymized it because there's no pointing fingers at people. When you see that, that's supposed to be part of single patch series. There's four version numbers. There's no sense of progression. You can't really extract any meaning from that. First, you miss, obviously, at least two-third of the series. There's no cover letter. But beyond that, what it really means is that the risk is that we're going to ignore that series, meaning we're going to miss out some potentially really important code. Good ideas, good fixes. I mean, we need people to engage with us and to contribute code, but that's a wasted effort. And that's a shame, really. We don't want wasted effort. And it's even more of a shame because we have the tools. We have the tools to help you do that, to send a proper patch series. And that's good. I'm pretty sure most of you are familiar with the tool. It's not that easy. I agree. But it's basically all you need. Yeah, questions here? Yes. You post a series. You respond to it. You agree to some changes. You make those changes. You repost the series with a new version number. You iterate over it. And you say, okay, that's the new version, plus one. So I'm getting to that. So Git is all you need for that. And please, do not ever send a patch series by hand. Do not copy paste a patch in your email client. Do not edit a patch by hand. Generate the patch with Git, send it with Git. That's how you do it. You configure Git as an email client. That's something like four lines in your Git config. You add those two variables, two cover, CC cover, which are going to take the two and CC lines in your cover letter and apply that to your whole series. And for each of your patch series, you first identify the recipients. Know who you're talking to. Maintainers, reviewers, other contributors, mailing lists. You format your patches with that line here. That doesn't work. That does work. So what does it mean? Git format patch, self-explanatory, minus some directory where you're going to stash your patches, a version number, that's v3, that's version 3. Generate a cover letter and the boundaries of your branch. And that's it. You edit your cover letter. You write this nice description we've talked about. And then you do a Git send email. First with the dry run. Check that it's all fine. Remove the dry run. Off you go. That's all there is to do. Same thing when you reply to email, when you interact on the list. Use plain text email, please. Try to avoid HTML. That breaks all kind of things, including filters. Reply in line. Don't top post. No silly disclaimer. Avoid attachments if you can. It's not always avoidable. CC people that actually matter. Don't CC the world. I mean, if you see 50 people on CC, on the receiving end of that email, yeah, there's so many people on CC, someone else will do the job. We'll reply to it. So keep your CC list short and to the point. And when you reply to someone, trim the email. Keep the important content. Don't have a one-liner at the end of a 300-line patch. That's just a waste of bandwidth. A quick one. Why email? Why not? I don't know. Facebook. There's all kind of good reason why email is important and email is our tool of choice so far. These are the requirements. I'm not going to go through that. But we need something that fits those requirements to move slowly, gradually to another system. Of course, email is sometimes badly implemented in some companies and we have issues with people can't send email easily to Desdes or kind of filtering. And we end up seeing a lot of corporate contributors using their personal email addresses. That's sad. But that's not a lot we can do about it. If somebody wants to tackle that idea for the next DLC, SMTP in a corporate environment, I'm happy to provide material. Review your bandwidth. As I said earlier, reviews get a lot of emails. Hundreds of emails a day. Address to them directly. So if you've posted a patch series two days ago and it looks all good, got some responses, you've made the changes required, it looks all good, you're ready to post it again. Take a deep breath. It's a bit early. Leave some time for other people to catch up. Posting once a week is good. Not more than that. That's the rate at which we release candidates. There's no reason why you need to be more proactive than that. And remember how long it took you to write those patches. It was painful. It was okay. Need to understand how this thing worked. For the reviewers, it's likely to be the same. And they need time to digest your patch series and understand what you're on about. So what's the process for a maintainer, for a reviewer? So is it something I have an interest in? Is it something I maintain? Yeah, okay, sure. Does the patch series overall make sense? Is it properly formatted? Do I understand the cover letter? Do I understand the point of that patch series? Is there any reported failure? Like the cabled robot trawling the mailing list, picking up patches? If you've broken ARM or x86 with that patch series, doesn't look too good. Is that a fix or a feature? Obviously, we're going to tend to prioritize fixes. That's more immediate attention to it. And each maintainer has their own pet subject. Basically, you have to know who you're talking to and just want to know one. But there's something that overall goes beyond that and that's trust. It's how we trust each other. It's how we recognize each other. So we tend to trust people who will go the extra mile, who will don't just look at their small corner of the Linux kernel source, but will try and expand and try to do something that benefits a larger part of the kernel than the users. So when we, as maintainer or as reviewers, ask contributors to do some extra work on a patch series, well, it's not really to know you. It has nothing to do with that. We don't take pleasure in asking you to do more work. But it's really to improve the overall quality of the kernel because that's what we really deeply care about. And eventually, it's about building trust between two parties. It's knowing that we will recognize your work and that you will do something that is good for the kernel as a whole. So really, it's not about, there's no difference. We're talking amongst equal. And the real thing is about making a better kernel. Because the overall thing is, that's how the whole Linux kernel works. It's about trust between Linux and some of the top level maintainers. It's about these top level maintainers and their sub maintainers, for lack of a better word. And co-maintenors of a single subsystem do trust each other. If you don't have that, you have nothing. And at the end, this trust is just as important as the code itself. So if you don't get contributions to a project, that project is effectively dead. It's frozen in time forever. And one of the maintenance roles is really to retain those contributors, those who will really help making it a better thing. And we always need new reviewers, new maintainers, co-maintenors, because those contributions are the ones that really benefits most of the users. And that requires everyone to really agree on how we work together. So when a maintainer or reviewer asks a contributor to do some additional work, either to provide a better infrastructure to refactor some code or to move bits of code that's been squealed away in some random driver into core code, it's worth stepping back a bit. Look at the bigger picture. See how it fits in the overall kernel. And if that request isn't clear, really ask for clarification, challenge it. There's no harm in doing that. And if you think that request is not justified, try and come up with an alternative proposal. Maintainers are not always right. Sometimes we need to have our eyes opened. And that's a good thing. In the end, what you want to become is the trusted maintainer of your own code. Embrace the concept. So, small digression here, what we call drive-by-patching. So, these are one of contributors. We'll see one patch, one patch series, and never to be seen again. It's the equivalent of what we call in the UK flight tipping. Here it is, be happy and disappearing. So we don't want to discourage that because we get a number of bug fixes from that. We also get series that just do not make much sense. So, we'd like to convince people to stick around and because in the end, we all started with one single patch. That single patch that got merged and that empowered us to do more. So, I'm not sure how we can improve that. It's difficult. So, we just need to, yeah, that's not just, it's hard. So, if you have suggestions, I'm all here. So, one of the best way to improve your knowledge is to become yourself a reviewer. And you don't really have to be an expert in one domain or another. You just have to be able to follow some code and be interested in it. I think the interest is key here. And if something seems unclear in whatever patch you're reviewing, ask questions. Don't be afraid to do so. If you don't understand it, it's likely that someone else doesn't either. If you spot a problem, yeah, please say so. And at the end of the day, if you're happy with the way the code looks, then if you can, post a review by patch saying, okay, I've reviewed that patch and to the best of my ability, I think this patch is correct and makes sense. And that's the important bit, to the best of my ability. It doesn't engage you any further. Nobody's going to come back at you saying, oh, you've approved of that patch and it was wrong. Well, as far as I can see it, that patch was correct. I didn't get the full picture, that's fine. And your input is always valuable. And that's how you build trust with someone. One thing that's important is, please be the first person to review your own patches. Read them. Quite often, you're going to be the first person who's going to see that decomposition of a much larger change. When you look at it, when you look at your code, you see, okay, I've changed this, this, this, this. And this is a big feature. But when you start looking at individual patches, individual patches, then you start spotting things that may not be quite right. And these are things reviewers and maintenance will pick upon. So do that exercise first with your own patches. You'll catch basic mistakes. And then put yourself in the maintenance shoes or the reviewers shoes. Does it make sense? Is it split in a coherent way, well documented? Have I addressed all the review comments? Go back to that email thread from a week ago. And okay, oh, may have missed that comment. Might as well address it now before sending it again. Collect the act buys and reviewed by tags. People have put a lot of effort in reviewing your code. Don't miss out on this opportunity. Add them to your patches. And if you've answered to that, yes, then it's great. Ship it. So that's where we are. Contributed to Linux can be hard. I mean, really hard. But it's also extremely rewarding. And we're all trying to work together in an ever-changing code base. I mean, it's like a ship moving all the time. And trying to understand each other's view is key. We can't talk past each other. We need to talk to each other and work together. It can be hard. I mean, building that trust is what we're after. And that makes things really, really, really easier. So we have tools. We have processes. They're not always friendly, but they're here for a good reason. And honestly, we're not trying to be difficult. That's what it's about. We're really trying to work together. And if you have any question, if you're in doubt about anything, just ask. I mean, nobody's going to shout at you. That's not what it's about. It's about working together. Thank you. If there's any question, we have microphones, gentlemen here. Hi. So this is about practices, best practices or something. So, for example, if you have a patch that was developed by somebody else, and then you want to submit it upstream, do you keep the sign of buy for everyone? You would keep the authorship of the original author. If you haven't changed the patch too much, I mean, you've rebased it or something quite similar. I mean, you haven't changed the logic of the patch. You would keep the authorship, you would keep the sign of buy, and you would add you on. And I'm guessing the author should be... The author should be the original author. Unless you've changed the shape of the patch in such a way that there's no real content of the original patch left. Okay. That's it. Cheers. Any other question? I didn't think I'd be that clear. Less a question, more a remark, as a maintainer myself. If you haven't got any reply to your patch, no, nothing from anyone on the mailing list, then feel free to ping us. Give at least a week, because sometimes people only review on certain days of the week, and you don't know which one it is. So at least allow for a full week. But don't wait for 10 weeks before you ask, because then it's pretty certain that it's just fell through the cracks, especially high volume mailing lists. It's all too easy to miss a patch or it's got buried under all the threads and all the messages. So between one and two weeks, if you haven't heard, send a gentle ping. People may be on vacation or sick or whatever. It also helps to bring that patch series on the top of the stack every so often. If it is a very big series, complex, then you might want to wait a bit longer because it takes time to review something like that, with a simple fix, just ping. Fixes definitely don't hesitate to ping after a week or so. If your kernel crashes, clearly we want to know about it and if we miss that, please let us know. Thank you. When you have a patch series and you've been resending it for, let's say, three release cycles, which happened to me and there is still no response. There even have been some acts from people not really involved with the subsystem, but the maintainers simply don't respond. What should you do? Should you escalate somehow? It's hard. It usually means that the maintainer fails to see the point of that series. That's difficult. I would say try to find someone who could mediate that one way or another. If the direct conversation doesn't work, try to... It doesn't work. There is no conversation happening because there is no response. It doesn't work. Then try to have a conversation with someone who could talk to that person more directly because apparently the direct email doesn't work in that case. Maybe it's also something where you can take the opportunity of a conference like this. Assuming the maintainer you want to talk to is around, it would be useful to have a face-to-face discussion. Sometimes it's also about, okay, am I doing the right thing? Is what I'm doing actually... One thing you could look at, for example, is how has a similar approach been taken before? If so, why did that fail? Maybe there's something around that. There's a sense of deja vu and the maintainer fails to be interested. Without knowing the specific, it's hard to give you an answer, but we can take that offline. In theory, a maintainer should at least acknowledge your existence, but maintainers sometimes you fall off the grid entirely, but if they're actually still in the mail list responding to other people, then you have an expectation they should get back to you. There are people who will help you. In the end, Andrew Morton takes that on as one of his ombudsman jobs. Often also in any given subsystem, there may be other senior developers who may take up your cause, but there should be somebody who can help you out. You shouldn't be left out in the cold like that. They should at least say no for some reason. Yeah, I'd like to add a comment to that as well. I think the key there is to involve the community, because we as a community want to get code merged, so you won't meet a wall in that case. Just try to involve as many people as you can from the community, and you will get some help. Anything else? Well, thank you very much.