 Let's get started. Hope you guys are having a good first day of your summit. Cool. My name's Colleen. I work for SUSE, and this is going to be a beginning contributors talk about being an effective code contributor for OpenStack, and I hope any open source project. So I've been involved in OpenStack for coming up on three years now. I used to be a core reviewer for the Puppa Models team, and right now I'm a core reviewer for a sub team within the infrastructure team and also for Keystone now. And I wanted to give this talk because I sort of see the same problems over and over where someone comes in, you found something you want to fix, and you write up the change, and you make it through this whole ridiculous process of getting your account set up and the CLA sign and all that stuff, and you submit your change to Garrett. And then this patch takes weeks or months to merge. And in some cases, really major changes that's expected. OpenStack's a giant machine, and big ideas take a long time for all these moving parts to come together. But I'm not really talking about those types of changes. I'm talking about simple bug fixes, documentation changes, basic features, things like that, taking just an unnecessarily long time to merge, or sometimes even becoming orphaned because the contributor got frustrated and impatient with the process. So I wanted to give this talk to give you guys as new contributors advice on what you can do to make this process a little less painful. Because I care about this project. I love OpenStack, and I want it to succeed. And I think a big part of that is about retaining new contributors. So I'm hoping this will help. So one of OpenStack's founding tenets is open code review. It's part of the open development part of those four opens. And we want excellent quality code. And the only realistic way to achieve that is to have multiple sets of human eyeballs actually looking at and thinking about processing this code. So we allow and we encourage anyone and everyone to review code. It's not just a job for the cores. It's an everyone thing. And having that back and forth discussion on how to solve a problem, it's a critical part of code review. And that's something you should embrace and look forward to. But a side effect of this code review culture is we tend to come at it from a perspective of reject first. We want to protect our projects from bad code. And so it sort of turns into a game of finding things wrong with this patch. And that's a really kind of negative outlook. So as an example, this is a patch from Keystone. It was supposed to make some readability enhancements to the docs. And I went through 12 patch sets before being merged. And in the Grams scheme of things, 12 patch sets is not that many. There's a lot of changes that go through a whole lot more. But when you think about how small this change was and that it wasn't even changing code and it wasn't making super fundamental crazy changes to the docs, 12 patch sets is a little much. And the author ended up either losing patience or had to refocus their priorities. And someone else finished up this patch. And the problem with changes like this one was the scope and the problem aren't well defined. And the reviewers can't really agree on whether the change is solving a given problem. And so in this case, people kept finding small issues with it and asking for little fixes. And we just kept building up over and over. And so what I want you to come away with is the abilities to sort of fast forward through some of those initial patch sets, get all the reviewers on the same page, and then the real meaningful discussion can really happen. So when I was an intern at Puppet Labs, I had a mentor named Jeff McKeown. And he taught me three things that I took away from this internship. The first thing was how to use XRs. That was a really mysterious command to me at the time. The second one was how to write proper bug reports. And the third thing was writing good commit messages. And this is one of the most important parts of this talk. This is something that I take a lot of pride in, is being able to write good commit messages. So this is the big part of what I want to get through with this talk is I can't really tell you how to write better code. At least, I can't really tell you that in 40 minutes. And that's what Code Review is great at. It is taking a chunk of code and having lots of eyes on it, figuring out how it can be better. But commit messages are really important for any software project, but especially for open source projects, in part because it helps you get everyone reviewing the code on the same page. So here's what I learned from Jeff. The structure of a commit message is pretty basic. Some people might think it's sort of obvious, but I know from reviewing a lot of code that not everyone thinks this is obvious. So this is where we're going to start. The subject of the commit message is maybe obviously it's what changed. The body of the commit message starts out with describing the state of the world as it is today without the change applied. So this is where you talk about the bug you noticed, the symptoms you saw, or you talk about the fact that the project is lacking this feature that you're trying to implement. And then you make the case for why this is a problem. And unless it's really trivial, it's not enough to just say that it's a bug. Therefore, it's a problem. You should actually make the case that this has affected you somehow or it could affect someone else. Or if it's a feature, you should make the case that your life is in misery because you don't have this feature implemented. And finally, you should explain how you fix the problem. So describe how the changes averted the bug, describe how this new parameter enabled this feature. You don't need to necessarily restate line by line what is changing as long as it's obvious from the code, but it's also not always obvious. And sometimes it deserves explanation and that's where you would explain it. So my commitment is to just almost always follow this structure. They have these exact phrases. Without this patch, this is a problem because this patch solves the problem by. And it's not particularly creative writing, but it really gets the message across. So this is an example, a really old example from Keystone of a very not good commit. It doesn't say what the bug was. It doesn't explain why the bug was a problem and it definitely doesn't explain how the bug was fixed. And even if you looked at this change, if you looked at the diff, it's not obvious what the bug was and how it was fixed. So please don't write commits like this. This is a more recent example from Keystone of what I think is a great commit message. The subject, it says what changed. The body starts out by describing how things were without the change and why that's a problem. And the last part describes how the proposal fixes the problem. So I think this is a simple change and I think this is a great structure to follow. So besides getting your code to merge faster, besides making it easier for reviewers to review, writing good commit messages is really important for software history because a lot of the time we're looking to refactor or clean up old code and we need to remember why something was added so we can determine if it's still important. And without good commit messages, important code could get removed accidentally or on the other side unimportant code might hang around for a while. And you might not just be providing context for other people in the project. You could be providing context for your future self. So this is an example of a change that was made a few years ago in the infrastructure config management repo. And what I like about this commit message is it gives context for why the change was made. We wanted to add support for Fedora 18. And recently we were doing some spring cleaning of this repository and traced some code back to this commit and we were able to delete a bunch of code because we know we're no longer supporting Fedora 18. And so if this commit hadn't provided the context like this, that old code that we didn't need anymore might still be hanging around. On the other hand, in Keystone land, there were a few such of changes recently that they happened because the original commit was not very well explained. So what happened here was we added this code and we set this parameter to default to true. And there is a comment, but it's not entirely clear what that means. And the commit message didn't explain it further. And then as part of another change later, we added this other comment that this seems dangerous. We should see how we can change it, but this didn't add a whole lot more information. It didn't really help anything very much. And so finally, someone noticed that comment, that this seems inherently dangerous comment. And because they were lacking context for why it was set that way in the first place, they changed the default and ended up breaking things and breaking some RBAC work that was building on this assumption. And so this change, this most recent change is in the process of being reverted because luckily there was someone who does have context who caught the change and is working on it. But my point is that having good commit messages isn't just about getting your code merged quickly. It's about if you have any kind of long-term involvement with a project, you're going to appreciate good commit messages because of problems like this. So besides the general structure of a commit message, I have a couple more thoughts. When we write code internally, we, even if we practice code review and not everyone does, but even when we practice code review, we get into bad habits of not explaining ourselves properly. We're working in a small team of people who all know what's in the sprint. Everyone knows what's been prioritized. We're all sort of headed in the same general direction. We know all of our teammates. We trust that they know the product at least as well as we do, and so we let them get away with poor explanations in commit messages. And it's bad practice internally, but it translates really badly to a global multi-company project like OpenStack because someone on the internet who has never met you doesn't trust that you know what you're doing, and they probably have completely different ideas of what's important. So in my mind, a commit message is not just about talking about the change and how it solves a problem. Commit message should be a fierce argument of why the previous state of the world was suboptimal and why your solution is the best solution. People on the internet don't know you, they don't know your credentials, they don't take for granted that you probably know what you're doing, and you have to convince them that you've researched this problem, you've tested it to the ends of the earth, and that you've considered and rejected all of their options. And so you should explain your personal use case for the feature or the environment where you notice the bug, and even if your code isn't perfect, framing the message in that way helps reviewers understand where you're coming from, and from that perspective, can start offering suggestions on how to improve the code. You should preemptively predict what questions reviewers might ask you. Why did you do it this way? Why did you make this change that seems unrelated? And explain that in your commit message too. If you start out your proposal this way with this kind of certainty and compelling evidence, your reviewers are gonna be carried away by your argument, they're gonna be less likely to debate you on it. And so long commit messages in my opinion are fine, they're great, because explaining yourself upfront is key to level setting, and I've had pretty good luck with having really verbose commit messages because when everyone's on the same page, everything moves faster. So without getting specific, there's an example from Keystone-Oth, where there's a changeup for review that would sort of drastically change some of the default behavior with how Keystone-Oth uses the Request Library, and it's been sitting there for kind of a while because no one's really sure what to do with it. On the one hand, we can't find a concrete reason to reject it, the change feels kind of weird, but we're not entirely sure what to think about it. But on the other hand, the author hasn't made a compelling reason for us to accept it either. So at this point, the change is just kind of hanging there, and it's probably gonna hang there for a while until the author really makes this compelling argument that we should accept it. And as a counter example, I'll show you one of my commits that I'm sort of proud of. I know you can't see it at all. This commit message is a commit, it's a message that I wrote for a two-line change in the Pi Open SSL library. And I know it's ridiculously verbose, and I know you can't see it, but the gist is explained in detail what the bug was and how to reproduce it. And I also explained how it was negatively affecting users. This was causing a segfault in Python. And I also provided context for how I found it, which was that this was something that was happening with Glance. So reviewers understand that I'm not trying to do something crazy with the library. This is a standard use case. Then I explained why the bug was happening and how will the change solve the problem. And I also answered some questions that could have possibly been asked about the change. Does it affect other types of keys? Does it affect other sizes of keys? And I admit that the part of the summary where I was describing how to reproduce the bug, that could have gone in a bug report and that would have condensed the commit message a little bit. But still, the summary of the bug and the why it's important does belong in the commit message. And when I wrote this, I never contributed to this library. I didn't know any of the maintainers, so I know these people are way, way smarter than me. So I needed to show that in this particular problem, I understood the problem and that I was convinced that this was the right solution to fix the problem. And so if you're structuring your commit message in this way, it's gonna help you as well. It's gonna get you thinking deeply about the problem. Is it really a problem? Is it gonna be a problem for someone else? Is it objectively a problem or do you just sort of have a personal distaste for it? And if you're thinking about it in these terms, then your code is probably gonna be of higher quality and you're gonna be better prepared for the ensuing discussion. And so I'm gonna finish up my rant on commit messages by leaving you with the link to the Wiki page. That'll give you details on conventions for commit messages, specifically an open stack. And next I'm gonna jump into some other tips on how to make good code submissions. So stepping away from commit messages. The next thing I wanna talk about is writing small contained patches. You can write multiple related patches, but it's still important to break them up into logical units. And a major problem with how humans do code review is that huge changes either never get reviewed because no one has time to spend hours reviewing and checking a massive change, or it gets kind of a lazy plus one. Like yep, that does look like some code. I think that's code. Because your eyes sort of glaze over and you stop being able to discern what the change is about. So look out for signs that your change is too big. If your commit message has bullet points or uses lots of phrases that start with also, that could be a candidate to split up into separate commits. So, and if you look at my changes, I'm sort of guilty of those signs. Here's a change that I made that has got five different logical things in the commit. And my excuse for this is all of these things needed to happen in order to make the subject true. Your beaker is a functional testing suite for Puppet. And in order to fix the functional tests, all of these things had to happen. So it's not entirely a hard and fast rule, but something like this could be a red flag for something you should be splitting up into multiple commits. And then next, there's kind of a checklist when reviewers are looking at code of things that you should be adding with your change. So you almost always need to write tests for your code. It's gonna be one of the first things reviewers ask for, and it's part of proving your code is correct. So your test should be exercising the problem. It needs to demonstrate that there was a problem and it needs to demonstrate that your code, when applied to that test, fixes the problem. And something that kind of goes along with this is please run your own code before submitting it. I've reviewed code many times where I go to test it out thinking I'll try to find some corner cases, I'll try to break your change, and I realize that it doesn't even deploy in the first place. And this happens sometimes with Puppet modules because they're kind of under tested. But it's, I think, embarrassing to submit code with commit messages that convince me that you know what you're doing only to have your code fall flat on its face. So please test your code both with unit tests and like actually running it yourself. The next in that sort of checklist is documentation. If you're adding anything, it needs to be documented. In the obvious cases, if you're adding a feature, you need to update the project documentation to explain to the players how to use it. And that's also gonna help explain to reviewers how to test out your change. But it also applies to other things. If you're adding a public function to a class, that needs a doc string to describe how, how to, to developers how to use it. Or if you're doing anything remotely clever or non-obvious in your code, that needs a code comment as well. And that's also part of documentation. And the last thing that reviewers are gonna ask for in that kind of checklist is release notes, only in some projects and only in some changes. But if you're adding a feature or fixing a bug that was more than just a trivial fix, anything that's really user facing, it's gonna need a release note. The tool is called, the tool for OpenStack is called Reno and you can look up how to use it. And that's something that you can do in your change before the reviewer asks for it. And so even if you've done all of this perfectly, you have a perfect commit message and you've added tests and you've added docs, it's still almost definitely not gonna get merged in the first patch set and that's okay. But sometimes something happens where a patch sits for a while without getting any feedback at all. And so if you're not getting timely feedback on your change, my advice is to go talk to the project team either on the mailing list or on IRC. Sometimes that means being willing to get on IRC and talk to people at a time that's inconvenient for your time zone. That's just something that has to happen. But sometimes a patch just gets buried behind other patches, just falls through the cracks or it could be that your change is one of those complicated changes that people are sort of afraid to look at and merge or review. So reaching out is a good way to get attention to figure patch, to figure out why it fell through the cracks and to show that you're engaged and willing to help solve the problem. And that kind of leads me to my last point about code contribution is please follow through on your change, see it through to the end. Should be watchful and responsive to CI test results. If your tests are failing, you should go figure out why and be responsive to reviews. It's your responsibility to follow through on your own patch until some sort of decision is reached. Be humble and accepting of criticism. Sometimes the written form of the suggestion just because it's written down or sometimes because of cultural or language differences. Sometimes it's come off as harsh. And I'd say don't take those things personally, just be accepting or respectfully defend what you're doing. Continuing to follow through and be responsive, it shows that you care about this change and that it's important to you and that makes it important to other people too. So my last advice is keep in mind that open development model. Anyone can review code and that includes you. So and the more you review other people's code, the better you're gonna understand what reviewers are looking for when you submit code. But I have a couple of requests. Don't plus one code, you don't fully understand. A plus one means I understand and agree with the direction of this patch. And I think it should be merged. It's really not helpful to leave a plus one when you don't understand what's going on. And don't leave a minus one for nitpicks like spelling errors or questions. It's perfectly okay to leave comments and questions without giving any kind of vote. Even if Stakelytics doesn't count a plus zero, it's still useful feedback to have that discussion with the author and to see those questions for other reviewers. So it's great to leave comments without plus or without voting. And all of this translates mostly, I think, to other open source projects. So feel free to keep in mind if you're working on things besides OpenStack. And thanks for listening today. I hope you can all go help make OpenStack better and I hope this makes it a little easier. So thank you. And I think I have time for questions if anyone has any. And there's mics if you want. If you want to just shout, I'll repeat the question. Okay, so the question was, is it okay to add reviewers on a change and also how do you find the right person to review changes? It sort of depends on the project. A lot of the time when, a lot of the time people add all of the cores as reviewers on their project and that just ends up sort of flooding my inbox because I can't, if I have 100 patches where someone has just added all the cores, that's not useful and I can't discern the things that I actually do need to review that I have been added specifically on. So I don't have a great answer for that. It kind of depends on the project. But if you are able to figure out the right person to ask for a review, you can add them as a reviewer and also ping them on IRC to actually bring it to their attention because sometimes adding them as a reviewer to the change, that's not really effective. Yeah, I mean, and get blame is a great way to figure out if someone specifically should look at the change. Also, spec authors. If you are making a change to some feature and that was a spec that's been implemented, you can talk to the spec author or the people who implemented the spec. Those are good people to add as reviewers too. You have a question? Yeah, I'd like to know about the wiki text about the commit messages. Is it a guidelines of how to write the wiki text? The way you explain, just know how to write commit messages. You gave a link about those wiki text. Oh yeah. Yeah. Yes. Yeah, that's, sorry, ask your question again. So I kind of missed the context while you were explaining that. So I just wanted an elaboration on how does this relate to the messages that you were explaining about. Oh, sure, sorry. So this is all the guidelines for writing commit messages in OpenStack and it's some of what I talked about today. It's not everything I talked about today, but it talks about some of the conventions, like line length and structuring it, so your subject is in the imperative voice. Conventions like that, doesn't quite go into the detail of what I'm asking for, is making that strong argument. It doesn't really talk about that stuff. And another, maybe this is orthogonal, but I'm trying to get my code upstream and I was looking through some bugs to get introduced to how to contribute to OpenStack. And I came across some bugs which required bash shell scripting, not Python code. So when I write a patch which changes that, how can I write a test for that? Do I need to write one? It depends on the project. I think, are you talking about like DiskImageBuilder or TripleO or those? Ironic. Ironic, okay. I don't know specifically how their testing works. You can, there's likely examples already in the repository of tests and so if they're testing their bash changes, there's probably examples already there. Because specifically with Ironic, because it deals with real hardware, sometimes writing a test is difficult. Yeah. Hence my question was, if you could, what are the examples? My best advice is to follow the examples that are already there and also to ask in the OpenStack Ironic channel. Thank you. Sure. Anything else? You should post it to Garrett and you should mark it with a workflow minus one that implies work in progress. But you should also ask on the mailing list or in the IRC channels for people to review it because some people don't review things that have that label. But you should say this is a proposal, I don't know how to fix it, but this is something that I could use in collaboration on. All right, thanks for coming today.