 Hello, whoa. Hi. My name is Asaf. I work for you can guess we have pretty subtle templates this year and I manage a group of engineers working on OpenStack networking, Neutron, Octavia, SFC and OVN. And this all began with an argument and and I wanted to win the argument so I spent months of my time working on things that are not consequential at all, but you're captive audience. You're here, so you have to listen. And the argument was this was a colleague in Red Hat and he said I have engineers and they're sending patches upstream and nobody's giving them the time of day. So the patches aren't getting merged and what are we to do about that and that resonated with me because we're all familiar with that because anybody working on OpenStack knows what that feels like and that brought me back to when I was starting to work on OpenStack and I was the first developer on my team in the Israeli office working on Neutron and my mentor was in San Francisco. So I never spoke with him. So what happened there was that my manager told me there's this bug and you got to fix it. So I said, okay, so I was pretty new at Red Hat and I was pretty a junior in the industry, so my world became I was obsessed with fixing this bug, right? My entire world was about fixing this one bug. It was the most important thing in the world. So I reproduced it and I did root cause analysis and I worked on it for days and days and then I did design work. Should I fix it this way? Should I fix it that way? And I found the best way to fix the bug and I wrote this pristine beautiful patch and it was just marvelous and I had this vision of me sending the patch to the upstream community and I was kind of naive. I was I was new to open source and I had this vision of you know, five this kind of ceremony, right? So I donated this, you know, people use the term the verb donate. So I donated the patch and I expected all of these people to say thank you Asaf. Wow, you fixed this important bug and the work that you did is so amazing. And then nobody gave a shit. I mean, nobody just nobody nobody. I just sent the patch and nobody, right? It's just sitting there and and and you know, this was for firewall as a service CLI niche but I mean even within firewall as a service CLI this was like a niche thing that nope, this bug was not important at all. This is the bug that you give to a new developer just just so that they know how it all works. Didn't matter at all. So that sucked a lot. And so I remember what it's like. So I was arguing with this colleague and I was like, yeah, I know what that feels like and I really wanted to win the argument and I thought because I'm like that and I thought how do you win an argument? Well, you present data, right? You present data because that's really hard to argue against especially if the data is even true. So I said, all right, I can do that. I have lots of spare time. Just know that my manager hear that but I have all the time in the world to work on on this sort of stuff and I pulled it pulled up put some Python code the link is on GitHub. It's it's the variant the presentation if you want to waste your time and look at ugly code then it pulls some Garrett patches and some Stakely six Stakely ticks stuff and then graphs a bunch of that. So here's an example this what you're looking at. This is the triple O heat templates repo and these are all of the patches that were ever merged to that repo. So that's I can't see but that's many three something patches and the x-axis is a time the y-axis is how long did it take to merge the last 60 patches, right? So what is the mean time to merge the last 60 patches? The y-axis is in days. So just as an example Yeah, that's barely working, but if you look at I don't know February of 2015 then it took 15 days the the mean time was 15 days to merge over the course of the last 60 patches, right? And what's interesting here if you look at this graph is that this looks like an earthquake parameter, right? I mean this is going up and down like crazy and the y-axis this is significant because the y-axis doesn't go between 14 and 16 right this goes between 5 and 30 so this the difference between 5 and 30 Apart from it being 25 is a lot. I mean as a developer if I get my stuff merged in five days I Can you know I'm happy right I can work if it's 30 days Then that changes everything because then I have to work on multiple things at the same time because I'm waiting for my stuff to get merged and It's it's it's it's a different world to live in So I was asking the current ptl of Triple O Which is this the gentleman sitting right here a million which was He's actually not human. He's actually a saint and He was talking to me gave me the time of day and we were talking about this and Well, what could explain this stuff so see I for example triple O has historically had issues with CI stability and There that means that there could be periods of time where for a couple of weeks there so two weeks You're not getting anything merged. So obviously that bumps the meantime, right? so I was thinking okay, well, this is really interesting stuff to me and What else can we learn so? What are the different ways of looking at this data? What else? What else can we do? So I was looking at and the graph is going to get even more horrible trust me I wanted to look at March of 2016 Because it looks like the time to merge kind of started going down, right, right? So if I Kind of went that away So that was interesting and I looked at the volume of incoming patches, right? So the red stuff the red graph is how many purge how many patches were? Sent how many patches were created that day? So the y-axis here is between one and six and again that's significant the variance is Between one and six that means either you're getting one patch a day or you're getting six patches a day and over the course of a week That means that the cores have the core viewers of that project have to either review seven patches Or 32 and that's a different order of magnitude. That means that your behaviors have to change So what's happening is that the incoming volume of patches is going up Significantly and the review time is going down significantly. So what could possibly explain that so again? I pestered Emilia and we tried to explain this and This comes down to governance changes. So what kind of changes can do you make in your community to explain this sort of thing? So I thought okay, but it would be neat if we could graph the number of cores over time Right, so maybe the number of cores went up. So they were able to review more patches faster. So how do you count cores? well Stakelytics is a really interesting a really useful piece of software for people, you know people like me that like Spend their time looking at graphs and It does historical data right so you can go back to a previous release and look at the patches and reviews for that release Unfortunately, it shows you the current course. So if you go back to ice house, it shows you the course from today So how do you count? How do you visualize the number of cores over time? What I ended up doing is I got all of the patches that were ever merged and then I looked at How was that what was the last patch set for that patch? So I go through each person's plus twos minus twos and plus a's so the patch approvals and I basically just remember that and then here each line is a person and Their actions as a course start at some timeline and end in some other timeline, right? the first patch that they ever Approved or gave a plus or plus or minus two and the last patch that they ever did that for and so you get these time periods And you do a vertical line right for each day You basically cross the line and you can count the number of cores that were active for that period of time and Then you get this obnoxious It's it's close to art at this point I think it's very colorful and so the Green the blue and the red it's exactly like before and then the new piece of information is the number of cores over time Which is in green and the y-axis is over at the far right. It's between 4 and 18 and obviously again That's a very big variance four cores small project 18 cores as a monster, so Well anybody can look at this and say wow the number of cores in from March 2016 shot up dramatically between around 11 cores and 18 cores And they did something similar to the neutron community where they basically Maybe de-emphasize the idea of super cores that are expected to know everything there is to know and Cores that specialize in an area or a component like OS net config or these specific components of triple o and so The core is expected to kind of stay within this area of expertise and merge or plus two stuff that they actually Know anything about And it seemed to work. It seemed to work really significantly and I thought that this was very interesting and Do these governance changes? always work Right, so I spend a lot of my time when I was working on neutron Thinking about the community and and you know is then is the time to review the time to merge is it going up? Is it going down the number of cores how we structure our community? Especially in neutron it went through pretty drastic changes. So do governance changes always work so I looked at Nova and in Nova in April of 2014 they introduced I think it most likely it was the first project that introduced the specs process and a bunch of specs started merging around that time and what did that do to review time intuitively it Means that there's another barrier of entry before you can start pushing your patches and having the merge Do you first have to go through this other process? So intuitively this would? Increase the review time right it means that you takes more more time for your stuff to merge So Did it increase or decrease? Seriously, I'm asking Did this did this increase or decrease the time to review? So let's take a look at this beauty of a graph that this is again the time to review and the number of patches and interestingly the number of patches Roughly when it is going down right it went down to that to that valley and and and roughly speaking it's going down and the time to review What happened there? I'm looking at this and it's kind of kind of stayed the same It's not Visually you've got this line over here in the middle and roughly speaking it kind of stayed the same So why how can we explain that? Again counterintuitively the number of cores in Nova is going down yet the review time is roughly staying the same How can we explain that I'm not exactly sure but I found that really interesting A similar thing happened in neutron similar thing that that happened in triple o is that roughly from December of 2014 to mid 2015 we did two very dramatic things in the community which was to split out the as the load balancing VPN And firewall as a service as well as all of the neutron drivers, right? So we basically said okay the neutron repository instead of being everything networking an open stack now It's just going to be the neutron platform, which means the API quotas The policy stuff and the reference implementation so OBS Linux bridge and SRV and Nothing else Right, so we split out a few hundred how hundreds of thousands of lines of code Which to me looking back was a win-win because the vendors could work at their own pace They weren't bottlenecked by neutron cores and neutron cores didn't have to pretend to know what it is that they're talking about when They're reviewing stuff that requires hardware that they don't even know and Would that increase or decrease? Review time so this we're looking at just the neutron repository its scope went down drastically right and Would this increase or decrease review time for the neutron repository when everything else around it has been split off Intuitively I thought that with this would decrease merge time significantly You've got the same number of people roughly working on less stuff less incoming patches So they should be able to merge the the neutrons the faster So I thought well, you know we could look at the number of patches first the number of patches is kind of static or even going down Which I guess kind of makes sense Seeing as you know VMware and and Juniper and Cisco and all of the People doing great work are doing it in separate repositories now. So the number of patches is going down and review time Kind of didn't change It's kind of staying the same and this to me was really counterintuitive. It's also not what I felt in my rotten heart when I was Experiencing that myself. I thought that it was much quicker to get my stuff merged, but it wasn't What's even more interesting is that at exactly the same time We were iterating very quickly on governance changes and we introduced the lieutenant's system Which is very similar to what triple O was doing is that we introduced the idea of mini cores that are supposed to specialize number of cores shot up dramatically it nearly doubled form 13 to 22 obviously it's been going down pretty dramatically ever since But all of these new cores that were responsible for less things we're not actually reviewing faster and This would this this graph is weird to me But there it is So going back to the argument right and the reason why I started doing all of this stuff is because I wanted to Beat that other chronic of mine into submission and to show in that I am a better person for having won that argument So I was looking at a number of different metrics Remerge time as a function of Right, so looking at lines of code I do into it and you don't really need graphs for this right, but bigger patches are slower to merge so that's true with a small caveat is that looking at the Keystone repository so the x-axis is lines of code so bigger patches are on the right and Each dot here is not a person but a patch. These are individual patches from the last year just from the last year in case there were any Significant changes and the y-axis is how many days did it take to merge that patch? Obviously. There is a significant cluster Right, there's a huge cluster of small patches that were very quick to merge But what was interesting to me is that I expected for there to be a stronger correlation because I expected the graph to go like that Right, so bigger patches take longer to more to merge. So it's kind of supposed to have that, you know, linear Graph so if there was a strong correlation you would see the kind of a line of dots roughly Correlating and going up and it's kind of not like that I mean this was also almost looks like it was generated by a random number generator You know I would question the legitimacy of this graph, but I made it so Beyond this cluster it looks like there is absolutely no correlation So Yeah, I pruned like the this is the bottom 90% So this is the you know the vast majority of patches And it you know just to verify this is the same in every major Projects so I looked at a wide Array of projects and it looks pretty much the same There's our there's differences, but by the way a Cinder for example has generally larger patches than other projects. I don't know why But there's differently differently differences. So What I did learn and this was the most Interesting thing for me is and again, this is intuitive But only if you've been in the community for a long time and you kind of know how it works is the difference between New contributors people that you know merged They push the first patch their second patch and people that have been there for years and And everybody knows them and they know everybody and they know who to ping when they need their stuff merged And they know how it works and they know how to write the commit message and they know all of the Written and unwritten Rules so Long-term contribution versus flyby so if you graph the time to merge by Resolved bugs, right? So this is basically saying the more bugs you solve On average the less time it takes to merge your own stuff, right? So here each dot is not a patch, but each dot represents a human being Funny way to represent people but basically saying that this guy on the right Resolved, you know 45 bugs and it took 22 days on average to merge his patches so We do the same thing by bugs resolved bugs filed emails Number of patches and of course the big one number of reviews So the more I reviewed the quicker it is to merge my own patches And we always knew that to be true, but it's just nice seeing it graphically so What's interesting is that if you look at? different communities right so neutron is categorized by Specific practices in its community But it works it works it doesn't really matter which project you look at it works if you look at projects like Kola and SFC and heat and these are projects that are of different sizes these are small projects and huge projects and single vendor and and multi vendor and Different projects and it always works and you can always see this sort of pattern where there's a huge cluster of People that have not yet generated a lot of work right not yet generated a lot of patches or reviews or bugs or what is it or Whatever and that's the people in the red box. That's the dangerous box. You don't want to be in that box because it's Nearly randomly generated. I mean, there's a nearly random distribution of the review time So that means that the first patch that you send and the fifth patch that you send it can either take two days To merge it and then you're off doing the next thing or 140 days, which means that you quit Right that means that you stop working on open stack because why would I want to send? Patches when it takes half a year to merge my stuff Well, the good news is that it gets better over time So you look at the people in the green box These are people that have their stuff merged on average 20 days or less. That means that you can actually get some work done and I looked at all of the people By name all of the people in that box for different communities and again this pretty much works globally The time to merge for a person drops dramatically in the first year and then kind of stabilizes And it's intuitive to people that are in that green box So I I to me that I know that to be true because I experience it myself and I saw all of my peers Go through the same process. It's absolutely sucks in the beginning It's it's a horrible experience You send your patch through fire It's the most important bug in the world and nobody gives you the time of day and your stuff It doesn't get any attention But after some time You learn the practices and more importantly you get to know the people because you go to the summits if possible And you get to know the people and they get to know you and they merge your stuff very very quickly so Message is it sucks, but it gets better over time There's ways to accelerate this so a couple of notes first there's the link to the code if you happen to Have spare time and Second of all is that there's way to accelerate that process So in the speaker notes, I will upload the slide to My website and I'm assuming also the summit website Where there's best practices merged this was written by neutron community members like Kevin and Rosella Where they you know basically write on what worked for them What are the best practices how to and it's silly but it's it's how to write the commit message, right? That matters a whole lot because that you can spend days ping-ponging on the whatever and you know showing up on IRC and all of these Best practices that that make a huge difference That's what I've got to say any questions. Yeah, thank you Yeah, so one of the best practices that we often hear about in the Nova community is That new contributors should try and do their own reviewing with a particular level of kind of concentration on the patches That are submitted by course So that's the best way to make yourself kind of known to the course is to review their patches and hopefully find issues in their patches and identify nits and prove your kind of technical Chops in that way, so yeah, I'm wondering if that's something that we could apply some statistical analysis to I mean, so first off, that's what I tell people when I mentor them is Piss off course basically No, it just get any emotional reaction from a core and they'll remember you for good or for bad Yeah, it should be possible to graph it. Absolutely. I I Can't say that I will do it, but it's it's obviously absolutely possible. Yeah Hey What kind of things you think the open-stack community can do to? Unbound more quickly the new developers in the community and because this is an ongoing topic we have at the TC and and we have been discussing about some ways to You know make some YouTube videos of how you can do code review how you can Learn more about how being involved. So what kind of things you think we should focus more on the next month? Maybe so When I started I remember there was not a lot of stuff. There were not a lot of blog posts three four years ago Videos I didn't have any content. Now. I think there is actually an abundance too much content so when I onboard people I prepare a list of things that you have to read in a specific order just because there's too much stuff and people get lost I Imagine My I would suspect that there's like five different onboarding documents at this point in different Locations, so if we could have one That'd be a good start and specifically about learning how to review because it took me years to learn how to review Properly when I reviewed at the start it was Contest to see how many Comments I could leave right so you know oh and you have a Patch and I'm trying to find all of the different problems with the patch which is not helpful at all It's not the philosophy mentality Which I think people should review and then I took me years But I slowly transitioned to a style where I imagine that the patch author is sitting next to me And we are just going through the patch together are trying to burn to merge the best Possible version of the patch, and I think that mentality shift Helps because a lot of people Tried just to find as many knits as possible to prove that they are the best reviewer in the world So I don't know if we have Do we have a global onboarding document for open stack developers across all of the different projects It's like git how to Yeah, that yeah, there's things that you learn by experience and you don't necessarily have to like the review best practices I think they are written down and I'm sure that they can be improved for the the softs, you know soft skills Involved because it's not a technical process. It's just two people talking and That's much harder than you know fixing a bug. I hope that's anywhere near an answer Yeah, do you think maybe Foundation could have some videos maybe on YouTube or something where like we ask the community leaders to like pts core reviewers and reviewers just to share the best practices on How do they make review on individual individual projects and maybe share it on through videos or something like more like formal, you know What do you think that that will help so I'm asking you because we we are both at the TC. So Yeah, I think that was one of the things that we identified with the when we had our leadership summit Between the board UC and TC a couple months ago We were trying to figure out how to grow leadership within the community which kind of Piggy backs on being able to bring people into the community in the first place and get them kind of to the point that Asif is talking about You know that that year in where where they're they're feeling comfortable with their ability to get changes merged to review code well and similar sorts of things that also allow them to be good leaders within the community We did identify that that we want the foundation to to basically among other things try to curate some of the video Sessions such as this one, but others That that kind of outlined the sorts of behaviors that that can make someone More effective as a contributor within the community as a leader within the community So we're doing some very targeted ones like Emelian's gonna kind of do some a brief lightning Session of some code review tomorrow afternoon. I think But you know also just more generally sessions like these come up spontaneously At each conference that that we put together so trying to collect those and and sort of curate the the topics that they cover is Something we're gonna try to focus on Cool, there's also there's already you know material written that I mentioned earlier that I thought that was really spot-on by Kevin and Rosella and there's some links somewhere One is I guess by way of a concrete suggestion When we displayed that welcome new contributor message on people's first review, maybe we can give them a Really short and easy to read You know handful of links like you know, how and why do we review on these projects and You know, what are people actually going to be looking for in those contributions and I guess the other thing is I think we need to Kind of be more aware of the fact that a Lot of contributors that we are gonna have are not They're not people who are habituated to being in an open source environment they so they really have to start from square one in terms of You know, how does this work? What is this community thing like if I'm if I'm sitting at dollar big corp and I'm working on some downstream open stack distribution say and You know my manager is telling me okay. You need to submit this upstream So we don't have to maintain this separate thing going forward You know like and I come into that and then all of a sudden people are firing all of these questions at me some Which are maybe not so Kind of helpful or polite or friendly it's gonna be a really overwhelming experience and I feel like You know folks who are more invested in the community don't Have that kind of perspective on what that experience is like actually cheated and looked at your slides ahead of time so I think you had them on your blog and Yeah, I think this is a really interesting area I think one other thing is it would be really interesting to talk to People who did give up and and ask them, you know, why why didn't you contribute? I mean Obviously, they're not necessarily gonna want to come back and answer those questions if they if they quit But it would be it would be interesting to see like, you know was it just you just had that one bug and now you're good or You know, were you put off by the amount of time or people's attitudes or whatever? So yeah, not exactly a question, but just some Stuff yeah, it's it's I mean being realistic There's people that are paid to work on a project continuously through years and they can afford to To spend that time and there's people that aren't that are basically being told by their manager go fix these two bugs that are Important for our you know product or what have not and that is very valuable that these these bugs are upstream But these people just you don't have the option of you know working full-time on the project Unfortunately, they are in the red box. I don't have they are and the other thing is a lot of the help that we can give them is is Through IRC or through other means that they might not even really be aware that they can have access to or that it might be difficult for them to access a Lot of a lot of something that I notice is a lot of people don't go on IRC Yeah, they don't a lot of folks who are kind of working on the on the you know kind of Downstream let's fix stuff for the customer side or are not on IRC and a lot of them are not even Linux desktop users gasp One of the things In terms of just dealing it periodically because I'm not a hundred percent upstream and so that's one of the challenges of IRC is Actually, like you know if I have to go back and scroll through or figure out What people have been saying over the last like two days while I was working on something else. That's really tricky The other thing I wanted to to say was if you have somebody that has one particular problem And they basically throw a patch over the wall and run away I've worked in other communities Notably the the Linux kernel where they're far more accepting of that and somebody will take that and fix it up and get it merged Whereas it seems to me that in open stack. We're not very good at somebody else taking that and running with it It just kind of tends to sit there and languish So so one just to jump in one dirty little secret about the welcome new contributor message that Sinarama had mentioned It was implemented not Just to Welcome new contributors, but also to give reviewers on those projects an indication. This is someone's first batch and To be gentle and to kind of help try to usher and guide them because I mean we get so many new patches from From people that it's kind of hard to keep track of you know, this is a name. I've never seen before so when you see that message show up in a Change that's been submitted you kind of know You know this this is someone who we may need to treat with kid gloves initially and kind of help usher them into the community a little more gently Hi, thanks for the statistics. Did you also try to correlate? Review time of the patches to outside events like Christmas vacations open stack summits And the various other time points per project for example There are milestones everybody twice to get the patch in before a milestone Which means there are many more patches before that that can't be reviewed or might be reviewed and similar things I would believe that this should amount to at least half of the variance of all the data Yeah, I mean we have the review time Here for various projects Um My experience is is exactly that. I mean I remember there was a summit that I didn't go to and I had nobody to work with I mean there was no reason to send the patch Because nobody would review it same for Christmas Uh, so yeah that that to me that's very intuitive You know we could look at December's here to try to to see anything Um There's just so many variables so many reasons why you know this graph would go up and down. It's it's uh Feature freeze Yeah milestones feature freeze obviously in neutron. I remember You know working weekends before uh Feature freeze absolutely Do you have some raw data available on the github repo? Um Could you post it for example start of patch when it got merged and similar stuff? Um, well, yeah I mean certainly you can look at the the code all of the code to to generate these graphs are here on github And you can just see You know what sort of garret queries I did and how I aggregated The data and then you can do anything. Okay. Thanks very much What's that That that that's what this is essentially this is a moving mean Yeah, obviously looking at the nova graph this is with smoothing So looking at this without smoothing would be nearly meaningless But yeah, obviously if this is a mean of the last 60 patches and You know we send this many patches a week and yeah, so christmas for example is kind of averaged out So I want to mention something that's kind of been implicitly kind of danced around Which is that you there's no way Of not starting in the red box We're not talking about mentoring people so they can jump right into the green box. That's not going to happen Not ever we're trying to Basically, you know increase the slope of their Of their graphs so that they go from the red box of the green box a little bit quicker And maybe you know your next set of slides will be based on what training materials they consulted What is the slope of their graph? That would be nearly impossible to come up with but I can interview every single person What's that? I can every interview every single contributor, right? Uh, but but I mean the point is you have to some of this stuff and you just can't grok or glean by Reading a blog or having somebody tell you you have to get in there into the mud and the muck in the mire and just Get the experience and I'm not saying I'm not in the green box And I'm I'm still waiting my way through the left hand side of this graph And but I mean I can see I've been in enough that I can see Like what what's what's going on and what needs to happen. So You know, I just I guess I just wanted to make that point. You can't get you can't avoid the red box Yeah, it's you can accelerate the process, but you have to go through it takes a non zero amount of time, right? To add on to that the notion in positive psychology is they don't want to look at They want to look at people that are the outliers They're the ones that went fast from being in the red box to being in the green box Is there a way for Chorus to identify those sort of people and figure out what they did in a way to replicate that Yeah, I have a couple of I you know I imagine that happens and and it may not it may just be a very skilled person or the mythical 10x but But it seems like a potentially another way of getting out a little bit of what what he's talking about Yeah, you know, I I kind of ran through these graphs for different colleagues of mine and absolutely there's people that kind of Just kind of started effective and and stayed effective throughout their career so another way to maybe phrase that is Why don't we interview those people and see what's their secret and I'm also over time. So thank you