 Ryan Davis. Thank you very much. I have 101 slides to do in 30 minutes, so I got to go a little fast. And I realized that I'm not actually getting the butterflies, that the same feeling you get from butterflies is also the same feeling I get from sleep deprivation, which I have in spades. So prepare for me to put my foot in my mouth. So I need to start off. This is my first time in Madison. Gorgeous city. I'm really enjoying it. I'm really liking the evenings with cicadas and the crickets and everything. Do we have any locals somewhere in the front? Okay, I'm gonna pick on on you. I was told not to touch this word because if I do I'm gonna get beat up by some farm boys. What's that? I'm sorry for spelling it wrong. Can you say the word though? Monticello. Yes, it hurts. I can't criticize this word. I've been told not to, and I really don't like getting beat up by farm boys, so I'm not gonna. But I can pick on this one. Can you say it again? Manona. It's actually wrong. It's like Manona, as in one na, contrasted to a Baena, like shana na or banana. Turns out there are other nas in the world. I didn't know this, but My Chemical Romance does a song with a dodeca na, and the logical conclusion of that and what I've been driving Ozzy crazy with this entire week is the double dodeca na. So now you know. So this is a tiny bait and switch. Aaron Patterson talked about how to get started contributing to open source at Steel City a few weeks ago, and that happened after I proposed this talk. So he did a really good job of it. I'm not going to talk about getting started in open source at all. He did a very good job of it, a better job than I can do. Eventually the video will be up on Confreaks. Is Julia in the audience? I did a wonderful job of making these awesome notes, and I found them online. So here's Aaron's talk. I'll wait. Okay. So what I'm going to be talking about is how to contribute open source. Holy crap, communicate. As a side note, I just need to say so that everyone understands my pain that Aaron sucks and you never ever wanted to work with him. As an example, if we're going further than a few blocks to lunch, it's incredibly embarrassing to go out to lunch with him. We don't have an office at all. We work from a cafe, so it's embarrassing to work in public with him. All he talks about incessantly is his ugly cat, which I need to point out, ran into a brick wall as a kitten and has a completely flat face, and then incessantly, obsessively, talks about his sausage, which has its own Twitter account, has computer monitoring, the works. It's really obnoxious, but I don't think this really conveys. I'm going to show you a typical day between Aaron and I. So I say, hey Aaron, why are you meowing? He says, thinking about cats, totally normal response for him, and I say, why are you meowing journeys? Don't stop believing. And he says, well I'm thinking about cats singing. And I'm like, so that said, Aaron is brilliant and working with him is fun and rewarding and just so you further know my pain, I scoured through literally 59 pages of Flickr self-portraits for you guys. So I'm going to be talking about the later stages of open source contribution, typically the relationship between the developer and the contributor. So let's get into it and talk about what contributors actually are. In proprietary we have the developer, which is most often a corporation. It's usually a black box. If you're lucky you have a phone number to call tech support and you have the consumer, you, the user. In open source it's just slightly different. The developer is oftentimes an individual or a group of individuals. Sometimes it's a company but you'll almost always have direct access to the developers themselves. There's you, the user, again and then there's what I'm considering to be a hybrid between those two things which is the contributor. And I'm going to be talking about this relationship between those two things. So what are contributions? I see it as bug reports, feature requests, documentation submissions or enhancements. Pretty much any form of feedback is a contribution. Even just a, this doesn't make sense to me, is information the developer can use to improve their product. So let's look at what contributions used to look like in yield open source. It started with smoke signals and then we adapted TCPIP in Morse code over Telegraph. That typically looked like checking out a project in CVS and then later Subversion. You locally work your changes. You don't have right access to the repository so you're doing everything locally and it's transient. There's no persistence to this thing. And then you submit the diffs of your changes back to the developer because you don't have right permissions to the original repository and you don't have a real cloning system. That's what makes this transient. Hopefully there was something like source forage or track or any sort of mechanism to cause the contribution to be persisted somewhere. In having that type of thing, you don't have your diffs get lost in email. They don't get, typically don't get ignored or forgotten about. They are something that you can keep referring to and they're things that other people can look at and they can see, oh someone else has had this problem also. I'll add additional information or diagnostics and as a result there's more information sharing. There wasn't always source forage and there wasn't always track and oftentimes there were diffs just emailed to a mailing list and forgotten. We actually had this world in Ruby five plus years ago. So most the burden is on the user to get it right. They need to make sure that their diffs are right, that they apply cleanly, that they work right. Back in those days testing was not standardized. The burden was on the user to get it right. Nowadays it's a lot easier to be a contributor. We typically use Git, whether we like it or not, we typically use it. We typically use GitHub and we're dealing with pull requests instead of patch sets and the difference is subtle. Basically you have a system managing those patch sets instead of you managing those patch sets and managing the communication they're in. And really, thanks to things like GitHub, you don't even need a local checkout anymore. You don't have to clone the project at all. You can clone, edit, and submit a diff in one or two clicks. So as a result, forked projects and pull requests have scaled up by orders of magnitude. And this is good overall, but it means that the project developers are massively outnumbered by the contributors now because the contributors have been empowered to be able to give that feedback as easily as possible. And this often looks like the fire hose. When you're dealing with a project like Rails or Ruby or RSpec, very popular projects, the fire hose is a little bigger. But the point is that most of the burden goes back to the developer. They're now having to manage all that communication coming in. They're having to triage it, having to schedule it and figure out what to do with it when there's a lot more coming in than there used to be. Aaron really wanted me to bring up the question of when someone should try to contribute any sort of contribution and I think it very clearly is early and often. As soon as you have a question about a product, as soon as you have the potential for a bug and you're not even sure about it, please communicate that to the developer and they can clarify that as quickly as possible. I think that things like Stack Exchange obscure communication with the developer. They make it harder for the developer to be in contact with the users that are having issues. And so the developer has the additional burden of having to hunt down questions on Stack Exchange and other similar systems rather than having direct communication with with the user. So how to handle a contribution? The developer has roughly three options and there's trade-offs with each one. They can accept as is, just push in the merge button. If the Travis CI bot says that the patch applies and the test run, it must be fine, right? So what can go wrong? Well, as you can see here, it's a minimum of effort, tiny little sliver of it. Pushing the merge button takes no effort whatsoever, but the potential for complexity goes up the most. 20 merge buttons in and you potentially have a very bad mess. You can reject the contribution for whatever reason and that takes a little bit more effort because it typically warrants a response of some sort and the potential for drama goes up tremendously. And in some cases, there's nothing you can do about it. Or you can do what I'm calling distillation or rewriting the original contribution so that it fits both the user's needs and the project's needs. And this requires the most amount of effort, the most amount of communication, and it still has the potential for drama because you're potentially editing someone's contribution to you and they might have ego attached to that change, attached to that code. And obviously any addition to your project has the potential of increasing the complexity, but you've reduced that complexity as much as possible. So I think this is the ideal balance to spend more time upfront so that the complexity of the project goes up the minimum amount over time. Wow, I'm flying through this. 66 slides in. So let's take a look at distillation. As an example, many tests has a flexible but simple runner. And by simple, I mean it's just Ruby and it is actually pretty readable code, but this is a typical stack trace required to run a single test. And you can see that a runner is instantiated and dispatched. That based on the type of runner that you're dealing with, dispatches to run tests which picks up the suites, which runs a single suite, which instantiates a single test, runs the hooks, runs the test, and runs the cleanup. That's actually as simple as I could get it and still have it be flexible enough to allow for both running tests and running a benchmarking system with n-by-m matrix and doing curve fitting to that and allowing you to do performance regressions against your code in a normalized manner. And really, the system's flexible enough that we haven't fully realized the potential for it because there just aren't that many plugins for many tests that change the type of runner that you're dealing with. There's a lot more stuff that it could do. In particular, I've been working with with Aaron. It's got hooks for various phases of a test run. There's the before setup, the setup, the after setup, same thing with tear down. And I worked with Aaron to help generalize those hook systems so that it meets all of his needs for testing rails with the Rails mocking system. But he has the benefit of working and sitting next to me on a daily basis. So we have the highest communication bandwidth that's possible and most other developers don't. So other systems like Mocha or FlexMock turn various CI integration systems. There's one for Travis and I'm sure there's many others. The number of bugs filed from those development groups currently are at zero with a minor exception that the Mocha has been picking up the ball in the last month or so and getting that communication rolling. And they're doing what I'm calling copy-paste munch, which is they're taking whole methods from the runner, which I consider to be internals. They're not documented. They are not promising any particular type of internal architecture. And as you can see here, Mocha has essentially a monkey patch per version range of many tests releases. And as you may know, I'm a tad bit release happy with my projects. You're welcome. The current maintainer Mocha is doing a much better job of communicating. I need to say this. He's been talking to me a lot about what he can do. And he's working on a scheme to make it so that Mocha does not need to monkey patch many tests anymore. And that will improve in the near future, but it's not there yet. But what it means is that Mocha in particular has the problem that its complexity is proportional to the number of frameworks that it's patching, and that's our spec test unit, many tests and probably others, times the number of releases of those products. And that's a geometric growth. Every time I change my internals, and my internals are mine, don't touch me there, they can break his code. And that did happen with the latest release of many tests. That's why we're now communicating about this stuff. And I think that's untenable. Luckily, he's working on fixing that situation, but there's a lot of other systems that monkey patch many tests that just aren't there. So I have a happy example. This is not just a bitch session. I had a co-worker come to me with a question, and the dialogue basically looked like this in IRC, he's trying to make something similar to call the spheres CI reporter, which is basically a system that hooks into the test framework and then submits the data up to a CI so that you can see it company-wide. And I'd never heard of it, and so I decided to go look at it, and the integration was insane. It was crazy. It was 224 lines of monkey patch that I couldn't make sense of and couldn't figure out what version they'd forked from many tests because it was that munched. So he's submitting a pull request to me to writing a similar system. It was hooking into many tests and making it so that test results were getting collated and then written into some sort of XML and pushed to some CI system somewhere. And he's submitting a pull request to me to make many tests fit his needs better. He still had monkey patches, but he found the places where they were overlapping a little bit and wanted to make it so that his life was a little easier as he should. Unfortunately, his patch set was the wrong way to go for many tests. I rejected it, but I was able to talk to him through the issue tracker. And really, what happened is there's a huge difference already. It got the ball rolling. It got us communicating. And as a result, we were able to distill his needs. I basically said, let's back up a second. I understand what your patch set is, and I understand what you're trying to go for, but what do you really need out of this? And as a result, we were able to reach some clarity on that. One of the things with him being a newer developer is that he just didn't have the mindset to reduce the problem description to its pure essence. And so I was able to just say, okay, let's just stop for a second. What data points do you need and when do you need them? And as a result, we wound up with a very small change to Minitest. It is literally an empty method. That's it. It takes five arguments. The suite, what method was running, how many assertions ran, the time it took to run, and whether or not there was an error. And if there was, what the error was. Nothing Minitest does with that is interesting, so it's an empty method. And then I wind up calling it in the run suite method after every test is run, after the tear down methods have run. As a result, and I don't expect you to be able to read this, what I want you to do is look at the colors. And for any of you who might be red, green, blind, this is pointed out to me about five minutes before my talk. So I would just want to point out that about 80% of this diff is red. There are removals. And about 10 to 20% is green. There are places where he was able to clean up his code. And as a result, instead of this tangled mess, he's got this. He subclasses Minitest unit with his own CI unit. He has one after test hook which goes and does the submission. And then he is overwriting the record method where he creates a push into his system and calls super, like a good citizen. Everyone should call super. And then he replaces the regular runner with his CI unit runner. And since all it's doing is subclassing and overwriting one method and adding one hook, it runs exactly the same and then does his extra stuff. So is there a conclusion on my talk? Kind of. I can't really boil this down anymore. But communicate, communicate, communicate. I'm not going to jump on the stage because I don't trust it. And I'm not going to sweat that much for you because I don't love you that much. But please communicate with the developers. Thank you.