 My name is James Blair, I work for the OpenStack Foundation, but more importantly I'm on the infrastructure team for the OpenStack project, which is a extremely open cross-organizational team that works together to facilitate the operation of the OpenStack project as a development foundation. So I guess I'll talk a little bit more about what that means. These other people on the slide are not speaking today, but these slides, like everything we do, are part of a public GIP repository. They're all shared. We all contribute to them. So their names are up here because they have contributed something to this and given some version of this talk elsewhere. And they're also the other core members of the infrastructure team. As you can see, there's some pretty good representation there. Two of those people work for HP, two of them work for the OpenStack Foundation, and we hope to put some more people on the team from other organizations soon. So this is not a talk about OpenStack. So I'll talk really quickly about OpenStack. OpenStack is, in case you haven't heard of it, it's open-source software for building public or private clouds. So basically we're talking virtual compute storage, networking, identity management components that you can put together using APIs and build large data center spanning applications out of them. The components of OpenStack are varied and numerous. Over here on the left, you can see our list of server projects. So these are kind of the major components of OpenStack. Like I mentioned, compute, object storage, there's images, identity management, network, service and so forth. I won't read them all. Most of them have client libraries that go along with them so that you can actually interface with them either from the command line or at an application API level. And this is kind of the first challenge of development in the context of the OpenStack project. We keep calling it one project, but it's composed of many, many projects that exist not only in their own Git repos, but they kind of have their own development community around them. There's a group of people who are more focused on Nova than, say, Swift and so forth. So a lot of our challenge as a whole project is to enable these individual projects to keep doing what they need to do, but then at the end of the day, make sure that all these things work together. We can put them together and build an actual cloud out of all the individual components. And then just to sort of make sure that the development methodology is shared amongst all of these projects to reduce friction between collaboration of all the different components. So this is, you know, it's kind of a case study in how to do large-scale development in, you know, where you, where you're, I guess, large-scale development of a componentized system. In addition to all of those Git repos that we have for individual technical efforts, we have some cross-project sort of horizontal efforts, including documentation, infrastructure, which is what I'm talking about here today. We have, we have a project specifically to deal with reuse and abstraction of common ideas across OpenStack projects themselves. So it turns out when you have 60 Git repos and they're all kind of doing similar things, people start to copycode around them and develop new ideas. And so we actually have an effort to make sure that that's minimized. We, you know, there are folks that find that code, factor it out into new external libraries. And that's the Oslo project. Of course QA is a horizontal effort, especially in the realm of integration testing. There is a project that does have its own Git repo called Tempest that does integration testing with all of the OpenStack components. And so it's good to have a team that's focused on making sure that all of the projects are participating in that effort. Release management. It is a team. It's spearheaded by a singular individual named Thierry, who is an amazing person who's extremely detail-oriented. And so he makes sure that as we approach our release deadlines, all of the projects have hit their milestones and things like that. We have an internationalization team and a vulnerability management team, as you'd expect. So basically, you know, to date, these are all of the horizontal efforts that we've seen necessary to try to integrate all of those individual components together. A bit about our release management. We borrowed a lot of things from Ubuntu when we started the project. So we started with timed-based releases every six months. We decided we would release a complete version of the system. We named it after the year and release number, similar to Ubuntu. So we're working on the first release of 2014 now. So it's 2014.1. We also have code names for our releases that are named for the geographic locations where we have design summits, which, again, is another thing that we borrowed from Ubuntu. So every six months, right after a release, we all get together and try to figure out what we're going to be doing for the next six months. So we try to make the most of that by coming up with plans for the kind of work that we want to do over the next six months, sort of rough plans in advance. And then we all sit down in a place for several days, I think about four days, and basically lock ourselves in dark rooms and go into those plans in detail. And by the end of that, we should know what we're going to be working on for the next six months. And this is really important since we have a lot of different folks from different organizations working on the project. So, you know, there's a lot less of going down to the break room and chatting with somebody or having weekly meetings or that sort of thing. So these design summits are very important for syncing up as a project with what we're doing. We actually, we just released the Havana release of OpenStack, which is not named because we had our design summit in Cuba. We actually had it in Oregon, and there's a Havana Oregon as it turns out. Possibly wishful thinking when we named it. So we also have milestone releases within that six-month window. Every few months or so, we sort of checkpoint our progress and make sure that we're keeping up with what we've decided we would do at the previous designs summit. And then finally, after a release, of course, they're not perfect. They're supposed to be perfect, but they're not. So we have stable branches that we fork off and we maintain those for a period of time with security and bug fix updates. I sort of got into this a little bit on the previous slide, but our contributors are rather varied. There's a pie chart, which may or may not be representative at this point. It's honestly, it's changing all the time. But we have folks from, as you can see, Red Hat, HP, IBM, Rackspace, several other companies, independent folks working on the project. So it's really exciting having so many people from so many different organizations and backgrounds working on this thing. It's also a bit of a challenge because all of these people come to open stack development with their own expectations, their own goals. They might be just focused on some subsystem and their company specializes. And others might be more interested in the functioning of the system as a whole. Or can you take this system and bundle it up and deploy it in a certain way? So there's room for everybody. But how do we coordinate that work and make sure that it's not just a mess is quite a bit of a challenge. Of course, the quantity and the quality of the contributions are varied. We have old school expert Python programmers working on this and people who have never written a line of Python in their life join up and start submitting patches. So we try to be a pretty welcoming community. So we also try to facilitate onboarding new people and sort of gently bringing them into the system. To do that, we focus a lot on the consistency around our tooling. The infrastructure team in particular focuses on meta development. It's an area that we're very interested in. So we kind of focus that energy in this team. And so all of the individual Git repositories, all of the individual projects aren't spending all of their time figuring out, well, how do I bundle this up into a release? How do I build a framework around testing? Things like that. So in the infrastructure team, we focus on doing that for the project as a whole and concentrate that development so that it's consistent for everybody and it's easy for new developers to not only onboard onto OpenStack as a project but move between the different components if they're working on the compute system and they need to dive into the networking system because it turns out they interact. So making that sort of thing consistent is important. So here's some of the stuff we run as part of our developer infrastructure. It turns out there's a lot and I'm not going to talk about every one of these. But I guess the high-level categories are things around code review and managing our Git repositories. We do test and build automation. That's together because we use the same kind of automation for running tests as we do for building release artifacts and so forth. We do a lot of work to minimize the disruption of the external internet on our development process because we depend on packages from OpenStack as written in Python, by the way. So we download a lot of packages from PyPy as part of our building and testing and so we need to make sure that we can do that reliably. We get OS images for testing and things like that. So we have to cache all of these things because it turns out that the internet is not terribly reliable when you're trying to do the same thing thousands and thousands of times a day consistently. Job logs and build artifacts, that's a really interesting thing that I'm going to get into later. When you run tests and builds at the scale that we do, what do you do with all of the logs? How can you make use of those? Of course, we build documentations and we publish it. We build releases. We run IRC bots. The OpenStack projects themselves use IRC extensively. It's one of the ways that we can communicate with developers from various organizations in real time. So while we don't have a water cooler, we do have IRC and that turns out to be a very good substitute. We have a channel for the folks who are focusing on infrastructure too and it's actually one of the busiest channels because we're not only building and maintaining the system but we're sort of the de facto help desk for new developers. So we're always around to help people who are just starting with this system. All of the teams have weekly or bi-weekly meetings again in IRC so that they're accessible for everybody and they're logged and recorded. You can go back and see the meetings. The project itself has a governance structure, a technical committee that also meets weekly in IRC. That's where we make sort of project-wide decisions. We run some blogs because people like writing about this stuff, mailing lists, an etherpad server, a paste bin. All of these things are really helpful with the kind of collaborative development that we do. And finally, authentication for all of our tools because we have a lot of them and we like you to be able to just log in once and use all of them. And then we manage bugs and future development work in blueprints in Launchpad. So the typical development environment for an open stack developer is obviously Python, as I mentioned. We run all of our tests on the long-term support releases. So we run into on CentOS 6.2, not 2.6, and Ubuntu Precise. So that sort of covers our, it's kind of the lowest common denominator for what we're trying to support as a project. All of our projects are PEP 8 compliant. PEP 8 is the style guide for Python, which sounds like a minor thing, but when you have so many developers working on a project, having a style guide just sort of, it puts so many arguments to rest before they start. We don't have arguments about how many tabs or spaces you should have. We have a tool that enforces that automatically, and so that people don't have to deal with that sort of thing. I mentioned also earlier the common libraries. And then we use virtual ENV quite extensively to pull in all of the Python dependencies when we run tests. We also use a tool called TOX, which actually manages the creation of a virtual ENV and then the installation of the sulfur into it. That's actually made it so that we can be very consistent about how we bootstrap an environment for testing. So you can, we have our Jenkins servers running TOX to run the unit test suite, and you can run the same thing on your workstation and get ideally the same results. I mentioned IRC, we're on the FreeNode network earlier. We have a cool tool called DevStack, which basically what it does is it takes all of the OpenStack repositories and downloads them and installs them. So if you're trying to develop on OpenStack and you're not using it in a production situation, or even if you're just like, hey, what's this OpenStack thing? I want to play around with it. I don't want to read anything. I just want to start breaking stuff to find out what it is. I don't know about you. That's how I learned. You can download this tool called DevStack from devstack.org. And it's basically a shell script that gets all of the stuff and installs it and leaves you with a functioning OpenStack system at the end. Don't run it on your primary workstation because it'll destroy it, but get a VM or a container of some sort and run it inside of that. And you'll have within a couple of minutes a working installation of the full OpenStack system with all the components. And we use that same tool in our integration test suite, so that we know that this thing always works. We know that it's always able to produce a working version of OpenStack. And we have a system called Project Gating, which is basically the idea is that we don't let anything merge to a code repository unless it passes tests. So it's kind of a simple idea, but it's a powerful one. It's great for developers because it means that you never start your day sort of checking out the code and then figuring out what somebody broke last night because they pushed it into the tree and didn't run the tests. You always know that every commit to the repository works, at least it always passes the unit tests. And it's a very egalitarian system. We don't have a BDFL. We don't have any single person who is in charge of all this stuff and gets to flaunt the rules and say, well, I didn't run tests on this, but I'm sure it works, so I'm going to push it in. That physically can't happen with our system. Everybody is subject to the same requirements that a change pass test before it merges. Everything that we do is automated because we're lazy sys admins. We don't like to do things manually. And whenever we end up having to do something manually, we botch it up. We're terrible at it. So here's a quick shot of the tool that we use to drive a lot of our automation. It's called Zool. And it drives not only the project gating system that I talked about, but it also does things like build documentation and publish it whenever a commit lands. It builds releases and pushes them to pie pie and that sort of thing. Here's the process flow for a developer when they're working on a change. It's a little bit different than what you would consider the normal workflow if you're, say, used to GitHub. It turns out Git supports a lot of different workflows equally well. So this one's a little different than GitHub, but it works out really well for the tools that we're using. So you start, obviously, by cloning a copy of a repository. So you've got the Nova repository and you clone it down into your local environment. And then what we do is we ask developers to start a new topic branch. That's basically, it's going to get awkward if you start trying to do your development on master for reasons I'll explain in a minute. But you start a new topic branch, you write your code, you ideally run the unit test suite, but hey, you don't have to. We've got programs that'll do that for you. And then you commit to your local branch. Then you use a tool called Git review, which is something that we wrote, which will take the outstanding commits in your repository and push them up to Garrett, which is our code review system. Once you get in the habit of just writing something and then typing Git review and then going and writing something else and typing Git review again, it's really kind of addictive. It's a really low friction way of writing code and pushing it up for peer review. So anyway, once you run Git review, it sends it up to Garrett, where immediately our Jenkins system driven by Zool will start automatically testing your change. So the idea is, as soon as a new change comes in, developers, reviewers are going to want to know whether it passes tests or not. And so it gets started on that right away. Eventually it's going to come back and leave a comment in Garrett saying, yeah, this passed or failed tests and so forth. Then reviewers come along. Our system is based heavily on code review. The reviewers have, I guess, the most power in the system in terms of they're the ones who decide what gets merged or what doesn't in OpenStack components. So anybody in the world is welcome to log into our Garrett and start reviewing code. You can leave advisory votes of plus or minus one. I like this. I don't like this. You can leave comments about specific things that should be changed. We have what we call core review teams for every project which are composed of senior developers who ideally know in great detail what's going on in any of the individual projects. And they can come along and leave plus two or minus two votes which are basically binding. A minus two means this can't merge. A plus two means that it can. And we require two votes from two different core developers before a change will merge. So then once core developers decide that a change is appropriate to be merged, they approve it. It goes back to Jenkins for one last round of testing because this could have taken a while. Things might have changed since the last time the change ran. Moreover, things are probably going to change between the time somebody approves the change and the time Jenkins has finished running it. So we actually have quite a bit of work around making sure that the change is merged as tested in the repository. And I'll get to that in a minute. But anyway, once it does pass tests, it finally gets merged into master and you've closed the loop, so to speak. So Garrett, as I mentioned, is it's a key part of our system. It's kind of the, it's where developers spend and code reviewers spend most of their time interacting with the system. We try to make sure that every part of the system that interacts with working with a change, whether that's code review or automated testing systems or whatever else interfaces through Garrett so that we're focusing all of the information about a change in one place for developers. So anyway, Garrett is a standalone code review system developed by Google for the Android open source project. It's been really great to work with because of its highly flexible integration points. You can add commit hooks as you'd expect with Git. But you can also, it has an event stream interface where you SSH into the thing and it spits little JSON blobs of information at you whenever anything interesting happens, like when a new patch set is uploaded or somebody leaves a comment or something like that. And then has extensible review categories. So out of the box it comes with verified and code review, but you can add others. So for example, you can say like changes need to go through a licensing review check or something like that. And you can add all of these categories as your workflow demands. So here's some screenshots of Garrett. This is sort of the top part of the page where you can see general information about the change. It's got information like the person who wrote it, the project branch, the topic when it was updated, its status, et cetera. Here's the commit message. So you start by reading commit message and figure out whether you can actually understand what this thing is talking about or not. Down here we have all of the people who've reviewed and left comments on it. So you can see that these people left plus one reviews. This guy, the check mark here is check mark is Garrett speak for plus two. For some reason. So anyway, this is a plus two from this guy. And here's a plus two from Jenkins, meaning it passed the unit test. And remember how I said we require two plus two reviews from core reviewers. This apparently is the exception to the rule. I'm sure they had a very good reason for only approving this with one. So anyway, Daniel decided that this was okay to merge and so it got merged in. Another thing Garrett provides obviously is a diff view of the changes themselves. So we, most of us like side-by-side diffs and that's what it does by default. So you can see here they just changed, you know, this line down here as such. It's highly configurable. If you like some other kind of diff, it'll show that to you. It's got syntax highlighting for those major languages and that sort of thing. So it's a really good way to review code. And you can also click on each of these lines and leave inline comments and that sort of thing. We have Garrett integrated with Launchpad, which was our bug tracking tool. So anytime you upload a change that mentions a bug in a certain way in the commit message, we'll hook it up to Launchpad so that you can look at a bug and say, oh, well, somebody proposed a commit to address this bug. So it links back to Garrett automatically and it helps keep these two systems in sync. Here's another page from Garrett where Garrett has a lot of features for querying reviews, for sort of dealing with your workflow. We have custom dashboards that try to help prioritize reviews in certain ways. But just as there's various ways that you can get Garrett to show you a list of patches that are in various states. So the one that's highlighted is a change that's past the initial tests and it has at least one positive code review from a core member, but it still hasn't been improved yet. So that's a change where if you're a core member, you might say, oh, well, maybe I should look at this and see if it's ready to go in. And then, obviously, these other changes haven't been, they haven't even had their test run on yet. This one had its test run and it came back with a failure. I mentioned Git review a little bit earlier. We wrote this to make interacting with Garrett easier for new developers. Garrett is, it's very easy to interact with if you're a Git expert because basically it's just a Git server and it has special magical refs and if you push your refs to these magic refs on Git, it will create patch sets for you. So if you're into hacking that kind of infrastructure, it's a dream because it makes a lot of sense in terms of Git and there's a lot of cool things you can do with that. That's not necessarily what you want to teach every new developer. You don't necessarily want to teach them how to Git, push, head, refs, Garrett, etc. So we wrote this tool that acts as a Git sub-command and basically all you have to do is make your commit and then type Git review and then here's the output from Garrett about how it created a new change and pushed it up. In fact, right there you can see the special magic ref that Garrett uses. Basically, this means create a new change for the master branch with the topic of bug 9.1608. If you're using Garrett, Git review doesn't require any of the other infrastructure that we're talking about. It's again a simple Python script and it's actually in Fedora and Ubuntu so you can just apt-get or you can install Git review and you'll have it or you can pip install Git review to get the latest version from PyPy. It also auto-configures itself. It's kind of a zero configuration kind of thing. If you have a special file in your repository that tells it the Garrett server to talk to, it'll do all of its configuration automatically. As you might have figured out, a lot of our infrastructures around testing because testing this project is complicated and it's only getting harder as it gets bigger. As a service for developers, we run lots of tests easily for them just by pushing up a new commit. We start running tests. The main kinds of tests we run are unit tests which are designed to run inside of contained environments. You should be able to run it on your workstation. It shouldn't be going and doing any like sudo commands behind your back. It should be quick and easy for a developer to run. Quick varies over time. I think some of our projects take 10 or 20 minutes to run unit tests at this point. Nonetheless, it's supposed to be an easy thing for developers to do. Integration tests are a little bit harder. Like I said, we use the DevStack tool for that. That basically trashes an entire virtual machine in order to do the tests. It's a little more involved for a developer to run if they don't already have this set up. Basically, what we do is we spin up virtual machines and install the stuff and run it for them so that developers don't have to. Some specific challenges that we've had to deal with around testing are, as I alluded to earlier, we want to test the effect of merging a change. We don't want to test that your change works against the state of the repository now. We want to test that when your change merges in the future with everything else that might be merging around the same time. Is the repository still going to work? That turns out to be a little bit of a challenge. I'll talk about how we address that in a few minutes. Also, our infrastructure is entirely virtualized. We run it on OpenStack Clouds. We're eating our own dog food. We actually have a couple of different OpenStack Cloud providers that give us free accounts where we run all of these tests. A big value proposition of OpenStack is this is supposed to be a provider independent system. It's supposed to help avoid vendor lock-in. We're really putting that to the test by running our infrastructure transparently across at least two different Cloud providers. We have a large number of very similar projects. We get really tired of typing the same thing over and over. We try to build a system that is amenable to templating and dealing with standardization and then repetition of that standardization. Then finally, we have not only in our own Cloud providers, which provide us slightly different versions of computing resources, there are a lot of different ways that one might choose to run OpenStack on a lot of different hardware configurations. We would like people to be able to test that as well and provide feedback. One of the things that we do to address the question about testing the effect of the change as opposed to the current state of the change is we wrote a script called Garrett Git Prep. Basically what it does is it puts the tree in the configuration that it's going to be in after this change lands, which isn't a terribly difficult thing to do except occasionally this change might be landing ahead of a couple of other changes. We need to make sure that they're in the tree as well. In some situations when we're doing integration testing, we have other projects which they may have tests that are in process as well. We need to make sure that all of the projects represent their proposed future state for every change before it lands before we start testing it. We have a script called DevStackGate which takes the Garrett Git Prep idea to the next level and puts all of those related projects in the state that they're supposed to be. We know that when you're making a change to Dova, it's testing that change as well as any changes that people might be making to Keystone or some other tool at the same time. When we started doing this, we ran into some problems. Our tests are slow. I think at the worst, our integration tests maxed out at taking about an hour and a half. Right now, they take 45 minutes and that's after quite a bit of effort to get things running in parallel. As you can imagine, if you're trying to test every change before it lands, and it takes on the order of an hour to test a change, if you just sort of did this in the naive way serially, you'd only be able to land 24 changes in a day and we like to land a lot more than that some days. So we had to deal with that problem. Cloud API calls can fail. They fail a lot. It turns out that's one of the features of the cloud is that it's eventually consistent. It might fail at any time. You're kind of expected to deal with that failure. Netflix has sort of famously popularized this idea of being prepared for failure at any time with their chaos monkey tool. So when we started spinning up tens of thousands of machines every day to run these tests, we ran into these kinds of problems and so we had to start building robust tools to make sure that they could deal with failure at every conceivable opportunity. Then external services are unreliable. I talked about this a little bit earlier where if we're running thousands of tests every day, we can't necessarily depend on downloading every dependency that we need from PyPy or from the Ubuntu archive or something like that. Frankly, it would just be rude anyway to download quite so many all the time. So we came up with some caching and mirroring solutions to help deal with those. So one of the things that we do to try to keep the system moving smoothly is we spin up nodes beforehand for testing. So we actually have a pool of nodes at any given time that are ready to have tests start running on them at any point. So what we do is the whole process for that is we spin up a new node in a cloud, just a base Ubuntu image. We get all of the packages and repositories that we need and we cache them locally on that node and we snapshot that to an image in the cloud. And so basically we now have an image that represents a base system plus all of the things that we're likely to need. Then we spin up nodes from that image and like I said, those are ready to run at any time. And when a job comes along that needs to run on one of those, it starts running that job and then when it's done, we delete it because we don't really want to bother cleaning it up. We don't know if the cleanup is going to work. We don't know what kind of state the node is going to be left in at the end, especially if the job failed. So we just get rid of it because it's cloud, right? There's more of where that came from. So those are the main things that we've done to speed up that process. Now that we've got all of those tests running as quickly as we can, we still want to be able to merge changes even faster. So we wrote a system called Zool, which is a general purpose trunk gating system. And it's kind of mind warping feature is that it does speculative execution of tests. So it runs a lot of tests in parallel and assumes that they're all going to pass. And if they all do pass, then that's great because all of those changes can merge. And if they don't pass, then it goes back and it figures out which change failed and it kicks that out and it runs through again. So this is kind of what I just said. But I have a quick little simulation for those who are more graphically inclined like I am. So you can imagine you've got two projects, Nova and Keystone here. And you've got to commit there, representing the head of each of those projects. And then somebody comes along and starts approving changes to these projects. So somebody approved four changes, two of them to Nova, then one to Keystone and then another one to Nova. So these are both open stack components. And we know that they relate to each other. Nova needs Keystone in order to... Keystone is identity management. So in order to even authenticate to Nova to do something, you need to use Keystone. So these two projects are related, which means that we need to be careful about how we're testing changes to them because changes to one project can break the other. So what Zul does is as these things are approved, it puts them in this virtual queue and it starts running tests for all of them in parallel. And what is found here is that the first two changes work just fine. And then the two after that, they failed their test for some reason. So unfortunately, Zul doesn't know whether the change number four failed its test because there's a fault in change number four, or if there is a fault in change number three because they're, like I said, they're related. So at this point, what Zul does is it... Well, notice change number three has failed. So it's going to shift it out of its virtual queue. And then it's going to test number four again, but based on the assumption that three isn't going to merge. So it's going to... The future state of the repository at this point is changes one, two, and four. And this time, it actually did pass tests. So presumably the reason it failed before was because there was actually a fault in Keystone. So once the tests are done running, Zul starts reporting these changes back to Garrett. And it says these two changes passed their tests, and everything ahead of them passed. So they get to merge. This one did not pass its test. And so it just gets reported back with negative feedback to Garrett. And then finally, number four gets to merge because its final test was only based on the other changes that did pass. Zul has a very flexible configuration syntax. It basically has very little understanding of the workflow that we use it for out of the box. So we've built all of these concepts out of very simple configuration primitives. So we basically define pipelines for all of the different kinds of actions that we're going to be doing. So the thing that I just demonstrated was what we call the gate pipeline, where a change gets merged after testing. But we use it for other kinds of automation as well. The check pipeline is what I alluded to earlier with where we test changes as soon as they're uploaded. And then the post and release pipelines are, as you'd expect, the post pipeline runs post merge jobs, like build the documentation and publish it. And then the release pipeline runs things like build it's horrible and possibly even upload it to PyPy or something like that. We have a lot of jobs in our Jenkins. We have many hundreds of jobs. I've forgotten how many, but it's more between 500 and 1,000 or something like that. I think it might be more than 1,000. But at any rate, we have so many jobs because we have so many projects. And every project has a set of very similar jobs. So about the time we were getting close to 100 jobs in Jenkins, we said to ourselves, okay, this idea of logging into a web interface for a Java app and clicking around to create new jobs and change their configurations and whatnot, that's going to get old really fast as we add new projects. So we started working on something called Jenkins Job Builder. And basically what it does is we manage all of our jobs as YAML files in Git. And so it's really easy to sort of think about these and abstract them into templates and then apply those templates to all of our different projects. And it's also really easy to give access to anyone. So whereas traditionally in Jenkins, you have to have some kind of administrator access to log in and change jobs. That just seemed kind of elitist to us. So instead what we built is a system where anybody can check out a copy of or clone a copy of our Git repo, change the YAML for these jobs, and then propose that change up to Garrett and we'll code review and merge it. And once it's merged, it automatically gets deployed out to Jenkins. So here's an example job template. Here's a pep8 job. Like I mentioned before, that's the Python style checker. So it's pretty simple. We basically say, you know, this job, its name is going to be gait something pep8, and then the name is going to be filled in later with the name of the project. So it'll be like gait nova pep8 or gait swift pep8, etc. It's got a couple of build steps in Jenkins. It's going to run the git script that I mentioned earlier. And then it's going to run our standard pep8 script, which is basically just going to run talks-epep8. And then when it's done, publish the console log somewhere. So this is about all it takes to write a simple Jenkins job. Again, the system is pretty generic. It's not really tied to any of the rest of our infrastructure. So we actually have a lot of people contributing changes for bits of Jenkins that they need to manage. So it's actually a quite featureful system. And there's a question back there. So at the moment, we have a few Jenkins masters that have all of the jobs. And then each of those masters have a couple of hundred slaves attached to them. We could divide it up in several ways. We could do jobs. We could have a master for a project. We could actually have all masters and no slaves. But right now, sort of based on hysterical raisins and the direction that we're going, wow, I have two minutes left. We have, like I said, a couple of masters with a bunch of slaves attached to them. We're kind of exploring the scaling limits of Jenkins. We found that after a couple of hundred slaves, Jenkins itself becomes untenable to run at that scale. So that's actually why we have a couple of Jenkins masters at this point. Yes. And so one of the reasons that we're doing it that way at the moment is our system for spinning up nodes for the DevStack jobs is really good. It spins up new nodes and then attaches them to Jenkins' masters. So it makes a certain amount of sense for the way that we're spinning up those nodes and attaching them. But nothing that we're doing really assumes that kind of topology. So once you've written a template, you can say, hey, there's this project called Nova. And it needs to run the Python jobs. It needs to run that Pepe job. It needs to run translation jobs, things like that. And so this is basically on a small scale exactly what our configuration looks like. It says for each of our projects, these are all the jobs that you need to run. There's a really cool thing that we're doing. Now that we're running something like we're deploying like a thousand open stack clouds every day for testing. They generate an enormous amount of logs, like a couple of terabytes for every development cycle. And that's after we've compressed them and proved them and things like that. So this is kind of new territory because most CI systems don't build that kind of data as a byproduct. So we're sort of exploring what can we do with this. We've now got the equivalent of a huge amount of production data just from running our tests. So can we mine that data and do things like automatically identify failures, either ones that we know about or possibly even ones that we haven't discovered yet. So we actually have a lot of people looking into creating tools that deal with that kind of stuff. And it's pretty exciting. And we basically use Elasticsearch and Logstash to drive that automation. And we'll call that the end. So these slides, as I mentioned, they're available on a Git repository. They get published to this location, just like all the rest of the open stack docs whenever you make a change to them. So are there any questions? So we're adding new tests all the time, and we're adding new jobs all the time too. We have a workflow around adding new jobs where we run them in an experimental stage off in the corner on request and they don't do anything. They don't report anything back to developers. And then as they become more stable, we run them silently so that they get run more, but they're still not reporting back. But we can go and look at the output and find out whether it's working or not. And then finally, once we're satisfied with a new job, we start having it report back as well. Most of this stuff is all self-gating too. So if you're adding a new test, that test is going to be run as part of its merge proposal. And so it's only going to land in the repository if it passes its test itself. So, yes. I have a question. Like you said, do I configure over the second so many ways? How many configurations do you actually test and how these work together? Or do you just base them? Right. Okay, so here's the slide that I skipped. So upstream, we have a couple of really popular configurations that we can test in our virtualized cloud environment. So we test with MySQL and Postgres, and we test with Rabbit and sometimes Zero and Q, but maybe not right now. At any rate, so we pick a couple of popular things and we try to make sure those work. And then we have this extensible system where anybody can run their own testing infrastructure and have it report back to our Garrett. So if you're like, well, I need this to run on this particular real, really expensive networking switch. You can set that up in your lab and have it report back in an advisory fashion so developers can see those results. Why Apache? So that was, I mean, OpenStack was basically started by Rackspace and NASA very early on. And then they got some other big companies along in the beginning. And honestly, I guess that was the license that people, those people were comfortable with. There's a really interesting thing in our community in that it is Apache licensed, which means, of course, anybody can take this and stick it in proprietary products. If you do that and you don't contribute back to the community, the community doesn't really interact with you very much. There's very strong pressure to if you're actually doing any work on OpenStack to contribute your changes upstream. We're trying to build that kind of collaborative environment. All right, thanks.