 Hey, everybody. Thanks for coming. And thank you for coming after lunch. And I appreciate it if you would wait to take your afternoon nap until after this. So we're going to be talking about continuous integration and OpenStack and whether it's possible, whether we should be doing it, what is holding us back from doing it. And I'm really going to come at it from both a OpenStack contributor's standpoint, as well as a Rackspace employee and deployer of our public cloud. So first off, I guess, who am I and why should you listen to me? Probably you shouldn't. But the second part is, or the first part is, my name is Brian Lamar. I work for public cloud deployments at Rackspace. And I have been doing that for the last about a year. What exactly that means is a little bit interesting. But I'm also an OpenStack contributor since about February of 2011, approximately. I work in a lot of Ansible, Puppet, and Python. And so I have a team of about four or five guys that people who all work on our public cloud deployments and work in all of those technologies, Ansible, Puppet, and Python are our primary technologies. So what is continuous integration? And so a lot of people might know the answer to this. I'm not an exact expert on continuous integration, but I'm going to try. And I want to kind of explain how that relates to OpenStack. So what we're not going to be talking about is our continuous delivery, continuous deployment kind of aspects of the way a lot of times you hear CI, CD, continuous integration, continuous deployment. There are a lot of people who contest all of these from anything from what continuous means. Some people call it continual. And delivery, deployment, what that means. But we're not going to go into that really right now. We're just going to kind of focus on the continuous integration and what that means. To step back even one more time, though, what are the benefits of continuous integration and why are we looking at doing that? The first one is really that OpenStack, I want OpenStack to always work. I know that we have releases. We have milestones. And we have this whole release process. But my goal is, especially the goal of Rackspace, is to kind of deploy OpenStack at any point in time, not just when the release comes. And we have all these bug fixes. And we really want the release to be, I mean, OpenStack should always work. That's what to put it. And since it always works, it does allow for more frequent deployments. So here's a big long list of basically just a couple of topics and practices of continuous integration. I stole this blatantly from Martin Fowler's website. I would recommend you guys reading the link right there or just reading anything that Martin Fowler has to say within reason. So I'm going to go through these topics one by one really quickly, kind of explain what OpenStack is doing right now, which kind of works with the process of continuous integration and what it might be able to do to kind of speed that along. So the first subtitle there is maintaining a single source repository. And this kind of might not be exactly what it seems. Basically, the point of this is to say, use source control. And a lot of times, actually, when Martin Fowler wrote this, it was, I mean, I know that I worked for a company about 10 years ago that we worked on software and we didn't use source control. We didn't use, I mean, our deployments were literally copying the file and running a make on a Visual Studio Box in production. And so really, you should be using source control. You should be using tools to minimize the amount of branches that you have to control in those source controls. And so basically, this is covered by our use of our really big, whole thing with Zool and Jenkins and reviewedatomustack.org. And so one of the things that kind of bugs me, and I'll say this and then I'll explain what I mean by constant interoperability between projects is an issue with this. A while ago, we broke out NOVA into, well, Glance was kind of in NOVA. And then Glance Be Kind became a separate project. And we made the decision to have many different repositories in OpenStack. So OpenStack as a project is a bunch of sub-projects. So we end up with not always having all of the tests run successfully against every master of every project. So I'll give you an example. Whenever someone creates a new NOVA commit, we have to install dependencies from PIP to run the unit tests. And it doesn't really matter where they come from. In this case, it's PIP. But we install a bunch of dependencies. And those are examples of the dependencies. PBR is equal to me. And the last one is Python Keystone Client. So actually, one OpenStack project, which is NOVA, relies on another OpenStack project, which is Python Keystone Client. And the issue is that when we install that, we actually install the version that was cut whoever knows how long ago and put into this repository, this package repository somewhere. And basically, what this says is that the latest NOVA doesn't actually work with the latest Keystone Client. The latest NOVA works with the latest PIP package Keystone Client. And so it makes it really difficult to say, oh, I want to get a new feature into Keystone Client that NOVA needs to use. But I don't know when Keystone Client's going to be released to PIP. And it's just something that I would like to fix. And I have talked to some people, and I'm hoping to talk to the right person to get that worked out. But I think it's an upstream issue, which I wouldn't be more than willing to help out with. So that should be from PIP and not PIP. So the next tenant of consistent integration is automating the build. And basically, this is done by DevStack. And it's great. It really works. We have third-party voting bots, which basically allow other companies to vote on changes that are coming downstream. And it really, we're very successfully automating the build of OpenStack, in my opinion. So making your build self-testing, basically, this means you should have tests. And we do have tests. We have unit integration tests. We have the Tempest test suite. But there isn't, we don't really have any hard requirements on actually including tests. This has been a big thing we've been talking about over the years. And some people want to say, oh, the test percentage should never go down when you include a change. It's something that we've been debating back and forth. And I think it hurts the continuous integration and ability of OpenStack. But it is also understandable that certain changes subjectively and objectively cannot have or should not have tests. So this is a big point that Martin Fowler makes about continuous integration and that people should be committing to mainline every day, which basically isn't really able to be enforced. Because I'm not your boss, probably. And so I can't make developers say make a small change and make sure, before you go home that day, you have that commit, at least up for review. So we have all of these smaller reviews, which are very smaller reviews are very easy to be tested. They're very easy to be reviewed. They're less likely to include bugs. However, there's no really consensus of what a small patch is. And what does that mean? So it's really a subjective thing. So every commit should build the mainline on an integration machine. So this is just saying that we do this. We do that with what I talked about before, which is review.OpenStack.org, and Garrett, and Jenkins, and Zoola, and Nodepool, and all of those great things that the infra team has created, keeping to build fast. So this is one we struggled with. Product test suites are still slow, in my opinion. It shouldn't take 10 minutes to run unit tests. It shouldn't take, I mean, sometimes it takes half an hour. With us, for Neutron, when we first started, Neutron was taking two hours and required over four gigs of memory to run tests. It's not fast. But on the other hand, with such a large project, I'm really not convinced that this actually matters. Test reviews need to be up for at least eight hours, preferably probably a day, so that everybody can have a few hours in their time zone to review patches. And so we're not going to be a fast-moving project in terms of typical software development where it's in house. So I'm not convinced that this actually really matters. So testing the clone of production environment, you really want your project to be built as close to production in your pre-production and your pre-production environments so that when you do put your code into the production environment, you know it's going to be run successfully, and it's going to run without issues. This includes integrating with things like your in-house billing system or your in-house authentication system or whatever. And so this is really handled by the third-party kind of bots that I kind of alluded to or talked about earlier in that we're actually really getting good at allowing companies to say, this is the deployment that I have for OpenStack. I'm going to, you know, on each commit upstream, I'm going to take that commit, I'm going to do whatever I need with it and deploy it to my environment, and then I'm going to vote on that change to see if it worked or not after running some of my tests or some of the upstream tempest tests or whatever. And so it's really not up to the OpenStack infrastructure guys. They are allowing us to say, you know, this is what my environment needs to look like and make us really put the onus on us. Make it easy for anyone to get the latest executable. This is somewhere that I think OpenStack really can improve on. And the word executable is kind of an interesting word to use, but when Martin Fowler describes what a project is, he actually kind of says that there's the code and the configuration, and everything should be kind of in the repository so that when you do a deploy, it's really just one command. You either double click something or you run a simple command and you say, deploy. We don't have any sort of configuration or any sort of deploy scripts in OpenStack, and there's no real executable in Python. And so it's kind of, and there aren't any upstream packages. They just don't exist. As far as I know, I could be completely wrong on this. Each commit doesn't generate anything, any Debian's anywhere, or any Tar Balls anywhere that you can easily consume. And that's actually not true, because there are some Python Glance client, Python Keystone client. All the clients are actually on PyPy, as I kind of talked about earlier, because they're actually used in the build of all the other projects. But there's not a consistent release in my, I mean, I don't know when Keystone client gets updated and when Glance client gets updated. It's not consistent. And so I'd like to either have that and have it consistent or not have it at all, really. So everyone can see what's happening. It's really important and continuous integration to say, developer knows what's happening with this change. A product manager knows what's happening with this change. We have a great gate. I really like it. We have the ability to have third-party input. We have a lot of great things that they provided for us. I was just in a session earlier yesterday, I think, or the day before, where we described where the infrastructure team described what a third-party bot was and what the requirements were for having that and kind of making that really easy for people to have. One of the things that is happening, though, is that there is a large amount of information overload with Neutron or some of the projects that have, I don't know, 60 bots or whatever. And they're all commenting on this one review, and then someone makes a patch set, and then they all comment again, and then it can really be, I think we're working on that with projects like Zed Vins and different things that can really help the workflow and the visibility of where my change is and what do I need to do to get it in. So the next one is automated deployment. And there is no official deployment method, really, of deploying OpenStack. With continuous integration, it's important to deploy. However, it kind of blurs the line between continuous integration and continuous deployment or continuous delivery in that, what is a deployment? And actually, that's a great question. No idea. Not only were we never able to define what is a deployment in the OpenStack community, but it's actually difficult to agree on what tools you would use and what a deployment is. And not only that, it wouldn't really be useful to companies, because the deployment that the upstream people would agree on wouldn't actually be the same deployment as we might use. So that's really why we've left the automated deployment party back to the deploy bots. You can deploy in your environment. You can have your own custom deploy. And then you can comment on commit upstream to say whether or not that that commit works for your environment. So what is Rackspace doing now? And we're doing daily. Daily pulls of upstream code. We're pulling them down. And we're merging our custom code repository into the upstream code. And we're using an open source software called applypatch, which is written in-house by Rick Harris. And it does work. It means that we're carrying a lot of custom code, a lot of custom patches, which I'll show you later. It might not be that good. With that code plus our custom patches, we build a custom virtual environment. We run unit tests. We build a tar ball package. We deploy that tar ball package. We run more tests. We kind of do a promotion process from one of our continuous integration environments into our pre- production environment. We run more tests. And then we have staggered production deployments over all of our production regions. And that sounds simple. And so that's actually one of the flowcharts and how everything works. And there's this whole process for what happens when the code upstream merges with our patches that conflicts. And we have to meh. So overall, largely, it's a really simple process. Really simple. And so it's because of this unnecessary complexity of us holding our own patches that is really making things the most complicated. And so it's taking us about 21 days to choose a release branch, meaning that once we bring down OpenStack that day, we start our 21-day process. And we start looking at what code bugs that our code is introduced versus what upstream might have introduced that they didn't catch because we're using some sort of weird configuration variable that they didn't test or we didn't consider it a test for. And then after that, it takes about 45 days for us to deploy everywhere and then start looking at the OpenStack code again. So by the time we actually start looking at the OpenStack code again, it's been about eight weeks behind. And then we start the whole process over again. And actually, it just compounds on itself. Because we're eight weeks behind. It's not easy to really get into it again. And so here are our patches. We have about 80 patches. The green one is actually Nova. We have probably over 40 patches. And on the right is about 35,000 lines of code. It's a lot to maintain. It's a lot. And we have a couple of environments that we deploy to, 40,000 things we deploy to. So it's a lot of code. It's overhead. It's a lot of things that we're deploying to. And I really want to make this as simple as possible because I'm going crazy. And I get this question asked a lot. So what's taking so long? Why can't you just take all of these custom things that we want included with this open stack and make them into this package that gets deployed? I'd say at least 90% of my time is identifying and fixing issues with our code that has been integrated with upstream code, which we've deployed and found issues with either before we deploy it in that it doesn't merge correctly or just that after we merge it, there are issues that it introduces. I work with our Jenkins infrastructure, which has nothing to do with upstream Jenkins infrastructure. Our packaging process is completely separate. Our deployment process is completely separate from upstream. And then there are tons of integration issues with our, as I said, our billing systems, the auth systems. There's tons of stuff we have to worry about. It takes forever, a long time, and we can improve, I promise. And so the question is how? And the first three bullet points are actually the same. We need to stop carrying our own patches because as a company Rackspace really, really, really needs to work with the community and not with themselves on our own patches. And we just can't do it. So besides just getting rid of our patches, which is a big thing, I really want to stress that is a really big thing. I'd like to look at doing packaging upstream so that each commit is deployable by anybody that wants to pull that at any point in time opens to actually be able to be downloaded and run successfully. I don't know how well this would go over, but I'd like to explore different deployment methods upstream, maybe creating some sort of suggested or reference deploy, but I know that it can be controversial. So why do we have custom patches? It speeds development time for us, or we think it does. There's a huge cost to people like me and my team because developers will go out and they'll create code and they'll say, oh, it doesn't fit in the OpenStack model. They're not going to put it in, so let's just put it in our own repository. And so at that point, we have to say, well, if they're not going to put it in, there's probably a reason. And so we need to get it in, and it's going to take them a little longer maybe at first to do that, but it can work. We have proprietary features. We don't really have proprietary features, and we shouldn't. If we're going to work with OpenStack, we really shouldn't. There's proprietary integration, things like your billing systems and your off systems. And really, we can allow that by doing things like plugins, plugins, and configuration options. And so real quickly, there's a nice billing system story, which I will try to tell briefly, in that people may not know that our Nova emits kind of notifications for usage and billing systems to consume so that we know how much to charge our customers. And we had a date string in that notification, which our billing system had hardcoded. And then we changed it in upstream OpenStack. And so one wouldn't think that a billing string was really that difficult to change in the billing system once they started getting the new string. But we actually carried a custom patch for that for over a year because they couldn't change the billing string because billing systems are difficult, evidently. So I understand why people carry custom patches, but wouldn't the answer of that really just be making sure that OpenStack was flexible in what date string it sent? And then we could have a configuration option to say, oh, this is the date string we wanted. And so there are different ways of doing things other than creating custom patches and creating overhead for other teams. And yeah, developer, I talked to a couple of people on the teams. And it's going to be more difficult to get your code in to OpenStack rather than taking the easy route and putting a custom patch in our custom proprietary repository. You know what, I don't care. I don't care. So going forward, the long story short is I want to stop carrying patches. I want us to encourage pluggable code. I want to make sure that we're commenting on Garrett with the results of our deploys. I want Rackspace public cloud to be, as soon as someone makes a commit, we're spinning up as production-like instance as possible. And making that comment to say, oh, that doesn't work with our public cloud. And here's why. And here's how to fix it. And giving that feedback to the developer so that we can all work harder and better and faster. For OpenStack, I want to talk about the official OpenStack deployments, what we can do with packaging, just in general project interoperability that I talked about before with making sure that the latest version of Nova works with the latest version of all the clients and the latest version of Glance works with the latest version of Nova and everything like that. I really want to strongly encourage smaller commits and make sure reviewers know that smaller commits are better and they are. I promise. And I will talk to you about that all you want. And definitely keep encouraging tests because it's one of the things that it does make things better. So a couple of fun reads. There are two books, Continuous Delivery by Jez Humble. Well, Jez Humble and David Farley. Jez is a great guy. I've met him. And I really do recommend if you have a chance to read that book to do that. And The Phoenix Project is another good one. It's not necessarily exactly about continuous integration, but it's really about the whole DevOps IT integration and collaboration. So that's really all I have. It's a, yeah, any questions? So on the flowchart where you showed your process of doing this, there's a slide before that with bullets. And one of those bullets said, I think, build custom virtual environments. Are you referring to the Python virtual environments for that or something else? Yes. Yes, I am. So are you actually building a separate virtual environment for basically every package then? Yeah, so what we do is every package we create is actually a tarball with a virtual environment in it per project, per open stack project. So Nova has a directory, which is in there. Glance has a directory. Neutron has a directory and everything. And they're each separate virtual environments, which we basically pip install all the dependencies into and then install into there. So then do you ultimately turn those into RPMs to actually deploy them? We actually use BitTorrent. And we torrent those around to all of our nodes. Torrent the tarballs? We torrent the tarballs. Yep, and so then also what's included in that tarball are all of our puppet manifests. So actually, we use Ansible to kind of control BitTorrent and pull those down to all of our nodes. Ansible will unpack them. We actually have a Simlink on all of our servers, which points to the current version. Change the Simlink, run the puppet manifests, which kind of restart all of the services. And that's really the entire deployment process is the tarballs and the Simlinks and puppet. So then on your regular integration test, so you're basically building out sort of dynamically a full open stack install kind of using that same deploy process? Yeah, and so we have Cloud on Cloud, which is what we call Inova. So we have a Nova installation, which we can spin up Nova API servers and Glant servers and Neutron servers and everything. And so we have Ansible scripts to create those in our undercloud and then Ansible will deploy all the code, configure all of the services, and basically kind of build everything from the ground up. The only thing that's difficult with that is our things like hypervisors, where we don't have a great way of reinstalling them every time for our continuous integration environment. So we do the best possible effort to clean those up and make those as clean and fresh as possible, so that when we do start over and reconfigure for our continuous integration, that we, yeah, it's best effort kind of thing for that. Thanks. Yeah? That I want to add. I also work on the same team with Brian is that this is the summit where those of us that are the big operator deployers are finally at the point where we're all solving the same problem. And so we've already made some really amazing contacts with people at DreamHouse, at HP, some at eBay as well, where we're like, we're all solving that same problem, even some smaller deployers that are just getting started. I met some great guys from iWeb today at lunch. So if you're interested in helping solve the problem and kind of comparing notes and approaching, please do let us know and join that conversation. It's not going to happen overnight, and we'll probably still be talking about it at the K Summit. But just do know that we will solve it, but more people would be helpful. Indeed. Any other questions? One of the other questions I think that maybe somebody hasn't asked is, what's in the patches? What, we've talked about proprietary integration. We have a lot of networking patches in NOVA, which is one of the reasons why we were willing to take on the extreme experiment of deploying Neutron to help alleviate some of that. A lot of that Neutron code is related to the network functionality. And so there isn't anything in the patch, except for maybe that usage hack that nobody really wants that couldn't be out in the open. But in general, it's because of the technical debt of how old, how long we've been doing this, and how long we've been trying it. So this isn't proprietary features. This isn't stuff that isn't part of the community. It's technical debt that we've taken on and that we haven't been able to clean up. And now we're saying, taking a stand, clean it up, don't create more. Yeah, it's actually a great point in that I view these patches really, and anybody that is creating patches, it's you're basically creating technical debt for yourself and don't. I really, really suggest do not do that. It's probably not necessary. Actually, I know it's not necessary. Keep it for your configurations, your fax, your custom plugins. Don't try to hack the OpenStack code without being part of the community. It's really not worth it. So you can wrap it up. Yeah, no, that's it. So thanks guys for coming.