 Alright. Hello, good morning. Thank you for joining us to talk about leveraging CICD to improve OpenStack operations. My name is Maria Bracho. I'm Product Manager for OpenStack at Red Hat. I'm Dan Shepard. I'm the Product Manager for Rackspace Private Cloud, powered by Red Hat. Alright. Thank you for joining us right after Keynotes. We're excited to be here, excited to have you here. I mix up two companies working together at the open source way and to the light developers and you guys that make OpenStack better. So we're going to talk about bringing OpenStack all the way from upstream to the customer in a continuous way. So keeping track of OpenStack changes is a little bit like counting stars. So there's plenty of developers today here at OpenStack Summit. There's plenty of developers committing right now. And I bet some of you are thinking, how can I keep up with all this change? And will my API still be there tomorrow? All good questions and very difficult to keep track on. Testing OpenStack individually is very hard and it proves that it's not, it's probably a futile effort because it's really hard to keep up with it. So it comes in a joined way of developing OpenStack and testing OpenStack. And this is what we came here to talk about with you today. So both working with upstream OpenStack, with a Red Hat OpenStack platform and with you, our customers, our partners and the ecosystem at large. So what we do in our development process really is have our delivery team come in changes upstream and then those changes go into a version control. And once they're in, it triggers a build and unit testing of those changes. If they pass, then we move through. If they fail, they go back to delivery. We check them, we test them again, move them up and see once that wall passes, it goes into Red Hat's QE environment. So quality engineers test it into multiple deployments and deployment architectures. If it fails, then it goes back to the development. And the same thing goes through. Changes happen. We improve the changes and the patches that we submit. And once it passes QE, the idea is that we then distribute it out to our partners. And that is before we go and announce that this OpenStack version is GA. And why are we doing this? So we found partners that are eager and waiting to test every little change that we make to OpenStack before we announce a new release. And the reason is they want to make sure that we know, and they want to make sure that they know all the changes going in. They can test it against their own environment and they can certify faster. For us, we want to make sure that we're testing OpenStack in the use case that we care the most, which is really our customers and our partners. Our QE team tests hundreds and hundreds of different architectures and different deployment scenarios, but we can't really cover the one scenario that we care about. And that's why we partner with folks like Rockspace to be able to deliver to them pre-release deployments of OpenStack. And then once that is ready, then we're ready to announce a release. We know that the release that we announce is really tested not just in-house, but also with our larger ecosystem. And this is what we're calling distributed continuous integration, our ability to deliver pre-release versions of OpenStack to our larger ecosystem of partners and customers and be able to receive feedback from them before we actually go and release. What DCI is trying to do is really change the way that you interact with us Red Hat and with OpenStack in general. We have seen that happen in our relationship with our partners. It really doesn't change the work that we do because we've been working together to build OpenStack anyway, but it really changes the way that our conversations go. So before it's like, did you deploy it? Did it work? Did it not work? How did you test it? Where did you see the fall? Can you help me report a bug? To really elevate that discussion to, oh, here are the logs. And this is my DCI deployment, and this is where it failed. So we really elevate the conversation to direct troubleshooting and not fact finding. So like I said, latest OpenStack builds get shipped over into our partner sites and our customer sites. They test in their own environment, which we have no control over. And it really speeds up the time for certification. So if you're a partner and you're trying to certify a specific plugin on the Red Hat OpenStack platform, it would be, it would behoove you to try DCI because it allows you to see the changes faster. You're testing hundreds of times more than what you originally would test just to prepare for certification. The idea is that you're continuously testing and then you're continuously certifying your solution. It simplifies the upgrade process because, as you know, Upstream has a six month release cycle and from release to release, a lot of things change. A lot of projects get added, APIs may be there or may not. And you don't want to find out what happened after the release. So think about it. If today you're testing OpenStack, so Newton, as it was just released, think about it. If you don't test a new version of OpenStack until the time Okada is released six months from now, all those changes that are being discussed right here at Summit that have been discussed at mid-cycles and will be discussed in the next mid-cycle, you're going to have no idea how that looks on the code, how that looks on your infrastructure if you're not continuously testing that. It's a shock factor and a surprise factor to test after the fact. What we're trying to say is you should test a lot sooner, you should test continuously and you should automate your tests. DCI also provides automated feedback to Red Hat on the test that happened on our partner environment and that's awesome. So here's what DCI looks like today. And let's just say from the line to the right-hand side of the screen, or the left-hand side of the screen, it's Red Hat environment and upstream all the way to the left. And to this side is Ruxmas environment. So like I mentioned before, we can see a more code from upstream from the OpenStack foundation. You would notice I just changed it to the new logo catching up right after the keynote. And we port it from upstream and we create what we call our downstream repost. Whenever we think that we have a release or a pre-release OpenStack kind of ready to go, we put it in a DCI repository. And then after installing a DCI agent on the Red Hat side, on the Ruxspace side, sorry, we've been working so much together that it's interchangeable at this point. Now, after installing a DCI agent on the Ruxspace environment, this agent picks up every new release and all these new changes and then orchestrates the deployment using Red Hat OpenStack Director of an automated deployment of an OpenStack cloud, including both the under cloud, the director node and the over cloud with the infra nodes. We also orchestrates that and we run a battery of tests that can include things like Tempest. If you're familiar with that testing framework for OpenStack, Rally as well, another project of OpenStack. We run BrowBeat, an excellent testing framework that is also sponsored or folks from Red Hat work on that. And we also run a battery of tests that form the Red Hat certification test suite and any other partner or customer-specific tests that you want to run can be run with DCI. After all these tests are done, the results get ported back into this DCI repository so that we can have access to the logs. And then DCI also presents them in a user interface where testers and developers can see what happened to this new version of OpenStack that we just got released, this pre-release version, what failed, when it failed, how do those failures look like, why did it fail or why did it pass? And it allows you to test multiple different versions of OpenStack. So right now, well, I'm not going to spoil your part. You can tell us which ones you're running, but you can run multiple versions. So what we're trying to do at the beginning, we said keeping up with OpenStack is kind of like trying to count stars. It's hard and the more you count, the more they show up. And you really depend on what kind of instrument you're using to count those stars. What we're trying to create here is not to remove that because that complexity is going to be there, but we're trying to create something beautiful that we can do together. And with that, I'll leave it to Dan to tell you a little bit about OpenStack. So that brings Rackspace Private Cloud powered by Red Hat, a product we launched back in February that leverages Red Hat's OpenStack platform for us to take the Red Hat distribution. We wrap fanatical support around it, and then we deliver it to customers as a managed service. And anyone who's familiar with Rackspace knows that fanatical support is really what we're about and taking code from upstream or from Red Hat's distribution and then applying our practices around it to give customers something that is truly differentiated and production ready. So before DCI, we would build our cloud for our customers. We would test every version as it was released from Red Hat, and then we would run these 1,600 tempest test cases manually. So someone had to log in, click the button. Well, someone had to build the environment, then log in, click the button, watch the run happen, get the results, figure out what failed, why it failed, and go through all of that to figure out if we needed to raise a request with Red Hat, or if it was something in our configuration or where that problem really existed. So every time we got a new version from Red Hat, then we had to re-kick the environment, leverage a bunch of scripts to get everything stood back up, and it took us literally weeks of testing from the time Red Hat would announce the GA of OpenStack Platform to when Rackspace can announce the GA. And then once we got done with that, then we had to go back and do the same set of testing all over again for upgrades now. So with DCI, we were able to take those 1,600 test cases, run them automatically, they run daily against a build that is automatically configured from DCI, and anytime there's a failure, it automatically creates a log file, raises the request, sends it over to Red Hat so that their QA team can get involved and start checking it out. With that, we were able to go from what was about a month between a GA announced from Red Hat to a Rackspace announced for a GA down to right about two weeks. In fact, with OSP9, I think it was 12 days, so very much able to shorten our timeline just because we have access to an automated testing tool. So what does testing look like at Rackspace? 1,600 Tempest tests, you can see kind of over here. What it looks like, they run through. Our test cases right now are all upstream. We're currently developing some Rackspace in some of our customer-specific test cases, but everything that we're running right this minute is upstream test cases. Some of the things we're working on developing the additional test cases around are really more for our monitoring agents, some of the custom use cases that our customers have for their very specific workloads so that we can further validate and make sure that those key scenarios that customers are providing to us are going to work every time we get a new build from Red Hat. So why do we do all that? It really comes down to all clouds are unique. OpenStack means custom. We're all here because we all have some sort of private cloud that we've probably customized in more ways than we want to admit. We have hardware variations, we have networking changes, everyone has a different use case. So with all of that, you really end up with complicated deployments. Even at Rackspace where we standardize and use reference architecture and build based off of our own best practices, all of our customers end up being unique at some point. Whether it's in their data center, whether it's on some hardware stack that we're not running anywhere else on any other customer, everyone ends up with some unique thing. So having the ability to get to a point where we're testing that unique thing in an automated fashion is really important for enabling us to accelerate adoption for our customers and keep moving them to the latest and greatest OpenStack. So in addition to that, as Maria indicated, we have a lot of collaboration between Rackspace and Red Hat. So over the last 18 months, we've connected developers, we've connected QA resources, we've connected sales teams, really everyone to sit down and say, how can we make this better? How do we do things better? So building on that DCI comes out, we start using Rackspace hardware in the Rackspace data center, but Red Hat's managing the deploy force through the DCI tool, so we've given them the hardware, they're running the deployment scripts. We come in with our test cases, those result in bug zillas for Red Hat to go squash, we don't even have to pick up the phone and call them, we don't have to create log files, it's just there. We can just track, hey, these bugs are moving forward all well before a GA happens from the code, from Red Hat upstream. So the end result really comes down to Rackspace testing, Red Hat patching, and then Rackspace verifying what has been patched all before a customer ever even knew there was a chance of having a problem in the next release. So in, sorry, my turn. In the nutshell what we are doing, what we're trying to tell you is testing OpenStack manually is a bad idea. Automating your deployment of OpenStack is a good idea. Automating your tests of OpenStack is a good idea, and not only is it good, it's gonna save you a lot of time, even though it's a lot of time consuming to actually do it, it's gonna save you a lot of time in the future, it's gonna help you keep up with the pace of OpenStack. And as Dan mentioned, not only we're just collaborating with Dan and Red Hat and Rackspace to make OpenStack better, but we also have another list of partners that we're doing the same thing with. So who benefits? It's really OpenStack. Every time we test with a partner and we find issues with a partner before RGA release, we're able to make those changes sooner in the release cycle, and sometimes those changes actually get merged in the release cycle that we're testing in, which is fantastic for our customers, for our partners, but for OpenStack in general. I mean, we all talk about it quite a bit with software developers, right, and anyone who's working directly with developers in your organization. Developers are out failing fast, they're creating test-driven development, trying to test early fail often. It's applying that same method and mentality to your infrastructure. So the option of seeing the next build of OpenStack, the daily build coming from Red Hat's upstream distribution, gives you this idea of, hey, I'm going to see that, I'm going to address it before I get it in front of developers, before I get it upstream to my customers. You get to see every day, you get a peace of mind of knowing that things are failing or things are not failing, and hopefully you start off with a lot of failures early in the release, and by the end of the release, right before GA happens, nothing's left that's failing, but generally that's getting to that point earlier is very interesting for the engineers working on OpenStack and for your downstream customers. Yeah, and also the way that Red Hat works is obviously upstream first for us, and what we're trying to do is get to a release of a new upstream-based OpenStack distribution, and we want to do that release a lot faster, we want to test as much as we can, like I said, in the use cases that we care about the most, which is our customers and our partners, and we want to sort of bring also the customers and partners closer to upstream to start working on the latest and greatest as soon as possible. So we talked about distributed continuous integration, so in a nutshell, we shipped this packages before their GA ready to customers and partners. So what is the next step? Also is enabling our customers and partners to build OpenStack the way OpenStack Infra builds OpenStack. So income software factory, so host your own OpenStack in front. If some of you are OpenStack contributors, you're familiar how OpenStack Infra works, you've probably heard about Garrett and Sewell and others, and know that the gating process of building and testing OpenStack is what drives this wonderful code. So what we're trying to say is we're packaging an OpenStack Infra-like environment on your own data center, so that you can test and build also your own changes and patches. So if you look at this graph, it's pretty similar to what I showed you before, where we send you this Red Hat pre-release version of OpenStack, but also we now give you the ability to build your own changes to OpenStack using software factory and then deliver those to DCI, so DCI can also run the deployment of that change that you did and you can leverage the same tests that we're doing right now. And you can do that sort of interchangeably or intermittently, one time coming from OpenStack, another time bringing in your changes, maybe one of them running Red Hat OpenStack version 8 or then 9 or then 10, and then testing your own changes on 8 or 9 or 10 if you're so wish to support that and see how those behave. And this is the UI of software factory. It basically, if it looks a lot like Garrett, it is because it is. And then also leverages and integrates all these other tools that help you build your own, either patches or changes to OpenStack or to your own distribution. And so then what now? So we want to enable the customer directly. We want to be able to let customer handle their configuration as code, make this, given the customer's a Garrett infrastructure so that they can build their code that way and test it out that way and use DCI as a gate for testing and making sure that passes that CI gate before you commit a change and then merging. And then at the same time from a DCI perspective have used DCI to provide the continuous delivery of those changes, testing your workloads, leveraging the test cases that you were already testing with, whether they were your own tests or some that were provided by Rehab. And so what's in it for me? What's in it for you? What's in it for all of us? How do customers and partners benefit? You can track and test all of your changes, can validate, use their changes, can be your configuration, can be new code, can be simulated workloads, whatever you're doing. And you can integrate with tools like Browbead, which I mentioned before, so do even more rigorous testing and benchmarking your changes, seeing how your OpenStack Cloud behaves at some point and then before the change and then after the change and then after a certain amount of time, after that change, and really just give you more data to work with and make decisions with. And how does OpenStack in general benefit? You were basically allowing for the OpenStack 9 release, which was metaka-based, and for the OpenStack Newton release, which is gonna be Newton-based. We were giving feedback from customers and partners that we never gave before, mainly because that testing happened manually, that feedback happened a lot slower, and we just didn't have a way to automate all these changes or to test as many times as possible because testing OpenStack manually takes time. I remember the first time that we had DCI up with Rackspace. In a weekend, we basically got more data and more test results that Rackspace had gotten in months. We went from executing maybe a test every couple weeks, just because of resourcing and time from having an engineer working on it to we could get two to three in a day. So, all of a sudden, this feedback cycle started happening and we were giving poor people like Gonnery, who's sitting in the front row here, just piles of data that would have taken us weeks to come up with on the old system. And Gonnery's now trying to figure out, well, how am I supposed to parse this and find what's meaningful to change first? I mean, there's just so much data coming in. So, but that's his problem, not mine. Yeah, we've definitely generated a bunch of other new problems, but it's all good. Like it's good problems to have. Similarly with Rackspace, we also had other partners that were interested in DCI because their main focus was not producing a fanatically supported cloud, but actually operating their own cloud. And their feedback was, well, we needed to be Red Hat certified. And in order to go through all the process for Red Hat certification and all the tests that the Red Hat certification require, they had to spend a lot of time and a lot of resources. And by resources, I not only mean sort of dedicating their data center to do that, but they had a team of over six engineers, including a project manager to orchestrate when the test will happen, when will they come to fruition. It took them almost three to four months to achieve certification. And we release every six months. So the cadence wasn't really aligned and catching up was hard. So the first time that we did this with this other partner, their feedback was, well, now what are these six engineers gonna do? And our reply was, get them to automate some more tests to put them in. So all very, very good feedback. And of course our partners now say, okay, now that this is solved, here's this other problem. We have more data. We want to parse it. We want to make smarter decisions with it. We want now these changes to make it into this release because you already have the results so go to the mid-cycle and push for this change. So all great things. We were able to take our engineers and send them out to go talk with customers and start building those test cases against the customer use cases that we didn't necessarily have full vision of before because there just wasn't time. So now we have customers going out. We have engineers talking with customers trying to understand more of the use cases so that we can pull them back into DCI and create more of those meaningful tests and get more data back upstream for gonorrhea to figure out how to handle. So is it open source? It's my name Maria, of course. The source is over here. It will be provided with your slides, tick picture, and you can get to that. Both DCI and software factory are open source. Some of the developers are here. Raise your hand. And that concludes our session. Do you have any questions? So one of the questions that I've heard is what about using DCI against upstream OpenStack as opposed to using packaged Red Hat distribution? So what we're doing, what we can do is it definitely run DCI against upstream and package upstream or RDO, which is a package distribution of upstream and do that sort of at the beginning of the cycle because at the beginning of every cycle we don't necessarily have a release candidate ready or as often as we do at the end of the cycle where we have many release candidates ready and open multiple times per day because we're sort of committing a lot faster. At the beginning of the cycle, there's a little bit of a lull trying to figure out what changes we wanna commit, what changes are coming. Maybe those are not even passing testing so it's really not meaningful to deliver those to customers and partners but maybe they want to see, well, what's going on upstream. So we can start distributing RDO which is really upstream packaged and then we can start seeing how that behaves on that environment. The idea is that this is a test dev scenario, is a test scenario. So if the CI fails, it's okay. We just wanna see where it fails so that we have a heads up that we need more testing, we need to look more into these changes, we need to look more into these APIs, et cetera. So yes, we can test against upstream. Any other questions? I can keep going. Yeah, yeah, yeah. Do you have a question? Yes, hi. So the question is, how does the CI help post GA and how does it make upgrades easier? So post GA, we don't really have a concept of a static code post GA. We may say this is a release nine but later after that there might be bug fixes, there might be security patches. That release nine or 10 version is not steady. So you'd still want to continue to test on that one. So that is helpful that way. Another way that is helpful is we have partners that develop applications on top of OpenStack. So you want to make sure that that OpenStack is there, sort of running underneath and then you test your applications on top of it and make sure that with each changes that happens in OpenStack on a certain release, bug patches and fixes, that application sort of remains whole too. And then how does it help upgrade? Well, imagine this, once you have DCI running with OpenStack nine and you have been continuously testing with all the changes that we're pushing between version nine and 10. If your CI is sort of passing, then you know that you're gonna be that much closer to have a passing OpenStack 10, for example. The idea is that we don't ship you a whole new version. So we don't give you here's OpenStack nine, here's Red Hat OpenStack 10, but we're giving you hundreds of little incremental changes to that so that we ease you into the upgrade. And then from a Rackspace standpoint, we treat upgrades as kind of two pieces. There's does the next version of OpenStack meet all of the requirements for our consumers and for our operators so that we can say, yes, this is something that is production ready and supportable. Answering that question is what we're leveraging DCI for. The next part of that question is, is what happens during the upgrade process and does that break anything? So right now we leverage DCI to shorten that cycle of, is this ready for our users? And then we come in and run a manual upgrade behind that to say, okay, let's validate that the upgrade process works. And so we take our 1600 test cases, we run a smaller subset of those, then we do the, before we do the upgrade, then we do the upgrade and then we run that smaller subset again. So the idea there is, we ran 500 test cases, we had 495 passes and five fails before, we ran it again at the end, we had 495 passes and five fails, so things are the same. So it becomes this two stage process. So you'll see there are times that we announce, hey GA is ready, all new deploys will be the next version and then it will take another week or two before we'll announce the upgrade path. It's because we're going back and retesting and maybe working in partnership with Red Hat to fix something that went wrong in the upgrade process. And we really rely on this feedback. How do you manage? So the question is, with growth of CICD test cases, you're creating really a longer running test scenario. So the answer is you'd run less builds in the day and if we found that, hey we're not getting through all the testing we want to get through, we would potentially add a second CICD environment to split that workload. We're not there yet, but if that's a good problem to get to, that means I have lots of testing happening and we have a high confidence in what we're building. So right now it's early and we're still building our first set of kind of customer use cases. I expect patterns to emerge there and eventually I'll get to a point where the customer use cases won't necessarily require additional scripts, right? It'll be, oh, we'll just flag customer X on the script that we created for customer Y, so. So I think you asked also about different kinds of customers and how do you manage that growth? So once we had DCI open running with Rackspace, one of the ideas that came actually from Laren from Rackspace was, well, now I have all these different customer deployments. Can I have a DCI agent running a specific deployment for a specific customer where you only run the test cases that are relevant for that deployment? And then can I have that going? And our answer is you can have as many DCI agents running as many different OpenStack architectures per customer and then limit your test cases to whatever's relevant there. The number of test cases is gonna grow. I mean, OpenStack is growing, they're building more things. Those things have a test case associated to it. So that's the problem that's gonna continue to be there. Did Laren pay you for that shout out? I mean, he's back managing a support team in San Antonio, so I thought maybe he paid you for that. No, he didn't, but we find that with Rackspace and I can say Dell because we also announced with them, Dell is another partner that is doing this. They actually give us a lot of new requirements because now they have seen this. It's like they have seen the light, like, oh, this is great. And it freed their minds from a lot of other busy work and it seems like their minds are just as busy just creating other busy work, not busy work. You know what I mean? Just figuring out all the problems to solve and to continue to make OpenStack better. Yeah, so our development team for DCI is actually right here. So you can really talk in depth about that but I can tell you that when I approach a partner or a customer and they ask me about, I want to install DCI, what do I need? Really what we tell them is we need this DCI jump box that can be a VM. So our footprint in your data center is as big as a VM. And then everything else in terms of how does that deployment look like can be as many nodes as you have in your OpenStack deployment and it can have, it can be a mixture of virtual or bare metal and if you want to certify, for example, sender plug-ins, well, you're gonna need to have the gear that goes along with that but we don't tell you that there's a bare minimum. In fact, I have some partners running, certify their software SDN plug-ins and so their OpenStack deployment is 100% virtual. So this thing is just VMs. For Rackspace, the most complicated part was figuring out how we were going to get the director built out inside of our existing deployment process for bare metal deploys. So once we worked through that, which was just figuring out, oh, how does orchestration happen now and how do we make sure that we keep our tools in alignment, it was very easy after that. Yeah, to have access to Rackspace hardware we found out was hard because they want to keep their hardware up and running 99.9% of the time. Any other questions? Hi, the question was how much time is it between an issue found on a customer side and then the patch or the fixed scene upstream? Well, isn't that the question that we all want to know? And the answer is like, it depends on the patch, right? It depends on what the issue is and how we're tracking it. I tell you though that with DCI, the time it takes to actually report that issue upstream is a lot shorter because we're finding it sooner. And then if you find it during the development cycle, say if somebody commits a patch during, for example, right now, the Okada cycle, I'm able to send that patch over to Dan in Rackspace and we see that failing. That is logs that we can just push upstream and say, hey, this failed here. So think about that before it actually makes it to the product. So that time, I can tell you it's faster. The time to fix it, it's another thing. It could be, yes, if we're using, for example, RDO, like I cannot tell you that a patch that lands today can be in a week in one of our puddles, in one of our pre-release versions because we do a lot of that testing ourselves. So it may be that we catch that bug before it goes to our customers. So it may not even go there. And in that case, we report it upstream right away. And it may be- When the stars align. And it may very well be that somebody at Red had made that freaking patch that then messed up in the customer, which is what we find, right? Because we test the patches that we made in our own gear, as well as all the third-party ACIs from upstream. But we tested on our gear, but now we're just, honestly just extending the QE environment that we have to have our partners too. And in the case of Rackspace, some gear in the case of Dell is a lot of gear. So, and we're expanding it to other partners as well. Two questions. How long does it take to run each build? Using DCI. And then the second question is, how long does it run a patch of the build, or does it run the whole build? Okay, so the first question is, well, it depends on what kind of open stack you're deploying. So it depends on how does your environment looks like, right? Oh, for the 16,000 test case. So in your specific architecture, how long does it take to run a build? Once everything's built out, the run takes, I wanna say like 20 minutes. And then it takes a little bit longer for us to look through the logs and make sure that everything ran and pulled the file out. So maybe an hour total. The deployment, no. So the deployment in front of that, we basically get close to two runs a day right now from build the environment, run all the test cases, collect all the logs, build the next environment. It's gonna vary it on the number of nodes that you're building, right? And we're building on physical nodes. So everything gets redeployed. And the hardware we're using is not bleeding edge hardware by intention because we are trying to support a wide variety of use cases, right? So. And then the second question was, do you run the entire build or do you just run the patches? Yes. No, so the question is, does it run just a patch or does it run it from the beginning to end? And the answer is, right now, we run it from beginning to end. That's a good use case to just run whatever change. But because we do the orchestration with director, we just sort of test the deployment and do it all over again. So any last question? I think we're out of time. Thank you so much for coming. And I hope you have a great open stack summit and walk a lot and enjoy the week. Thanks, everyone.