 It's it working? Is it working? Yeah. Okay, guys. Hello. Welcome. It's our next talk and here I would like to present Clint and Paul and I will tell you something about testing, let's say. Okay. Yep. Hello. So obviously we are not Monty Taylor. Monty was on his way here but he unfortunately had to turn around for personal reasons. So today myself, Paul, and I've asked Clint to join me on stage because he did a fantastic talk about a Zule use case yesterday at IBM and we thought we'd just, rather than doing, if anybody's seen Monty present it's very hand wavy and stuff, we thought we'd kind of turn this into more of a discussion, question, answer. So if you have questions, feel free to ask them as we're doing it rather than waiting to the end because we have lots of time to get through this and it seems like a fair bit of people here already know and understand Zule and you're, you know, want a little bit more deep dive into it. I agree. So we can kind of skip over this, who am I, because this relates to Monty but Monty is a part of Red Hat. He's in the Office of Technology. His primary mission statement or goal is to work on all things related to Zule and Ansible. Who are you, Paul? So my name is Paul. I also work at Red Hat. I actually had a little thing that I was going to flip over to or not but I am part of the Open Stack infrastructure team which is the team that has created Zule and Node Pool and this testing strategy that we use in Open Stack infrastructure. And Clint? I work for IBM. If you saw my talk yesterday, thank you for coming. If not, I work on a team that has built an alternative Zule for testing things using GitHub. So it's kind of a fork that's coming back into Zule. And previous to that we actually built a downstream Zule from Open Stack to deploy our own Open Stack, not a fork but actually just using Zule as packaged. And so between the two of us we have almost as much as is in Monty's head and it'll come out a lot slower than when it comes out of Monty. So hopefully you guys will be able to absorb it. So what are we going to talk about? We're going to be talking about Zule and we're going to be talking about Ansible. Basically just by show of hands, does anybody, is this the first time you're hearing Zule, the word Zule? Oh, fantastic. Who came in to see a Netflix Zule presentation? Okay, good. That's not that. Yeah, because there is technically two Zules. They tried to take the name but we took it back. It is a Ghostbusters reference. Hey, I'm down. Like we can, somebody torrented. Yeah. So what is Zule? So at the heart of Zule it ultimately is a gatekeeper and that's ultimately where it gets its name from from the movie Ghostbusters. And Zule is a multi-cloud scalable elastic CI CD engine. And I think one of the coolest things about Zule is we didn't necessarily set out to create Zule from day one. It organically grew over the course of OpenStack as a project to solve real problems that we were having with CI and CD and so on and so forth. So the biggest benefit that I think we both agree about Zule is it has this concept of a future speculative state of your source code specifically related to Git. And we hope to dive a little bit more into how that actually works. But the idea is we create a fresh VM on an OpenStack cloud. We attach the job to that VM. In this case version 2.0 we were using Jenkins. So we would have Jenkins initiate code on the remote slave that we would attach to Jenkins. The job would run all the artifacts and data from that node would be archived off to our log servers and so on and so forth. And then we would actually destroy that Jenkins slave so that we could put it back into the pool of resources and start the whole process again. It does support. Go ahead. The cool part about that is the code that we put on it. So normally I mean that's nothing new. We did that with Jenkins before Zule. But the cool part was when we started saying okay I'm changing this component of OpenStack in response to a change that's still in flight on this part of OpenStack that broke everybody in the response to a PyPy library that came in, right? And we were able to build a dependency chain and Zule actually pulls both in flight changes and tests them together and they land together in each repository. So you're actually able to build cross-repro dependencies which is actually one of the most interesting features. So you're building a speculative future across repositories and deep down stacks of commits which I like that because it encourages you to make small changes. Instead of like well I got to package these all together so that I don't have to wait for my gate test to run six times. No, you can have six commits. They all run in parallel and they all land at once. That's what I like about it. Yes and to that it definitely is code that is pre-merged into your repos. So the idea is that nothing broken is getting into your master branches because it's gone through this process or pipeline to validate that the test is good rather than looking at the code and saying I think it's good, let's merge it and then your build bots or your build servers start screaming at you because you've broken everything. So some of the terminology, we'll dive a little bit deeper into Zoo. Actually you know what, there's a really good slide. I'm going to skip ahead here because I think this slide should have been closer to the beginning. So in OpenStack Infra it might be a little bit difficult to see here but this is ZooL right here in the middle and you can see everything else that kind of goes around it. So with ZooL being in the middle it interfaces Intangir which is our code review system which is the interface that our developers use. We have Jenkins over here and you can see we have eight of them, eight Jenkins masters. Hey Paul, how many of you think that this is a crazy complicated way to run CI? All right, how many of you know how many developers OpenStack has at any one time? All right, over a thousand active developers at all times. Sometimes 2,300 I think distinct but there's generally a thousand developers developing all through this system. 250 commits a day land pretty much. Sometimes it's up around 400. So we're talking about it. We kind of got ahead of the fact that this was built to scale CI for over a thousand active developers daily. So yes that's crazy but it actually works and it scales really well. So go on Paul. Make sure why we did this. You're right. From an infrastructure point of this is my days is managing all of this but I think if you've ever contributed to OpenStack your interface is only Garrett and everything else is kind of hidden from you except the results, the artifact results to get kicked out at the end and there's a log stash server or something over here. So when we talk about ZooL, yes we're talking about one application but in reality this is kind of the architecture of it and it like Clint said it's crazy but this is how we had to scale things out to allow us to run a massive amount of changes at any given time. Is it contrast the one that we built at IBM which was downstream from this one basically testing our deployment code in our small local patches. We had like one ZooL server, one Garrett server, one Jenkins and one log stash. So it can scale down. We had 70 developers but we were still actually landing all 250 commits every day in our local repositories and deploying that so it does elastically contract as well as an application. Part of that's the way that because it organically grew to scale out as microservices it actually does the right thing and scales back down pretty well. And we'll talk a little bit more into when we talk about ZooL v3 the next iteration because I think we take it even a little bit further with the idea of making it less complicated to operate as an ops. So some of the terminology that you're going to probably hear Clint and myself talk about or anybody who contributes to OpenStack. A periodic job. A periodic job is basically a job that runs on a cron timer. For example we have a periodic job that runs every 24 hours to build pi pi packages of everything that we put on our wheel mirrors. Something that we always want to run. Don't necessarily care about the job. Well I shouldn't say that. We care that the job runs but it's something that we always expect to pass. And it's very expensive. Yeah and correct. It is a very expensive process because usually it's a longer running task maybe like a three hour or four hour task. A post job. These are kind of inverted backwards. I'll skip down to a check job. So a check job is when somebody submits a patch into our testing infrastructure. You propose a patch using code review. These are the first set of jobs that pass before you actually approve. So as an example for me I will write a patch and I will throw it up into our code review system for the purposes of running, having the tests just be run against it. I might throw another 20 patches to that until I get everything running properly. And I will I'll preface it with something like work in progress flag it. There's ways to kind of indicate that hey this is still kind of experimental. I have a question for the audience. Any of you who's vagrant. Right. I used to use vagrant a little bit. I found that working in an open stack I use it less and less because these check jobs are so good. They have enough coverage that basically as soon as I've got something that I think works I just hit get review. It throws it out into the CI system and it's I'm kind of abusing all the cloud capacity open stack has. But now when I work on a project that doesn't have that and I have to go back to like vagrant up and waiting and taxing my laptop I'm very sad because essentially this exposes thousands of cloud VMs of capacity at once. But it's kind of the same thing. It's like quick checks my you know just check my in-flight thing. Is it working? Give me good coverage. And it's more to inform the code reviewers like yeah this code doesn't have obvious flaws. So just read make sure they didn't change the tests in a stupid way. And then you can read the code and evaluate it on its merits. Exactly. Let's check jobs. Yep. And this tech terminology or technology or concept is not new if you do any sort of CI. Just to speed it up a little bit a gate job just represents that somebody has actually reviewed the patch and thinks it's well enough to actually be merged in Garrett. We have a process of an approval usually requires two core reviewers. So I think on average it takes about two weeks to kind of go through this process of sign off and approval. And then same thing the same nine times out of ten the same set of tests run again to ensure that the testing state hasn't changed from the last time you uploaded the patch. And if it takes two weeks to actually get somebody to to approve it like a lot especially in opensack a lot of things have changed. So we rerun the test on approval. If it's green great it gets merges in. If it fails for some arbitrary reason it gets punted back to developer and you just simply start the process again. And we're going to show a little demo about that a little further in. And then basically we have a post job which which is jobs that run after code is merged a primary example is we generate tar balls for every commit for a project. And then we publish them to our tar ball server. So so I think we kind of I talked about a little bit about the massive scale but I mean the thing that the way that this happened right where we got to that giant crazy graph it started with a few servers in Iraq and HP running some tests for open stack CI in very boring ways just like we're running Jenkins here and then we have a slave that like builds a directory and then K execs into it and hopefully that works. And that obviously didn't scale. So then it was like well what if we pointed this at the HP's open stack cloud that existed at the time. And then well now we need another Jenkins master because Jenkins can only run 100 at a time. And then now we've got 400 active developers and we need three Jenkins masters. And now the gate takes so long we can't wait for you know there's four approved patches in flight that's eight hours because the tests are two hours long. And so Jim Blair who wrote the original Zool came up with this idea of what if we built a speculative state where we stacked up all the approved commits and tested them all in parallel across a cloud. And that's where Zool that was essentially Zool one. Yep came out was just that the pipeline for the gate. And then we realized this is actually useful for cross repro dependencies because we're already building up commits. We could do this across repos. So he organically grew with the scale of open stack over I think it's four years ago when it started. And at this point now it's absolutely the linchpin that holds open stack together. I don't think development could proceed in the fashion that it does without it. Right. And I think one of the great things at the beginning the people on infrastructure team Jim and Monty truly felt that everything had to be automated. Everything had to be a robot because we didn't want to depend on one person. And if you're in a company works nine to five. Everybody's in the one office. That's okay. But if you wanted to scale to the scale that open stack has contributors worldwide. You don't want somebody in Australia having to wait for somebody in North America online to be the guy to push the button. The sun never sets on the open stack. Exactly. I mean I want to sleep. I'm sure Clinton wants to sleep. I'm sure any developer wants to sleep. That's why we write robots to do all these arbitrary tasks. So now we're kind of getting into some of the scale that we're talking about here. So we've coined a term called KJPH kilo jobs per hour. So we're actually at two kilo jobs per hour. And what does that mean. Well that means in a given hour we're running 2000 tests. But the cool thing about that is because we launch a VM for every test. We're actually spinning up 2000 VMs across all of our cloud infrastructure. And I can't remember if the next slide shows what our cloud infrastructure is, but we're up to about 12 regions of capacity across the world. And in those thank you if you donated. Yeah. This is all donated hardware. Mind you we are grateful that we don't have to pay for any of this. Companies see the value in donating back to the project. And yeah I don't think we would grow as big as we did and still are without you know companies like this providing these resources. So yeah so and these VMs are we standardize on an eight gig eight core VM eight gigs of RAM 80 gig hard drive so they're pretty you know four years ago kind of laptop size because we wanted everything to run on a laptop. Yeah so arbitrary developers these stats are a little bit older but yeah like 23 2400 developers at any kind of given time writing code. That's actually peak. Yeah it's generally around 900 at peaks in the cycle. Oh yes I'm sorry yes I didn't yeah. These are just the biggest numbers we have. I would we show you the smallest. We're up to about 1500 get repos that we manage jobs and these are if people don't know what JJB is Jenkins job builder. These are once we have Jenkins job builder create all of our jobs we're pushing about two 12k worth of job definitions and so on and so forth. I will say Zool V3 should reduce that a lot of those are distinct just so that they have a different name. Yeah because there's a quirk to the way Zool creates shared queues that they need to have a different name or they'll get co-gated. So I think it V3 will take that down to I think 3000 which is still a lot of jobs but part of that is just naming. And about 10,000 changes per month that we merge. So just as a comparison the Ansible project has about 14,000 pull requests in its lifetime. It's merged about 8100 commits. Oh I'm sorry it's merged about 8100 of them and there's close to 40,000 commits. So yeah the scale is just massive. So in four months open stack makes another Ansible every four months. Yeah so multi repo speculative execution. So this is what Clint was referring to. So I'm good time for the demo. Yeah I think what I'd like to show first is the demo of how all this works because. Not a demo really but it's a it's a fantastic animation. Simulation. So that we don't have to wave our hands so much. Watch as Zool handles your commit. So again we we have a concept of a pipeline which if you've done CI or you've talked of CI you hear the word pipeline more and more these days. The tiny text there by the way the blue dot is says Nova and the yellow dot has a head for Keystone. So Nova and Keystone if you're not familiar are two different open stack projects that are co-gated so they both have to pass for either one to land a commit. Yeah what we're showing here is a parallel co-gating pipeline in Zool where both exactly right. Nova and Keystone are interdependent on each other but repo different repos. Yeah so and then above there you have four changes that have been approved. Three to Nova won the Keystone. The idea is that Nova can't land a commit that breaks Keystone. Keystone can't land a commit that breaks Nova. Yeah exactly right. What we're showing here is four patches are online. Oh sorry I gotta click here. So four patches have come online and they're being queued up into Zool. So the idea is that Nova one is basically in the speculative state applied to the master. The tip of Nova it runs its set of jobs independently. The second patch set applies the previous patch at one and master and launches its set of testing and it continues down. Keystone does the same thing queues up the previous patches and four. So what's actually happening is as somebody approved all of these simultaneously we are going out to the cloud and we're running these speculative tests in parallel across all of our testing infrastructure and because Zool knows the future states based on this history it can do some really cool things. Yeah so if you count the lines there they're a little small there's 15 parallel jobs running all at one time and the first eight have passed and then one of the Keystone jobs has failed which is why it's gone red and then the Nova one has also failed because it's stacked on the bad commit in Keystone so that the rest of the queue has basically failed. So we have a question. Just in queued order unless there is an explicit dependency so the Nova ones are in the same repository so if they were stacked on each other then there is an imposed order but and also if there was a cross repro dependency imposed right then there would also be an order but otherwise it's just how they were in queue. So the question is Keystone three I'm sorry Nova four requires Keystone three okay I'm sorry so Keystone three requires Nova four so but it was not expressed but it was not expressed it would it would fail so yeah you're right in this example it could be that is the reason why in this example that didn't happen though because four failed as well if that was true four would have actually passed because it would have everything it needs so that's that's not what happened here but if you saw green there which could happen and does happen what would happen is the Keystone commit would get kicked out well hold on the Nova one oh god go ahead so I think to answer your question a little bit more so in just so for two and or one and two are good you know Zule is going to emerge those three and four are bad four may have been cancelled or it had job fails so Zule isn't is smart enough not to kick four out because it has a failure ahead of it so it's going to move three out of the pipeline let me click this again kind of moves it out of the way and then it rebases for a top of two and it does the tests again and as a developer this is in kind of invisible to you you don't have to do this Zule kind of takes care of that and then what what Zule is hoping is that when four runs again it's going to pass because it truly isn't dependent on three and then what happens is three continues to run you can see I don't know if it shows it up here but three will eventually continue running even though one job has failed we will run the whole suite of jobs so that we present as much data back to the developer correct so number one is green we passed all of our jobs so Zule merges into nova two is great it's green again gets merged in three we know is bad we kick it back to Garrett we leave a message indicating log failures so on and so forth four is green it gets merged in and the beauty part is is nobody had to zero humans it was all robots question let me read the question back real quick so what you're asking is if in these in in number two which passed its tests we may have actually broken the ones after it there right so we may have broken the api for keystone yes that's a possibility it's not actually what we're trying to cover in this simulation but that's an absolute common thing that happens in any distributed system right where things are passing all the tests but then the next thing adds a test that the last commit broke you would hope they're like we didn't just run the nova tests to get that green we ran the open stack tempest test suite which tests all of open stack so it tested keystone we didn't just run nova's tests so keystone's test passed on two one would hope that we didn't break keystone and that is an important part of having a good integration story cross repository integration is make sure you're running all the tests on all the things that are that are co-gated together we have made that mistake before where one is not running the test and one is and and one of them keeps breaking the other we've that was actually a very early problem that was solved mostly just through socializing the fact that you can't be co-gated without running all the tests so so this is something that you get in zool at a box okay i don't think we can i think we can continue on here because everybody understand this idea of a speculative future that we just displayed there i want to make sure because it's i think the coolest part of zool the the idea that zool as a robot creates a pipeline of changes that are in flight and lands them all in parallel yep in the back so the question is we have 1400 or so repos do we test the entire matrix of combinations together absolutely not i don't i don't do the i can't do the arithmetic my head but that matrix would be insane no that that's just the number of distinct repos they're essentially testing islands in open stack so there's what a lot of people referred to early on is the integrated gate which was basically the core of open stack and i think that was 16 repositories at its biggest size and i think it's actually been pared down a bit that's probably the biggest co-gating matrix and that has i think that runs maybe 20 different jobs to test the different combinations of common configuration so like let's make sure it works with postgres and my sql and this driver that's common in this one but it still doesn't test the entire matrix that would just be too hard yes so the follow-up question is can developers just arbitrarily state dependencies in their commit message they absolutely can and an important thing is they can state dependencies on things that are not co-gated to them and that's fine it's that means your project is taking responsibility to follow that one that project doesn't have to care that you depend on them doesn't have to so it can be a one way or it can be a two-way dependency that's actually more of a social problem than a technology yeah i was exactly going to say that that's a social i think it's a social contract between projects because it i think the next slide kind of gets into this so and i think it's going to address your question is is how to force a dependency is we've created a depends-on header so as an example i'm going to kick to the next slide but i just wanted to highlight this is how you express it in your git commit message this is the change id that's generated by garret this is the change id of another patch in garret so an example open stack has a shade library for the for the purpose of our discussion it is a wrapper around all of the open stack api python api's we run tests against that for open stack now neutron which is our networking project creates a breaking change for shade um neutron proposes a fix shade then depends on that patch because there isn't an explicit integration gate like there's there the social contract is shade sees value in testing against neutron neutron doesn't want to depend on shades testing right so shade expresses the we need that fix to fix us um shades tests all run as if neutron has landed them and shade basically says to neutron this is good but we can't merge this code until neutron finishes its process and that um that is the way that zool expresses itself so that you don't um put the what is the expression the egg before the chicken or or something like that right so that's that's the whole concept of the cross repo dependency of not having i guess this integrated queue just i i'm not sure we ever so solve that chicken or egg thing so i think the cart before the horse maybe that's better one but it is that the idea there that that um the ordering is up to each patch developer not each project so i think just to expand on that at ibm when we uh we brought this functionality to our developers who had never had anything like it it kind of blew their minds because they were used to having to harass developers to make releases of things so they could depend on them and it was like no you can't make your release until it lands but you can finish your development and you still have your green passing test and just when it lands they both land and magic happens and it became this nice like i can get that change out of my head and it and it's actually the main reason we started the bonnie ci project i talked about yesterday was we want to bring that to everybody and that's actually also why zool v3 is happening and trying to bring it to ansible and things like that is i think this is going to help open source developers get things off their plate faster so yeah so so basically just to follow up but isn't shade supposed to prevent neutron from breaking again it goes back to the social contract shade shade cares about maybe cares is a bad word but shade needs neutron to function neutron doesn't need shade to function so it's the one-way dependency and and something like this if if neutron breaks shade shade is willing to accept that risk and be the the project to react to those breakages so again not specific to open stack gate and check pipelines they're they're just arbitrary words in theory you could have a pipeline you're checking so when somebody uploads a patch you could have this multi-process pipeline where it it just continuously grows testing after each pipeline before it you know to to scale it you're testing that way we have about 50 plus vendors that are actually using zool because the way that we do third-party testing in the case of cinder for example ibm might have a driver that they want to test in house on ibm hardware they would set something up in their corporate firewall because they don't want to expose that hardware to the public web every every time zool comes in and detects a change set clint's version of zool listens to that event stream kicks off its testing and then reports back upstream so it truly gets us into a distributed not only distributed testing but distributed results testing i guess is maybe the word it's super powerful so now you don't need to cram all this hardware into one ops center or one team again a social contract is there to allow ibm to to post results but if ibm's system stops working or goes rogue or something like that then we politely ask them to turn it off if they don't we cut them off you know things like this right yeah the key there is there's a there's a one garret to many zools relationship which actually ends up being powerful even within organizations as well as while we were doing the ibm cloud thing we did have another organization spin up a third party and start testing their driver with our cloud with their zool the same one that would test upstream which was actually a pretty pretty cool thing to see so i just got the 10 minute warning so we might have to hop to it yep so wiki media for example is somebody who uses zool if i'm online this is what their interface is just a quick point on friday they pushed almost a thousand kilo jobs it's awesome they they've really embraced zool and and and no pool and so on and so forth this is our representation you can see we actually have things in the check gate right now if i drill into hopefully i can find one with a good ip i don't know if i'm gonna be online or not but these are the tests that are actually being run um if you were to click on this link it would take you over to jankins if we were still running jankins because we we don't run it that's the next part we're gonna talk about but yeah all of these individual tests are spinning up vms and because we have something called node pool that doesn't first automatically no humans involved until something breaks but these are our statistics down here so you can see um we wiki media probably beat us i think no maybe not this is only over a 12-hour period but right now it's also saturday yeah it's also saturday yesterday if you looked at Clint's presentation the the check or the gate queue here was totally packed with like 40 or 50 commits and because zool was doing its thing no humans were waiting it's just it's a waiting game to all the tests are running so on okay so uh let's kind of skip over some of that so basically zool 2 what it got us four years of running in production uh it's what most people run today basically triggers our garret and periodic publishers or reporters again or report back to garret email mysql node provisions are static and elastic static means we depend on jankins to create them elastic as we use um node pool and our jobs are executed by jankin most of the third party ci for instance just they have a couple of servers hooked up to whatever special hardware they're testing they don't need a node pool with a giant cloud behind it yep so this that got us to this complicated design which we showed at the beginning so uh 2.5 this is what we're running today so basically we replaced jankins with ansible and uh our jobs are actually we still use jjb um but on the fly we convert those jobs into ansible playbooks anybody using jinkins job builder by the way it's actually pretty awesome right like and i think it actually gets lost that it actually came from uh the info team yeah open stack but it has become pretty popular i i love it yeah it's a way of expressing jankins jobs in yaml and then converting them on the fly into xml and pushing them up into the web interface so you can code review your jobs and not touch the web interface of jankins because remember we had eight eight interfaces and that would be problematic so um oh so basically uh this is what a configuration of a job looks like these this is actual live job that ran this morning so this would be the playbook that oh i got kicked off the wi-fi off the duo yeah but anyways it would express a playbook and so on and so forth okay why replace jankins paul so um yeah it wasn't for a lack of trying so every that big massive diagram was basically all to support still using jankins um we started with jankins actually i think it was hudson was the original then we you know went into jankins else ran hudson the first thing it's right on yeah represent so we we as the open stack info project we've maintained various plugins um a jcloud plugin basically a garret trigger plugin we had scp artifacts plugins we wrote jankins job builder which we just talked about before uh a gear man worker like clark built that uh five minutes oh man we're running the time here uh basically we tried everything to get it going and unfortunately we just couldn't top it more than a thousand so there actually was a an actual reason why we did this but before we get to that we just want to acknowledge that the world would be a better is a better place because of jankins and the ci that it's brought it's just unfortunate that at the scale that we as a project we're working at it it just it didn't fit our model but respect yeah so clint yeah so the basically what happened is we ran into a security flaw that was pretty serious an open stack was running one of the biggest public jankins in the world and we had to shut it down and turn it off we couldn't expose jankins web for a while and after stacking all of those workarounds on top the info team looked at what jankins was actually doing for us and it was at that point mostly sshing into things and displaying text on the web everything else was being done by zool and jinkins job builder and all the tools we built around it and so the idea was what if we just took jankins out yeah and replaced it with a few lines of ansible and in fact we know a better remote executor right and that's basically ansible so i know time is short but basically what we did was we blew out this part here we this is the previous jankins masters which we've turned into zool launchers which at the heart of it are ansible playbooks and we have a process that starts uh everything else up to this point of the node generation and so on and so forth is still the same except when we said jankins build me something we said ansible playbook run me this at this port this ssh port it actually still speaks the same gearman plugin protocol that jankins does yep so so it's pretending to be jankins but it's actually just a box running ansible yep so that's ultimately what 2.5 gets us today right what we want is zool 3 and i think we're going to get a run out of time on this one here but basically zool 3 is everything that we would want if we were to sit down and design this properly not necessarily let it grow organically and so on and so forth so basically what we're doing is because 50 or so people are now running the version 2 of zool we're like wouldn't it be fantastic if other projects could run this so we're looking at as other projects outside of open stack are really going to be we're hoping we're going to utilize this and because we as open stack infer make a lot of specific assertions of how we do things may not translate to what you want to do as an example we love open stack clouds we love running on open stack clouds you may not have an open stack cloud and you want to do bare metal testing or you want to do containers or or some sort of other thing that i can't get kubernetes thing that people want to use yep so the concept is is those are going to be considered first class citizens aws google cloud you know those are no longer bad terms that i mean just to kind of try and wrap it in a bow i think of zool version 2 as you know an internal project that's kind of gotten loose and v3 is an attempt to rein it in and say no we're actually going to take care of this and make some guarantees about it so the info team is like well we'll support v2 for these limited third party use cases but then you know wikimedia runs off and runs a giant amount of stuff on it and it's like we we don't really know how that's going to operate we didn't design it for you we designed it for us v3 is actually designed as a with a proper public api sort of sort of thing so it has some guarantees and some some things that you would need if you're if you don't want to run it open stack scale question that is a very good question is open stack an internal project that got loose perhaps so that's the boys at nasa we're getting one minute here so that's with questions afterwards right or okay so we we got one minute left does anybody have questions or do you want us to keep kind of talking here and propose questions when's v3 that's a really good question the hope is for march yeah so we've been working on it for a while this year the reality is probably more like june i mean it's getting close march is probably when info will start to run it yes but it's probably also going to be a little ugly and rough so there's a big sprint at the end of february the the open stack project team gathering where we're all going to be in one place and we can pair up and kind of push push it you know over the hill beyond that now i have a project that's that's waiting on it as well so we want it to be done as soon as possible it is a pretty steep learning curve but if you want to join the development effort you have some python skills hashtag hash zool yep yeah pound zool on uh free note so yeah so i just wanted to highlight one of the big things that zool v3 kind of addresses is we're good at ciing projects we're not necessarily good at ciing zool so change it so configuration changes to zool today are really human approved processes which end up blocking projects centralized yeah it's it's it's server side requires a rooted min to say yes this looks good or i think it looks good but we have no way of testing it right so the irony is that it's really hard to test the changes to your tests yep it's really annoying so everything is going to be in repo now again zool starts very basic learns its configuration as a change change as a proposed patch is uploaded if it contains a change to this configuration file it goes out to the mergers asks for everything again rebuilds it all and then again pre-commit runs all of these things again and inherits all the functionality of depends on cross repo dependencies and so on and so forth and basically the concept is test your tests before you land them that's really what we're aiming for now anybody use travis travis ci right works more like that so you have a file in your repo that defines the matrix of tests that you want to run for your project and then you are only constrained by the zool admin at what those can do yep and then basically true first-class multi-node support we were kind of mentioned this before we could say give us a controller that's ubuntu give me a database server that's sent os whatever like the the possibilities are endless today it's give me three nodes and those three nodes are sent os and you know it's kind of kind of limiting and then again these are basically node pool changes we kind of touched on it we're we're wanting to add a container support kubernetes support bare metal support any other any other thing that you want it really is going to be a pluggable infrastructure to drop in and support that it would be basically writing a driver to what your thing is at the end and yeah so this is kind of answers the question status is focus on open open stack first we're hoping in the next few months to get it into open stack i think by march so we can churn on this fix everything and then masses come in and ultimately the next step is to actually get the ansible project on it using the git interface so instead of using garret we would use github pull requests and so on and so forth and get all the the fantastic support that zool does today and hopefully bonnie ci will also be using it yep this is my project and that's it so this is more information again i encourage everybody who is actually interested in zool v3 to read this specification because it goes into more technical detail of some of these structures of what an in tree configuration looks like what we're trying to accomplish we're depending on zookeeper you know all these good things yeah like basically one minute for questions sorry say that again nova by oh so the question the question is what are we going to replace nova with zool i'm sure somebody's already done that in go so hopefully everybody found this so who's going to go and run a zool now because it's awesome you get a zool and you get zools for everybody but yeah hopefully you found it informative i definitely talked to clint definitely talked to myself or just jump into open stack in front or pound zool lots of great people it's again sorry we're not monti yep if you find monti pin him down because he would love to talk about it thanks so much thank you yeah i guess the problem is keeping the cuffs reasonable the thing is it's how many people are going to cook just because the graph keeping the graph potentially right like like one change will be fine when they when the dependency changes are really deep right like right now if i make a change that depends on 15 to 9 like is that change should depend on it it has to dive down commit messages or things like that because that was the reason why i was asking maybe you can like simplify the tree or just just looking at those like building one time dependent some recurs yeah i don't think it'll actually like you still have to build that's it once you have to build a tree and pipe up in memory it's it is going to become a graph problem and good point that is one area where we have located repositories is probably going to be required so basically right now you're scheduled i think we need to have both some automated way and then you know that those manual like okay i know i'm depending on something but or you're breaking someone out i mean it might be cool if there's a way to detect it especially if you're just depending on an already released that might be there's another point right the layer of products like what if what if you change in the packaging and around breaks yeah those upper bounds to basically fail every test and see