 Hello everyone, wow a lot of people that's great Good afternoon everyone My name is Emilia. I'm I work for Red Hat I'm glad to be with Matt Fisher from Time Warner and Mike Dorman from Goody Today this is about how how we are building clouds with a puppet open stack modules You have the link of the slide if you want to download You have the same link at the end of the slide. I think if you miss it So Mike Dorman is working for Goody's Deploying open stack as doing Matt Fisher for Time Warner and I'm running the project technical lead for puppet modules in the next cycle so For for today, I don't expect you to be prepared to expect so I will introduce what is the puppet open stack project in our community So maybe you missed the information, but the puppet open stack project is now an official an official project under the open stack big tent This is a this is a set of Puppet modules that that provide a way to install open stack So for each open stack project We have a puppet module and we have a module which is the open stack Lib which is a common library to to configure the common bits between the All the open stack resources like database etc So we provide you a set of modules, but we We don't compose the the puppet manifest for you That means if you want to use the puppet open stack modules You need to to to write your own composition layer. So we will see in the next slides how you can do Also, all the puppet work that we are doing is is One of the best practice and they will talk more about this in the use cases later are the To use the era for data driven deployments So how how you can use the modules today? First of all today the modules are hosted on stagforge and we are in the process to move the repository on the Open stack git server So today you have two options, I think to deploy the puppet modules You can you can select an open stack installer that that is using it This is good because you don't have to know much about puppet mode You don't have to know much about is it working? Yeah, okay But the bad thing is that sometimes you can lose in flexibility when you want to deploy complex infrastructures So depending of the way Installer is constant puppet modules Okay So this is fine So I was saying that the open stack the open stack installer is fine because you don't have to know much about puppets But still you can lose in flexibility for complex deployments So those guys have very good example later and as far I know they don't use installers right now and That's why you have another way to use the puppet modules This is the So you need to write your custom puppet composition So this is more difficult because you need puppet expects like those guys They know a lot about how to use the modules how to write manifests This is good because if you have a very big infrastructure that needs Custom drivers custom plugins custom software that you want to plug to open stack This is very good because you can you can bring your puppet manifest very flexible So if you wonder which installers today are using The puppet modules Some of them are using some parts. Some of them are using the wall the wall beats So we can name a pack stack Stay puffed audio director Marion T fuel Spina stack The former product from in events and maybe some others we don't know But this is a this is a good example of product that are using the puppet modules For the for the custom puppet composition So like I said before it requires some good understanding of puppets So if you come up on IRC and ask hey come up, how can I use the puppet novel module? You will need to have a good knowledge about how to read puppet manifest before Trying to do to deploy the Nova service You will also need to figure out if you want a puppet master or not If you want to use Yara to drive the data in puppets or not, this is a usually a good practice and The modules today, I think they have no limits and you can contribute if you think that it's missing something And if you think that we still need to bring a new feature at the end of this talk We will explain how the community works and how you can contribute. It's not that difficult and So we will go deep in that topic later I would like to say thank you because this is a picture that come from Saturday This is statistics about who is using puppets today in the open stack ecosystem and We are reaching almost 70% if you count the other product that are using puppets so I think this is a proof that people are using puppets to deploy open stack and As far I know today, this is the reference to deploy open stack in production So, thank you very much and Thanks So I'm gonna let continue with Mike. Thanks Hi, so just want to kind of talk through the the use case that go daddy for How we deploy open stack with puppet we we have a pretty large internal only private cloud So we use it primarily as a platform to deploy other products and services at our company We also use it a lot for dev tests So all the development teams in the company use it for their their dev test purposes as well kind of from our some of our big goals when going to open stack is just to gain some Efficiencies and just take advantage of it so that we can innovate faster within the company. So so, you know Fundamentally it's making everything self-service You know eliminating humans from the process getting tickets out of the way Just having everything being self-service and kind of automated. That's a huge thing. It's allowed us to do To do POC is a lot quicker to get new products to market a lot faster We just really have streamlined the whole server provisioning process for our dev and product teams within the company Another benefit that we've seen is just hardware utilization. So Prior to open stack a lot of our applications a lot of our products just ran on bare metal, you know, you'd As a dev team you'd have budget for your production environment So you'd go buy five servers or ten servers or whatever the open the data center would put them in for you And then there you go And we saw a lot of cases where the hardware utilization on those so, you know CPU RAM wise would be super low I mean, I think the average across our whole company on Strictly bare metal was something like eight percent or something ridiculous So by being able to virtualize all this stuff and wrap around all these other tools To kind of get to this cloud-based infrastructure has really helped us there as well But really the huge thing for us is just leveraging that power of the community, you know, we've got I don't know something like 5,000 contributors to open stack now and it's just Not something that we could do on our own, you know internally So that's been a huge win that we've seen as well We have multiple data centers around the world a few different regions me We have a few different open-stack regions. So You know, we're running these things fairly significantly across our infrastructure We're already a pretty big puppet user within GoDaddy. We have been for a long time It drives the config management for most of our products across the company I heard a quote one time somebody said, you know, the best configuration management tool is the one that you're already using We think about the amount effort that it would take to convert You know to Ansible or Chef or choose your you know, your favorite one That's a lot of effort and we already had a lot of knowledge a lot of experience with Huppet in-house So then we just continued on with that for the the open-stack deployment We use the modules from the open-stack puppet project to manage each of the open-stack services and then Again like Emilian mentioned we We've built our own composition layer around all this We're already gonna need to do that anyway, so it was a good fit to just drop in the open-stack puppet modules for our use as well We use this roll profile model for Our composition layer basically what that is is every server in the catalog is is defined Is assigned a roll so we have an app server role a pod server role Which is like like a network server and then a compute node role as well as We differentiate between our API cell and our compute cell apps The role defines which profiles go on to that machine So the profiles would be like Nova compute or a glance API or neutron server And that's how we define what goes on the machines We use high risk Extensively that's actually how we do that role assignment is through hyra. There's a tool called hyra e ammo which allows us to encrypt the different Credentials and other secret information that we've got a store in the config so that that's not just floating out there in clear text In addition to deploying the open-stack pieces we use puppeted to configure everything else on the box as well, too So it's kind of an end-to-end solution for us We have this notion of a world which for us is what differentiates between Like the dev environment test stage and then the different regions in production. So that's kind of our top-level config cataract is Catered Categorization And how we define what goes on all the box another thing to notice we we use quite a few of the future parser features that which is now What the puppet 4.0 language is there's a bunch of little things in there that we took advantage of so we were always on a fairly Later version of puppet I'm not going to talk to this as a hyra structure if there's more questions on roles profiles on that stuff We can talk about it later or you can look at it Online if you wish So I mentioned we use a bunch of the open-stack puppet modules to manage the individual open-stack services So that's that list there then of course we've got a bunch of other modules that we pull in from the forge You know the standard lib one ha proxy Apache all kind of a lot of the standard ones that you would normally need anyway Then we've got a few that are just custom to us that deal with integrating with some of our internal systems That aren't really useful to anybody outside. So those are all just inside We have an internal enterprise github and we just do We pull in from the upstream repos kind of as an odd-needed basis and push them into our internal repo and That's what that's where we deploy from we deploy from our our internal github. I Mentioned that we are using a lot of the future parser stuff The the internal puppet master infrastructure that we have in the company was on on version That was too old for some of the things that we needed to do So that kind of drove a decision to go with masterless puppet So we We use the r10k tool to get all the modules on the right versions and all that kind of thing Locally on the machines and then we literally just do a puppet apply to apply the config on all the machines puppet Doesn't do our great great job at multi node orchestration type stuff. So we have some very basic Ansible playbooks that we use to kick off these masterless puppet runs and all the machines That's how we do our our deployments and manage them across all the machines We'd really like to get to this place where you know We have a full CICD pipeline where anytime somebody does a commit it kicks off all the tests and then when those pass it deploys the dev and all these other gates and automatic deployments and We've we're just not there yet, but it's definitely You know a place that we're kind of driving towards and we want to get to Some of the recommendations that I would have for best practices are just like don't fork anything like really do whatever you can to not do it When we first started this a year or so ago, we were in a really bad situation like we not only forked but The way we imported the modules from upstream is like we did a git clone and then we just like copied the files Over to our internal repo and then push that up. So we didn't even have common histories. I mean it was a Horrible mistake. It's very difficult to get out of that place Not only to get out of it Much less like bring in the newest stuff from upstream So do whatever you can to not do that if you absolutely have to like there's a feature that you need that you have to implement yourself Do it on a local branch and then you know go through the review process that Amelia You'll talk about later to bring it back upstream. So not only can other people take advantage of it But then you get off of your local fork much more quickly and it's like it's a much better place to be in if you can avoid those forks couple other things You know trying to keep the composition module pretty simple That those things have a tendency to get really complicated and a lot of Little weird dependencies and conditionals in them because you're like Oh, I just need this one little exception for this guy over here So and all that stuff adds up over time and it quickly becomes a mess and it's difficult to read and understand what it's doing So a couple ways that you can do this are That I found is to try to do the profile classes as granularly as you can So you can imagine if you have a single let's say you have a single Nova profile class Well, Nova really has a bunch of different components to it. You know, there's Nova compute There's Nova API conductor scheduler and maybe on day one you have all those on the same box And when you say this is a Nova box, it gets all that stuff But you might get to a point later on where some of that gets split out and you now need to have Nova compute just on the hypervisors and everything else on the you know on the API machines and now You have the single monolithic profile class And now you're doing like all these weird conditionals to get it to do the right thing in the right places Whereas if you create a profile class for Nova API one for Nova compute one for Nova scheduler you have them all split out to begin with and it's very easy to distribute that stuff out later and another kind of You mantra is you know, you shouldn't you shouldn't have to run puppet multiple times to get it to get Your things into the final state that they're supposed to be in Anybody that's actually use puppet for any amount of time realizes that that's like that's much easier said than done But that's definitely something that you should you should strive for and to not Not specifically make decisions in the manifest like well We can just run puppet twice to get this done like you should really try to do it clean from the beginning to make it You know one run and everything gets to the state that it needs to be in without having to do it multiple times So I'll hand it over to Matt talk about his use case Okay Matt Fisher again from Time Warner Cable and you're notice a lot of similarities between The things we do and the thing Mike the things Mike does what's interesting is we kind of came to them independently So at least that proves to me in my head that it was a really smart thing to do Mike's really smart guy So You know, we have a similar use case to Mike We run a multiple data center private cloud for internal use we do things like website hosting CICD Video content we even host a reverse 9-1-1 system on there and a pretty typical reasons To run OpenStack that everyone else usually has speed self-service and innovation You stuff to wait weeks for VM Then you have to wait more weeks for an IP address and then more weeks for the security team to create you fire law access OpenStack solves this for us and a team had recently told us they got more done in three days with OpenStack Than they had in the past six months, which was pretty cool The other thing is you know OpenStack is a great platform It's easy to add on services that you want right now. We're working on adding designate in Elbas And so that's also just part of the innovation OpenStack provides With new services. I'm open sex getting better every cycle as everybody knows Faster more reliable easier to easier to use Okay, so open everyone knows why OpenStack, but why puppet of our division had some previous experience with puppet our team really did not but in our division it was it was used and Highly regarded so we kind of started with that also when we first set up our open-sack environment We worked with Cisco and Cisco kind of showed us this tool called puppet open-sack builder And we use that to stand up a lot of our proof-of-concept stuff and then our composition layer kind of grew out of that basically Vendor locks a no-no for us. So Not wanting to use an open-stack distro doesn't eliminate some of the tools that are available Finally that the puppet community in open-stack is great when I started this I actually hadn't used puppet at all and The community was really helpful answering questions through email or IRC and taking patches and and It was great Okay, so our composition layer. It's it's similar to what Mike said we use the roles and profiles pattern This is what everybody should use and they should go read Dan's book about this And so what example is we have a control role and a control node consists of Nova API neutron cinder my sequel, etc. We try to break those down Small like Mike had mentioned we don't always do the best job of it. It's kind of on an as-needed basis We did exactly have one nova profile that we had to split into a compute profile and an API profile So that's a perfect sample Mike gave Regardless of the type or role all nodes get a base profile, which gives you basic things that you need SSH keys NTP servers things like that This way when we're developing puppet modules for new things like Swift those nodes come in and they're sort of least minimally Available without someone having to manually copy SSH keys over The other thing the composition layer does Emelian kind of alluded to this this is where you define your architecture We run keystone and horizon on top of a global cross-data center Galera cluster and horizon actually runs on what we consider a keystone node so our keystone role includes this global Galera profile and a horizon profile and You know the puppet open-stack modules do not define that you run horizon on top of keystone That's a decision. We made that the composition layer does for us Another thing Mike mentioned ours has gotten large and complex We have over 1200 commits and over 5,000 lines of puppet code in our composition layer But to be fair this layer we also this module I should say we also use to manage our infrastructure So all our CI CD tooling our VPN servers Those are all profiles as well. So it's not all open-stack Okay, so I've discussed using roles and profiles Mike showed you a kind of higher a diagram We do things a slightly different way the question is when I when I install a box How do I decide is it a keystone node or or a sep node or a swift node? We actually do this through host names. So give you an example YBR 1 keystone 0 2 YBR is the airport code for Vancouver We use sort of a geographic thing. It's called a silly code if you're familiar with the telecom term and This host name defines not only what type of node this is but is it production or or staging or dev? So, how do we do that? I'll go through real quick The first piece is the site iteration We have the idea when we started that you might have more than one cloud in data center So in this instance, it's cloud. It's oh one. It's the first stand-up. We've done The second piece the middle. That's the role. We basically just parse this we see the middle It gets the it magically gets that role and gets all the profiles associated with it the final piece This is just because you want to have unique host names. It doesn't really mean anything for the most part There are a couple things we only do on node one or node three like you don't need to run my sequel back up on Every single node. I think we only run it on node three To go back to the first part site iteration does actually use Hyra We have a hyra file for YBR 1 and we use echelons might call them worlds but why VR 1 will basically say why VR 1 is is prod and Splitting the stuff out and Hyra is great In prod you want to get page or duty alerts three in the morning and dev You do not want to get page review alerts at three in the morning So having that split and having the concept of prod dev staging is important to us Okay, and in addition to the open-sec modules We use over 70 total. We have tons of modules SSH VPN We do like Ubuntu repo mirroring. That's a couple modules things like that We have a local github and we also use some things from github.com. We have a very strong I'm not sure the right word, but we have a very strong thing where we do not fork We'll basically make a change and immediately do a pull request or submission upstream and then the minute It's merged we pull the fork back if we can actually wait a week and just wait for the pull request to come in We'll do that Getting forked off. We did in the past with with puppet neutron We ended up getting like eight months behind and getting back to it is it was miserable Also, we have four major custom modules that I want to talk about and sort of five smaller ones And I'm not going to go into detail the first ones and I sing a module The single modules actually pretty complicated The tests you do are based on roles a keystone node has a certain set of tests and a compute note Has a certain set of tests so we actually have our its own roles and profiles in this module We use exported resources to manage these I sing a test We also have a HA proxy composition module that basically manages all the HA proxy config We have an internal info module that handles some of the info pieces like Jenkins jobs Finally we have a custom keystone module. This is not to deploy keystone It's to use keystone. We have a higher a hash of a bunch of users roles Intendance that anytime we stand up in environment dev whatever we want these predefined users like the I sing a user It also handles things like a token expiration job We're using the UUID token so we'll eventually get be able to get rid of that when we switch to for net So how do we do deployments? I'll just read through the slide and you can kind of see this is this is the outer layer going in We manage deployments with Jenkins We orchestrate them with Ansible we deploy code with our 10k and Puppet actually does the real work So starting with Jenkins when we go do a deployment you sign into Jenkins press a button and the deployment goes We use Jenkins because it gives us access control. It gives us a shared console when we can see something goes wrong It actually post updates to hipchat for us And we have a history where we can go back two weeks and see what happened two weeks ago in the deployment With all that fancy stuff Jenkins is really just running an Ansible job The Ansible is our multi-node orchestration layer. I learned pretty quickly why you might need this We did a deploy. This is before production. We were using something called M. Collective and M. Collective will run puppet one node at a time But I was way too impatient for that So I told him collective just go run puppet everywhere at the same time because I'm bored and I'm tired of watching this So what I found out was that code change that day had changed my sequel and changed Galera config And the glare cluster when you restart all the nodes at the same time does not come back So I spent the rest of that day reading up on Galera and Getting the cluster back up and running and then that's kind of our lesson like we really need to solve this problem So we use Ansible to enforce an ordering So Ansible says if you have a nodes in a cluster such as our keystone nodes You can only run them one at a time In addition to that Ansible will run a pre and post puppet run check to check to make sure the cluster is healthy the nodes healthy So for Galera, it's it's a simple Galera check the idea being if we break the Galera cluster We want to know on the first node. We don't want to know when we're done with the deploy We can recover from one node dying. We can't recover from everything I mentioned our 10k we used to use librarian puppet, but our 10k is actually what checks out all the modules It's faster. It has better support and we're pretty happy with it So all these are these little layers, but really puppets doing all the real work puppets What's installing packages creating users setting up services configuring everything if you boil this down Ansible is kind of really just running puppet agents and collecting logs Okay, so my best practices a lot of these are the same as mics use higher to separate config and code and use the Automatic data binding when you can you will You will think this is I only want this value set this way so I can put it in the manifest You're gonna find that that value is different in an east data center than a west. It's different in dev and staging It's different than net on the next iteration of OpenStack than it was before This is just a pretty basic puppet practice that I recommend everyone doing Our hyra EMO we have SSH keys we have SSL certs those should be encrypted if you're gonna check them in to get Code reviews is something we're really big on Everyone will talk about infrastructure as code, but not many people actually review it. We review everything We review Jenkins jobs Ansible Puppet code hyra everything goes through review and a pre and post-merge test We use Garrett we run our own Garrett server The question was how do you review? Have regular and frequent deployments our whole team will test the longer you wait between deployments the more problems You're gonna have Everything just gets bloated and impossible to test and you have a huge mess of three weeks where the code change is going at once Right now. We typically do these big weeklies once a week I think I'd like to get us doing them more often than once a week. So they're less scary I'll finally participate in the community This is a really great community. I think it's easier and friendlier to get a patch in here than anywhere else probably And participating has really allowed us to sort of set direction and and have discussions and participate And make sure things are going the way we want All right, Mike, so a lot of this will be kind of repeat But we wanted to kind of coalesce down Between Matt and I kind of our you know best practice recommendations what to do pretty much already covered this stuff You know it kind of goes without saying that last bullet there It basically means like you're not going to solve all this stuff on day one like you're going to start with somewhere and Then just plan on iterating it to make it better gradually You know you're going to learn more about the best ways to do things for you and you know Just plan on kind of iterating and making things better But we wanted to talk through some of the challenges and just some other points to think about to Using public to deploy open stack so One of them that you should think about is whether you want to track the master branch within the public modules versus the stable branches So you know kind of typical Trade off their master gives you newer features sooner But it has the potential to break you versus stable branches a little bit slower to get new features But you know by definition it's going to be stable. So something to think about there And we've both got internal github enterprise systems that are you know very handy to be able to pull down stuff from upstream and push into there So if that's something that you've got you may want to do that or if you're comfortable just pulling directly like from github.com You know that that works too. So Kind of another upfront decision you should think about when getting started with this and Kind of related to that is what's going to be your method for keeping up with the upstream module changes So I think I think Matt you guys pull down every day or something Yeah, so they they do it roughly weekly we kind of just do it as an ad hoc as needed Manual type basis. So something to think about there If there's a lot of stuff from upstream that you want to stay in sync with you know You might need to have some system in place for that Another challenge that I think we've both seen is just being able to find folks at the skill the skill sets I mean, I think we all know like, you know knowledge of open-stack experience with that is pretty valuable To employers as well as people that know puppet real well and and can do that and finding somebody that has both of those You know is in the middle of that Venn diagram is is challenging Not only to find them but you know to hire them away from the five other places that would like them to work there as well So just just some other points to think about there Now our esteemed PTL will tell you a little bit more how you can get involved in the community and help contribute Thanks Mike So I will finish that that talk with how if you want to push up some contribution to the community so Like I think we already all understand the benefits of using the three modules comparing to forks or custom modules You have the feedback from the other people that deploy open-stack on the infrastructure So you take the advantage the advantage of using the work from others Also, we we have a I mean we try and we are working hard on this to have an extensive gate tasting right now So that means that if you push some patch on the modules We will actually deploy open-stack and try to validate that your patch doesn't break anything For example, we are using triple or jobs today to deploy open-stack modules So this is a good use case in the in the public upstream and also Getting involved in the community is fine if you want to bring some features that you need for your enterprise You need in you need upstream so That's that's a good example if you need the ATC patch you can you can push a patch also So how do we work together? We just use the open-stack workflow So I'm done. I'm not gonna spend some time on this Just if you need to submit a patch you just clone the puppet module you create your local branch You you do the test you do the patch And you you use Garrett to to push your patch in the in in the open-stack Garrett So there is a wiki page on the on the wiki open-stack server, which explained how to How to how to contribute to open-stack? So this is a kind of contribution that we like So when you submit a patch, it just not to add some puppet code It's it's only to have some unique testing functional testing documentation Good commit message if you find a bug, please use the launchpad to find the bug and explain what you are doing in the commit message of course, we are using the Puppet conventions for syntax and lint and we try to we try to have to respect some conventions across all the modules of course We we maintain the backward compatibility if you drop parameters or if you change the default value of Something we don't use to break master for a new feature. So It's quite safe today. We have the backward compatibility policy that we make sure to keep and Of course, if you want to bring a new feature that requires some discussion You should create a blueprint and discuss with us on the mailing list on on ARC usually So that kind of contribution, we don't like much if you try to solve packaging issues with puppets Or if you try to create some users in groups If you put the wrong default values that are in OpenStack and you try to change them in puppets If you don't write a commit message and so we don't know what you are doing And of course if you if you write the test in a wrong way, we try to explain on Garrett how we like it So I have one minute to explain why we move we move under the big tense We truly think that moving under the big tense will Will help us to get more contributors more consistency across the other open stack projects We will also use the OpenStack ecosystem like the infrastructure CI This is all about adoptions today. So we try to get more adoption from all the people that deploy OpenStack So if you want to join us, you can feel free and join on on ARC. We have a channel for that We have we use the OpenStack Dev mailing list. We have also tomorrow all the day You are welcome to join us for the Collaboration day we have a full room for puppets and on Wednesday morning, of course, we will be here for the obsession So feel free to join us. We are welcome and we'll be pleased to talk about puppet modules. Thanks everyone