 So, um, my name is Michael Chapman. I work for Apptyra where I open stack services company based out of Sydney This talks a little bit like experimental I spent the last six months I Spent a little bit of that working on this these kind of problems around Orchestration and config management out and how they kind of come together using puppet This was kind of inspired by particular issues that we had in the puppet modules In the upstream puppet modules So just to start off surf is like this little agent that you can install on all of your nodes and it uses the gossip protocol To communicate between them. So it's not It's not a clustered service as in like Galera or or something like that where they're like tightly coupled It's using gossip. So it just it's kind of like a mesh or something like that we can send out events and we can send out queries and the agents automatically create events when they join or leave the cluster or when they fail so We can then take those and we can run we can create an event handler That will do something when we run a query or do something when they receive an event. So well this graphic worked So this is off the surf website. So it's kind of like the the nodes will send on send messages And those will then be forwarded on to other nodes. So the idea is that it's Very difficult to Actually shut down the cluster completely Due to like a node failure puppet I'm hoping you guys are familiar with because I don't have time to go through all of what puppet does But it's a declarative graph-based config management tool. It has trouble dealing with cross node things It's very good at configuring single nodes But once you want to order things between nodes it gets a little awkward because it wasn't really designed for that From the start has a very strong community and in my unbiased opinion as maintainer the open-stack modules are very good So add up Tira as I said, we're a services company So a lot of the time we got and we we deployed clouds for people we run masterless puppet This is kind of because I like the idea of having Individual nodes be responsible for themselves rather than having a central service that we then have to monitor and maintain Lots of places run masterless puppet It's it's neat from a lot of perspectives, but particularly for cross node dependencies You need to invent things in order to get it to work. We use Hira We we don't have a custom E&C or anything like that. So all of our data is just sitting in YAML We have a very large item potent. That was exciting. We have a large item potent bootstrap script which we run straight after nodes come up and It's responsible for installing puppet bringing the data in making sure the FQDN is correct and making sure NTP is working just for certificate signing purposes and We do like adjust enough OS install from from pixie to start off and that's kind of that's pretty much everything in like reverse order So some of the stuff I'm talking about isn't quite done yet. Sorry about that It's still interesting So if you're if if you're sitting in the audience and you're thinking I'd really like to write a config management tool Or like any DevOps tool in general Open stacks are really good test case for it because it's kind of nasty to deploy There's a lot of cross node dependencies For example API nodes need DBs API nodes need need message queues You've got things like keystone which are required by all of the other Over-stack services and strange things start to happen if it goes away You have dependency cycles at the roll level where like no one might require neutron and neutron might require Nova Some of those most of those are fairly new and they've come in because of notifications There is lots of stateful parts. So we generally can't just delete things and and Bring them straight up again clean Because if we blow away our state database that would be a problem if we blow away virtual machines customers get angry We have security issues. So compute nodes Some people get kind of paranoid about people breaking out of hypervisors So we can't just throw everything in higher and put all of our databases database passwords and that sort of thing in there We have to break it out The config files are absolutely enormous to the point where templating is almost useless. I Don't know how many of you have actually Written like templating for like nova.com for example, but it gets pretty bad And the upgrade procedure as most of you probably know is pretty nasty I'm not gonna solve. I'm not really gonna solve any of these problems today, but we'll try So cross node dependencies is the first one It's awkward in anything which is just based around dealing with a single node for for pretty obvious reasons So puppet apply on our Nova node It depends on the DB. We need some way to like encapsulate that And and sit there and try and connect to it while the DB runs So exported resources would be one way that some people might do this. I Kind of hate exported resources and I don't use them Because they bring this like centralization back to puppet. So if I was gonna run puppet apply The whole point of that is removing the centralization if I then use exported resources to solve this I May as well just run a puppet master and do the whole thing Just a quick sidetrack like I've seen a bunch of people running out to do puppet. Oh, that that loafer is really exciting I Really like this it it does a lot of good stuff It gets you the cross node orchestration at the roll level so you can say bring up all the DB nodes Then we'll bring up Nova then we'll do this then we'll do that that works brilliantly but For those of you who prefer chef style, that's that's really natural where the dependencies are implicit But in puppet style you will generally want to explicitly declare all of your dependencies So Nova Nova's service has a dependency on the MySQL service and we want to express that at the at the resource level not at the roll level and We don't really have a way of doing that if we're just wrapping up puppet apply in an answerable playbook So the easiest nastiest way is to just make an exit and It works pretty well So our command is is that we're just going to run my SQL against a host and make sure that it's alive So never Nova API says I would like to start my service before before I do that I'm just going to check that this my SQL connection string that I have is actually legit And if it sees that it's dead then it's not going to do anything But it's gonna it's going to retry 60 times and in between each try it's going to sleep 10 times So it's quite neat that the exact results actually kind of had this use case in mind Where we just sit there and and wait for things to actually be ready before Before we start them This is also really useful Even when we're not doing cross node dependencies, even if the DB and Nova API are on the same node because you're running like a converged control plane When you run service my SQL start that returns at some point usually when like the socket is ready or something like that, but It may not actually allow you to run Commands against my SQL at that point. It's it's generally sometime a little bit later when it will accept connections and Except connections that actually do something useful so when When Nova runs when Nova checks against this local one, it's also benefiting from that validation check So the much more complicated case is where we can't do validation Against like a network resource network resources are easy is the port there can we do something to it? Okay, great There's there's this really weird case that we had upstream in the puppet modules where There was nothing really being exposed that we could look at to say Has this been done? So my attempt was to use surf to do this So the exact case is We want to create a volume type so that's an API command belongs on the Cinder API node and After that happens we have to restart all of the Cinder volume demons I'm not sure if this is still the case in in Cinder But it was certainly the case when we tried to do to do this maybe six or so months ago So what we can do is it register an event handler in the surf config? saying I Have an event its name is volume refresh and the command that's going to run is server Cinder volume restart and Then if we just ran this command surf event volume refresh that event will get sent out to the nodes and Any of them that have this event handler will run that command So then within our puppet module it looks like this so we have Whenever we have a Cinder volume type which is created We notify our exec and It's going to refresh all of the volume servers in the cluster Again, I'm using exec because I'm a horrible person. So that's pretty neat It's kind of an uninvasive it didn't require a whole lot of work surf itself is really easy to deploy So there's not a whole lot of overhead It works very well when the thing that we're doing is item potent so restarting the volume service It's an item point process. It doesn't matter if we do it several times It would matter if we did it a thousand times like all day because obviously the volume server wouldn't really be alive then but It's kind of if kind of fits into the puppet ethos of restarting things just whenever you need to It also works when we have a dead node because if the node is dead Then the service is probably dead and when it comes back up the service will be alive and that counts as a restart But if we had a side effect to whatever we were doing so it was an item potent then this is Starts to get a little bit more complicated because if we had a dead node that didn't catch that event and do anything Now whatever side effect we're expecting is not there You can kind of mitigate this by allowing it to replay things when it when a node rejoins the cluster But in the previous case of volume restart that wouldn't really make sense if we made 10 volume types On our API nodes that's going to send out 10 refresh events We probably don't want a node to come up and immediately refresh The volume service 10 times It also is pretty nasty because if someone did take over one of our nodes it can just say hey restart the volume service Constantly and DDoS our volume service I'm gonna get back to this later Because they don't really have a solution So surf has to two kind of things it has events Which is where we just we just push something out to the cluster and and things happen and we don't really pay that much attention to Acknowledgements or things like that And then we have a query where it's like We run on our node We push out our event of our event to the cluster and then we wait to see if there's some response after some specified time Out we say okay. We're not getting any more responses So we can do some kind of cool discovery things with this which is which is a little neat as well It's also an area of puppet which is not kind of built into it So we have an event handler which is instead of event. It's query We'll call it rabbit MQ node and what it's going to do is do a status and then grip through the the output of the rabbit MQ status Which is in its own funky format and get the the short name of the node which is um Which is what we use to to do the clustering so the response for that Is neatly formatted like that so we know where where the response came from and We know what the response was We can also restrict which nodes respond to our query using tags So you can set multiple tags for things and those of you who? have done a little bit of puppet work and Use roles and profiles you can immediately see that this is a really nice match for roles and profiles. So we set profile names To match tags and then we say that only a particular profile Should respond to a to a set of queries You can also write a higher back end for that So we'll call our back end surf and then we can invent this syntax for it where we say we want to bind the parameter rabbit hosts to the nova class The query would be called rabbit MQ node We add some tags to it and we might want to say the that this is a particular type so that we can say I'm going to query to see what rabbit nodes there are and I'm going to get a list and Concatenate those together So in order to be sure that the thing that we queried is actually what we expect it to be We could look at our response and compare it against a list of role mappings And that might be our first attempt at not getting crappy data from things that are saying they are one thing But they're actually another so you recall back here it says It says where it's come from and we could compare that Against our role mappings pretty simply The problem is that in surf if a node leaves the cluster another node could then take that name and pretend to be something else So what we really need is is keys and certs in the traditional puppet sense essentially So we want something that's going to sign certs and send them out to nodes and register them So this is I see this is kind of one half of the the useful bits that the puppet master does where one half of it It's like it can pass catalogs and handles like the the code distribution aspect and the other half is The authentication aspect so we can if you're running masterless You can already handle the code distribution aspect using packages pretty easily, but the CA aspect is not Handled at all So what we want to do is go from this this kind of situation to this kind of situation where this node says I'm a volume node and I have a set that says that You're a compute node You're not allowed to tell me what to do despite the fact that you claim to be an IP API node The other neat things that we can do are kind of obvious but You can create more events to do things like set up your nodes before you run puppet So in our case we want to run puppet we want to install puppet and hire and we might make some RPMs That include all the higher data We can use it as a kicker Which will just run puppet apply across the whole cluster We could make ones that apply specific profiles So that's a kind of neat Idea that we have so think if you think about like The token services in in open sac The way memcache works is we have a list of memcache servers and When a lot of memcache servers go down it increases the amount of time that it takes to find that token and So what we want to do as a result of that is remove Memcache nodes out of our list of used nodes in nova quite quickly after they start having problems So by applying things per profile instead of applying the entire catalog This allows you to to react to those changes of preempt getting performance problems So you can also handle the the join and leave events By running puppet apply and do the same thing with failed The thing to note here is that for almost every single event the response is just to run puppet because puppet is checking The state of the current node is correct So you don't want to do things like Watch for a particular node leaving and then change a config file Because that would be faster than running puppet the idea is to get puppet to run quickly enough that you can use it For its intended purpose. So one of the last things that I tried doing was Someone asked me if we could set up a system where the database team is responsible for the database and all of the data That goes into hire a for the database and the puppet modules and all the deployment for it And so the DB team is responsible for that the message queue team is responsible for their for their part and then somehow they all Communicate together, but for example the message queue team wouldn't need To be putting commits onto the higher data Get repo for the DB. So that lets us control Control access, but we do need to define some sort of API so you can do this one of the ways that I that I tried out is With a spelling mistake so we define an event handler of just hire a query and All the handler is doing is is capturing things from the payload so When you run a surf query or a surf event you can say the query name and then you can run any payload So what this will do is? if I run higher a query Surf event hire a query and then say Some key I'll get all of the values back from all of the nodes with this event handler So that allows the DB to have all of its higher data, but no other node has that higher data So it allows the DB teams to have total ownership over everything that they have in higher You can also do the same thing with factor where you could say What are the facts on the DB nodes or what is the value of this? This particular fact on a DB node. I haven't found a solid use for yet So what we end up with is a totally decentralized messaging system sitting underneath puppet that puppet can use in order to fill in kind of orchestration gaps Security is definitely an issue, but one of the nice things about it is that it can't be killed I think it's quite cool, but it's it's also not very finished it kind of follows this Ethos that I've I've seen Promoted in some circles which has this horrible name But they call it kind of anti fragile where it's like can I move all of my state to the edges of things and then allow the nodes To all query each other and maintain their own cluster status Rather than having central command and control systems So that's all I have. I'm probably like way under time, but I Thought I'd leave plenty of time for people to tell me how wrong I am Thanks, that was awesome Talking about unfinished stuff at the end along those lines. Have you checked out console yet? Yes, awesome. It's yeah, because it starts to I think try and add In some of the blanks that you've discovered and hacked around with this first iteration of this system and it looks like they're going to do it in a pretty elegant way as well So I think console so console is built by the same people and it's built on top of this Console is the centralized service that you need to complement this It's the next body of work to me There's like and there's like a little thing that you have at the bottom That allows the nodes to form a mesh and there's there's a thing that you have at the top that allows you to do Like authenticity of the nodes within the mesh I agree. It's pretty cool. And I like the fact that you build this system to work around Something that is missing in open stack basically so maybe I'm a bit of topic, but did you think about that the fact that there's no coordination service in open stack itself and some of that could be provided by open stack to Services that are going to do the configuration Is it something you've considered like something like console or serve as part of Open stack to coordination of services or nodes or adding new nodes It would be interesting to see open stack use that gossip is not a protocol that is consistent And we rely on consistent data within open stack today But It remains to be seen whether whether open stack would handle that not being the case I guess one place for that go Maybe you would be in in triple O2 since they're actually building tooling around open stack instead of just open stack itself It might be a might be a place for it to to live What I was wondering about with the security question is I know you run everything masterless everywhere But wouldn't having a puppet master in this case take care of some of that Because you've got the back-end certificate system and Yes, indication system. Yes pretty much eliminate that problem. Yeah, it it gets rid. Well You'd then have to build some little tools on top of surf to use the certs To actually check the results, but but yes, cool That's all I have