 Alright, next up we have Gareth, who will talk about SaltStack and building a self-healing system. Thank you. Wow, you haven't even done anything and you get applause. That's really awesome. My reputation precedes me. Can you guys hear me okay? Yes? No? Yes? Thumbs up? Okay. Alright, so yeah, this is building a self-healing system with SaltStack. My name is Gareth. My Twitter can be found on the right-hand side of every slide, so if anyone feels like tweeting at me or about me during the talk, feel free. I'm a senior software engineer at SaltStack. I'm fortunate to get to spend my days writing open-source software. If you have the opportunity, I highly recommend it. If anyone's interested in talking about Salt in general, just automation, anything, just find me outside after the talk or I'm around. I'm also a former DevOps engineer, so if anyone wants to have a group therapy session about being on call, I'm happy to leave that. Also, we are hiring, so if anyone's interested in working for SaltStack, we do offer remote positions, so you can stay in lovely Europe. You don't have to move to the U.S. We'll just leave it at that. So yeah, building a self-healing system with SaltStack. So just some basic SaltStack terminology. First of all, how many people are using SaltStack today? Awesome. How many people would like to use SaltStack? Should be everyone. Okay, so SaltStack is written in Python. It began as a remote execution system as a way to execute commands across many systems. So it began similar in nature to tools like Fabric and Funk, but it's not done over SSH. It's using 0MQ as a messaging bus and it's encrypted and very fast. Because of the way it was designed, adding configuration management was easy. Through a series of plugins, we call them state modules. It's also available to work with Salt API, easily exposed for various purposes. We have storing secure data in pillar and reactors, which are small state modules that wait for specific events and react some manner, which we'll look at a bit later. Okay, so the SaltMinion is a daemon that runs commands on a server. So any server that you have within an infrastructure that's controlled by Salt, it will be running a SaltMinion. And there's the SaltMaster, which is the server which sends out those commands to those Minions. So a typical SaltStack... Jet lag is a hell of a drug, I swear. A typical SaltSetup requires a single master and many Minions. And the Minions communicate with the master, the master communicates with the Minions. We can have a similar setup with multiple masters, controlling many Minions. So Minions will connect to masters at random, specific masters, it's up to you. We also have the option to run in what's called a master most mode. So not requiring a master, you run all of your commands directly on the Minions, accomplishes the same thing. There are certain pieces that you don't get access to, which like events, reactors, things like that. So Salt is built on the SaltEvent system. It is the heart of SaltStack and what distinguishes Salt from similar projects. The SaltMaster and the SaltMinion each have their own individual event systems. So here is a slide that shows a kind of typical SaltSetup with all the pieces involved. So we have our Minions running on a variety of operating systems. In this case, Windows, Linux, AIX, as well as some network devices. The Lightning Bolts represent the XuronQ, SSH, we also support Tornado. The EventBus is the line with the X through it, and then we have the various other pieces which are typically found on the master. So your runners, your reactors, and then your SaltMaster. So event types, as I said, SaltStack is based on XuronQ and events. We have a variety of events that SaltStack usually uses. So we have authentication events, which are fired when the Minion performs an authentication check within the master. There's also start events. So every time a Minion starts, it fires an event saying, hey, I've started onto the EventBus. So the SaltMaster. Key events are when accepting and rejecting Minion keys. These typically happen as a result of actions undertaken by the SaltKey command. And then there's also job events, a variety of events here. Such as when a job is sent out to the Minions from the master, when a Minion returns data from a job to the master, and each time a function inside a state run completes its execution. This option is disabled by default and must be enabled by the state events option. We also have runner events. So SaltRunners are commands that you would run on your SaltMaster to orchestrate against Minions to run commands on the master. And we'll look at one of those in a little bit. So runner events are associated with SaltRunners, including events for when a runner begins, returns, and whether the runner is part of the orchestration system. And there are also presence events, which indicate the presence of Minions connected to the master. Finally, there's cloud events for the various cloud tools that we have available. So we can listen for events using a variety of salt tools. So using the salt command, using the salt CLI, excuse me, we can use the salt runner commands, and then use the state.eventRunner, passing the pretty equals true command. So this will show you all of the events that are running through the salt event bus. And the pretty equals true makes it colorful and nice and pretty. We can also get events using the salt API. So running a curl against whatever URL you have configured on your SaltMaster for the salt API, passing it events, and then the token that you've generated will give you a streaming list of events. We can also do it directly from Python. So we can query the event bus using just some simple Python code here. So here we're inviting, importing some salt specific Python modules, and then connecting to the event bus using some Python code. So here's a typical event that you would see on the salt event bus. So in this case, this is an event, a new event that's been generated. So the tag indicates, identifies the event that was fired and the data contains details about the event. So in this case, our tag is that long date-based string. And so we find that both in the tag up top as well as the job ID. And we also see the minions that were targeted. So in this case, we targeted all minions that the salt master knew about using the asterisks. It's a target type of glob. And the only one that returned was minion2. So here's another event. In this case, this is the return event. So this is the event that returned from minion2. And we can see that the command that was run was test.ping. And the return value was true. Very simple. And if we notice that this also has the GIDs, which should be the same. And there's our minion2, which is the minion that we targeted. So here's an off event. So this happens, as I said, when any time a minion authenticates against the salt master. So we have a pending event here, minion2, and it's sending its public key along to the master to be signed. So we can also send events using the salt commands. So using the salt call command and event.fire, we can place an event on the salt event bus. So we pass it some data inside the single parentheses there. It looks like a Python dictionary. And then we give it some sort of tag that identifies this particular event. We can also fire an event up to the master. So here's a kind of more involved example using the same command, event.send. And we are giving it a tag of my code, my tag success. And then the data that we want to send is success is true and message it works. And also sending them from Python. So in this case, we defined a custom salt module with a function called do something that uses the event.send module from salt to send a custom event using the tag. So this is great, but now what? What would we do with this? Sending events and receiving events using salt. So we have the reactor system. Salt's reactor system gives salt the ability to trigger actions in response to an event. It's a similar interface to watching salt's event bus for event tags that match a given pattern and then running one or more commands in response. Excuse me. The system binds SLS files to event tags on the master. These SLS files then define the reactions. This means that the reactor system has two parts. First, the reactor option needs to be set in the master configuration file. The reactor option allows for event tags to be associated with SLS reaction files. Second, the reaction files use height data using the salt system, state system, to define reactions to be executed. So just a couple of types of reactions that we can send. So we have local reactions, which are reactions that run in remote execution function based on remote execution function on the targeted minion. We also have runner reactions, which are reactions that execute a runner command, which would be on the salt master. And we have wheel commands, which are reactions that execute a wheel function on the master. And then caller reactions. There are reactions that run remote execution functions on a master list minion. So here's a typical example of a reactor configuration. So all salt configurations are done in just basic YAML. So we have a reactor config defined here. And we have some events. So in this case, we have a salt minion slash star slash start event. So this will match on any minion that starts up. Any time the minion starts up, it will go ahead and run these reactor files. It runs them sequentially. So it'll run the salt.SLS state file as well, and then the monitor.SLS file. We also have some cloud events. So any time the salt event bus, the master sees the salt cloud star destroyed event, it will run all of the commands, all of the state files that are in salt SRV reactor destroy all the SLS files. And there's our custom tag there that we defined a couple of slides ago. And then here's a reactor file. So in this case, we want to do a state apply, which is a high state, and we want to run it on the minion that was part of the event system, or part of the event that was sent. So that would be the ID. So reactor files have limited access to minimal ginger context, as well as grains and pillar are not available. The salt object is available by calling remote execution of runner functions. So we can do some basic monitoring using this. So here we have a reactor system that is looking for the event monitor slash restart slash service. And when we see that, we want to run the reactor file restart underscore service in the reactor directory. So if we fire an event on the master that says salt call event dot send monitor restart service, ID is the minion that we want to target, in this case, web server, and the service is Apache 2. Within that state file, that reactor file, if we define start and then using some ginger data with the service name local.service.start and then the target, this will cause the service on the targeted minion to go ahead and restart. So here's a slightly more involved example of the previous slide. In this case, we are telling it what we're passing some pillar data to give it the actual information that we want, the server name and the service name. And then within that state file, we can reference those pillar data values. And a little more involved steps, as well as restarting the service, but also sending a Slack message just saying, whoever's on call, hey, I went ahead and restarted this service for you. So the next thing we want to look at is beacons. So beacons let you use the salt event system to monitor non-salt processes. The beacon system allows the minion to hook into a variety of system processes and continually monitor these processes. When monitored activity occurs in a system process, an event is sent on the salt event bus that can be used to trigger a reactor. So the available beacons that we have, so we have ones for monitoring file system changes, as well as system load, service status. OK, I'll go fast. Shell activities, such as user login, network, disk usage, and a new one that's available in our upcoming neon release is one that can monitor certificates. So you can monitor your SSL certificates and do regeneration of those. So here's an example of a basic beacon config. So here we're using, again, basic YAML beacons. And we're monitoring a service, in this case Apache 2. And we want to do only on changes. So if it monitors and it sees it, OK, it's running and then it sees it stop, it will only fire that event on that case. And only if the Apache 2 pit is still existing in that directory. So here's the event that we would see when that beacon fires. So it looks very similar to the events that we saw before. It has a tag, in this case, salt beacon minion service Apache 2. The ID where the event happened was minion. The service name that triggered the event was Apache 2. And Apache 2 is running as false. So then going back to our reactor, we can have the reactor react to this event and go ahead and restart that service. So building on the previous example that we had with the reactor file, we pulled the service name out of the data that we get from the event. And then using some GINJA, we generate a state run passing in the data that we pulled from pillar and then run the reactor file. So very similar to before, except this time we're pulling it out of pillar, we're looking to see if the service was running or not. And if it was, then we restart it and then we send that slack message. Do we have time for questions? One question, two questions? Who's got the slide thingy? OK, I'm just going to assume yes. Does anyone have any questions? Two minutes. Any questions? Way in the back. I'm sorry, I couldn't hear you. So the question, I think, was if restarting the service doesn't help, if restarting the service from SALT doesn't help it get back online, how do you tell SALT to not do it? So in that case, what I would do is I'd have to double check on this, but I believe there's a way to set a threshold of how many times or how many events it gets before it tries to restart or how many times it tries to restart. So you could set that threshold and just say, if it's not fixed within, it's not back and running back in three attempts, then just quit and stop. Any other questions? Nope. OK, thank you.