 So is anyone expecting like a chat ops talk, or like how do you Slack to do ops? Because I'm actually like the only person that never wants to use Slack to run Slack. If Slack goes down, it wouldn't work. So if you're expecting chat ops, this is the wrong talk, unfortunately. See, I just want to like set that tone. We're not going to go like, wow, Slack is super cool. We don't use dog food for a ton of like read-only stuff, like notifications and things, no one wants us to be like, oh, sorry, we can't get Slack back because it's Slack now. Yeah, exactly. So, yeah, so we're kind of bummed actually. So it's kind of lame, but yeah, there's a lot of great tools and stuff out there. So it would be great, but you just want that muscle memory. Like if things are hitting the fan, you want to know exactly how you're going to do things. Which is exactly what we're going to be talking about here. So I think people are still filing in. Feel free to find a seat or come closer. Hopefully this is going to be very participatory. We're going to have a lot of questions and we're going to be able to work through a few things. So my name is Joe Smith and I'm an Operations Engineer at Slack. I'm really excited to come back here. I was actually born and raised in Anaheim. So I helped organize Scale a few years ago. So it's really great to come back and talk to everyone here. I spent some time working at Google and then a lot of years working at Twitter where I was the first Mesos and Aurora SRE over there. So I helped scale out their data centers in the cluster there from just about 100 machines to tens of thousands of boxes spread across the world, which is pretty cool. So we learned a lot of things and I kind of, in my time there I kind of got used to this very like the Twitter-specific stack. And I think we kind of sort of thought about things in terms of services and millions of users and things like that. And so coming back to Slack was actually a really refreshing change of pace for myself. I was able to work with a much smaller team. I could actually count the number of machines I was dealing with again, even though it still was relatively large. And I realized that a lot of the approaches that we'd used in order to scale Twitter were maybe something that were going to benefit us in the future at Slack, but weren't necessarily appropriate for the scale that we were at at this point. And so we had a bunch of tools that were written in POSIX Shell. We had some Python scripts. But we didn't really have a consolidated code base for operational code. We basically have a lot of application code. But I think the way that an organization scales is it starts to realize that, hey, we have some ways that we provide value for our customers. And then we also have ways that we enable ourselves to move more quickly and add some velocity to our teams. So that's what we're going to do. So hopefully a lot of you identify within a few of these buckets. I know that different teams and different organizations describe SRE or DevOps or Operations Engineering in a few different ways. And so to be honest, I didn't really care too much about what any of these mean. Me personally, I identify myself as a site reliability engineer where I look at it as my primary focus is to keep the site up no matter what it takes. And so my desire is always to have uptime and to make sure that whatever services we provide for our users, we're going to be up and available for all of them, no matter what part of the world that they're in. So if you feel like you are one of these engineers or if you'd like to be an engineer, then this is the perfect place for you. And if it turns out this isn't, that's totally fine too. So we're going to start... How can I break this talk into three different pieces? And to be honest, I'd actually love to talk about any of them individually for several hours, so I hope some of you will come up to me afterwards. Unfortunately, we only have time for one talk this year. So who here, would you consider yourself like an experienced operations engineer or an experienced systems administrator or experienced systems? So you have like three to five plus years, something like that. Okay, so a lot of you, awesome. How many of you have written like at least a few hundred lines of Python code? And who has published your own distribution to GShop or PyPy? So it sounds like most people know how to write Python. You're fluent with that. Most people know about reliability. And so maybe it sounds like most people are going to be interested in how do you actually bundle all these scripts together and make them cohesive? Does that sound interesting? Okay, cool. So I think we'll probably focus on that a little bit. So we're just going to touch a little bit on some of the core tenets that I think are important when you're thinking about systems administration or reliability engineering. These are some things that I think most of us are going to take for granted, but we're going to have a little bit of a recap just to go into what it is that we're actually focusing on and why we want to put all of this time and effort into building resilient tools. So the way that I think that we can kind of structure this talk is we can look at three different pillars which we should focus all of our work and all of the ways that we organize our teams and the road maps that we build out for ourselves. So the first one is going to be how do we actually perform the work that we're doing? How do we go into production and make some changes and modify things in a way that's safe and well understood? After that, we're going to need to have some way of tracking what are all of these things? What are all the different servers? What is the different code bases? And when you get paged in the middle of the night, how do you know what these things are and how to work on fixing them? And finally we have something that maybe does kind of lean a little bit more towards Slack is actually communicating with each other. So how do you make sure that you lay things out in a way that your teammates are going to understand what's happening, what's expected, and what's the future goals going to be for the team? So here I think that we can kind of push things out to kind of use the application code and operational code in two separate buckets here. Typically you want application code to be touched by the vast majority of the engineers or the developers at your company. What I consider the operational code or infrastructure is going to primarily be the pieces that are dealt with by the engineers in this room. So this is going to be the core foundational work that may not necessarily be seen by the users of your company's product. So this is the sort of thing where nobody really knows what you work on but it's really important and if it goes down everyone's going to know about it. So this is the type of thing that you really want to make sure that you're being very careful when dealing with and you understand how you're actually operating on it. Is everyone familiar with being able to roll out changes and roll back changes? Is there anybody that's not familiar with rollback? A few people, but basically I do want to point out that any roll forward procedure or any deployment should have a good way of rolling back these changes. So you need to make sure that just like you have, hey, push out a new change or a new configuration in Chef, you should have some way of saying, hey, actually this is a really bad idea and it's causing huge issues for us. So we need to roll back this change safely. I'd recommend something that's called the feature flag approach. So it's even something where you can set an attribute on a node, maybe in Chef or in Puppet, and each time you set this on a certain server, when you run your configuration management on that server, it's going to apply some changes and if you reset that back to previous value, it's going to roll back those changes and come back in the previous way. So these are all things which you can put inside of your runbooks and make sure that you understand how to deal with this. Speaking of which, in theory, everyone is very good about always updating documentation. Is that right? You've never made a change and forgotten to update that. Oh, I'll put that after lunch or I'll update this and send it out later. Which is of course not true, but I think it's really important that we all keep each other accountable, especially within our team. And every now and then we set aside maybe like a documentation Friday or maybe a coffee hour on Thursday afternoon, something like that where everybody gets together and says, hey, you remember that thing that you were working on for two months and you never actually documented it and it paged us last week and we didn't know what to do? So I think it's very important that we don't only focus on sort of the technological aspects, but also the people aspects. Documentation is one way that we go from the code and the computers back to the humans and the way that people actually work together. Which again leads into this last part which the communication which is primarily the 100% human oriented operations that you do. So I think that hug ops is a good example of this movement. I think being able to work together and realize what is the process that we have for making sure that when I deploy something that depends on your system, we don't really something at the same time that's maybe incompatible and we have some way of defining our interfaces and making sure that we know what the different dependencies that we have are changing and what they're doing with each other. This is also very important. These are the ideal. Obviously we expect to know what we're doing, be able to read up on anything that we're working with in production, and then also be able to communicate very easily with each other. But obviously this is not always the case, especially as a company is growing or if you're making new features or new pieces of infrastructure. So of course sometimes you may not be expecting to have to make a change, maybe something deviates from the established run book and then you need to make an additional modification in order to make a system run. In addition, like I said earlier, we may not make the changes to the documentation, which isn't good, but it's totally possible. And even though we try to communicate, sometimes you may feel like you express yourself well, but then your teammates don't quite keep up what's going on. And in general the more people that you have, the more difficult it is to communicate everything to everyone at the same time. But of course these are hopefully good problems to have and you're comfortable with your teammates, you've built some rapport, you've been able to work constructively with each other on pull requests and you realize that even though the processes that have sort of worked out for a while, maybe we only had a few services to deal with, now as we're growing, we have a little bit harder time keeping tabs on everything. Which is why I think that it is important to sort of move from being what I consider more of a traditional systems administrator to actually writing some code and being able to build some of the tools and automation which can then empower you to work on slightly different subsets of problems that you're solving them in a different way. So I think this is actually kind of an interesting animation because it starts out as a very simple interface, but then it provides a lot more customization that you can use later. So I think this is sort of a general guiding principle we can use is for building tools because we want to give most of our teammates just like, hey, just hit it with a hammer and it should work, but if that doesn't, then here's a few more options that you can use to tweak things later. And so I think when we think about runbooks, essentially these are going to be manual checklists and there's a set of steps which most people are going to say, hey, I got this page while I click on the link inside of pagerduty and it's going to take me to a markdown page or some format where I store my documentation. And I have to follow these steps and supposedly things are going to work just fine. If they're not up to date, well, that's when you have a problem because then maybe you need to worry about these edge cases. And maybe it's 2 a.m. so someone isn't going to actually go and make that change to the runbook. So the next day they sleep in because they're tired and maybe it happens again. So now the entire team needs to pitch in and figure out what the issues are and try and solve it again. So this is a little difficult, but it's something where if you are using a tool and you do codify these fixes, you can automatically detect which procedures to follow depending on what changes happen in your environment and what the systems look like as you're modifying them. So I'm also going to push for considering that as you are moving toward growing out your team and if you start to find that you have maybe 100 runbooks or something like that, it turns out that just like humans are able to understand what it is that needs to be done, you can also write this in a way, in a language that could also maybe happen to be understood by a machine as well. So Python can be very legible and especially with lots of practice and collaboration. You can find that people can understand it almost as well and it can almost be as readable as plain English. But it's going to have a lot more of the details and a lot more of the specific steps. So you may not be able to hand wave as much and say, oh, we'll just log in and kick off a deploy and if this is the first time someone's looking at the runbook, they may not know what kicking off a deploy means. And so by using code, you're actually explicitly saving these steps, which is very nice. And lastly, once you've actually gone through all the work and make sure you know what's happening, then you can actually put this into some form that can be done. So this is sort of the way that I consider what sort of an evolution looks like for a team as it's moving from just sort of simple steps that maybe someone who is primarily a developer that maybe is doing some of the DevOps starts with and then sort of how it transitions into a true operational workflow which can then be handled and managed by a reliability engineering organization. So typically what we found is that it starts with one person who knows how to run things. Was there a question for the back? Yeah, of course. Desning? Any better? I'll try to speak up too. Oh, well, that is way louder. All right, so it starts off one person sort of has all the knowledge in their head and typically is like the one master of whatever system they're dealing with. This is great in the short term because then this person feels like they know what's going on and they're able to jump in and fix any problems. But eventually they're going to want to take a vacation. And so they may not be able to always be there to bail the entire company out of whatever problems happen. So when this does become an issue eventually they're going to say, okay, fine, I've had enough. I'm going to write the dossier or I'm going to write the codex or they always come up with some cool name I found or like the tome, something like that. And then they just put in this huge Google Doc or something like that and it usually makes sense. But they're making a ton of assumptions and it isn't always very clear what it is they mean in all cases. But at least then you have something. So you have something you can work on and something you can base the rest of your work on. Now when this happens there's hopefully going to be one person who's that first spark. This will be the person that maybe has written some Python before. Most of you raised your hands earlier. So this is probably all of you that I'm looking at in the audience. You are going to be the person that's going to say, hey, I keep doing this over and over again. I'm going to take it upon myself this time. I'm going to take maybe an extra few minutes, an extra few hours, maybe a few days. And in between the other work I'm actually going to focus on automating this small piece of the runbook. And I think this is a really, really big step and this is when you start moving from just systems administration to actually what I would call DevOps or SRE. And I think this is a huge deal because it changes the type of work you're doing and the scale at which your team can operate. Because you're no longer having this sort of constant scale of team members with machines, but you're able to write a tool which is then going to exponentially be more impactful with the more servers you have. So even though this like step three is kind of like the big spark, unfortunately once you have this tool this is where I found that things kind of get a little dicey. So this is kind of like the trowel of despair you see in that technology curve. But once you have this tool you're actually going to have to spend a little bit more time than you would expect performing maintenance on it. You're going to give it to someone like someone's going to say, oh, how did you take care of that so quickly? You're like, wow, that's great. You didn't have to SSH into that box? That's awesome. And they're like, well, yes, actually. I have a tool I wrote. And everyone's going to go, whoa, that's awesome. Cool, what is it? Give it to me. I'll run this. And then you say, oh yeah, sure. I haven't really like made sure it works, but you can try it. And they're going to get excited and it's going to work until it doesn't. And there's going to be some edge case you didn't think of. Maybe they set up their computer differently from you have. And so it's going to turn out that you're going to have to go from the cool developer that's working on all this stuff to then kind of being tech support a little bit and making sure that you're actually building a production ready application. And so this is a little bit of the difference because all the tools that those of us in this room are going to work on need to be mission critical. And so like I said earlier, this is why we don't necessarily use Slack to work on our tools itself because if Slack goes down we need to make sure that we understand and can rely on the tools that we're using to get Slack back up and running as quickly as possible. And so that's really important to us to make sure that we have that muscle memory and we understand how to fix things up as quickly as possible. So yeah, so this is kind of that like trough despair which we're going to talk through a little bit later and I think you'll be able to get through it. But I do want to call out for you that if you do feel a little bit like you're being the tech support person that is a little normal and this will pass for sure. But I hope I'll be able to convince you that it's worth it. So after this, you kind of have the tool working again. You have a little bit more time to write code. You're going to be able to remove steps from the runbook and put them into your tool. And again, we're going to see what this looks like. And eventually you come to what I think is like the best part where the huge runbook is like 20 steps long and like refer to other runbooks and like, oh by the way do this, if this thing fails it just becomes one tool and you just have one line that says hey run this thing and it'll figure it out. And eventually you can kind of send to the next level of reliability engineering which is you're actually taking the knowledge that you have operating your service and then you're going to feed that back into whatever you're working on. So if you have some web application which is running Apache, things like that or maybe you're building some Java services, you're working with containers, you're going to be able to take a lot of the understanding that you have and put that into the orchestration of the system itself. I think that's when you're actually able to take this knowledge and then move one step beyond. But this last spot point is a little bit outside the scope of this conversation. So one last kind of run through of the steps we're going to go through. And if you remember between three and four is going to be that like trowel despair which if you hang tough and you keep working on it we're going to get through. We really have found in the past just about a year that I've been at Slack that investing the time to write the tools that we need to form our jobs have really been worth the effort. We absolutely understand the processes better. We're able to share and communicate things as we pass around code reviews. We're able to find gaps and potential errors that weren't handling previously with our old runbook process that we now have to run because we're running tests and we're making sure that things are being dealt with appropriately. So I guess the most important thing is we're just not working on the same old problems anymore. We have these tools and they're solving this really boring repeated issues that used to page us all the time and now we can actually focus on more interesting issues and that I think is the most important thing. So we're going to kind of walk through sort of like a sample of what this process would actually look like. So how do we go from a runbook to some code and a tool that's actually going to do the work that we needed to do. So we're going to start with just a very simple kind of like lamp stack essentially here and we're going to assume that we have a database that failed and what the process looks like for replacing that database. So here I have my schnazzy emojis which is our architecture diagram. I do not recommend using emojis for architecture. It actually does not work very well because people are like, okay, and then lizard, talk to spider and then like that didn't work. It doesn't make sense. So we're assuming here that we have Route 53 which is fronting an elastic load balancer in AWS and users request for being routed to their web server which is then querying the database cluster. So very simple kind of hand wavy like generic application here. So let's assume that one of those databases which in this case is a floppy disk, fails and we have an issue because as of course as we grow and as we have more and more machines there's an issue. So as the machine needs to get replaced we need to change around some of the replication. So we're going to look at the runbook and follow some steps here. So is anyone like a MySQL DBA or database administrator to people? So feel free to, this is like a heckle zone so feel free to correct me if all my steps are wrong. So supposedly now I'm like an application engineer so I don't really do database which is why I was like, oh this will be fun to learn. So on a bunch of your follower hosts you're actually going to stop the replication logs. You're going to stop the IO thread. And then you're going to want to make sure that all of the followers have caught up and are no longer processing any events from the relay log. Once that's happened you're actually going to promote the leader machine and so you're going to stop the slave on it which was trying to connect to the failed machine and you're going to tell it, hey now you're the leader so you're going to run reset master on there. After that you're going to run stop slave on the follower host and then you're going to tell it to point at the new leader machine. So you're again running another SQL command. After that you're going to start the relay log again on those two followers and eventually you're going to go back to that web server configuration, make your modification and then publish that. So not too bad, you're going to have to SSH do a few boxes, run a few commands, make sure you don't typo anything which of course works most of the time but as we grow and scale we have more machines. You're going to find yourself doing the same thing and eventually you kind of get forward and then you maybe copy-paste the wrong thing. So perhaps one time Joe maybe was kind of like late in the evening may or may not have been beer involved and I may have accidentally missed and this is not a real scenario I suppose but may have mispasted the host name in there. So maybe I was copying some test database and production and it would have been terrible. Totally happy to have you on the promise. So this would be pretty bad. So we want to put some way of safeguards in there. So we could say hey, you know like let's go back in the run book and let's add some bold or let's add some cool fonts or emoji or color. That's still great but it may be something that I'm going to ignore after a while or I'm not going to pay attention to. So this isn't really something that we can just figure out how are we going to fix with English. It's something we need to put actual technology behind in order to solve this correctly. So I think this is something where we can apply what we've just been talking about with tooling and figure out how are we actually going to make sure it's automatically being fixed in the future. So when you're looking at your work, the things that you do, or some of the different processes that you work with, I highly recommend starting with something that you understand very well. Something that you think is super boring. You totally know how this works. You don't have any problems doing it. You can pretty much do it in your sleep because this means that you don't need to worry about what does the process look like or I can write the code that does this or I can write the code that does this. And you can just focus on writing the Python or whatever program language to translate into the logic that you have. So I think that's the important part, is pick something really boring. It's not going to be the most fun thing to write necessarily, but it's going to be something you understand and it's going to get you going and it's going to help you figure out what it is that your process is going to look like. So in this case, we're going to talk about replacing the leader. So the first thing that you're going to need, and I think that we kind of have two core operations. The first thing that we needed was some way of figuring out what machines do we have, how do we find out what things we want to run on. And the other one was how do we connect to a host and run some commands on there. So this is the first one that we wrote. This is very, very simple. This is removing a little bit of like edge casing and a few error checking that we did. But we're actually using the Fabric Python module for this. And we've actually found that it's abstracting a lot of the complexities of dealing with SSH. And for us it's actually worked out pretty well. We also have a way of doing running commands on multiple hosts across the cluster. That's going to be, and we have a link to a few of the like core building blocks that I'll show later. But this year is going to help out with us here. So the first step is we're kind of wrapping some of the Fabric API itself. So we're kind of setting one attribute on the Fabric M module, which is telling it to use our SSH config because we have some custom magic in there for Bastion hosts. And then we're actually telling it which hosts we want to operate on. And if you look at the very top at that second line down where it says death execute command, it's pretty reasonable. It sort of reads for itself fortunately. So we're just running a command on a host and we're telling it what command to run. After that we're just executing that down on like line five or so. And then we're gathering that output and then we're actually going to be able to to return that back to whatever caller we have so we can process that string if we need it. Can everyone see this? Okay. So we've got that. That's great. That's kind of like the first start which is very helpful. But after that we need to have some service discovery, some host discovery mechanism. So I'm kind of cheating a little bit here with this at the very first line the from host management import search. For us we use Chef for managing our infrastructure. So for us this is actually using the Pi Chef API and we're using our Knife Config to specify which Chef server to talk to and how to partition which nodes to get from there. And so we're just using the Pi Chef search function to get that information. So this will be replaced by something that's you know custom to whatever your host management is. It could be the AWS EC2 API or something like that. So here we're defining a method called Cluster which for us we kind of cheated a little bit because we in MySQL we're going to assume that we have like a cluster name where it's cluster C or cluster A or cluster B and if you look down like line 7 we'll say that every machine that belongs to one MySQL database cluster is going to be following that naming pattern. So in this case everything that has cluster C in the name belongs to one cluster. Everything that has cluster E in the name belongs to that cluster. So this is a little simplified just for the purposes of this talk but you can imagine you have some form determining how you're tagging hosts with attributes or something like that. So the second function that we have here is called not very exciting leader and followers. So this is going to be how we determine which hosts are the leaders of our MySQL ensemble and which ones are going to receive the relay log events that we stand by. All we're doing here is we're actually just going to say that the lowest sorted name is going to be the leader and so we're sort of assuming that when a machine goes down it's going to be removed from our host discovery. So another simplification but for our purposes it will work out okay. So any questions on these two so far? Awesome. Alright so here we're able to kind of combine a little bit of what we had before. So at this point this is where we actually are using this remote execution. So to this function called change leader we're going to pass in some of the hosts that we were previously able to search for ahead of time. So we're passing in the leader and then a list of strings which are our follower host names and then we're just going to execute the command on each follower host and we're going to tell it to change the master to the new leader host. One point that I would highly recommend is to always make sure that you put in like where that print statement is to represent when you're making some modification to your infrastructure. So any time that you go out and you maybe you update a row in a database or sometimes you're shutting down a machine or you're rebooting something you're going to want to print something to console to make sure that the users of your tool know what's happening and how far they've made it through a process in case it exits or they like lose their network connection something like that. You can also put in maybe some external logging so you can write out to just log and send that to LogStash or if you had a metric system like Graphite you can take this opportunity to also send some metrics to Graphite as well down the road. So we have a few of the building blocks here we have this function and essentially we can just build a very small command line tool we'll kind of go into it a little bit later but you can build, promote my SQL follower you can pass it a cluster name it's going to find which hoster in that cluster and then it's just going to execute that command so I know exactly what would happen. So great this is awesome it's making sure that we don't need to go in there anymore it's going to detect the right host for us we only need to make sure that we're typing the cluster name once and this is going to be pretty good so maybe this is good but then we have an issue where we're not actually checking and making sure that the relay log is caught up on all the hosts before we move on so we're sort of following the run book but maybe we forget to run the show process list so we're going to check that the relay log is caught up so this is another example where we can take what was in the run book and relied on humans making judgment and being very very meticulous and instead rely on computers to do what they do very well which is run exactly what you tell them to so here we can stop the IO thread on all the machines in the cluster and we're detecting that with again our very friendly hosting cluster function which we've used before and we're going to reuse that find each of these hosts and then we're going to operate on them and constantly loop through them and make sure that they have caught up to their relay log so this is sort of nice because in this case we don't need to actually be paying attention and remembering to come back maybe there is a really huge lag in that relay log we do need to worry about saying hey I'm just going to go to launch for an hour and then come back and remember to check that host and finish up on this and then we're just going to continue onwards and then continue onwards once it's done which is nice so questions on this so we can update the runbook which is good we still have a runbook but we also have a tool so the first one is we're running our stop relay thread tool we're going to pass to the cluster name then we're going to SSH to the new leader we're going to run a few commands we're running our other tool and then we're going to SSH to these other hosts and run this so that's pretty good so here we are our first trial of despair unfortunately we've executed these commands and our teammates have been using these tools but unfortunately they've got a stack trace and so it looks like they have some issue in this case fabric is going to raise this time out and so it turns out that it was trying to connect to a machine but it could not reach that anymore so it didn't know how to connect to the cluster D4 machine unfortunately so this is something that every time I have not done this I've always regretted it where I said I don't need to check for any errors I don't need to worry about this this always succeeds and that error is just never going to get thrown I don't think I've ever had a case where I've never had the error about thrown so even though it's sort of annoying and you'll see this code that we used to have that was pretty readable and pretty you are going to want to make sure that you not only handle the errors but you are going to want to continue to put some timeouts on things for instance even when we make an HTTP request we find that some tools are hanging for quite a long time and so we need to make sure that we are putting timeouts inside of our HTTP request so that way they are not just sitting there and hanging so the server may not be responding or maybe the connection isn't going through so it's something that you are going to want to time out for 30 seconds and then retry so here is where you have to bear with me a little bit because I know this looks like a bunch of small text but really all we have done are just doing these two things so I know so I think the way that we can kind of break this down is we are pretty much adding if you look at that comment kind of at the top third we are adding this timeout equals 0 and then we are also changing that while loop to also make sure that we are checking to see how long are we going to iterate through until we have hit the timeout which in this case at the very top is 300 seconds so that's going to be 5 minutes so we are still doing the same thing where we are still just making sure that we go through and we run the command until it is successful in every host but we are also adding a timeout in this case so this is basically just giving us a way to realize there may be something wrong and we can either catch this and deal with it in code or we can send back a more helpful error message to whichever user is running this tool and they are not just going to be sitting there for hours and hours saying hey maybe there is something wrong I don't know what is going on so in the very middle of the code we are also seeing that we are actually handling the SSH connection timeout error exception but in this case because we are retrying several times up to 5 minutes we don't necessarily care that this problem happens we do want to export this message up to the user and we do want to warn them that there is potentially something that they could open a new terminal window and take a look at but it's not going to be something that they need to necessarily jump in on immediately it's something maybe it's some transient issue or something wrong enough it could pass away for instance and then down to the very bottom we are also adding an else statement there so if we did succeed and the relay log is caught up that's great, remove it from the list otherwise realize that we need to do a little bit more and we can move on and then everyone's favorite timeout sleep that's great you can do some interesting things with where you put your sleep and how you actually execute this but for the most part I would say just keep it simple for now but in the repo link to this talk later you'll see that there's a little bit of other things you can do with this and then lastly you're seeing that we do have an opportunity to return false which I highly recommend because I think that's actually a really valuable way of passing information back up and so you don't necessarily need to just print all the time sometimes you want to actually just return your values up to the higher function which is calling yours so that way it can decide what it needs to do so it could actually do another retry maybe so whoever's trying to do this relay log caught up could say maybe I actually want to try 15 minutes they could call it 3 times in a loop questions on this so far Yeah, so the issue is sort of how do you help different teams or maybe even different engineers on the same team collaborate and understand that maybe they don't need to write all the code again so maybe one of my teammates did actually write something which checks the relay log and didn't include this additional time out increments issue that I have here so maybe one of my teammates did actually write something which checks the relay log and didn't include this additional time out increments issue that I have here so maybe they've done it a little bit more elegant way or maybe they've just solved it and so I don't need to rewrite this this is definitely something that I think I used to really focus on and make sure that we had as little duplication of code as possible but I think that it's a little bit more about relying on that third pillar of communication and making sure that teams realize that there is an option there sometimes people are going to want to customize things specifically to their needs or realize hey I need to actually do something a little bit different from the more generic function that Joe wrote once so it's something that is a little bit of a trade off you do have a lot of benefits from standardizing and making sure there is only one core functionality for some solving some problem but at the same time you do want to give people a little bit of flexibility so I would sort of lean on saying hey we're going to put all of our code in the same github organization or maybe the same repository or make sure that we all post this on some list but I feel like everyone was in an organization agreeing on how are we going to keep track of what tools are available it's going to be a really good start if that kind of helps any other questions cute alright so we're just going to run through this part a little quickly but basically maybe we had a little bit more time things working out okay so we're just going to add in a few more things and then we're going to make sure we knock out the last few steps on this so in this case we're just running the commands to actually promote a leader so we know which host name we want to make the new leader and we're just going to execute my SQL statements there here we're actually going to handle the connection time out and do three retries so I'm not worrying about time about sleep or counting how many seconds this is taking it kind of depends on which one you're doing I expect this to take not to take much time at all I'm going to see if I can just blow through and make this happen so if it turns out that maybe you're doing an operation which could last for a few minutes or something maybe you want to put in a sleep in there to not constantly be hammering a server so maybe you're performing a backup or something maybe you want to say hey I'm going to try a backup and if for some reason that fails maybe do a sleep for 10 minutes and then maybe check some metrics before you try again questions on this and then again here I'm just kind of running through this but this is you know you kind of get a little bit of boilerplate at a certain point where you just want to run a command some of my teammates did actually want to like update that initial building block with some built-in retries but the reason I pushed back against that is sometimes we're actually running commands which aren't safe to run multiple times and so you only want to be able to run it once and then if that fails you need to do something differently so in this case I recommend to retries a little bit higher up in the code that you write because you know exactly which commands are running and what you need to do in case that fails questions on that sweet so here we go this is kind of the big this is pretty awesome because you actually have a tool here which you can reuse if you need to so this is going to be a Python file that we call promoteMySQL follower and we're going to be importing all the functions that we've just written and then we're going to be importing all the code that we had we're going to create a main function and we're going to use the built-in arg parse module which comes within Python to do a little bit of that command line argument reading and then pretty much if you kind of go through the names of these functions it's sort of well it still reads weird and all the answers underscores but for the most part it should explain what it's doing it's getting the leader and followers for our cluster have caught up on their relay log it's going to promote the new leader and make sure that that one is now going to be the master it's going to change the leader for all the followers to point to the new host and then finally it's going to resume replication for the followers that are picking up and getting the relay log events so for the most part if you're running this tool and maybe it fails or something like that you're going to be able to get a stack trace and know what part did this fail or handling or checking which is nice because now the runbook aw man wrapping is weird but now the runbook is just run the tool and give it a cluster name so you can automatically handle this so this is kind of the goal that we wanted so this is still helpful if you have your runbooks in Git you are going to be able to track changes and see who made the modification when they did it but in this case you're actually getting a lot of the benefits of continuous integration you can write some tests for these things and you can make sure that exactly the tool that's being run for these things is exactly the process that's being used each time so it's not something where oh yeah this is that special database where we do this like one off thing and somebody forgets to update it this is the tool that's used and there's enough of the logic in there then they're actually going to need to go in there and make changes to the tool to handle that one off and maybe that will make them reconsider so any questions on this? so maybe so we're also going to go pretty quickly through this so there's a so we have this tool and it's working great but maybe something happened or maybe in the middle of something we lost the database so it turns out that we also want to add a new functionality to make sure that we're also doing backups before we actually make any of these changes so we could take this and put this at the top of the runbook or we could take advantage of if we're on AWS the Boto 3 Python module which provides an interface for S3 that we can then use to check our backups and then perform a backup later on so in this case I'm just running some code to see if the backup is up to date and if not then we're going to return false and so rather than making a runbook entry where it's like login to the console click around a bit, find the file in S3 take a look at the metadata and see what the time sample is instead we're going to just have this little tool which is going to do that automatically for us so this saves people a ton of time you don't need to click on things it's going to be built into the same tool that everyone's already using and all we had to do was figure out the Boto 3 module and how we actually interact with the Amazon S3 API kind of nice questions on this so here's the third part of the talk and this one this I think is a lot more complicated and there's a lot more like institutional knowledge and a lot of different changes and different configuration options you can make depending on your organization but I think this is actually when you start moving from hey we have a few Python scripts that we use to we're all a bunch of engineers and we can all collaborate with each other and build tools that help each other do our jobs which is awesome so one of the issues I have with Python is that I find that distributing Python tools can be very very difficult so some people write very large monolithic Python code bases some people publish a bunch of distributions and they upload them to PyPy and then they depend on them they try to use virtual LAMs, they zip up virtual LAMs they do all kinds of crazy terrible things but it doesn't always seem to work out and it's been a little bit difficult but I think there's a few ways that we can make this better so just to make sure that there's some terminology when I talk about a module that to me is .py file which is just one file containing some Python code and then the Python PPA the packaging authority they consider a group of code, a bunch of Python files to be considered a distribution or a dist so this is all the files that you group together in the same version number so when you download Bodo3 from Cheeshop that's going to be one distribution when you download Fabric, that's one distribution as well SlackOps that we use, that's for us manages one distribution, a bunch of Python code all at once so this distribution it can either be shipped as source what's called an S-dist this is going to be all of the human readable Python files or it can be shipped as what's called a B-dist or a binary distribution so this is going to be things like wheels so all that's special about this is it's basically just been run through Python before and it's been compiled in a format that Python can read but humans really can't read which is sort of a little bit of an optimization it makes it a little easier for Python to run the issue though is that when we have multiple distributions at the same time it's a little bit difficult to manage them all together and keep them grouped up so what we need to figure out is some tool that's not only going to manage just the code that we write or just fabric or just photo but some way of kind of grouping them all together and one thing that we found that's actually been helpful for us is a tool called PECS which is short for Python executable and this one of yours work put into PECS to make sure that it's reliable and stable and it also follows the Python PECS standard so this is going to be it's using the zip app functionality of the Python interpreters to bundle up all of your code and dependencies into a zip file it's going to add a little bit of executable code at the very front of that Python interpreter is going to read and understand how to zip everything out and then execute the code that you've written so this has been really, really a huge game changer for us because it moved from just having one script and maybe installing everything all over all the hosts we have in production just saying hey, here's this one thing and we're going to be able to deal with it so inside of our repository we've written our MySQL code and we've set things up like we'll see in a second once we have the PECS tool, we can just say hey PECS, I want to output or dash O a tools dot PECS in this current directory in this distribution which is the dot and it's going to output a binary called tools dot PECS and it's just going to be a zip file so you can run on zip dash L and you can look at all the contents in there you can extract files just like any normal zip file just directly as it is it's going to put you inside of an interpreter so it's going to give you the repl with Python just the normal thing when you type Python it's a command line and what's going to be great is you can actually import all the code that you've written you can import photo 3 and you have access to all the tools that you need there so you don't need to worry about a virtual end you don't need to worry about explaining to people how to update things or pip install you can just give them this for the repl so here's where I think we go from just a repl to actually having a tool so with PECS we're essentially going to look for the name of the tool we've written and we're going to add the dash C argument flag to that and this is going to give us a tool which is then going to look for that script that we wrote initially that promotes my SQL follower script and it's going to know when I run this PECS file dot slash promote my SQL follower and I'm going to execute that Python code but have all the dependencies and all the other code available inside of this distribution and the other dependencies we have so this right here this exactly this is how we write all of our tools like this question? yes it's when I say anywhere we do compile separate PECS for different Ubuntu distribution versions depending on what you see so if you have pure Python you can build one PECS and it's going to run anywhere on Mac or Linux or Windows if you've made sure it works on them but if you have any native code that you're compiling like something like OpenSSL or ZooKeeper then you're going to want to make sure you build on separate platforms but that's hopefully not going to be an issue for most people but happy to talk about packaging and building later so within your Python repository we just have a few a few scripts which I think are going to kind of give you the framework for building this out the first one is going to be a script that I call Bootstrap Python it's very boringly named I suppose which is actually going to create the virtual environment and pull in all the dependencies you need to make sure you can write your code after that you're going to configure your distribution with the very standard setup.py which we'll talk about in a little bit we're going to give something for our continuous integration servers like Jenkins but other people use CircleCI or Travis so this is going to be the file which your build system is going to use I put most of our code our Python code in a directory called lib and then finally we just have a little wrapper to help run tests and do some linting and stuff like that but that'll be for a later talk so basically we're going to start from the very base level which is just the setup.py is everyone familiar with this if you're not familiar with the setup.py okay so we have just a few little things we have to let Python know we have to give it a name we have to keep a version keep track of which version we're publishing make sure that we know probably most importantly I think which dependencies we need so are we using photo are we using the Google Cloud API which client libraries are we dealing with are we using Python console or Python Zookeeper or Pazoo or something so once we get this list Python is going to be able to pull in these dependencies the packages we need so it's going to look something like this here pretty simple and I've kind of stripped out a few other pieces but essentially this is the general framework that you need just give Python the name of your distribution the version and most interestingly this entry points dictionary down here this is where you actually are going to define all of the names of your tools so if anyone has any great ideas I'd love to hear them I haven't quite come up with anything good just yet but for every tool that we create inside of our bin directory inside of our tools repo we're specifying a matching entry here in this dictionary so we're inside of this list sorry so for each of you know promote my SQL followers or upgrade server or reboot box or something we have a line in here and that's what's going to tell techs to execute that code so if there's anything that's important in this I would say it's this entry points because this is how you're going to use this to create all the tools questions on that and then lastly which dependencies you're using and I'd highly recommend pinning the versions so making sure that you specify exactly which versions you need because that's going to prevent any version changes or conflicts down the road questions on versioning so this lets make sure we have everything set up and can use our Python code as well as all the dependencies we need this Bootstrap Python is primarily for developers so these are all of you in this room you're going to be very familiar with the script so this is going to be the first thing you are going to execute when you first set this up and this is going to make sure that you have your virtual environment and it's going to make sure that when it runs this Python 2.7 setup.py develop it's going to take all of the code that you've written inside of your repository and make it available so that way you can actually test it and execute it as well which is nice and there's a lot more like helpers and other things but this is the most important steps right there questions cool so then to actually set this up and make it available we just have a Jenkins job like I said which executes that build.sh and inside of that it's going to upload to s3 and make it available so here's kind of the the interesting part here so each time we build we just have this really small little one liner which I've kind of broke out to make it a little more legible but basically this is using the package of resources module and it's going to look in for every every entry that we've written inside of that console scripts entry and it's going to make this it's going to generate this list of all the tools that we expect to build inside of our CI system so this is nice because every time there's a change we're able to create and increment things for us it's been pretty handy because if we have some low-level improvement or some fix that we're adding to some low-level library then we can rebuild all of our tools and make sure that they all get that benefit which is nice the other part of this is actually building all of the pecs files which is going to do the work of making sure that we have all the latest python code and all the correct dependencies set up put into those executable zip files and we have a little helper script called s3 upload which is going to put them up into the bucket and make sure that they're available in the build number which is created by Jenkins this is actually POSIX shell it's not even bash I know, yeah you can write pythons to this if you want I don't think I would you could that's my best tool for the job oh dear any questions on this or heckling so we actually do a very very simple way of deploying these tools we don't worry about docker containers or anything like that again we're using Chef for our configuration management and so we made a small helper we call it black ops tool but basically all it does is it just says hey you want to download this tool for this version we're going to put it in make it available in user local bin you're going to want to build the equivalent for your configuration management system so here's the provider I'm not really going to go into too much of this but the important part is we're actually depending on S3 file which is another open source module which is going to do a lot of the heavy lifting of going to S3 making sure it signs the request and does the HMAC AES before something, something, something signing I don't know how it does all that but basically make sure it does all of these things under the hood for us so it authenticates and pulls down the file correctly and we also keep different versions on disks these aren't too big actually some of them are like a few megs which is kind of large for just some Python but for us disk is pretty cheap so we actually like having this self-contained all the Python dependencies in one place which is nice so all that crazy things and a little bit of extra stuff just means that our teammates can just write this in Chef so when you know the MySQL Storage Operations Engineer finally sits down and automates that runbook it was kind of crazy it's sort of nice for them to be able to say hey, all they need to do is at the very end write these three lines of code and it's going to show up on all the boxes and it's pretty nice so it's a good feeling and similarly if you have maybe some demon which is running, you're using Flask or something like that which is running under Monit or something we also have a notification so again, all of that setup you only have to do once the setup.py, the Jenkins job all that stuff, put in the time and the effort to do it once and get a good pipeline going and then all of your teammates will benefit from this workflow which is nice so this is really all they have to do they just create the Python script they add that one line entry in the setup.py they just go to Jenkins and click build or you could do continuous integration and then just put in that little entry to install it with Chef this is something where eventually you can modify some things you can add additional testing you can publish things if you have different environments you can also utilize Chef environments for this so you can create cell folders within S3 and you can pull down the latest version in some cases run some integration tests, things like that but eventually but just for the bare minimum you get a lot of value from this but it gives you a lot of flexibility for adding a lot more safety checks and balances down the road which is nice as well so overall to kind of sum up I think what we've kind of gone through here for the most part we all do a lot of great work we really pay attention to honing our craft and being excellent at work we want to make sure that we put this out so our teammates know what's happening and then eventually we want to convert that into tools which are going to do the work for us so we don't need to repeat ourselves and then eventually we're going to shuffle that into cell healing systems so we only have a few minutes left but is there any questions? yep great question, yeah we have tons of internal libraries and we keep them all in source control so we're looking into setting up I think it's brand in the DevPy builder or something yeah we eventually want to have this thing in front of Python that's right yep so the question is how do we work with team members that don't know Python this is an excellent, excellent question we had we had a lot of team members that didn't know Python when I first started so this was actually why I pushed so much to pick a process that you know well and understand that well so that way you can focus on how to solve the issue and kind of get your feet wet with Python and don't worry about solving a tough problem and learning Python at the same time so I would suggest starting really small and I've seen some of the teammates that give me the best code reviews didn't know Python when we started so it's been pretty cool to see actually so yeah it's a great way to do this yep so I really like the toolkit that you built and how you ship it and you ship it to actually conline that you run on your machines we do something a little bit different we have like a docker repository that holds those toolkits and we run it through Jenkins so what it gives us is a machine that is in the air and it turns for us it doesn't go down and also there's some auditing and logging so I was wondering if you also were thinking of those issues and also one more thing that it gives us is some documentation some simple UI that you go list all your things and you have parameters then you can enter in the Jenkins jobs so what do you think on those topics so the question is how do you take it from from command line to some web application that you run that has parameters that has auditing something like that sure so we've taken a few command line tools and turned them into services we've been using pretty simple strategies for doing that we just have a few Flask applications which are doing relatively simple authentication and basically because we've structured most of our core logic in this way we're actually able to plug those into Flask inside of the handler very easily so we're basically just reading the parameters and passing them in to do exactly the logic so for us we've actually found that it works out really well so we're for instance we have a service which is in the hot path for provisioning machines and it's actually was used as a command line tool and now we've turned it into a service so we're using the same underlying functions and Python code to do that does that answer your question? Thank you for the talk looks like you're deploying the tooling to all the services and then they sometimes run automatically by monitor things how can I reconcile that with the example that you give which requires touching multiple nodes so if the tool is deployed on all the nodes and all the tool runs make sure that they don't collide with each other so the question was how do we target which nodes to install the tool and how do we control where they're running and sort of the orchestration of the tool so we're we're actually specifying within a node recipe within a cookbook recipe we're specifying which machines are going to be installed have those tools installed so some of those that are the service-wide the web server monitoring daemon those are going to be running on every web server and those don't necessarily require any orchestration or interaction with each other so those are considered like fleet-wide tools but then there's some which are expected to be human orchestrated and those we would install on like sort of a centralized not like a bastion box but sort of a place where we can run tools from and it's only it's locked down to only a few members of the operation team that have access to run those that tool that we walked through in particular I think that does all the orchestration within its own process and so that I would only install in one place I wouldn't install it across the fleet but that's a good distinction question back there oh python 2 oh man that's a good question yeah let's talk python 2, python 3 later one day, one day I will do it it's been kind of lazy in this one someday soon I'm going to update and authorize these operations and make sure that the tools have the access they need and no more this is pretty complicated but I think it's roughly as complicated as like how does a user have access to do this stuff so if my user account has access to SSHN to a machine and like run sudo and execute commands then myself running these tools should also have access in theory so for the most part we've basically been able to use the existing ACLs IAM policies to restrict access to things we need and then we add or remove the processes of all but yeah that's like a whole that's a great question there's tons of nuance to that too I think there's one more yep so do you typically include your environment settings files into an Apex files on our include environment settings yeah sorry typically deaf person know a credential like for a credential file deaf person have got a credential to run a production so either they have to place it, they have to have their own deaf credential so they cannot deploy their Apex file onto production but then if you don't include it to the Apex file then if you don't include that file you need to know the location where the file is played and only the infrastructure our CIS admin know a location file so if you change the location you have to let deaf know to put a convention that carries onto before you build your Apex file how do you solve the problems sure the question is it sounds like there's a CIS admin team who manages credentials but how do you make sure that the Apex files have access to those in very small cases I would say you could like hard code some like fake passwords or like pseudo passwords and pseudo credentials inside the Python code but for the most part I mean I think the issue isn't necessarily like well how does your tool know how to get access to those credentials but like how does anybody know how to get access to those credentials so I think that having some like established practice like we use Vault actually in console template and so we actually on the hosts those are going to run and drop in the credentials that are specific to that host inside of that directory and they're going to be readable by various like Unix users and so if they are able to read those files like when we create a new CIS service we may say run this is like you know the provision service or something it's going to have access read access to those provision credentials and that's going to be controlled with Vault but yeah maybe we talk after see if we get one more detail yep oh sure sorry okay last one sorry great question yep yep yep yep so I've written primarily Python so that's just what I knew I could come in and like I was the big one that was going to be writing a bunch of tools we have a new a few new teammates who are really strong go developers so they're kind of like doing this but for go and I think that's going to be the future but I think we're always going to use Python because I think it is very approachable documentation and a lot of support a lot of third-party libraries and I think it is something which really is conducive towards someone who's like hey I kind of like written some scripts in the past or like done some Perl or something in my experience I found that it's pretty easy for someone to pick up Python it's pretty easy to pick up go but I think it's it's what people kind of reach for a little bit more quickly but yeah I'm looking forward to go because I think it solves a lot of the stuff and you don't need the extra apparatus so thanks for the time okay there we are okay cool so it's good hello thank you all for joining us on this talk Derhans will talk on a DC Keeper which is a version control for plus et cetera directory Derhans is a technologist and entrepreneur and currently the chairman of the Phoenix Linux group when you people have questions please raise your hand and I'll give you your mic so that we can record the questions and answers and that's it thank you very much okay thank you alright thank you for coming to scale I hope you're enjoying the conference I know I certainly am I'm going to talk today about Etsy Keeper what does Etsy Keeper do it puts Etsy into version control the author of this is Joey Hess and basically if Joey Hess wrote it you really should look at it he does some fabulous software he was a Debbie maintainer for many years here's a list of some of the different pieces of software he's written and I highly recommend pretty well anything he's done if you need something in that category consider what he's written so the long short of it Joey Hess is awesome Etsy Keeper is awesome alright we're done that's what you need to know but I have a few more details if you're interested in those okay so first of all what is system configuration so why are we looking at Etsy and what is it most of us probably know but Etsy is a directory that et cetera there we go hold system configuration for Linux or Unix operating system changes in Etsy can affect system behavior and performance so if you go through and for instance mess up the password file makes it difficult for people to log in I've never ever done that if you wipe out the password file because your editor does really stupid things makes it really hard for people to log in yeah never done that so it's kind of important to keep an eye on what's going on in there just for authentication and so forth but if you have Apache configs or you're running some kind of service on your system you know it's good to track what's going on over time one of the other pieces is in general we use plain text configuration files now this is awesome in many ways one of the ways is we don't have a registry so if you go change the configuration for one thing and break how that configuration file is done it doesn't cause then every other service to stop working and that is fabulous that we've got those separated the other is by having plain text and that might have some format you know you've got any files and different types of formats but by having it as plain text we have lots of tools that know how to use plain text we know how to we have tools that know how to automatically read them how to automatically write them to change them we know how to track what's going on with them so this gives us many advantages for controlling our systems now EtsyKeeper is part of an atrocious backup solution I'm going to talk about the advantages of EtsyKeeper but it doesn't do everything it doesn't back up your file system it's going through and putting it in revision control you still need to back it up somewhere if you want disaster recovery you still need to do that so there's other parts of that backup solution whether you were talking about machine or service or company for disaster recovery that you still need to put in place but EtsyKeeper is a really critical portion of it I find so what it does it puts all your files in Etsy into version control so your password file, shadow your Apache configs NSS switch all those things goes into version control it does it safely completely and consistently and I'll talk about each of those three and why they're important so first of all not first of all but the next thing is what is a VCS? I keep talking about we put it into version control what is version control and what does it do so it tracks changes to files so I've got a file that I have created today and tomorrow I go make a change in it, I check it in and I can use that VCS to see what those changes are we're familiar with this because we use it for code all the time but we can also use it for playing configuration files for those of you that write articles you can put your article into VCS as well so you can track changes over time and the editor gives you feedback or somebody points out a typo you can fix it and publish out the new version so it works for anything that is plain text or works well for anything that is plain text we can show the changes between versions, I just covered that works great on text files they're not so great on binary files there are some problems with binary files VCS has got a great tool called git annex that helps alleviate some of those problems that git has with binary files go to a different presentation to find out more about that and actually he's taking funding on that right now so you can fund work on git annex and a graphical tool he's doing for that and one is a specific example you can retrieve a specific version so the example I give here is you can pull the version of March 4th, 2014 for your networking config and find out what it was this actually hit me recently where I moved and not all of my systems got moved to the new network and yet kept working Linux is awesome even on the wrong network stuff was still working but then I had also made some typos in a couple of them and going back and being able to find the old versions of my configs instead of trying to remember what I came to at three o'clock in the morning one day was really advantageous for me now safely I mentioned a couple different files etsy password how critical is it that we go through and protect the information in etsy password not really because etsy password needs to be rolled right readable for everybody on the system that's how you find out who the users are what their home directories are what shell they're going to run things like that the secrets long time ago back in the 80s and 90s were moved into a file called etsy shadow that's where we have the encrypted copy of your password now how secret do we need to keep that file as secret as possible right that one's locked down that one you in order to read it you have to have root privileges or use a tool that gives you root privileges for the second thing you know the moment that you're using that and so you need to have anything that has a copy of that file also be locked down because if I can copy that backup to another machine where I have root even though I don't have root on your machine now I have root because I can go through and crack your password so etsy keeper goes through and make sure that we've got 700 mode on the file so only the owner can read and write on the file and then make sure that the owner and the group are roots so you've got that for the base repository before you start putting files in there you've already locked down the directory that's going to go in and then you're keeping things safe underneath that we'll cover those permissions again later on in the presentation so keep in mind there is a quiz alright I'm going to hit the correct button alright there we go so now let me take an aside on what we're talking about permissions and modes talking about net sync little story those of you that have not been around for quite as long as I have might be familiar with net saints current version called Nagios but many years ago it was called net saints and they changed for whatever reason but I went through and I was doing my own version of putting etsy into revision control I was using RCS at the time this was again a long time ago and my net sync stuff stopped working wouldn't start up and net saint Nagios stopped monitoring to make sure services are working and things like that so it was kind of annoying that my monitoring stopped working and the reason it gave for not working something about buttered bed and elephants I don't remember the error messages were completely irrelevant and useless which makes sense for monitoring hey things are broken but I'm not going to tell you what that was useful but what it turned out was that net saint was very insistent on certain permissions of files and certain ownership of files and group ownership of files and if you mess those up net saint would just presume that somebody had broken in or that you were a complete nutter idiot and refused to participate I don't know if Nagios is the same nowadays I don't actually set it up I use it but I don't set it up I don't fix their problems and if they find a file on the wrong permissions they tell you, hey by the way if you'd fix the permissions on this file I'd appreciate it so that caused me a problem and normal VCS tools don't watch that well so that causes a problem if you've checked it in and then you need to check out an old version which is what happened to me I checked out an old version that then ended up with the wrong ownership and if Nagios is suddenly owned by you instead of by Roots the system should complain that's not right so we can't just plain use of VCS and just start putting Etsy into revision control as I found we need to do some extra things as sysimines version control was usually written for developers who are creating code and deploying they're not deploying we needed to stay up in production so permissions and ownership are important for us so completely that's permissions, ownership and empty directories so VCS usually do not track these if you've got a director with no files in it a VCS will just ignore that directory until you put some files in it a couple of them have mechanisms for saying hey by the way there's a folder inside the directory but that actually causes this problem depending on the service for system administration so we need not just a VCS but we need something that tracks these pieces so that we can put them back correctly and Etsy Keeper does, it has some extra tools to go through and track these particular items and make sure that when something gets put back these items also get put back correctly now many VCSs nowadays will track permissions but they strip the execute permissions but that might be particularly useful your init scripts that are executable might need to be executable alright now consistently this is another part so if you go through and do an upgrade or you install a new package or you even get rid of a package Etsy Keeper will look at seeing hey there are changes and check those in for you if you go in and start changing things by hand Etsy Keeper doesn't automatically notice that you can throw a cron job in to take care of that if you wanted to but package changes is one of the big ones especially if you have a package change where the package maintainer says hey you really need to make this change that you didn't want to make well fine, make the change and I just go revert that file and I get my old version back I used to go through and when I do package updates in Debian I would copy and paste the old version so I could put it back or I would say no don't do it then I would go through and merge it on my element later on well now I can use my VCS to go through and do it makes it much simpler and I could also go check and see what changed with those installs so if a package is misbehaving in Debian this would never happen but if a package is misbehaving and changing configuration for a different package I can find out about it shall we say so who cares so I'm telling you these advantages but if you're not assisted men why do you care about this even if you are assisted men you're like I've got Puppet, I've got Chef these things take care of these things for me well there's some specific I find that there's some specific needs so like I said you can see what changed recently so I get called alright I can go look at that see and find out what happened did you change anything I know whether or not something changed because the VCS tells me whether or not it did either there's a file on the system that's changed that hasn't been checked in or I can look at the logs and see that the change has taken place recently I can revert changes so I can also this includes like a package management goes through and makes a change that I didn't want now you still need to go fix the package management or that package management the package management makes a change that I don't want if I've gone through and created a configuration that does exactly what I want it to do I don't want an upgrade to go through and remove all of that and if it does, again I can put it back pretty easily I can restore a file when I inadvertently remove something I am lucky in my career I have been doing this for 20 years now and I have only accidentally removed files a few times so I've not had too terribly much problem with this, however when you do it's kind of nice to be able to put it back especially this I have done being in the wrong environment and working at a company where we had 40 data centers and we had www 101, www, 102 and every single data center Colin has also lived through that and it was annoying because that was all that was in the prompt I couldn't tell which data center I was in I fixed that by fixing the prompt so I would have region as well as the host name the short name but you also still have prod and testing so if you go through and remove something in the wrong place you can fairly easily put it back or if you just change it in the wrong place when you have multiple people multiple tools with their hands in there you can find out what's going on so if people are changing stuff if somebody keeps making a change to fix something at 2 o'clock in the morning but they don't fix it in your configuration management tool so that at 230 it rolls back the old version but they're like the alerts are gone I don't care and if they do that every day for a while I mean aside from the fact they should just clue it fixing it the wrong way but now you can see that they're doing that and of course you can investigate changes after a script kitty gets in so if somebody else gets in there and starts doing things you can find out about it now something that it doesn't have I keep saying I need to write this myself there's an audit mechanism to go pull the logs and see if people have been changing things but that would be fairly easy to do because you're just pulling logs off of a version control system all right etsy keeper setup it's really easy pseudo etsy keeper init pseudo etsy keeper vcs commit-m give it some message initial check in then profit right one of the primary ways of profiting in this particular case is you get to sleep more often because you hopefully will end up with less alerts so now you've got the repo you've done the init you've got a repo and you can do normal repo things so but as I said there is a little bit of magic going on as well all right so here's an example I can create a file in etsy I think can go through and add that file to my vcs and I'm using get for my examples I can go through and add that file into get and then I can go through and commit that file here's also an example where I go through and add the IP for my next cloud instance to etsy host so quick question in here so I need to edit a file that only root has access to but I am not doing a pseudo on the echo I'm doing a pseudo on the T so anybody want to cover why that is yet so the echo is not actually writing to the file the pseudo the T is what's writing to the file so that is what needs to run as root if I just said pseudo echo and then double greater than etsy host does that work no because the redirect happens before pseudo before the command line even gets read so that redirect actually happens as the person running the command so you have to pseudo first so that's why I use a T and then I just threw the error to dev null so that the example wouldn't take up four slides all right so then I go through and do a get to check in I do a diff so you can see that I'm using just using get to go through and make the changes I do a check in and then I go oops there are four ones in that last octet and that's not really allowed so let's go ahead and revert that to get that change back out I go through and put it in the proper change and check it back in again I'm just using a revision control system another example you know none of us have ever accidentally run to command you start running command you change your mind to do something else but you didn't quite get rid of the first command so here's an example where I go through and remove a file well then I go through and I can check it back out and put it back in place all right so etsy keeper does support multiple version control systems but he highly recommends using get because that's what he uses that's what he does all of his testing on or almost all of his testing on has gone through and added pieces to work with the different things if you're using a distribution that defaults to something other than get for etsy keeper maybe somebody that would be saying bizarre because we made it then he recommends you go through and install change their configuration to use get that's what I do that is one of the distributions I use a lot etsy keeper is the second thing I install the first thing I install is get core install etsy keeper make sure it's using get go through and do my nets and then I also add open SSH server because why are they not doing that to my default so you can use these but we do recommend just going with get it makes sense now as I say etsy becomes a normal repo and you can just use your version control system tools and commands to go through and do stuff but there is a little etsy keeper where he's provided a convenience wrapper this is nice if you happen to be running different version control systems for some reason and also just in a few cases it's a little bit shorter less typing and considering we left out vowels in most of our commands for 40 years we like shorter typing so go that way so here are some examples we have a commit where I'm going to not using convenience wrapper commit is actually command from etsy keeper and I can give it a message and go through and do the commit for me the one below that is an example using the version control system using the get command to do the same thing the second set this one I'm using the vcs which is the convenience wrapper and I'm saying use your vcs system and give me a diff and then below is again the get version now you'll notice that I'm using dash capital C and then etsy that's because I'm giving examples you can run from anywhere if I was already in the etsy directory when I run the sudo then I don't need to tell get which directory is the home directory for that repo again I can do a etsy keeper vcs status that will use whatever your vcs system is to run status I can go through and run it on my own and then with the etsy keeper if I am trying to get status of a sub directory I can then pass through the dash C like I did for get and say go look at this sub directory I can do the same thing with get now if you want to copy the repo and I pulled this directly out of the documentation so no need to type it all I'll write it all down the big thing I wanted to point out here is I'm using another server and saying make a directory and do stuff but one of the first things he does after making the directory is he changes the mode on it changes the access to that directory to make sure that it's locked down because once he does that remote adder the push for the backup so once he pushes the backup into that directory if that directory is rolled right he'll be able to read it so he's going through and making sure only the owner has access to that directory in this particular case I'm doing that as me not as root which also means as I realized that my get push is going to fail because I don't have access to push the repo but if you were running as root then you would have root on both sides and so forth but I never allow root to SSH into a box and I hope everybody agrees with that we can have a discussion later on I will bring a big stick ignore now I said that etsy keeper backs up everything in etsy not quite right we'll get into that a little bit but it uses your normal ignore so if you have something in etsy you don't want being backed up you keep your music collection in etsy for some really insane reason you don't want to check that into get so you can say ignore my music collection or whatever right so it uses whatever your normal VCS uses again I'm using get as my example now there's also some files that just doesn't make sense to back up or not to back up but to check into revision control so that includes ephemeral files ephemeral being things that change or things that go away so at the mtab I think nowadays is a link to proc anyway because every time you log in there's things like that so that's not a real content file let's not put that into rcs and just be going through and causing extra IO for one it's not much IO but also you're just adding a whole bunch of noise to your repo and then also cache data cld.so.cache is a cache of where to find libraries every time you install packages that use libraries it goes through and it's updated etc etc but again it's something that gets reproduced you don't need to go through and back that up you don't need to check it into version control although depending on where you are if you're having problems with particular libraries you might want to say go ahead and do that so I can track it over time to see if there's some common versions and so forth so there's other tools that go through and look at etsy for us and do things one is you know configuration management so pop it chef ansible a bunch of tiny shell scripts package management so the tools that we're using to check in to install packages and also then you can use file system snapshots this came up in a comment I ended up writing an opensource.com article on etsy keeper as well and that came up as one of my comments so I added it in here let's talk about configuration management this is a very high level part of what it does I'm not trying to teach you everything about configuration management go talk to the chef who is here I'm betting pop it is here go talk to those guys so it sets files or parts of files to a specific state so I can say here is my Apache configuration and my configuration management tool push out my Apache configuration most inside of my configuration management it pushes that out we start the Apache and there we go or it might even just push out a part of it so I can say add this user it doesn't track all of etsy password it only tracks the user that you're telling it to do or the users that you're telling it to do so for instance it probably doesn't track the root user or if there are extra root users after some script can be got onto your system so it's not tracking everything or checking everything in use the configuration management system to set the state and etsy keeper to log the changes and I only believe in using a configuration management tool or an orchestration tool if you're looking at the bigger picture I don't know what we'll call the next kind of tool it will be the world music tool whatever so your configuration management to keep all your systems the same everywhere but then use etsy keeper to track those changes and then like I said do audit tools to go through and make sure that you're getting the changes you want and that you don't have extra things going on whether that be a joint system in it 2 o'clock in the morning or a script kitty that gets in or some tool that is just doing the wrong thing go through and have as I said don't use Pico but it had a bug where it would wipe out a file which was a really bad thing so if you have files disappearing in etsy that also kind of might matter depending on which files they are package management with extra a's apparently so that sets files to an initial state you install Apache you get into the default configuration for Apache now the package management system might ask you some questions hey do you have a website, do you have a domain do these things, do you want me to only listen on localhost so the package management might do some initial configuration for you which is kind of cool but it's again not tracking it over time the package management system doesn't look at that configuration file again until you install a new version of the package so it also might provide tools for automated configuration changes this is something I love about Debian years and years ago the guy that created Debian Junior asked on the Debian develop list initially we had lots and lots of opinions because that's Debian develop but he asked he said I'm creating with Debian Junior I need to create I need to modify base packages how they behave so that my kids can be doing stuff he needed his how many times you can change your password to be different than default so that his three year old couldn't change the password every two hours and then ask for resets but he also wanted the older kids the teenagers to start participating in system administration but not be able to change the password for the three year old every three hours just to mess with the three year old so he had some different defaults he wanted and you're not allowed to do his Debian Junior packages were not allowed to change core package in another package because that's just foreboding inside Debian and the resulting conversation was if you need to fork us in order to do that we're broken and that was where we started getting all these .D directories where you can go through and add your version of package management and basically what it comes down to is now when his thing comes in it says I want to change how the password and shadow behavior is done I go talk to the package that owns that and I don't know what it is off the top of my head and say hey by the way can I make this change and that package provides tools so that they can go through and make those changes they can allow or disallow those changes and in Apache you just throw it in the conf.d and poof, you've got your own configuration for your tool so net saint nowadays when you want the web interface it just drops a, or let's say this a Nagios, drops a Nagios.conf and under Etsy Apache conf.d and when you restart Apache now you've got your Nagios interface in there as well which is really nice and it's made package management way easier so it does those but again it's not tracking everything Apache doesn't go the Apache package doesn't notice that there's a new configuration file because the Apache package was used when you did the install and it doesn't talk again, it doesn't do anything again so you do another install so you've got those changes taking place package management again doesn't track anything so we've got two tools that don't track everything that happens inside of Etsy so use those to do the things they do and then use Etsy Keeper to log those changes now file system snapshots okay snapshots track everything that changed right they're doing bit for bit comparing the old snapshots of the file system so this does track everything that changed in Etsy but it tracks everything that changed on the file system something changed in VARV, something changed in Etsy something changed in any other place on that file system but we don't keep every snapshot right you've got snap on writes file systems so every time you change a file system you get a snapshot of it well if you do that for every single change on the file system your one your logs will run you out of disk space in a couple days so you can't log every single change that comes on and snapshots we use to have you know a couple days worth you've got your regular backups of stuff and you start tossing them away so if you really want to see how things have changed over time and this depends on whether or not that matters to you and it does to me but it doesn't necessarily for everybody else then again snapshots aren't providing the full tool that I want to have they're great works together they're awesome for the backup portion of it but I want to again track what's going on and I want to easily be able to see if I want to see what my patchy configuration file looked like this week versus last week I can go through and use my VCF to do a diff between those two versions if I use snapshots I need to restore the snapshot and then bring both files together and do a bunch of extra stuff I could create a tool for that but why I already have one I can just use Git and there we go so again snapshots are great for backups but I still think that we want to use etsykeeper for logging the changes that take place in etsy now I mentioned earlier completely or consistently I mean so package management hooks so etsykeeper has hooks with your package management systems primary that Joey was a Debian developer so for some reason I'm betting that he's got better support for Deb and Debian packages than RPM but etsykeeper does work with RPM based systems as well I don't know others because I just haven't needed to care but I know from seeing what other people have posted it works with other package management systems go read the documentation to find out if it does if it doesn't work with the one you like please submit patches and add support for it so you get automatic check ins before and after package management changes now I didn't realize until a few months ago I noticed one day there was checking files in before any package installs took place so what etsykeeper does is get a hook when you install a new package etsykeeper looks at etsy finds any changes if there are changes that have been made it checks them in automatically for you and then you do the package install and then it checks and changes after that but that means if there was a change to my configuration files that I had noticed it's now quietly been added for me so again if you're wanting to audit especially for a security perspective go through and look for those types of events and maybe add your own hook that says hey email everybody in Infosec when you find that password is changed in a way that it shouldn't have things like that so vcs hooks this is an example just basically catted one of the files out of etsykeeper the one for get so this is going through and saying do a pre-commit check find out if there's anything and if there is check it in so the point of this whole thing I got a okay I didn't get the setup right did I give it, sorry I was using a line from rear delt because rear delt is awesome anyway the point of this thing is I hate uncommitted configuration changes I want to know it's changed I want people to have taken ownership for those and make sure that we've got tracking for what happened inside my production environment etsykeeper keeps system configuration inside of vcs you get to use your choice of vcs but you should use get it is part of resiliency backup and disaster recover architectures it is not a complete breakfast unto itself right you need to go through and add other stuff there are lots of other tools for going through and making sure you have the other pieces but I think it is an essential part of all three of these types of architectures and then of course don't forget the other parts right you throw it into to get on that system you have backups no no you need to copy those data off to another system and do other things just because you've got it vcs so you can check back out old versions does not give you disaster recovery because if some magnetic comet comes down and wipes out your data center and all of the data that was on it if that was the only place you had to copy you have no copies some resources I am pointing to the etsy keeper home page and also to myopensource.com article that came out basically they saw my presentation asked me to write an article and these are ways that you can get a hold of me if you want to get a hold of me and I will stop doing that do we have any questions yes sir let me ask so if you have unsafe can you repeat the question so if you have unsafe symlinks in etsy that are pointing to files outside of etsy don't do that that's the simple solution I had years ago I was working in place and they said hey if you run mozilla as root it core doubts and I said good so you shouldn't do that but if you do it depends on how you set up the revision control system yeah yeah just because you have your main because I was thinking you've already got it in revision control but just because you have it in revision control doesn't mean that you know on that particular instance what version it was pointing at and you could probably look at time stamps and other things to figure it out but that becomes a pain so I haven't particularly done that but I'm pretty certain you can tell your revision control system say by the way check the symlinks for me instead of just doing the symlinks themselves right she's bringing you the mic is the save or the commit to get in real time or is there a quiescent period let's say that I'm running puppet and I'm going to make 10 changes and I really want those 10 changes and get under one commit and not 10 commits so how does it handle the in process time or something like that the scheduling or so it depends on how you do the commit so I like precise small commits so if I'm going through and changing a project and I change three files I might actually check all three files in one at a time so get commit this file, get commit that file I don't do the whole tree but if you just say get commit the tree it checks all the changes in at that one time and I'm pretty certain that etsy keeper defaults to the ladder checking in the entire tree so if you're using if you're using package management system it happens immediately after the package gets installed it's one of the hooks that it has if you're using puppet or shelf then it depends on how you're kicking it off and I don't actually kick off using etsy keeper check-ins using my version control system right now or not version one my configuration management system right now I don't use configuration management at home right now I used to but I don't anymore and at work I have bigger things to fix before I can get there so I don't know so it depends on how you would do it what you should do is set up a hook for the etsy keeper that you can trigger and then it would check in all the changes in etsy all together I just can't tell you how off the top of my head etsy keepers doesn't have a demon running watching for changes in the file system what it does is it has certain events that then trigger things so like I say the package management system etsy keeper puts in a hook and the package manager or APT says oh by the way go do this as cleanup so that's what you would need to do with your Puppet agent is just have the you can't do things in order because Puppet, Luke is awesome but crazy but you have to figure out something to do it that way any other questions yes sir I believe so the question was whether or not I can show the previous slide I changed slides can you repeat the entire presentation because you missed it yes sir there's a piece of paper here that says yes so hopefully and I will also be posting this online so you can see the slides or just go read my article that has more words alright any other questions we will upload both the presentation and the video like at the end of the scale maybe like within next week when they post they are trying to get both the crazy people popping around and then also the actual screen when they post the videos any other questions no well then we've got a couple seconds I do want to say I was going to do this beforehand when we were doing tech support so I'm wearing my t-shirt to advertise the software freedom conservancy it is a great organization Karen is our keynote speaker tomorrow they have created a nonprofit to support projects like Node.js but they are also doing working to defend the GPL and get companies and individuals to go through and honor the license agreement I think it's a really important organization so I would encourage you to look at them and also considering supporting them but with donations to either them or to their member projects enjoy the rest of the scale and thank you for coming up check check can you all hear me okay of course thinking pacemaker pretty universal so I'm going to talk about the configurations but no it's not Red Hat specifically this is just a clustering session that I gave for Red Hat Summit two more minutes and then we'll get started hi if you don't mind if you copy your presentation here I absolutely can you bet no it's presentation do you have a bio on me I'm sorry yeah that's odd I'm so sorry hello thank you for being here at the last session of today's scale we're going to attend a presentation by Thomas Cameron from Red Hat and it's going to be on high availability clustering with coursing and pacemaker where Thomas is going to demonstrate the setup and configuration of a high availability application cluster with shared file system thank you all thank you thank you very much hey look at that we got it down hello everybody my name is Thomas Cameron I work in the infrastructure business group at Red Hat it's our engineering group that's responsible for Red Hat Enterprise Linux and all the layered products on top of it as you can see from the alphabet soup after my name I am a technologist I'm a Red Hat Certified Architect Red Hat Certified Data Center Specialist Red Hat Certified Virtualization Administrator Red Hat Certified Security Specialist and Red Hat Certified Examiner so years past I was a Microsoft guy but I got better so I was an MCSE and before that I'm going to date myself here I was a Novel Certified Network Engineer so that shows you that yes indeed I no longer wear a beard because it's too damn gray so I've been doing this for a long time my contact information it's pretty easy I've been at Red Hat for a long time now I am Thomas at redhat.com so I'm fairly easy to get in touch with I'd love it if you followed me on Twitter I'm Thomas D Cameron and then you can also follow my business page at Red Hat Thomas on Facebook and this slide deck is available at the URL down at the bottom people.redhat.com so it's available you can find it out there and then I also just gave them the PDF version of it so it'll be on the scale website as well so we're here to talk about clustering but clustering you can ask 10 different folks in IT and probably get 12 opinions on what clustering is so I'm going to nail down what we're talking about we'll talk about HA clustering computational clustering dive down into what we're talking about today and I'm going to go through the steps for installing and configuring clustering and shared file systems I'm using Red Hat Enterprise Linux on this because I work for Red Hat but it's actually pretty much the same the package names may be slightly different but it's the same whether you're using Red Hat Enterprise Linux or the concepts and the technology is the same if you're using Debian or Ubuntu or whatever so this don't freak out that it's a Red Hat slide deck I'm a Red Hat dude, okay? I got a Red Hat tattoo I'll show you afterwards if you're interested but so that's what we're going to do we're going to go through installation configuration of HA clustering we're going to go through installation and configuration of a clustered file system as well and then we're going to get into the fun part which is testing it crash software and reboot machines and watch what the cluster does so it's kind of cool stuff we are going to do in the next hour what is normally a four day class so buckle up alright when we're talking about clusters in general and there's a ton more definitions than this you got HA clusters, high availability clusters and you got like computational clusters so HA clusters multiple nodes serving the same workload the primary design goal is that if one node goes down another node picks up it's real straight forward you can use shared storage in this environment for things like a clustered database clustered web applications clustered file servers and so on and that's kind of what that looks like you've generally got multiple networks one for management, one that's presented out to your customers whether they're internal in your company or out on the internet and then often times you'll have a network on the back end that connects to your shared storage computational clusters though are usually like big clusters for working on the same or at least similar datasets the design goal here is much more about I'm going to throw huge amounts of data at it and let it crunch information or render images or something like that and send it all back usually uses local storage for the actual work although in many cases there will be NFS servers so it can go grab chunks of data pull them down, munch them and then push them back so things like Monte Carlo simulations for financial services or oil field reservoir simulation I'm from Texas, you may have noticed the twang I'm from Texas and I've covered oil and gas for many many many years with Red Hat and I've spent more time in those big computational cluster rooms than I ever intended to when I got into IT and then also I got to work with AMD and I got to work with Texas Instruments they actually will do simulations of chips on these big computational clusters and they'll test like every gate and every circuit on a chip before they ever put it on silicon so it's kind of cool but it looks very different, right? You got all these machines out there that are all the same they may connect to shared NFS storage but there's no HA, there's no failover I'm going to throw a ton of work at it and get it to work so what we're covering today as I said is HA clustering I'm going to go over a very basic cluster this is something that you can do at home if you have three or four desktop class machines I've intentionally made this very simple so if you feel the urge you can do this at your place as well we would never use this design in production because we got single points of failure all over the place but it's a good way to just kind of learn what's going on single ethernet network iSCSI storage in three nodes so we've got one machine that's iSCSI target and then we got three machines that are the members of the cluster that are iSCSI initiators they are the clients of the iSCSI server for installation I'm going to blow through this because again you probably don't particularly care that much about Red Hat Enterprise Linux specifically but if you're on CentOS this is important I did a super basic installation this is just a regular old kickstart file the main thing to look at is under packages man I did nothing I did like the base installation so super super simple didn't need it to have a whole bunch of anything that was complicated and again because we're doing this in a lab environment you can go and install the rules you can go set up the rules for firewall ports to be open it's in the documentation it's in the documentation with the upstream cluster project it's also in the documentation at Red Hat but for a lab purpose just shut off firewall just talking too fast see Texas boys we don't like to talk fast but I know we got a lot to cover so just shut off the firewall for this environment looks like I just took some screenshots to show you disabling the firewall and stopping the firewall make sure that your IP tables rules are flushed so that all the nodes can communicate with each other you also want to make sure that you have time synchronized across all your nodes you can get some really wonky results if the system thinks they have different times because they talk to each other and they say hey I think it's been down 5 seconds the other one's like no man it's been down 10 minutes because they're off from each other so you want to make sure that your time is synchronized time date control turn on the crony D service or NTP whichever time server floats your boat turn it on and you should be in good shape and again that's kind of an eye chart but I just showed the contents of syncing the time across all four of my systems make sure they're the same across all four of them now these nodes in order to communicate with each other about what's going on on the cluster need to be able to SSH into each other so you generate an SSH key using SSH key gen and then propagate it to all the other nodes make sure you do it from each one of the nodes to each of the other nodes so this is what that looks like SSH key gen you can just hit enter, enter, enter that way you don't get prompted for a password and I just showed that I did this again on all four of the nodes and then you use SSH copy ID to push that from each node to each of the other nodes including itself I know it sounds weird but you want to make sure that all the nodes can SSH into each other that's what that looks like I use the ultra sophisticated systems management tool of a for loop for I and host one, host two, host three do SSH copy ID boom boom boom it goes across there's what the key looks like on all the different nodes and then I check to make sure that it's like that on all of the nodes everything's good we now have done it one time lather rinse and repeat for all the other nodes in the environment it's like a giant circle but you want to make sure that it doesn't look like anything else without any passwords and again this is just screenshots of what that looks like and by the time you get done once you've copied those across to all the other machines your authorized keys files are going to be huge on each node that's cool, you don't ever have to do anything with them and they're just text but just note that it will look like holy crap what have I done but there's a method to the madness so make sure that you can log into all the machines without being prompted to accept the key and I just did again a sophisticated for loop and made sure that I could log into all the machines replicate that across all the nodes in the cluster and life is good oh the other thing this is kind of a oops type of moment make sure that you SSH with the long host name as well as the short one because if you don't have the SSH keys set up or if the system doesn't show up in your known host file home directory SSH known hosts the clustering software will try to log in and it will get prompted do you want to accept this you know the SSH so don't do that I have no idea how I know that talking to the lead engineer on the software project there's your stuff's broken it's not working and he says well show me the debug and I send him the debug and he's like what do you do for red hat again so now that we got the machines that they can talk to each other we're changing gears we're going to have one machine that is where we're going to have our shared file system remember I showed you in that diagram it's one at the top of the cabinet and I made it look like a really big cabinet like four nukes on my desk so what we're going to do is we're going to use the Linux IO target has anybody used iSCSI target D in like rel 6 or something similar earlier okay really different forget everything you know nothing at all translates across it's kind of crazy but the cool thing is the new Linux IO stuff is kernel based it is really fairly straightforward and it's got a nice curses based menu to set stuff up it's actually once you get the hang of it I think it's a lot easier than the old style stuff so it's based on a SCSI engine that implements the semantics of SCSI as described by the SAM the SCSI architecture model blah blah blah blah blah blah blah blah this is marketing stuff but it is a lot cooler and newer versions of the kernel 3.x series of the kernel I think it's a lot cooler than the old stuff and there's a cheat sheet and this isn't a redhead cheat sheet this is the one that's at the linuxscusi.org website you can go look at the cheat sheet and I did when I generated this it's like step one do type this so it's pretty good this is a fairly small disk I used fdisk to create the partition that I am going to share out to the other nodes in the cluster you guys know how to use fdisk so I'm probably going to pop through this relatively quickly but the main thing is I took the existing disk which only had the three partitions boot, root, and swap and I just added a new partition I created a new partition made sure that it was primary so it might change this to disk if it's the same disk that you're booting off of you'll have to reboot or you can run part probe although that's kind of sketchy and sometimes works and sometimes doesn't so reboot the machine if it needs to re-read the partition table you get a little warning on the fdisk application and then install the targetcli command whether you use yum or apkid or whatever targetcli is the thing that I want to point out is that it actually drags in a whole bunch of other stuff so it's not just single command it brings in a bunch of libraries and a bunch of Python stuff so it can render the menu in your terminal make sure that that target service is enabled so that the iSCSI server will start up and so whatever OS you use just make sure that the service will start at boot time in this case system control enable targetd target not targetd targetcli can be used interactively you can just run targetcli and start creating stuff or you can just pass it a lot of command line arguments and it'll do it for you so you see what we're talking about so you run targetcli and when it first comes up nothing is defined the server doesn't know anything about any iSCSI target it doesn't know anything so the first thing we do is we have to tell the target server the iSCSI target hey what device are we going to be making available so you go into the backstores block configuration menu and type create the name of the LUN and that's freeform you can call it whatever you want and then the path to the block device so slash devs slash sda5 for instance or whatever LUN0 on virtual drive partition 1 and then you can use ls to make sure that it's actually there and then go into backstores slash block and say create LUN0 devvdb1 that's what the syntax looks like so backstores block create LUN0 devvdb1 and it says okay when I do ls of the root or this display has changed and now we see that we've got this 20GB block device that's available to us now what we need to do is we need to identify this device we need to give it an IQN an iSCSI qualified name supposedly universally unique name you can specify it or you can just let the command auto generate it for you so the syntax there is we go into the iSCSI menu and say create this big long horrific name but the cool thing is it's actually a compliance name with the iSCSI standards and so you say iSCSI create all it's done is it has said okay I'm going to have this device and it shows up under here and you see that there are no LUNs associated with it it's not associated to the machine's default portal or network address and so now what you need to do though is you said I've got this iSCSI qualified name I now have to associate it with that block device and the way that we do that is we go into slash iSCSI slash your IQN and then TPG1 and LUNs and say create slash backstores slash block slash whatever your LUN is and this is what that looks like you run that create command LUN0 and now when we run LUNs we can see that we've actually said oh okay LUN0 is associated with that block device yes sir oh you're just stretching thought you were cheering yes sir yes yeah it's really cool because if you suck as badly as I do at typing it's so nice to be able to just hit that tab key so yeah it will now again just for demo purposes don't do this in production we're basically going to turn off a bunch of the security stuff just so that we don't have to set up an access control list for every one of the nodes in here if you do this in production obviously I don't expect you to take a 45 minute class and go hey let's do this in production so you'll go and read the documentation on how to define these attributes but what we're going to do is we're going to turn off authentication we're going to turn off write protection and we're going to create this dynamically which just means when a new machine connects just let it work so you run these commands boom boom boom boom and now you've turned off security again lab only don't do it in production now we can look at the top level view so we cd slash and then do ls within the ui and we've got everything set up we've got the block device defined we've got the LUN associated with the block device which is the internet facing name associated with the iSCSI qualified name and we have turned off security so that we can access the shared file system or I'm sorry the block device on the network upon which we will put the shared file system you can run the command save config and it'll write it out or you can just exit the utility and it'll save it for you as it exits so when you type save config like I said it writes it when you exit it'll dump it out to a text file and you can look at that file you can take a peek at it it's not super friendly but it's also not awful so this is what it looks like it's page after page after page of a bunch of the default iSCSI settings for that block device and for those LUNs gives it some unique identifiers and stuff like that so the cool thing is just like everything else cool in Linux your config file is just text files just JSON file okay so now we have set up the iSCSI target which we would commonly refer to as the server now what we need to do is on each one of the cluster nodes we need to install the software and configure it so that it can connect to that shared storage and so what we do is yum install iSCSI initiator utils again this assumes a red hat distribution so yum install iSCSI initiator utils it'll install those and you will notice that when you install it as with almost every RPM out there the service won't start when you install it but the cool thing is you check it and you see that it's dead but it's configured to be on by default and what we can do is I think it's kind of cool to figure out where things go on the file system so you'll notice that varlib iSCSI when you first install the software is empty there's no information about any of the other like any targets or anything like that now what you need to do is you need to tell the iSCSI clients on your machine to go query the target and say what do you got so you run the command iSCSIatom-dash-mode discoverydb-dash send targets so tell me what's available against portal blah blah blah whatever your portal is your host dash dash discover and then you'll see that the iSCSI process is running and varlib iSCSI has content so what that looks like is I do the installation I look down varlib iSCSI and there's nothing there I run my command the iSCSIatom-dash-mode discoverydb and it'll come back hopefully assuming your networking is correct it'll come back and it'll give you the iSCSI qualified name the IQN and you'll notice that the iSCSI service is running and what's even cooler is all the information that it gathered from the iSCSI target now lives in directories and files in the varlib iSCSI directory so normally the kernel only sees blog devices for locally installed hard drives right you know if you put a new hard drive in you'll see devsdb for instance but once we log in to the iSCSI target because remember the first thing we did was we just said show me what you've got now what we have to do is we have to log into that iSCSI target so that we can access that drive so you run iSCSIatom-dash-mode node target name what's your iSCSI qualified name is where the portal is what the address for the portal is and then dash dash log in now we see a new blog device so what that looks like is iCat slash proc slash partitions and you'll notice that I've got three partitions and that's it I log in using that command and now look what showed up I got a new hard drive at least my system thinks it's a new hard drive it's actually a blog device on another machine but that's the cool thing about iSCSI is it says yeah it's a hard drive partition it format it put a file system on it knock yourself out lather rinse and repeat for each initiator everything we just did we do again on all of the nodes in the cluster that's just what that looks like and at this point we're going to take a break from the storage part of it because we need to install some more software and make some configurations now let's talk about core-syncing pacemaker because that's really kind of the meat of what it is we're doing here low-level infrastructure has to have information about who are members who's supposed to do what what the status of the various nodes in the environment are is the cluster healthy do we have enough nodes that the cluster actually can do voting to kick nodes out of the cluster and stuff like that that's core-syncing core-syncing is responsible for that pacemaker is kind of the brains of the operation though so pacemaker runs up kind of high on the stack and it's going to do things like process any events any messages that come in from any of the clients it's going to figure out what to reconfigure the cluster to look like things like quorum it's going to determine whether we have quorum and if so can we vote it's going to pay attention to failures to systems that are taken down for maintenance things like that and then pacemaker will compute what it thinks the ideal state of the cluster should be and plot a path to achieve it after any of these events so that may include moving resources so if you have a system pacemaker is the one that's going to go oh hey that system that failed that's not responsive anymore and I kicked it out of the cluster that's the one that was running the apache web server yes I need to move it so pacemaker is the one that's responsible for moving services and by moving I mean stopping on one node and starting on the other node over so it can move resources stop nodes even force them offline by using remote power switches or the like so when you put the two of them together they actually support a lot of open source file systems so and the cool thing is the use of a fairly new actually it's not so new anymore of a new standard around open source clustering means that you can use distributed lock management with a whole bunch of different file systems so if you're going to be doing stuff with GFS2 or OCFS or whatever there are a number of file systems that are supported that can be managed by that distributed lock manager you definitely have to have a distributed lock manager because you don't want two processes one on each server writing to the same block on a disc bad things will happen I hope you have good backups not the voice of experience at all so pacemaker is composed of four components right there's the SIB the cluster information base the cluster resource management daemon the policy engine and stonus d I'll talk about stonus d in just a second here's how they all work together to keep the cluster healthy pass information about states make decisions about what the desired state should be things like that the SIB uses XML representations to show both the cluster's configuration and current state it'll push that out to other nodes on the cluster the contents of the SIB are automatically kept in sync and are used by the P engine to compute the ideal state the instructions are fed to the designated coordinator pacemaker centralizes all the decision making by electing one of the CRMD instances as the boss if that node fails then the boss service just goes over to another node it'll vote again and figure out who's supposed to be in charge the DC carries out the PE engines instructions by passing them to either the local resource management daemon or the CRMD on one of the other nodes and then figures out what actions to take the peer nodes will report the results of any of the operations back to the DC and then based on did I get the right results or the wrong results it'll either execute additional actions or if it sees that there's a process that's just in a wait state it may wait or it can abort and just tell the eject one of the nodes from the cluster and then recalculate again so it's pretty smart software in some cases it may be necessary to remove a node from the cluster and you can do that a number of different ways for doing that though we use the stonus daemon stonus is my favorite acronym in all of information technology stonus stands for shoot the other node in the head that is the best acronym in the world sorry I had a failure today and had to shoot the other node in the head and being a Texas guy that's just off the tongue it's awesome so what happens is stonus daemon can use a whole bunch of different resources we can use a remote management switch like a WTI power strip or an APC power strip we can talk to out of band management systems like ILOs from HP or from Drax from Dell or RSAs from IBM or we can even talk to things like fiber switches we can just turn a port off on the fiber switch to say no more IO from you because I don't hear from you the thing that we absolutely positively want to avoid is as I said earlier you don't want two nodes thinking oh I'm in charge I'm going to write to disk because they can clobber each other and take out your entire file system so stonus is really really important the machine stops responding to the cluster we definitely want to shoot that node in the head alright pacemaker makes no assumptions about your environment allows it to support pretty much any redundancy level that you can imagine active passive which is what I'll show you N plus 1, N plus M N to 1 and N to N I'm only going to talk about an active passive setup although I will show you some active active features in a little while but you can get as complex and crazy as you want it's pretty cool you can have tons of services running across tons of systems you can group them together you can set dependencies all kinds of cool stuff and so it's very impressive software we were really happy when we were able to include this in red head enterprise analytics because if you use rail 6 or if you use older technologies we used to have lucy and ricky the konga web server lucy ricky konga and it was our own stuff it came from an acquisition that we made and it was pretty good but the community came up with something better we always read how it was like outstanding we will start contributing and we'll start using it so again there's a link to the pacemaker documentation if you want to go read more about the architecture than probably anybody in the world wants to now what we need to do is we need to install the clustering software before all we did was we just installed the software so that we can see block devices now we need to install the clustering software so that we can set up this HA cluster so I'm not going to spend a lot of time on this if you're using red head enterprise linux you make sure that you have the HA the high availability repository set up that is a subscription that you have to buy or you buy access to it by buying a subscription the packages that you need to install are the lvm2 cluster software the core sync package the pacemaker package the pcs package and fence agents and I'll talk about fencing in just a little while so lvm2 cluster is the add-ons to lvm so that you can have a logical volume which is cluster aware so that you can mount and unmount it from multiple nodes without again clobbering it because you try to mount it from two nodes core sync and pacemaker we've already talked about pcs and the pcs daemon is the administration tool it can be used from the command line I love using it from the command line because it's super fast but it also provides pcsd which is a web server that allows you to manage your cluster from a web UI and so you know I like to joke that this is so easy that even a windows admin can do it we'll use the web UI for this presentation because putting a bunch of commands up on the screen is not terribly entertaining but be aware that everything that we are doing in the web UI can absolutely be done from the command line and you can script the entire build and tear down of your cluster it's really cool and then fence agents all provides the fence agents for all of the supported fence devices and I'll talk more about what fence devices are in a few minutes this is what it looks like I did the yum install for all those pulled in all kinds of stuff but that's what gets installed and again make sure you do this on all the nodes you have to make sure that all of the systems have the clustering software installed and then as I mentioned earlier by default with red hat when you install package we don't turn the service on so you have to enable the service so system control enable pcsd.service and then system control start pcsd.service on all three of your cluster nodes remember this is the ice fuzzy target we're not going to do anything with it until much later on now the way that the systems authenticate to each other is through a user account called HA cluster scratch that I'm sorry they identified to each other is root the way that you log in is as a user account called HA cluster so what you want to do is for that that gcos account the linux account HA cluster set a password you can use the password command I did echo whatever the password is type password dash dash standard in HA cluster and we set that on all of the nodes so all three of these nodes are in the same command so that HA cluster account has the same password across all of them now that we've got it installed now that we've got the pcsd service running remember that's the web server interface for configuring the cluster now that we have set the password for the administrative user for that we can go ahead and start building the cluster you do have to do a couple of command line things before you can log into the web UI you use the pcs command to do authorization saying we're going to set these nodes to be members of this cluster so pcs cluster off node 1 node 2 node 3 whatever they are and then that information gets stored in varlib pcsd so this is what that looks like it's fairly straightforward I know it's kind of an eye chart for you guys I'm sorry I'm used to slightly larger displays but you do pcs cluster off node 1 node 2 node 3 and you'll get feedback that comes back and says yep okay it's good so we set and we find files down in the varlib pcsd directory which includes you know the authentication tokens and things like that so that's where that information gets stored so now that we have told pcs who's going to be playing now what we have to do is we have to actually set those systems up so we're going to run pcs cluster setup and give it a name whatever you want to call the cluster it's up to you and then node 1 node 2 node 3 etc all the nodes that you have and then it will generate the corosync.conf file under etsy corosync so this is what that looks like you run the you notice that there's nothing in etsy corosync or there's not a corosync.conf file we do cluster setup give it a name I called it summit because this is for redhat summit the nodes todayo malcom and lady3jane.tc.redhat.com and you see that it actually shuts down the pacemaker services so it restarts them which is the configuration files out you get feedback from each node saying yes I got the configuration file and then it restarts everything and now we've got the corosync.conf file so we do whatever we need to do on one and it will push it across to all of them in this case you've heard me saying before do this on each node at this point now that pcs is up and running or pcsd is up and running on all the nodes you can do this and this is awesome there's no single point of management anymore the corosync.conf file that does get generated from this is going to be super simple we got three nodes they're not doing anything yet no services defined, no resources defined no nothing like that now what we need to do is we need to enable the cluster so pcsclusterenable-all it only says we know what services we're going to start on each one of the nodes you don't have to run it on the other ones now you you can wait a second did I skip two? nope, okay cool you can now take a look at the corosync status and you'll see that it is actually running the nodes have started up corosync so they can transfer information back and forth to each other but we still haven't started any services or anything like that so this is on Hideo it seems we're just fine I can do the same thing with pacemaker and it's running I'm sorry scratch that it stopped it stopped right now on all the nodes until we do this pcscluster-all sorry I got ahead of myself so now that we've authenticated or authorized the nodes we've pushed our configs out to them to start the cluster so pcscluster-all and you'll see that it comes back and says starting the cluster on Malcolm on Lady 3 Jane and on Hideo the three hosts that I have in this environment and it will take a few seconds for the nodes to sync up with each other they have to communicate with each other you remember all those processes we talked about earlier they will communicate with each other and the first time I did this I kind of panicked a little bit because I saw this it was like oh wait man it wasn't clean what? so I ran the command a few times it was like oh crap I broke something or I didn't do something right but after a few minutes not even minutes after a few seconds it'll change over to where you'll see that yep we're in good shape everything's fine the nodes are online all the nodes are happy and all the processes have started so after a few minutes we'll be back on the Chorus Sync config tool it'll just tell you what the ring what the cluster status is you can use Chorus Sync CMAP control and then look at the members and sure enough there are all the members which are in the cluster it'll give you the IP addresses I can do CRM verify and again it'll come back and it'll tell you that that the cluster resource manager knows that there are resources available one thing that I do want to point out is that be careful the cluster is started but we don't have any fencing devices we don't have the ability to shoot another node in the head if you do not pick your fencing and you try to start your cluster up none of the services will start because it's assuming that you've done something silly that you've missed a step and it does not want to start shared services if we don't have the ability to shoot a node in the head that too, voice of experience I didn't have a fenced device when I first did this on my little PCs and I was like why can't I get my cluster to start but that's why it's actually a safety feature alright so now we can log into the web UI you connect via HTTPS and that's important make sure you connect via HTTPS on port 2224 and as I mentioned earlier man this is a phenomenal thing for a cluster all of them are running PCSD all of them are listing on 2224 there was much rejoicing if you used previous versions of clustering there was one node that was your management node that was a single point of failure it was kind of a pain so pretty standard process you guys have all done this before you go to a website that's got a cell sign certificate Firefox gripes at you you say it's okay you will log in as that HA cluster user account that you use to manage the machine so log in as HA cluster and can you all see that okay? your eyes are redder than mine gray beard, bad eyes alright in this case I'll log into one of the machines and what I need to do is because the PCSD software or the cluster management software is just saying hey I'm here I'm here to manage clusters I don't know about any clusters yet so you have to tell the PCSD interface about the clusters so in this case what I do is I go to the clusters management tab and I say add an existing cluster and I give it any one of the nodes that's in the cluster I give it that host name click on add it'll go out and again it takes a couple of seconds because it has to go out and get cluster status and things like that and it looks healthy you can click on that and look at health of the core sync process the pacemaker process and so on all of the nodes should have pacemaker core sync and PCSD running so when you click on that cluster you get an interface where you've got all your nodes over here click on one and make sure that it's healthy and again lather in repeat click on the second one make sure it's healthy click on the third one alright my cluster is healthy it's up and running I'm not doing anything I don't have any resources to find or any services but at least I know that my systems are talking to each other and they can see if the other ones are responsive you can go into the cluster property tabs at this point from that main page there's a cluster properties tab and you can affect cluster wide settings so this is where you can do things like set delays you can do things like define what you want the stoneness is in use and that's something that I do want to point out that there is a box let me see if I can put the mouse over it so you should there's my mouse there it is you should see there's a stoneness enabled checkbox right there if you don't have a fence device but you still want to play around with it then just uncheck the box that says stoneness enabled you can bring up services you can't do some of the fancy clustered stuff but at least you can bring up web services and shared IPs and things like that and at least play with the software without it saying no you can't cluster because we don't have a stoneness device also the other thing that I really love about this interface is it's got context sensitive help so if you hover the mouse over a single setting it will pop up and say this is what the default is and this is what this option means it's pretty I don't know about intuitive because I don't think HA clustering is what you would call intuitive but it's pretty helpful so to configure fencing as I mentioned if a node stops responding the cluster will attempt to remove that node from the cluster we don't want multiple machines right into the same data store so if a machine stops responding the other nodes will vote on whether that machine is up and if they decide it's not as I mentioned earlier there are a lot of ways that you can do this manage power strips, fiber channel fencing, IPMI out of band management even SCSI reservations if the SCSI controllers that you have support SCSI reservations I'm sorry if the SAM devices support SCSI reservations they can knock them out or you can do virtual machine fencing which is really cool spend a bunch of VMs on your machine run a fence XVM daemon on the host actually it is smart enough to go and kill virtual machines if they become unresponsive it's pretty cool so you go to the fence devices tab click on add and add the device now what I chose was when I went to the fence devices tab right up there and then I scrolled down in my lab at my home office I actually have an 8 port WTI power power strip it's a managed power strip they're super cheap you can get them on ebay for like 80 to 100 dollars they're really old and awfully insecure they use telnet so I mean don't ever put them out in an internet facing where but if you want to play around with this kind of stuff like I said you can usually find a WTI power strip on like ebay or something like that fairly cheap so we identify it just give it a friendly name in my case I called it summit WTI and then you enter the IP address or host name of that device and your system should at that point be able to resolve it and figure it out now right now you'll notice that it's inactive because I need to set a couple of additional settings so what I need to do is for almost any fencing device especially if it's a power management fencing device you need to be able to telet which power port or ports go to which machine so in my case I got 8 ports and so what I do is I go to the PCMK host map and say lady three Jane is on power port 5 Hideo is on power port 4 Malcolm is on power port 6 so if any of those machines become unresponsive then go and kill them you know so that's how you do that mapping unfortunately the WTI switch at least I haven't figured out how doesn't support doing things like having multiple ports like hey he's on port 1 and port 2 you know how some servers have multiple power supplies so I'm still trying to work that out it's probably something really simple I don't know you can also do things like set up the password to log in to the power strip username and password in this case WTI you just log in with a password there's no username like I said not very secure but you go and you fill out what that password is and the agent the dispense agent understands that it's going to use telnet, it's going to log in it's going to use that password and it's going to have port x for server x port y for server y and so on now I like to set some additional options I prefer to have a power cycle event wait for a few seconds so if a node does get shut off I want it to kill power wait for like 5 seconds and then bring power back up I don't remember what the motherboard vendor was that if you just went click click that it would it would set the biosex factory default yeah school hard knocks yeah that was kind of embarrassing so you know I like to say hey if you have to kill something turn it off leave it off for a few seconds and then turn it back on I also set delay to about 5 seconds that will vary in your environment the delay is ok the machine is unresponsive how long should I wait how long should I let it be unresponsive before I shoot the other node in the head some applications man they can bog the machine down for 30-45 seconds I ran a horribly horribly overused oracle database server one time that the machine would go unresponsive for 90 seconds at a time 120 seconds at a time it was still running load average was like 90 and if you just let it didn't poke it it would finally chunk through whatever it was doing and it would become responsive again so be aware of what delays you may have with your particular application load before you go and set the delay don't set it for a 5 second delay if you've got a really high workload that may cause a pause for 15 or 20 seconds at a time alright so again you can go and set those values in the web UI you can set the delay and the delay for inactivity so after a few seconds after you save your changes that fence device should go green what that means is the clustering software has actually gone and telneted into your machine and said yep I've got or SSH'd or logged into the iLO or the DRAC or the RSA or whatever you got I tried to set this up on my laptop to do a live demo but I don't have a WTI power switch on my laptop so I was like oh so this off live to the no not gonna do that so if you try to configure a fence device and you don't actually have it it will block the rest of your cluster so be aware of that alright you can test the fence device and host mapping with the stonethadmin command so what I did was I ran stonethadmin and I said reboot and I just gave it a host name and I could do that from any one of the nodes so stonethadmin hey let's test this node 3 and you can actually watch in the log files on the other nodes it'll actually go hey I got a command logged into the whatever the fence device is turned off power port whatever and then you'll see the rest of the nodes go hey this other node just dropped out of the cluster so you can watch the log files and you'll see that it had a DLM ejection and this is what that looks like I'm not gonna read a log file to you you guys all know how to read log files but it's just kinda cool that you can go and kill a machine and watch it bounce through and bounce out so the next thing we need to do is we need to configure resources now that the nodes are able to fence each other we're gonna go and we're gonna create a service that is comprised of two resources the IP address and the web server that runs on that IP address so we're gonna choose Apache to configure the floating IP address you go into cluster resources and click add you're gonna add an OCF heartbeat class provider if you've ever used this clustering software before don't use the old IP adder type you have to use IP adder 2 the old IP adder was for UNIX IP addresses IP adder 2 is specifically for Linux you give it a name you give it an IP address and you assign it to your machine so it's pretty straight forward you define what it's going to be you give it the IP address you save it and after a few minutes or after a few seconds actually again it'll come up it'll turn green and if you go and you look at the machine that it's running on you can do IP adder show and there is the address that we assigned to our floating IP address so it'll come up on a machine and you're in good shape if you do PCS resource show you'll see that it's there so now we've got the IP address floating back and forth let's install the web server that we're gonna have running on that IP address so you install htdpd I also recommend that you install wget and I'll show you why in a second so make sure you install it across all the nodes install the web server make sure that whatever service you're going to make highly available does not start at boot time you don't want it to start up when the system boots you only want it to start when the cluster starts it right that kind of makes sense we don't want to have a patchy running on all three nodes and then it tries to migrate a patchy so you'll have a collision so make sure that it's turned off and just for testing purposes I put the host name in the index.html file on each one of the nodes again I just ran that across all three of those and here's why I said you want to get the wget commands you can also create a monitoring page a health status page by creating a status.conf file so that the machine can check itself it can go and say hey is the web server listening it will check on 127.001 and make sure the web server is up so we create that file now to configure the Apache service we go into resources and click add the type again is OCF heartbeat type for httpd or Apache I'm sorry and then you will give it a little bit of information where you want to just give it a friendly name in my case I call it Summit Apache if you want to find anything else you just create the resource and hey we got green lights across the board the IP address is up and the web server the web server is running and so I'm just for grins going to run PCS status and I see that I've got the IP address running and I've got the web server running anyone see what was wrong with that picture where's the IP address Hideo where's the web server oops on Malcolm so I brought up my resources they're both started successfully but they're on different machines okay that's not going to do us a whole lot of good so what you want to do and what I did was I just looked at the IP address and I see that it's up but I go to the hostname that it's running on and sure enough it's up on Malcolm that doesn't do us a lot of good so what we need to do is we need to set resource ordering and resource location preferences resource ordering just means before this comes up this other resource has to have come up right so in our case what we're going to do is we're going to go into the summit IP address and we're going to click down here under resource ordering and we're going to say the summit apache resource comes up after this IP address does so we're just going to say IP comes up first and only then can apache come up okay but that doesn't solve the problem of which host it's going to be on so what we want to do is we want to go to colocation preferences this time we're going to go to the summit apache resource and we're going to associate it with the summit IP resource so I go into the apache resource I scroll down and I say for the summit IP we want these to be together and click on add so essentially what we're saying is bring up the IP address then bring up the apache but it's got to be on the same node now that we do that the cool thing is as soon as I save that that will go that may go red for just a second but then it'll turn green again and now when I do PCS status this is a much better result I got the IP address running on Hideo I've got the web server running on Hideo life is good if I go to that IP address now I no longer get the not found I now get that it's running on Hideo you can also do apache monitoring you remember that little server status file that we created you can go into the advanced resource configurations for apache and you can say hey the status URL is htdbt localhost slash server-status and it will go and it will check that page to make sure that the apache web server is responding so the cool thing about this is it doesn't just say psax pipe grep htdbd yep I see the process there it actually asks the application to do something to serve up a web page I like that because it means that we're doing more than just a silly little process check so to test this what I did was I killed the htdbd processes so I do psax pipe grep htdbd and you can see I got plenty of them running I do a pkill-9 of htdbd and they're dead they're all gone I've killed all those processes in the log file you see a bunch of events the CRM the cluster resource manager goes oh hey something's not responding the service isn't running and after no more than about 60 seconds all of a sudden it's back up it automatically restarts the service you can set thresholds that say hey try that three times and if it doesn't happen within three times then start it on another machine and then the first one you can also set host affinity which is I go to the summit IP address or the summit IP resource I click on resource location preferences and I can put in a host and a priority the higher the priority number the more likely the services to run on that host so when I add in one node in this case it was Hideo and I click on add you see that it moves over to the Hideo server I go and I add and sure enough when I look at the web page it's got the website then when I go and I add in lady3jain right there sure enough after a few seconds it moves those services over to lady3jain and when I look at the web page lady3jainindex.html ditto for the third machine which is Malcolm if you want to force a service to a host give it a priority of infinity and it will force it to go it's actually a great way to do things like move services and then shut down a system for maintenance so that's how that works to offline a host you just go to the node click on stop and that machine will go all red there's a stop button right down there everything will shut down life is good and you'll see in the PCS status you'll see that one of the machines is offline to online it just do the reverse click on online or start and it will bring it back up and to reboot go to the node choose restart let's be real clear it will fire off in a knit 6 or reboot command this is not fencing the other machine it's a graceful shutdown to enable distributed lock management we are going to set up a new resource called OCF pacemaker with a class type of or resource type of control D and I want to show you something kind of cool I clicked on add and I created the new resource but the thing that the thing that you want to make sure that you do is make sure that the clone button is selected hang on just a second did I skip over that huh there is a slide missing guys I'm sorry but anyway when you create a new resource remember there are some check boxes down there if you click on clone that says run it on all the nodes that's all that means so when you do that and click on OK it will start the distributed lock manager on all of the nodes so the next thing that we need to do is we need to create a clustered logical volume management resource again we're going to go into resources click on add this time we're going to choose a clustered logical volume management service and again we want to make sure there's that check box for clone make sure that that is cloned because we want that clbmd service running on all the nodes we don't want to try to access a logical volume concurrently from two machines it'll again go green you'll see that I've got the services running make sure that it is that you can see the clbmd process running life is good there and now we need to make sure that lvm understands that this is clustered you can run lvmconf enable cluster it will change the locking type from one to three in the lvm lvm.conf file I'm going to skip over this warning because I'm running late alright so you can also just use pearl to do it if you want to turn off lvm meta d as well to configure that shared storage or finally back to the shared storage use your favorite partitioning tool I used fdisk you can use whatever you want create the partition you need to log out and log back in of the ischiazzi target and it will re probe those partitions and sure enough once you log out and log back in there's that sdb1 that's available standard stuff pvcreate and that's what that looks like vcreate don't forget to do vgcreate clustered y so that we flag that volume group as being cluster aware and then point it to the device the blog device and then we do that's what the output of it looks like then we do lvcreate to create the logical volume and then to look at them you can do pbs vgs and lbs that will force a scan from each one of the nodes and you can see those volumes on all of them install the gfs2 software for the clustered file system and then run the command makefs.gfs dashj3 that's the number of journal entries if you feel like you're going to expand your cluster you can do j5 it doesn't matter it takes 128 megs on disk extras are fine dash t the name of the summit and then the name of the volume gfs0 and then the blog device you're making that on you'll get a warning that says are you sure you want to do this you'll say yes add that shared storage as a resource a new ocf heartbeat resource of type file system and you're going to give it information that's actually relatively straightforward you're going to tell it that it is cloned because now unlike the lbm volume we do want this to actually be mounted concurrently on all the machines at the same time so we're going to make sure we do cloned we're going to give it the resource ID name we're going to give it the device path dev bg blah blah blah and where we want to mount it in this case it's going to be on bar www.html and the file system type is gfs2 it turns green we do mount pipe grep gfs2 on all the machines and you can see that it is indeed mounted on all the machines concurrently pcsstatus shows that all those resources are up and running you can create an index.html file that shows that it's gfs set the priority IP address to infinity on each one of the nodes and you'll see it bounce from node to node to node and test it by just refreshing the screen you'll see gfs, gfs, gfs and then finally and I know we're just a second over give me two more minutes guys I apologize this is my favorite part through all these hoops we get to go and start abusing our systems make sure you sync your file system first type sync a few times echo s to slash proc slash cissrack trigger and then echo c crash to proc cissrack trigger so you do this your terminal will hang your console will look like this it spits out all kinds of ugly stuff and you will see error messages in the syslogs of the other machines that say hey I see that this other machine is non responsive you'll see it go through logging into the switch turning off power and after a few minutes you'll see that other node come back that's what that looks like so we have gone through all of the steps for doing all of the different things that are required as I said this is usually four days worth of stuff of course you have labs and things like that now go forth and cluster I appreciate you staying late I tried to talk as fast as I could I don't know if we're going to have feedback at this but ton of information in a very short amount of time but did that make sense the steps involved anybody got any questions I know I'm the last thing between you and a beer 30 yes sir yep people.redhead.com slash T Cameron and like I said the lady who introduced me these slides on PDF you guys will have them on the scale website thank you guys for bearing with me while I ran over do I do anything on the compute clusters no I haven't done anything with compute clusters that's a whole nother ball of wax that I never really got into alright get out of here go drink thank you very much can you go one back yeah