 Thank you everybody hello hello everybody's still awake I know this is after lunch so if you have to take feel like a nap is coming along and you need to rest your head I'll allow you to put your head down for a few minutes but thank you most force and foremost for coming out we appreciate it and hopefully you get the best out of the labs the workshop here today one disclaimer and I'm going to be the bad guy to say the disclaimer because I don't want Chris my partner here to get the blunt of it is we have a limited resources and I appreciate all the people out there that came to see the workshop we do have limited resources so everyone may not be able to participate but but the good news is is that I'm going to make it so that if you did want to participate from home and follow through the lab I'm going to make the resources available for a week after the summit right so if you don't get to participate today do not feel bad I'm going to give you my card I'm going to give you direct resources you can consume from home you can step through the lab and you can tell me how it went right so don't feel bad if you're not able to actually do hands on today but the limited resources part is my thing it's very difficult to spin up enough clouds for I don't know 150 people in the room so just being totally on it perfect and honest about it so just wanted to put it out there but cool let's do it so starting out so I want to first introduce my co-worker here Chris he's here we're partnering up on this lab so we're going to talk a little bit about Chris in the next slide but just to kind of give you some information about me I work for Rackspace if you haven't realized that by the rack or shirt that I have on cloud solution architect I work in a private cloud space just some of my things as I've been an IT for probably a little bit too long I love motorcycles I am an actual DJ so if you need a DJ for your weddings or your parties please call me up you know always about living life now I believe that you know tomorrow is never promised to you so I'm all about that these are just some of the companies I work for follow me on Twitter or send me hate notes on Twitter please you know welcome to do that as well as my blog is there if you want some other open stack information I put some stuff out there every once in a while Hey guys my name is Chris Woodard I've been at Rackspace for ten years RPC for four here to show that anybody can do this job even somebody with a Louisiana school education so no also as first time presenting Walter was kind enough to let me join him up here on stage and stumble along so you know without further ado No no I need the support and he's a great guy and you know he probably has a lot more experience than me because he actually gets to touch open stack every day I just get to say open stack about 200 times a day but he actually gets the chance to touch open stack so listen to him not to me basically what I'm saying So really really quick and simple ground rules again we're all adults so I know I don't have to say this stuff but I'm just going to say it if you need to answer a phone call please step outside you know try not to take the call while inside here One of the other requirements is you must ask questions right so if you have a question please raise your hand ask the question don't keep it to yourself don't tweet to me later and with bad words ask the question now so I can try to help you Take any side conversations that you need outside you know again like I said mainly because I like hearing myself talk only not everybody else so please go outside if you have to Like I said before because we have limited resources I'm going to ask that you guys group together and I know this is a weird thing for techies because we don't like to sit too close to each other But I'm going to need you guys to work in groups of at least three or more so everyone can get a chance to actually touch the keyboard and do things but I'm going to need you guys to figure out how to group up together so that everybody can get the most out of it The materials for the workshop are at this link here Google URL shortness are case sensitive so you must type it exactly like that and again it's going to be on another slide later so don't worry about trying to grab that now But again that's where all the resources are oh it's moving too fast why is it skipping past that there we go all right so again groups of three or more so look around stake out who you want to group up with now Each group will be given a student ID and some instructions to connect to your open stack cloud usually I give that as a handout because I made a mistake and forgot to print it because again I'm just not on my game this week The materials or actually the open stack cloud information how you connect to is going to be on the GitHub repository at that same link that we showed before So each team will have to pick so I'm going to go through and assign team numbers and then from there you'll go into the Git repository and pick up your information So what you're getting as a team is each team will be given an open stack Liberty release cloud and access to another server called a stack storm server Stack storm is open source piece of software that does auto remediation we're going to talk more about stack storm later but that's what you're going to be getting is a fully functioning open stack Liberty release cloud as we're in access to a stack storm server to do your automated recovery And last thing is is that we're also going to be leveraging Nagios as our monitoring platform to be able to monitor each of your clouds so that you will be able to kind of get a Short cut to figuring out what's wrong with your cloud because Chris is going to share some information with you soon about your cloud that you're going to be curious to find out So without further ado I'm going to turn it over to my partner in crime Chris here to get you guys started All right services auto recovery Whoops okay looks like I'm so from the same thing Yeah All right so I mean you know services are going to fail that is a part of any open stack environment or any you know infrastructure this is you know this is normal So what we're kind of looking at here is open stack core services there is a big amount or a large amount of overlap there one service fails it can cause a chain reaction and you know cause multiple service failures there Well one Recovering a service is pretty pretty much simple as restarting that service or you know there could be a lot of troubleshooting involved to try to track down what actually is going on Ways to monitor some of these services you know depending on your monitoring platform you can identify what is actually failing and then troubleshoot from there You know let Rackspace do it lots of prayers like that So automated remediation the ability to identify and verify symptom failures with a focus on taking actions in response to an event in an automated fashion So I mean I kind of like that's what we're trying to do here you know you don't want to get that call at three o'clock in the morning you just want to see that this issue was resolved Okay what cleanup is there going to be afterwards It just as a side note I this definition is something I created so don't buy trying to steal it and put it on Wikipedia right this is my definition so if you steal it you got to give me some money towards it right This is our brave new world Okay so we were talking earlier everyone could group up I think we have 30 environments so hopefully groups of three I think there's like 20 something rows But basically what we're going to do is you guys will log into this environment services always going to be broken or will already be broken You're going to log in and determine what's broken and go ahead and get that service back up and running And then from there you know we're going to start talking about stack storm and doing the automated recovery process Okay, so this will be the link to the get So what I'm going to do is I'm going to come out and start assigning some team numbers since I again I forgot to print it out I don't these guys don't count they're my coworkers they don't count so you guys are going to be team one okay so who's my three over here I need you gotta find one more you gotta find one more dude who wants to join these two gentlemen Somebody has to Alright there we go you guys a team two You guys have three at least but there's only two people I can count that's two You gotta have three or more three or more Yeah, you got all four you guys can work together. I'm fine with that right you guys okay with that It doesn't matter You can let them do all the work it's fine you guys you Okay, well you do all the work perfect all right you guys will be team three okay all right You got three people over here who's interested Come on All right team four keep you remember your numbers it's important all right three at least okay team five You guys want to participate all right anybody willing to join these guys They need one more I thought you guys came to do the labs okay team six I mean it makes me feel better than everybody's not jumping at it so I mean I'm fine with that I think Oh now you're playing with me You guys right here Yeah you guys will be team seven you three will be team seven Team eight You guys all together They're team nine Yeah team nine You guys gonna work together All right you team ten okay You gonna work together ladies you need one more Okay your team eleven we need one more for team twelve here All right team twelve you got it perfect You guys gonna work together All right team thirteen Anybody You need one more body I'm not gonna tell you until you tell me You guys are gonna do four or just get one more You guys all want to work together Okay All right team fourteen Okay All right Team fifteen Team fifteen You have three you guys wanna work together All right team sixteen You want one more for you Who wants to work with these guys Working with these guys Come on I would work with this guy I would trust this guy Okay cool I'm not asking you to give him your car keys I just want you to work with him for a lab So team seventeen And these guys So these guys are group eighteen Eighteen Eighteen team eighteen Remember your numbers it's important all right You guys wanna work together Huh You don't want to work together You take this that no All right Team nineteen All right Twenty You guys wanna You're gonna find a third For this guy right here We need a third For twenty one Team twenty one Everybody wants to be twenty one again Come on Or if you're twenty one now Then You'll really want to be twenty one soon All right we need one more For twenty one right here There we go I appreciate that Brave new world All right You guys are gonna work together You have four That's perfect twenty You are our team twenty two Twenty two is your number Are you guys working together All in group Or three All right we need You need one more to make two teams then If that's what you're going for Okay Cool What was your number Twenty three Thank you Team twenty three All right You guys want to work together You guys all All right twenty four Team twenty four Remember your numbers Let's do this right Twenty five Twenty five You're gonna have to lift and shift If you want to work with those guys All right You're gonna work together You guys Twenty six Team twenty six Twenty seven Twenty eight You three Twenty eight Team twenty nine Now I'm gonna be tough And I got a lot of people in this room Actually I did pretty good But did be enough You gotta give me credit for that at least You gotta give me at least credit There's a room full of people Man she's tough What's your name Will you work Okay we'll figure that out later All right So I only have one last cloud for team thirty I'm gonna let you guys arm wrestle it out You think you could take them You think All right Can I offer these guys team thirty You're gonna judge me for it All right So what happens is I won't include you unless you want to be included with them Okay So everyone else here You will get dedicated resources on my cloud account For a week to do this lab on your own All right So when it's over Come talk to me I got you All right Remember your numbers So if you go into this link up here You'll see there's a folder that says lab assignments Each one of those will be named Slide one, two, three, four, five, six, seven, eight, nine, ten Right So you assume where I'm going with this If you're team one You look at slide one If you're team ten You look at slide ten If you're team thirty You look at slide thirty Okay And that's how you're gonna get your information So let me know you're getting there Let me know you're able to pull it up You guys getting connected Everybody getting connected You get the concept of what I'm going with there If you use someone else's information You will be fighting with them for the lab Yeah So I would only encourage you to do that So make sure you use the slide that coordinates with your team number Sorry guys I appreciate that Raise your hand if you are having problems Getting to the environment Okay Your hand went up immediately I like that You're connected? Yeah No, it should be It's been pointing up Z Uh... Try... Yeah, you know what Do you have like a Nope HDK Wordpad, yes, please Open it with Wordpad Can you guys get to it? It's very hard for you to read it But yes, you should be able to SSH to that Using SSH client Try just going to Wordpad Yeah Much better So you should be able to SSH To that IP address Dash L So this is from So you use the IP address that you use Okay Try to bring something up in the browser Well, I guess with Putty you have to use Putty I mean there's a boot too So what I want you to do is Once you get connected to your environment You can start with the instructions in step one That's in the root of that GitHub repository So step one is just basically Having you log into your cloud Click around Execute a few commands You're not doing anything too fancy In step one I want you to get familiar I want you to get connected to the cloud first I'm not sure why you're not getting out there So you try to bring up another URL You try to SSH to that IP address Using student01 in that password Yeah, so Verizon Those credentials work But to do SSH you need to use the other credentials They're not the same I didn't want to leave that wide open With that admin and Take passwords Just felt wrong Beautiful Alright, I've seen one person connect So that means I know it works So good night Thank you everybody for coming No, I can't leave yet Try to bring up that browser Are you guys better now? Awesome It makes me so happy More or less shortener Yeah Pop a new tab and just go to Raise your hands if you're having complications Alright, look at that I love them when they go up fast like that I'm going to squeeze by you guys How you doing? Are you guys getting a 404 Connecting to the Okay, what's going on? Admin I'm able to do it Okay So expand that window a little bit more So you can see the rest of it So it's telling you there That the Verizon credentials are admin and that But to log into the server You need to use student01 Sorry, so let me put some clarification out there There are two sets of credentials to use How'd you get to stack server? No, no, no Open stack server That's what you want to connect to Okay, perfect Yeah, you can't get to the other one, sorry You'll need that IP for something else Not to log into So there's two sets of credentials on that Right, there's a set of credentials Student01 that logs you into your open stack server Then there's another credential Which is admin Which gives you horizon access Two different sets of credentials You got to use them for their different purpose If you're trying to get to horizon You use admin and password If you're trying to get to it from SSH You need to use student01 And the password that's presented there On the handout Yes I'll show you, Nagyo's credentials will be In another, in the next step So you'll get to that So right now I just want you to follow step one Connect to your open stack cloud Connect to horizon, click around See if things are working Connect to it over SSH Execute a few commands See if it's working I'm just trying to build you up To being familiar with the environment I've learned that if you start the lab off From jumping right into it It's usually not successful So I just want to get you guys built up Nope, nope Nope, you can log in Log in, do what you want Just don't break your cloud yet Don't break it yet You can break it later When I'm done Okay, so once you match to step one Feel free to move on to step two So step two Is where Chris kind of explained to you that Every last one of you guys Are running a cloud That's broken I did that on purpose It wasn't by mistake So everybody's cloud here has a service That is broken And not broken in the sense of I messed up the config file You got to figure it out on your own Broken in the sense that a process Of service may or may not be running Right, so just go on my hinge there The top of step two Actually points you to a Nagios dashboard If you're not familiar with what Nagios is It's a monitoring platform That happens to be open source too Which I think is the greatest thing Ever created Nagios is actually monitoring Each and every one of you guys clouds right now So if you log into the monitoring dashboard Of Nagios You might actually get a little tip As to what service May not be working So follow the steps of step two You got step one master And give it a go Alright Now step two The whole process of step two Is that you identify a service That's broken And try to fix it And again Keep it simple stupid right Keep it simple If the service is not running Or an API is not available What's the first thing you go to look at Is a service running right So go check that out Let me know if I'm talking too much Maybe I did it wrong Is this the Nagios box You guys are still having a tough time here You okay? The second box that was listed on that slide Maybe it is I have a sign that's I know we're not 36 It was whatever this one is Yeah Nagios server Okay Yes it is So it's Capital O Lowercase P Three Alright You want me to give it a go Yeah you trying to get to the Nagios server No No you're going to the wrong server You need to go open stack the first one I guarantee you'll be better than So it has Raise your hand if you've gotten to the Nagios dashboard You might have to check Oh no you're just raising your hand to ask a question Okay So you might just catch it from Nagios Well actually you were just in the stack from one I don't know how you got in there But you did great You hacked into it Yeah so with this guy do slash Make sure there goes up Nagios three That actually he might have Yeah You want to make sure Your guys is going to be Your credentials are in step two Oh so these are each different I don't think that's actually Each one Yeah go look at step two and instructions So everybody's got a different alert I don't know if you'd be able to Or not everyone has a different alert But I think you cancel So go back Go back Go back one more Now step two You guys got it And it'll actually give you the credentials there Those are the credentials for Nagios Alright so raise your hand if you got into the Nagios dashboard Alright What did you find there? No What's not running What was not running for you From the Nagios dashboard Well this is going to be It has an Rc and you guys can Jump from there to the other Jump out of the utility containers And nothing's wrong See nobody's following the instructions Why are you guys Yes there are This whole set of instructions Watch I'm going to show them to you Okay there's a bunch of windows Popping up like that It's called Linux Look at this Step two There's actually a set of instructions in there No you're not You should have got access to Open that cloud by now already Alright so who got to Nagios What did they find Huh Okay interesting You got a problem with me choosing the same Username and password for everybody Yes whatever your student Yes whatever your team number is You go to that number in Nagios You want to do the lab? Alright Was there another question I'm sorry You're good Excellent alright Hopefully you guys are fixing those services Oh my goodness I'm so excited I see this guy he has like The command line up he's typing stuff I saw him start a service already I like it I like it Whoa you got a lot of red What'd you do? Oh yeah you highlighted all the failed ones I like that you just went right at it Alright so I want you to do up to the end of step two So if you haven't found out that there are A set of instructions at the root of that GitHub repository They say step one two three four You should be doing step two Yeah you just put that in the browser And then that's the username and password You don't have to connect to the CLI No so what I'm asking you to do is Is What did I ask you to do Yeah so what I wanted you to do is And I probably wasn't very clear I want you to connect to the OpenStack server Go in and execute these commands on the OpenStack server Not on the Nagio server Okay so this was a hint This was a hint to just kind of give you an idea As to how to find your failed service But you know what I can definitely do that a little better next time I'm sorry So you need to be connected to your OpenStack server To troubleshoot your OpenStack servers To be able to fix OpenStack one day But we haven't gotten there yet Is everybody almost done And getting through step two No that's it Yes sir Yes Yes maybe Who knows Yes Yes so what happens Is did you execute that This command Execute that command for me LXC Okay you're in a utility container You have to hit exit one Type exit to get out of that container Yeah yes I want you to type So if you make that screen a little bit wider There you go So now you have the SSH to the container That relates to the service that failed Right There you go There you go So one thing if you're not familiar with OpenStack Ansible These clouds are running They're running OpenStack Ansible Which is another way of deploying OpenStack Using Ansible The services are deployed in containers And no not docker containers That's the copy of the original containers They're running in LXC containers So in order to get to each of the Individual services you will have to connect To those individual containers That are running those services Okay Oh look at the container names Well we got glass containers Oh check it out Connect to it see what happens Yes I used to see glass registries Yeah but alright so how many services run Glance How many services run Glance take a guess Registry and API Yes yes yes yes You only see registry running on that server right With Glance API Come on now I'm going to start running OpenStack test shortly Ma'am took the certification Because if you raise your hand if you took the Certification I'm going to lean on you right No We're going to get you there don't worry We'll get you there Oh yes Well I just did it Yes Okay yes Yes so what I want you to do So are you on the OpenStack You're on the OpenStack server Is that Yeah first you got to connect to the OpenStack server Then you execute the LXC command It'll give you a list of containers And then you connect to the container Yes You have the SSH to it Yes Well when you So when you execute the command and pull up the containers You're going to find that containers have more than one IP address Associated with them I don't want you to use the 10 dot address I want you to use the 172 dot address The reason why I gave you that as a placeholder Because each container has two addresses If not more than two You could see it that way yes Hey hey No so the container is running On the bare metal OS right And the reason why we And again if you don't know this Rackspace helped create OpenStack Ansible The reason why we did that run the service inside containers Is so that you can do in place upgrades As well as it gives you the flexibility to grow out Your services put print For example if you have a set of control plane And you see that your nova api Is being hammered all the time You can spin up more containers on another machine Or even on that same machine And actually run more instances of your nova api As well as when you do your upgrades I can drop in a whole new set of containers Make the database update Turn off your old containers And now you're running a new version of OpenStack And that's the reason for that approach What's the difference? Well compute is different than your control plane And run all your core services Compute knows it's no big deal Control plane is the big deal You should Yes Alright everybody almost done with step 2 Logged into the server Then we went into the container You got an X out of you And then a utility container right now You got an X out of there Just type exit So now you can actually pull up the list of containers That's fine So the container which we logged in Was running all the OpenStack services Like all the Controller Services But when we listed the hypervisor list It actually gave The name of the Host where the container was running Yes it's all in one that's why So you're running all the services In computing your nova compute All in one instance The hypervisor list is only going to show itself Because it's all in one It's not a distributed model Right Yes Well you can look at it that way Yes Nova compute which is actually the hypervisor It's not running in a container It's running on the base OS But all the services are running in a container So nova, neutron, glance All on the same server You can do it You can never do it for production But it's meant to just Do test environment only You would never do it in real life Yeah Correct Yes Correct it's not triple O No it is running on those old containers Are running on the bare metal Pretty much but better You should be done You fixed your service everybody fixed their service Raise your hand if you fixed your service Fix your service All right No I'm going to give you a few more minutes And then we're going to progress Yes No worries Yes Who raised their hand Was it you? Yes What's going on? I can't see the process running But when you do service status I see You got to deal with tricky and boot to Man you got to figure out how to make it work So I'll give you a comment I'll give you an idea Service, space, the service name Space status And that'll tell you whether or not the service is running or not It don't look like it's running now is it? So you can figure out Yeah there you go Yes Well okay so there is a small lag on Nagyos Nagyos checks every five minutes I could have turned that up a little bit But I didn't want to kill that server So even if you recover a service And it doesn't turn green right away You got to give it a few minutes You can actually look at the next schedule time For that check and it'll tell you when it's going to Check at the next time All right, all right Okay just give it a few minutes It takes like I said it runs a poll Every five minutes Yes No it's not 90 seconds It updates the screen every 90 seconds But it doesn't It doesn't change a poll every 90 seconds All right I'm going to see how you guys did How about that I can tell myself So I'm expecting when I go to Nagyos here I'm going to see all greens All right so Who's team two? I'm going to start calling people out All right I like that Team three, you guys okay? You good? You found your service? All right Well it'll be fixed even if you keep going on Because Stackstorm is going to take care of it for you Believe it or not Team seven and eight, you guys okay? Well I see that there's still some Some folks trailer, oh I got a lot of greens here All right So If you're still working on step two It's okay I'm going to move forward And we can always circle back I'm not going to walk out as soon as I'm done I'm going to give you time if you need to ask questions Or we need to do something else So if I could just get your attention For a few minutes here because I just want to share Some information with you about Stackstorm So anybody here ever heard of Stackstorm before? Okay that's good So you get to learn something today Right so Stackstorm Is an open source project And it's focus is on integration And auto I'll say I want to make sure I do it to say it right It's platform is for Integration and automation across services and tools It ties together Your existing infrastructure and application environment So that you can more easily Automate that environment With a particular focus on Taking actions in response to events So the reason why this tool Is really Helpful and cool in a sense that Services are going to fail Right and they're always going to probably Be people around to maybe restart it But what if it service fails at Three o'clock in the morning Right you get an August alert that's great You knock calls you and tells you it's Failed that's great too now you got to Climb out of bed find your RSA key VPN in connect to the cloud And hopefully recover the service right Maybe an hour later you get back to bed Maybe But with something like stack storm The knock never gets a chance to call you Because stack storm actually will restart That fail service for you Now the thing about stack storm That I love is that it's customized It's not just you accept it as it is and you go But you can customize rules In stack storm to do specific Things for your cloud As well as anything else in your Application stack because it's not just for cloud Right it's for anything in application stack So some of the common automation things That people use it for is facilitating Troubleshooting or automated remediation Continuous deployment I've never tried That one but that'd be pretty interesting The user tool like this for So this is how it kind of works and again This diagram says a lot of information So don't get wrapped around The handlebars with it But it shows the different components How you can have different integrators Such as open stack Integrators such as Nagios Obviously Amazon web services But we're not going to talk about them Because this is an open stack summit Not an Amazon summit But you can have different tools That integrate into stack storm And what happens is you can have an input That comes from any of those tools Going to the stack storm world Do what it does because it's too complicated To kind of describe just right now And then have an output So the key factor is That if a tool can give an input So let's say for example a service fails That notification goes out That a service fails from Nagios That's the example we have here So you guys found your cloud Nagios sent an alert But guess what? You can configure Nagios as an event handler To send an alert to stack storm And what stack storm will do Is it will evaluate that alert And that message and say Do I have a rule or trigger Based off of the message I just received That's going to tell me to do something If it finds a trigger that matches Then it evaluates to say Do I have a rule that matches This failure? If it's no, it does nothing But if it's yes, it does something And what it will do is it will actually create a trigger Which will go off and do an action And go back and actually do something To the integrator that it actually Received a message from So if you're following where I'm going with this Is stack storm has an integration Pack to Nagios So Nagios can send global Event handler messages To almost anything, it doesn't really matter So what I did is For the sake of this Is I configured Nagios The same Nagios you guys were logging into And told it you have an event handler Whenever any event happens Send a message to A stack storm server And then that stack storm server Then will listen Continuously through its APIs For that message So the simple requirements for Nagios Was I had to add and enable an event handler Which took literally a minute In Nagios You had to go in and enable access For the stack storm APIs So you have to be able to talk to those APIs from your Nagios server Which takes another two seconds And then create a custom Nagios command To be able to send across Those custom parameters And what I'm showing you down here Is the miscellaneous comp file For Nagios And it's really simple as you can see You name the command And then you give it a URL Not a URL, but you give it a command That it will execute So this is the stack storm Python file that comes with the Integration for stack storm For Nagios And you just basically pass it a whole bunch of Nagios Variables, nothing major there They're all pretty straightforward So if you figure Nagios to be able to send this Alert, every time something happens It executes this stack storm command And what in turn it does Is that in turn then sends a message to stack storm So we're going to start now Step 3, right? So get back in there Go to the GitHub repository We're going to go to the file named step 3 And I want you guys to start working through that And what step 3 is going to do Is it's going to introduce you to stack storm Is going to kind of show you how to create a rule in stack storm And we're also going to see it actually In the cloud. So start step 3 Raise your hand if you have any questions Okay Oh, I see a hand up already We're coming Sure What's going on? Okay, so it was never It was stopped So here, actually Let me turn this You know what, honestly I don't think it just has to be Nagios The reality is as long as you Configure your tool to send out an alert To stack storm And have some control to format how the message Appears Then you can integrate it That's really the prerequisite So it has a list of handlers That it knows and has integrated to You can also write custom ones And then on the other side It has a list of actions that you can take And it's the actions that are going to help You recover these services Because it's different actions whether it's Locally You can actually try and have to go after To restart a service like this We had to do something a little bit different Because you're running services in containers So I just couldn't tell it to restart a service But instead What we're doing in this example is we're going to use Ansible So since OpenStack is built with Ansible You can actually execute Remote Ansible command to do things in your cloud So you'll see that when you open up the rule And look at it, you'll see the command Being executed there They're running two... Yeah, no, they're running two different machines Yes, absolutely StackStorm Everybody loves it Yes, that's fine Did you figure out what service fell? No, I didn't You need to go to Nagyoo You want StackStorm You got to close that browser Okay, so go back to your command window Right So Okay, so Your heat container, is that guy? What's the IP address? 172 So you need SSH to that IP It's actually root But then I may not work for you Yeah, just do root And butu is all about root It's not admin You wrote 238 Sorry, I'm one of those guys Attention to detail Right, so now in here If you issue a service I'm reading those Space status Right, you see it's not running, right? So, if you do No, I think you figured it out What's the... Yeah, it doesn't matter Right So, did it run? Yeah, okay How did you know the service name was sdftti? Well, it's actually there But I just happened to memorize So, one thing you realize over time Is you start to memorize the name Of the services, it's just one of those One of those things So it's kind of weird, because There is no view all services There's no real command to really do that successfully You can do ps-ef And then pipe it and do a grep For that service That's how you can find the service name But it really is no Show me everything running See, that's the thing about OpenStack Operator Yeah, you need to know You really don't need to get familiar with While the service is running in your cloud Because there is no OpenStack command to tell you Whether or not their service is running Let's say, for example, Keystone had failed If Keystone is down You can't get to any of the other OpenStack services Through the APR or the CLI Because Keystone is down So this is what I tell everybody You have to embrace the command line And you have to kind of have a solid knowledge That you really know where your services are running You really have to have that groundwork I know I just said a lot But the reality What I'm basically saying is that You have to become a Linux admin at that point Not an OpenStack admin And again, for this case You had to be in a certain container For that command to work, number one And number two, if it was Keystone That command wouldn't do anything So I don't want you to get comfortable with Relying on OpenStack CLI commands You'd be comfortable as a Linux admin To know how to trace down your services How we doing? We creating those rules? I want to see my StackStorm rules Populating Okay, so Pay close attention You must name your rule Based off of your team number Please don't name it someone else's Team number And I need you to put in the IP address Of the OpenStack server That your team is part of Not somebody else's It's very important Yes I showed the one command The only thing you have to add to Nagios Is add that command to your miscellaneous config file And then you tell It's called an event handler Yeah, if you Google New Nagios event handler It'll tell you exactly what you need to do Which one? The file itself can have the same name It just Doesn't matter The name of the file doesn't matter It's more about the name you give the rule In the file The rule name in the file For these three files should be the same No Well, you can Make it whatever you want but it has to be changed It has to be changed to something Yeah You know what? It's been around for a little bit not too long And it's not open stack related But to me It was like a third party tool that could Solve a lot of problems So yeah, it's not But Hold on one second So you need to put in the IP address Of the OpenStack server No, they're not So if you're having a problem executing that command I found this personally to be a problem If you copy and paste out of that The instructions The little single ticks Are weird And no matter how I did it And no matter how I tried to show it When you copy and paste those ticks become weird You need to go back in Once you paste that line And change the single quotes To be real single quotes I don't know why it keeps doing that But it did it for me too So the single quotes get weird You gotta fix those up in your commands That's because someone named it the same name That's why Yeah, you gotta pick a different name So if you open up the file Just change it to Like as long as the name Of the rule is different That right there I don't know who stole yours But someone did So if you find that someone else Created a rule with your student number Or name or some exact one that you did Just pick another name for the rule Doesn't matter what the rule is named But it needs to have a... Apparently they are, yes, I'm sorry Yes, please So what I'm trying to do is To demonstrate how If you pull down this repository It'll show you some examples of how to create Stackstorm rules And I just basically want you to look at the rule Then I need you to make a change I need to tie the name of the rule To be something custom Because everyone is connected to the same Stackstorm server so it has to be unique And then I need you to go down here And put the IP address of your open stack Server in those quotes And just be conscious of You're going to have to go in and fix it Because when you copy and paste this It's going to give you that weird single quote You've got to blame GitHub for that one I don't know what GitHub is doing there You'll see when you go to look at the rules You'll see exactly how it's doing It uses regular expressions Stackstorm uses regular expressions So you can put in a regular expression Before you can give it multiple conditions And once it meets all those conditions It's when it will do the action Well no, it's not It's a separate instance And Stackstorm, funny fact Stackstorm uses an open stack server called Mistral I always say it wrong It's supposed to be like Event processing Stackstorm actually uses an open stack Service to do it behind the scenes Totally separate thing, but it uses the same code No, it's already running on the Stackstorm server Yes Stackstorm uses that same code From open stack Not part of your open stack cloud But it uses the same code Yes M-I-S-T-R-A-L So Yes, there you go That's it Yes You could, but the problem is It may not be as flexible So you can create an Ansible script And you can try to feed in variables But it may not be as flexible It may not, who knows Stackstorm has other capabilities It has other capabilities I'm not going to explain those But it does have other capabilities It has integration to many different things And to me, it's It's meant to do that It's meant to listen for events and do something Ansible is not created to do it Ansible is created to run a command Based on the signal Yes Well, yes I'm actually calling Ansible To do something on that open stack server It's actually an ad hoc command That we're calling Yes, absolutely You know Ansible really well Make sure you come get my book I'm giving away Open stack administration with Ansible We're giving them away at the Cantina Across the street at Rackspace Rackspace Yes Whenever you find a duplicate problem Just go back into that rule And just give it another name Whether it be Yabba Dabba Do Or I'm number one Somebody probably use the same name to create a rule So don't get hung around that The biggest thing is Making sure your rule has your IP address Of your open stack server That's the biggest key The name of it doesn't matter Hey, absolutely Absolutely There you go The biggest key is That the name of your rule does not matter What matters is You don't change anything else in there And you make sure the IP address is pointing To your open stack server Those are the only things that matter Yes Yes Yes, please Not the stack server You wouldn't take a stack store Rule to point to itself We want to make sure the rule Is going to point to your open stack server So that it can connect to it And fix your servers for you .YML .YAML That's simple markup language It's the same language You use the right ansible Which is used to write many other things Yes, sir Non Yes, I love the way swimsuits in the winter Yeah I realized that Well, yeah Thank you for that Yes Yeah, everyone is connecting Please, if you're connected to the stack server As student01 Unless you are team1 Please log out and log back in As your team number Student, whatever team number you are Please Thank you, is that it? No, the thing is that it says Sudo as you Not if they Get cloned before That's about the point Thank you It's probably too late now No matter how many times you step through a lab There's always going to be something There's always going to be something But I appreciate that You probably could have run that No, thank you for that I will fix that for the future I'm assuming you know how to get around that Since you are a really smart man So They're just editing the file Everybody is Yeah, overwriting each other I think they get the gist of it though Hopefully somebody is successful Yeah, I found a It's a small bug Okay Sudo as you was putting everybody into The root and everybody is getting cloned And everybody is overwriting each other's file And then everybody else Logs onto the same student ID too Yeah, everybody is in student 1 Hopefully folks are figuring out If you're having any trouble I probably know what you're encountering Apologize in advance Yes Think about it this way If you go to start a service and it keeps failing That means there's some other residual services Depending on whether it's MySQL Or whether it's If you use the NOVA scheduler Maybe NOVA computer is not running I didn't want to do that I didn't want to go there because I think the message might get lost But basically what you have to do Is create more complex rules in Stackstorm I have a few I can send to you No It doesn't restart the server It restarts the service Believe it or not It is a lot of service failures It's not kind of striking Unless somebody messes with your config Then that's a whole other story Nagios detects your failure Stackstorm fixes your failure You can't, that's why you need Nagios Nagios detects the failure Nagios detects the failure Stackstorm fixes the failure That's what Nagios does It checks whatever interval you set it for It will go and check It will go and poll it Over and over again Does it provide a standard API for it Or something? Think about it this way If you have all those servers on the same network Or networks that can communicate with each other That's all that's required It doesn't have to be public facing Inside your data center you can set up Multiple networks As long as those networks can talk to each other Nagios can talk to your open stack cloud And check it Once it detects a problem It can then talk to Stackstorm Any external APIs or anything It's all internal networking So what I use for Nagios is SNMP I understand what you mean The standard I use for Nagios I use SNMP You never heard of SNMP before? SNMP Through SNMP I can ping a server Pass it a SNMP community string And tell it to do something That's fine SNMP I use SNMP I know you use SNMP to test Whether the server is healthy It is not It generally even triggers the Stackstorm To take some action From there you have a policy file Or you change it And tell the stackstorm Exactly So possibly To reach these stories to another server Yes So You're cool? What standard is Policy file Follows? So Nagios has a few alerts So it goes Warning Then critical Critical soft And it does that twice before it does critical hard Once it goes critical hard Then it sends a notification generally And that's when it sends A message to stackstorm to say I've now gone critical hard And stackstorm then can take action But it's not until it goes critical hard That it takes action Yes Based on the rule It's YAMMO YAMMO That's the markup language That's the standard foot You guys making it okay? So anybody make it To step 4 yet? Anybody move on to step 4? Anybody see their servers recover on it's own? Alright At least I know it works for somebody While I have you guys Before I forget There's free book signings today At the Rackspace Contina If you don't know where the container is Or what the container is You'll find out in the next slide We're giving away 4 different books The OpenStack Cookbook The Neutron Networking Book The new book On Troubleshooting OpenStack Yeah, you actually do Sorry, you'll get to leave in a few minutes The new book, brand new book On Troubleshooting OpenStack Just came out so you can't tell me you have it already And then last but not least They'll be giving out my book Which is OpenStack Administration with Ansible Quite honestly, if you are done And you feel good about yourself You can proceed over to the Cantina Which is on 2nd Intrinity We have drinks and Wi-Fi And couches and Free books Free books So head on over to the Cantina Free books Everybody loves a free book It is at 2nd Intrinity So literally if you walk back down 2nd Where you came from It's right across the street It's a restaurant across the street from the convention center If you walk back down 2nd It's on the right hand side Right before you walk into the doors of the convention center 2nd floor No, we have the whole restaurant So you won't miss us Huge sign for Rackspace out front Please Just started working on Ansible Yeah My book signing starts at 4.30 And you can talk to me as much as you like I'll be over there I'm from Juniper Excellent I think I was supposed to do something with you guys Somebody once upon a time Anyway Okay No, that's fair Before you leave, did you have a good time? Can I have one more time? You have a good time Thank you guys Another workshop on Wednesday if you want to hackle me Please look it up Yes