 Okay, I'm here to talk about Senzu. So Senzu is a monitoring solution similar in vein to Nagios, if you know that tool. And I'd like to suggest today that it's a viable option to monitor your OpenStacks, and it's actually what we're using. So the state of OpenStack monitoring. Right now it's a grab bag. There really isn't anyone monitoring solution that's in OpenStack itself. So from an implementer standpoint, you really got to roll your own. Right now Nagios and its fork seem to be the dominant players in filling out that monitoring story. And there's also a c-leometer, right? So from what I can tell, that was designed to do measurements, right? And so it'll measure bandwidth, it'll measure system information. But traditionally there really hasn't been a monitoring piece, but it seems like it could fit in there, right? So actually, from what I've read with Havana, they are bringing in some monitoring. So there's a tool called Synapse that works with AWS CloudWatch. That's being brought into C-leometer. And that actually will bring in some monitoring stuff. So I haven't actually gotten to look at that to see the details. But that is interesting, and so I think we should look out for that. But until then, we kind of have to roll around. Some things about C-leometer though, right now it seems like it requires Mongo. And it's really bulky. I mean, the thing is designed to capture a lot of metrics. So I pulled this from a Rantis blog post. So on this post they say average production system is going to generate 240 gigs of data per month. So that's a lot of data, I think. So I think even if you were to go with C-leometer, you might want something a little more lean. So in this case, a Senzu or a Nagios. So what is Senzu? It's a monitoring framework, a lot like Nagios. But in addition to that, it's a metrics bus. So it gives you a lot more functionality in terms of querying your infrastructure and sending that information to kind of wherever you want. So I'll talk about more about metrics later and specifically how it can help with OpenStack, with the backstory. So I worked at Sonion about a year and a half ago and that's where Senzu was written actually. So it was written by a guy there, Sean Porter. And this was us using Nagios. Sonion is a cloud company. They do email archiving. And with their solution, they have a stack per customer in a sense. So we had 20 different stacks that were distributed on AWS. And for each of those stacks, we had a Nagios server that, of course, provided us alerts when stuff fell over. And it was just really challenging. When I came on to the DevOps team, it was the primary sore spot was monitoring. One of the big problems with it was we had elastic infrastructure. And so whenever we would add an instance, we would have to restart Nagios. So not only would we have to restart Nagios in that stack, but we had a hub and spoke model with all our Nagios monitoring. So we also had to restart the centralized Nagios server. So that was a pain in the butt. We saw lots of too many, or excuse me, lots of false positives. The staff was kind of bent out of shape about it, really. I mean, no one likes to get a page late at night, especially if it's wrong. We were seeing a lot of that with Nagios. We're seeing strange scheduling issues. Now, I think we could have tackled the Nagios issues that we face because a lot of people use it with great success. But we kind of just couldn't get over the fact that we didn't like Nagios a lot. And so we were emotionally blocked. We weren't making any progress on it. And so that's why we decided to write our own tool. Mostly, Sean. And that's where Senseu comes in. So with our product, we relied heavily on Rabbit. We loved Rabbit there. I mean, it was the tool that allowed us to be loosely coupled, did great. And so when Sean was thinking about a monitoring solution, Rabbit came immediately to mind. Some of the goals with Senseu, we really wanted the automatic client registration. We didn't want to have to restart our service when we added infrastructure. We wanted to be configuration management aware. So the way Senseu's written, it just makes certain choices that makes it easy to manage. For example, it has a confd directory for all its checks and other objects that you need to work with. So it's really easy just to write out specific files per configuration aspect. It's not like one monolithic config file. We wanted something that was hackable. It was hard for us to get into Nagios and know what was going on. We wanted something that was very simple. You could just see it right away. And we wanted something with a really good API. And Senseu accomplishes that. So the Senseu stack, I talked about Rabbit. So that's our message bus, Ruby. So it uses Ruby, specifically Ruby event machine. So it never blocks on IO. And it uses Redis for the persistence layer. So it uses Unix principles in terms of its construction. It's very modular, and it's MIT licensed. So high level, it's a lot like Nagios, but it actually has an agent, whereas Nagios typically doesn't. So you have an agent running on all of your clients. You have a server. And then you have basically kind of alerting principles like a check, which actually makes an assertion against your infrastructure. Is this the way I want it to be? And then a handler, which will take the results of that assertion and do something with it. Like send you an email or make a call to the PagerDuty API. So the workflow is largely based around Rabbit. So the Senseu server publishes a check. So the server is the brain, so it has a list of checks. And as it's rolling along, it just pushes out checks onto the Rabbit queue. And on each client, you just specify what it should be subscribing to. So when it goes on to the exchange, it just fans out to all the client's queues. They pick up the check, they execute it, execute it, and then they take the output and they put it back on the results queue, which the Senseu server is listening for that. It takes the results and it gives it to the handlers that were specified in the check. So the checks are, they basically copy the Nagios in our PE plugin model. So it's basically just, it just shells out. So when you run a check, you get both the exit code. So zero, one, two, or three. And also the string. And exactly what the Nagios plugins do. So that's very handy. So if you have existing investment in Nagios plugins, you don't lose them by moving to Senseu. So here's how you specify a check. That directory at the top is that confd directory. And that's where you can drop checks in. And then when you restart the Senseu client, it'll be there. So Senseu uses JSON for all its config, which is real nice. And pretty much it uses JSON wherever it can. It uses JSON to wrap metrics. It uses JSON for messaging. So in this check, this is a check disk check. So it's calling, in that command line, it's calling the check disk plugin. And it's saying warn me once the disk is 85% full and throw a crit if it's 95% full. This particular check will get sent to all the subscribers and then it has the email and pager duty handlers. And it's gonna run every 60 seconds. Or I should say with the checks too, you can either opt to do a centrally managed way to go or you can actually have the checks just run standalone on the clients themselves. So Senseu plugin. So this is a convenience wrapper. So if you wanna write your own checks, you can just import from this object or inherit from it rather. And it just gives you some convenience methods there. So you can easily specify options on your check. So if you want certain flags so that when you call your plugin, of course you can specify the parameters. So it actually uses ops codes, mixlib, CLI. If you've ever written any knife plugins, it's exactly what happens there. So it's very easy to use. And then you just find some methods that it's gonna look for, right? So just define a run method and do your logic in there. Then you can just say, okay, so if you call okay, it's just gonna exit zero with a string. If you call crit, it's gonna exit three. And then also with a string. Okay, demo. Okay, so I wrote a wrapper cookbook for Senseu. So it makes it really easy to provision Senseu actually. So I have a virtual machine running with Senseu. So at the top screen, I'm tailing the Senseu client log. So this VM is running Senseu server and Senseu client, but I'm just watching the client log. So it's basically just gonna be testing against itself in this situation. And I have a check to make sure that the NTPD process is running. So right now you can see it rolling by checkbox okay, and then it's found one process command NTPD. Let's see, so I wanna show you the dashboard. So here's what the Senseu dashboard looks like. So right now, all clear, nothing's going wrong. Okay, so I stopped the NTPD service, NTPD service. You can see it's now gone critical. If we pop over to our dashboard, okay, we got a crit. So at this point, an event has occurred. So this is where your handlers would then take over. So whatever you've specified in the event would then kick off. So an email, contact pager duty, really whatever. The sky's the limit. Oh, and so I actually published this to GitHub too. If you wanna play with Senseu, you can just pull this down and do a vagrant up and that'll get you going. So handlers and metrics. So this is the piece that I think is particularly interesting to open stack provisioners. So Senseu makes it really easy to write your own checks. And this is where it acts as a metrics bus also, excuse me. So you can write a check, find out some information on your system and you can send it to Senseu, right? And so for us, we can choose to find out open stack information, right? So if you wanna know the amount of tenants or the amount of resource usage per tenant, you could potentially grab that information and send it through Senseu. So the different handlers, there's the pipe handler, which is basically what I just showed you. So if you get an alert, go off to pager duty, make an API call, there's a metric handler and that actually doesn't care about the exit status of what's happening. It's just saying, hey, I'm gonna get a bunch of data and I'm gonna do something with it. And then there's the AMQP handler and that's basically like a router. So whatever came in on the queue, it's just gonna fire it off to a different queue. So it just passes it along. So that can be handy if you have another consumer down at the end of your workflow. So one thing you can put the end of your workflow is graphite. So there's quite a few community metrics checks out there if you're interested in trying to monitor your open stack. So here's some things that I thought of that could be interesting in terms of gathering metrics. Logins to Horizon, the amount of API calls, CPUs per tenant. We've actually just started getting into this space. We wrote a one metrics check just to deliver tenant resource usage and we piped that into graphite. So there's some nice things about sent to. It uses omnibus packaging. So even though it's Ruby, it's discreetly installed. It's embedded so it will never conflict with any system rubies that you have installed. It's got really nice logging. It uses JSON there also. And it's got a really active community. That is it. Any questions? We'll see you in the next video. Bye!