 Okay, hello again to my job about RHQ, job on system management is that. My name is Heiko Wupp, I work for Red Hat and you see my contact data here. Now when this talk, when I was asked about giving the talk, I submitted my proposal as system management with RHQ I got an answer back, ah you are talking about job on, yeah that's cool, let's do that. And now I come to RHQ here and JVOS operations network. What it's actually about is we have these open source management projects, RHQ and job on. And then we have we, Red Hat or JVOS, a commercial tool called JVOS operations network, which is built of those two other worlds, bus worlds. Now in the past, the members show some history slides later on. We have had Project RHQ as a foundation and on top of that we had job on, which were the JVOS management bits. And on top of both is built JVOS on. And since last September we are in the lucky position that we only have RHQ anymore. So while you will still see the term job on out there, it is being phased out. RHQ is the open source project. And RHQ is living on RHQproject.org. So that's the main wiki documentation side of the whole project. We don't have to write that down or you cannot process it. I will post my slides at least to my blog and this gets syndicated to the JVOS blog screen. I guess first then we'll have them somewhere linked as well. So it's RHQproject.org. We are using some other infrastructure for a few services, like the Fedora host that get a repository for all of our source code and also the Vaxilla from Redhead from our audience. We did have in the past two separate G-RA instances, one at RHQproject.org and another one at JVOS.org slash job on. But this was always a little bit of pain for people to decide if they wanted to report a Vaxilla feature in which issue tracker should that go. So that's also gone now. The job on JVOS is, but it should no longer allow you to post new Vaxilla. So everything in that area here in the Github repo, you will also find the source of all the bits from RHQ and we don't have to go to two different repos as in the past. It's now completely integrated. But before I continue with those boring slides, let's go off for some demo and have a look at the application itself. So it's about monitoring and management of systems, of servers from where we come from, it's mostly targeted at JVOS services and servers like the JVOS application server, JVOS cache, hibernate, and Tomcat or embedded Tomcat, but it's not exclusive. I will show you later a list of plugins that are available that isn't even complete and it's simple to write a new plugin for a new resource to manage. So this one here is the start page, the dashboard. You see information about what you have in inventory. There's a thing called the platform that's basically a machine where services run on. So one Linux machine is one platform. Another Linux machine is a second platform and so on. So this is currently my laptop here. It's one platform with 10 servers and 289 services. I'll show you the means. So just take it for given at the moment. Those server and service, where you put them, your stuff, is sort of arbitrary anyway. So it's an art grouping. We don't have any groups defined and we're currently getting 100 on seven metrics which is measurement values in per minute from the measurement subsystems. Up here we have an order of discovery portlet. So when the system file will show them to you up there and then you can decide to take them into your inventory or decide you don't want it. So for example, if you just fire up a new Chamber's application server for testing while the discovery run is coming in, it will be discovered and shown up in this portlet like that guy. And when you say, no, that's not my production when I just wanted to test something and I don't want to monitor it, you don't take it into inventory but you can't ignore it. Then you have some recently added resources so you see a history of what's got added to the system, can define some favorites. You see here about recent alerts, which alerts got fired. I will talk about alerts more in detail later on. Here you can see operations on resources that were triggered recently or that are scheduled to be triggered. So it's possible, for example, for a system that will be fine, reboot my J voice AS server every night because I know it has a memory leak or my application has a memory leak of the server itself and you can just reboot it like that. And here we also have hasn't also currently unavailable list of things, so it's solved a list of things to look about. So now when you let's have a look at one platform, for example, that's my laptop and you see a bunch of icons. When I go to the server list, it's even more impressive. I guess these items stand for some subsystems within the system. So you have this little icon here that looks like a monitor with a six-axe curve. That's about the monitoring subsystem. Then you have this little notepad. Yes, thanks. That's about the inventory. I will show you the subsystems again in the moment. The flag for alerts, this play button for operations. This thing which looks like an explorer icon, this mini explorer icon is about a content subsystem. This one I don't recall at the moment. So you can also see here the range is about configuring the source. You can see not every resource has all of the icons. That depends on the plugin that you write and how much you want to support in the plugin. Sometimes it makes no sense to configure a resource because for a platform, for a Linux machine itself, not for a search, like if you see hosts, but for a machine itself, it very often makes no sense to configure it. So if it's a hardware chip that you just are monitoring and has no machine to upload values into it, you don't have a configure icon. So let's go back to my platform. So I think I get many of these unavailable icons, the red icons, are due to the new network address that I just got from the THCD system because the system expects the same host to be in the same place. Some of those effects normally in a production environment you would have fixed IP addresses for that. So this summary page, each resource has a summary page so you can sort of dashboard for that resource to see recent measurement values and for the last one that got measured and a little graph on how it developed in the past. You see a list of alerts and how severe they were. So it's medium, low or high priority. Some part of bound metrics. I'm going to show that in a minute. So configuration updates. You see those, so nothing happened here. Our packages are uploaded in these other events. So let's go here for the used swap space. For each metric you can have graphs, you have the small graphs and then the large ones like this. And the system is calculating bounds between those metrics normally are measured or the default values sort of. So it takes the last n days of data and computes the bounds. And now when the value is going over or below the bound it will trigger an out of bound exception sort of. And this is shown because very often you know the active thread count is between for your application between 70 and 80. And if this one goes up to 120, you know it's in error condition. It's even possible to alert then on this being out of bounds or out of bounds by a certain percentage. So this is the default monitor tab. So by default for the monitoring system subsystem you implement in the plugin that you want to monitor your values and then you implement a little bit of code and all the value is back and these get automatically graphed in here. This is relatively generic. We have algorithms to compute the right percentage. So that you don't write value range but it's sort of default in the resistance. We can also have tables. If you say well I'd rather want to know the exact minimum and maximum values than the graph. And now we have something in addition to the pure numeric values. These are the so-called trades. Trades are always considered as string values and when the system is taking them it compares them with the previous trade value and the two are the same. Nothing, no new record is still stored in the database and the database only stores an updated version when this trade changes. So for example when operating system update comes in from 10.62 to 10.63 this trade value will change and you will see when it has changed. About the database size for monitoring data anyway we are compressing the data so we have one hour aggregate, six hour aggregates and one day aggregates and we can keep that by default up to a year. So your database won't explode. You're purchasing the raw data and keep the compressed one so you get still a very good overview of what happened in the past without all the huge amount of data. Okay, then we have here a little sub-tab about availability so you see when did your resource go up and down and how many failures are up for how many and things like that. It's also per resource. And the last one, that's the tab about the schedules on this tab. You specify which of the metrics that the plugin offers for this specific resource type for this Mac OS platform for example you want to measure and how often in which interval. If you have 10 platforms you can either set a default so that gets applied to when you import a new platform or do that a complete group of resource you don't have to go into each one of it but you can still override on a per resource level. Okay, then the inventory tab here that's the overview here about what's the name of your resource and then which direct children does it have. That's probably not that much of interest for the platform. One thing that's more interesting is down here that you can manually add resources that you don't get already discovered. In my case here on my machine is the Postgres database server because the plugin expects some default values which are just not straight to my machine but I can still go in say okay add a Postgres server and then on the next page here you specify connection properties and when you did the right thing it will just be taken into inventory and all its child resource which are the databases on the database server plus all the tables also get pulled into inventory. So on the left I'm going to show later so for the moment operations will be the last sometimes but again per resource type various operations that you can execute so for example a new process list for a unit machine the next step is specify when picture is gone either you can execute immediately this is then triggering an action on the actual resource a remote resource so it's all in the same box of course but if I had two boxes it would really reach out on the other box and execute the operation on the other one you can also specify a difference time and recurrence how often do we want to take that and when should it end so it's really like in Chrome you can just say okay do that every hour or every day or whatever then you click on schedule do that just once click on schedule you have completed operations and you can go in click on it and here you see the results it's probably in a different format than what you would expect we just know the output of p.s. minus whatever but it's still useful and it's just one example of what you can do when I'm going back here on the summary tab you also see it now listed here this recent operation and then there is this timeline in here where you can also see on a more graphical view what happened with your system so the scene was down here the red bar, the light red bar here we did have an operation with the green check mark which means it was successful and when you would get events in they would show up or alerts would show up in here that's actually the simile timeline project from MIT it's very cool on your display you also get timelines it's really helpful it's written in java script so it's sometimes a bit hairy when you supply the wrong data or not exactly in the format that it expected especially when you have those strange things like journal downloads in your system or time zones which are not US but when you work around that it's really cool I believe here on the tree that we are currently working on you can again browse through your resources here it's only this one platform that you see when you want to go to another platform you have at the moment to go to for example your resources menu you can go to an error in the current version that I'm having here in the developer version and then just select servers or other platforms when you go on a resource in here like that one for example you can also right click now that doesn't this one doesn't support right click let me define a different one okay so we are currently working on this tree and revamping it so it looks like the right click support is currently disabled so it should work in the next version that you get out there so one other thing that we have you have seen we have this summary page per resource and we have this dashboard but often that's not enough so if you want to know about other bound resources all about your system or which one if it's the worst outlier it's a bit hard to click on each resource and to compare them by hand so what we also have here are so-called system views where you have for various things like configuration changes metrics operations a view about this subsystem over the whole system so here that's about a suspect metrics which are these outliers they are now sorted by the out of range factor and the one which is the worst outliers is just on top so you know on what to concentrate so it's easy and here you can also go and for operations you see all the operations that happened during some time submitted where you can filter or on resource what happens it's only this one so no big deal at the moment this is the first of the two history slides that I was talking about in we started actually in 2006 with the current line of programming and in early 2008 we released RHQ-1 which was the framework and most of the bits of it but not the drop-off part which was which was the closest source at that time then here in end of 2008 we released RHQ-1-1 Jamers-on Tool-up-1 Jamers operations network product and after that also the drop-off bit so this one plus that one sort of gave this one and then we had these two lines of development one of the results that came out of that was also the embedded drop-off which many of you probably have seen as the embedded console in application server 5 and currently we use RHQ-1.3 and in September last year and finally able to put those drop-off Tool-up-3 plus bits into RHQ-1.4 so it's one unified source line we released a community release of 1.4 built 01 and in this build we had some issues with people that wanted to upgrade from previous versions of drop-off to that version because our plugin system considered the 2.3 version plugins as newer than the 1.4 version plugins so we decided in the process first we are putting some different emphasis and more emphasis on the whole development and also about just the compatibility with the version numbers that's the next version of RHQ will be RHQ-3 and last week we released the second community version of it RHQ-3.0 build 2 which has the plugins and this is now the mainline development RHQ-3 we don't have a release date yet but we plan on giving out community releases every every 6 weeks for 2 months and it's expected that from RHQ-3 you will also get the next version of Jebus which will probably be Jebus-3 as well so after this history review give a quick architecture overview the central thing that we have here is the RHQ server or a class bomb thereof so when you want to do load balancing or failover you can have one or more servers in your data center it's even possible if you have two data centers to have one each or two each and then have so called affinity groups that when an agent come to that in a minute is talking to one data center and or one server in one data center and this server is going down that we will try to talk with the other server in the data center before switching over to the other data center which is normally not wanted so on the server we have the access for the administrator it's the GUI we have the user interface as I have just shown we have command line interface on it that's also in Java and it uses javascript as its language in javascript we can write expressions in javascript with complete control structures it even has auto-completion for for some terms for resource names and for and other stuff so it's quite comfortable and quite powerful and then we have experimental support for web dev so you can mount your resource tree as a web dev directory in your explorer or finder or whatever the server hosts also the database connection before database for us our Postgres and Oracle there is some support for H2 database that's an embedded one which basically the successor of what was called hypersonic in the past this is mainly used for demoing or testing purposes it's not for production and then there is some experimental support for SQL Server or for one version of SQL Server and we have a big component that we have here that you need is the agent this dot dashed line shows a platform so on each platform that you want to monitor and manage stuff you need an agent and this agent gets all the agent plugins and only these agent plugins talk to your managed resources so in a firewall scenario you only have to the communication between agent and server but never between server and a target manage resource so yes please I was talking a lot about resources our definition of a resource is everything that can be managed or monitored I've written a plugin that monitors a terminal chip so it's just a terminal terminal sensor like a transistor that size begins on the one wire bus so here it's only monitoring and not managing but a resource could also be any process or a sweat count or a memory setting or anything like that and then each resource has a corresponding resource type a resource type for example Linux or Mac OS it's a AWS it's Tomcat it's an individual data source it's a database table things like that and the last thing which we have is the resource category this sort of shows the place in the resource hierarchy you have seen before platform which is the machine then the max level is the server which could be for example a javas application server it could be a Tomcat this whole thing is sort of recursive server can host a server javas AS has this embedded Tomcat server so that's recursive and then usually below a server you have the services which is a data source in javas AS which is a web application in Tomcat there is you know how to on when to use a server or when to use a service usually like that you that when you have a subsystem or complete process you model it as a server and when it's individual parts of the subsystem you model it as a service but it's really like as you can see where it's just finished okay that's this and this is again showing this hierarchy which who hosts what there are even platform services that are not hanging on the server but directly on the platform like never in the faces okay then we have those subsystems I have shown it already in the in the UI box so I'm not going to mention it a lot again it depends on what your plugin defines which of the subsystem is available for a specific resource type usually you want to do monitoring availability is technically also a known subsystem but it's it's your fault I know to monitoring sort of so you always have some inventory but many resources just don't have any connection properties know nothing so you don't need to implement anything in there so I'm going to talk about the accessibility of the whole thing we have now perspectives or magic in power you can you can write server-side plugins that are running in the server that extend the server functionality one of those is the perspectives which give you a new UI look or you use it to the UI and then we have the alert plugins that are alert senders and then you can write for the agents those agent plugins it only goes through all of those now perspectives that's like UI plugins so it's possible to write a complete new UI screen that's hosted on a tab like this monitor inventory that you have seen just next to that in a new arm tab this can be written in Java just a normal war file or it is or will be possible to even host a complete application that's written in PHP but that gets some information when it's called and that can even also link back into the main UI or call UI we have various extension points so it's it's possible to to change entries in the top level manual you can add new tabs and you can write new pages or exchange existing pages then the agent plugins that's basically the functionality those guys that talk to the managed resource that work with the managed resource only those talk to the resource that's important again so it's basically metadata plus some Java code the metadata defines the capabilities of your plugin so when you say I want to monitor out those 234 values only those 23 values will be shown in the GUI in the Schedules tab or the Monitoring tab no matter how many data the plugin could provide otherwise the metadata also wires the Java classes together the plugin can auto discover resources so it's possible when a new resource of that type is coming up that it gets automatically added to the inventory and there's a generator available that can help you with writing a plugin there is a big list of plugins available right now many of those add our HQ project in the gift report but also from third parties and I've written a script plugin that can even defer the measurement taking to some scripts in jruby or javascript so many administrators don't like writing Java code but they know scripting languages like Ruby are well enough so even that's possible okay our final is the 3 boxes so you have to plug in the script you need to write that that's a bit of XML you have to write a discovery class that's only to implement one method that in the shortest version just sort of returns its input and then the component class that defines all the facets for all the subsystems that you want to implement and yeah so this is a quick run of the plugin generator it's a standard Java archive it asks a few questions here do you want platform server service of your resource tree from that plugin where should it live in a package hierarchy where should it live in the file system so which are the class names and now what subsystem do you want to support do you want to support event do you want to support monitoring etc etc and when all of that is done it will write out a POM file from the formation and also a skeleton plugin descriptor with a few to-do tags in there and also skeleton Java classes that are fully stocked out for all the subsystems that you want to support so you can directly compile it and deploy it it won't do much of course but it's already made plugin I have created a video which I hope will be online soon about how to do that there are also a lot of online articles out now on how to write a plugin so there should be a lot of resources and of course there's the source code of the existing plugins that's just a list of available agent plugins I don't want to go through that in detail this list is not exhausted so I think the source is I'll look it up for the source of knowledge but there is you see big support for just Java software, Java's LS AS5 support, Java's hash we have an S&P trap key so other devices like routers can send traps to the system which can then incorporate into the event subsystem we have a generic JMX server here which allows you to monitor any Java 5 resource I've written a blog post in the past on how to monitor your eclipse IDE with the help of this JMX plugin so even that's possible and then outside of our domain there is a finished fan for the successor of Java's cache monitoring for mobile sense for various parts of mobile sense support, Java's ESB my proctor just told me that they have for their truth some support and I'm sure there are more plugins out there that I don't know about so after the agent plugins and the managed resource and monitor resource I'm coming back to the server side plugins about how to expand your server those server side plugins live in the server as the name says and they basically have access to all server side methods so it's possible for you to write a plugin that's for example doing reports on all your resources or when in inventory you could write a report about how many Windows 2003 servers do you have in inventory or things like that Google import very cute I really record the engine sometime in the future hopefully but you could use it for those purposes again it's some metadata plus Java code and there are some different kinds of server side plugins all of them have different worker that expand a base of server plugin XML in format one of the most used for my side is the alert sender that I'm coming to all of the plugins are content sources we have for example in javas a javas patch feed where customers can get updates to the javas as servers and this is also a server side plugin that's contacting the customer support model a feed with the changes and the customer can then say in javas on please apply those changes to my application source that's also supported by the server side plugins the alert plugins are a specialized form of server side plugins basically what you have to do is again write a little bit of XML and then implement this one method alertSender.send and the argument that you get alert that just got fired and then you have to react on it I will show an example on the next slide nice thing is we have preferences in the UI for the whole plugin so it's possible if you have let's say an IRC sender plugin that is sent as a preference although you want to contact and perhaps some credentials on how to contact that and then alert specific will be for example a channel this should go to I will show that in the UI in a few minutes and these get just injected you don't have to do any work to get at those values they are just there for free port the UI is just driven by the data you have a configuration system and this just renders the input fields for all of that which is powerful enough in many many cases but sometimes it's not enough for example when you want to do extensive searches or list paper boxes then it's still possible to write a custom UI you write an XHTML a facelet plus a backend in Java package all of that together and then you have a custom UI also here we have a script language alert sender where you can define those methods or the implementing method in JRuby or in Ruby and have this delivered by Ruby instead of having to write Java code so it's again appealing I guess to administrators this is a list of available alert sender plugins subjects and roles subjects are just the user who is in the RHQ server and roles are things like javas operators or system operators or business people or something like that and they will just send emails to those then we have RubySense support so the plugin is talking to a RubySense server which is then initiating a voice call to you and reads the error message on the phone I did have in the past a version where you were even able to do touchstones to answer back to the server for example to reboot a resource this is currently not available in this alert sender plugin version I still need to look for a good idea on how to integrate it so you are free to help short message via developerground.com they have a wrestling interface where you can talk rest to an endpoint to send short messages then microblock that's Twitter or status net feeds that's interesting I guess that's the status net server within your enterprise and people can just have their Twitter client talk to that status net server and get the updates from there as well email is just email sending to anyone we can then send SNMP traps and also talk to IOC have a board on IOC that's reporting alerts and again the script language and one thing that we did have an older version but which currently vanished in the operation in the alert plugin version is the operation sender that you can trigger an operation like we have seen on an arbitrary resource in your inventory so it will be possible if you find out that the data source is running at full connections for the whole time even if you only expect five open connections at a time you can for example re-boot the whole application server if you think that's a good idea to fix this problem okay that's a quick alert plugin example so that's the wiring you write this alert plugin the script door it's a bit short up there because of all these XML namespace headers that you have to implement which I don't want to show you so basically you have to define the name and you can define a package that's just a java package where your code lives in and you don't have to specify it here on the plugin class later on here in this block server plugin code on plugin configuration you specify the preferences that's global for all instances of this alert sender and then down here alert configuration these are the properties that are specific for one instantiation of the sender I will show it in a minute what's meant here we need to provide a short name of this sender which gets shown when you want to define a new alert notification you select from a list of senders and that's the short name that you will see the java code for the whole thing here is again this package public class URL sender we have that guy we have this method send here you just define the java code I'm getting my data from preferences so the preferences value is there for you you say get simple simple value hostname is the one we defined here in metadata and this second argument is the default value if the operator or the user did not enter anything you just specify default value the same here and you have parameters next to that part so the data is just here you have only this one method call to get it here I'm opening my htp URL connection to that host and port I'm getting an output stream and writing my messages with simplified of course and at the end when you're done you need to withdraw sender result object the sender result object expects that you specify if it was successful or failure and a message and then there's a search date not shown on this slide which is deferred email which means this plugin just computed a list of email addresses that should be sent off but it's not able to determine if sending will be successful later because that's only done after all the email addresses have been collected so this plugin kind of know it and luckily it does not have to implement this sending because it's just available for free it's close to the end but it's okay you can see here that's in this plugin that part I have just my alert notifications I have this MobySense sender I can show that outside later on if anyone wants to see it here that's the alert configuration that was down here on the XML snippet so that's just a default rendering and here you add your value for this loads plugin we have this custom URL to see if not as nice as the provided one but this is a little bit more complex with those pickups where you can put stuff from left to right and it disappears on the left list and gets the right one so this is not so you can just write it as you want and then the last thing system configuration plugins so here is a list of ancient and server-side plugins let's just go into one a little MobySense one configure so here that's this box this global configuration section that was up in the XML snippets again, render for you some URLs for you rhqproject.org it has pointers to all of the others of course then my blog I'm often writing about rhq and how to extend it on how to do things so if you are interested it's something that you might want to follow and then we have jampers.org this RSS aggregator so all of our feeds including mine about rhq will be fed into that one so thanks for listening and I will take questions but before that one more thing the sum of code is coming we are planning to participate in that if you are interested in that contact me please