 So good afternoon everyone. Our next speaker for this afternoon is Florian Forster who will be presenting Collect D in dynamic environments. All right. Hi everybody. I've been introduced. That is a first to be honest. This should eventually hopefully go to the next slide. Oh boy. Sorry about that. All right. So my name is Florian Forster. People tend to call me Akto because Florian is a common name in my age group in Germany. I've been doing open source work for a while. I started some pearl hacking I guess in 2001. In 2005 I started the Collect D project and I've been hacking on stuff ever since. Mostly infrastructure and backend stuff. So I'm not a GUI person. I have a Google Plus profile for you if that's interesting for you and I'm on Twitter as well. Here we go. I'm going to talk about dynamic environments. I phrased it dynamic environments because I didn't want to abuse password bingo terms but it's the cloud whatever that means to you. For me it means that we have virtual machines, virtual Linux machines that are scheduled somewhere or Java in this particular case and some form of job management system that will start a new machine here or tear this one down or start a new job over there and you have this dynamic ever-changing environment of jobs and of machines and you kind of have to handle this. So that's what I meant by this. The agenda for today for my talk. First I'm going to talk about Collect D and why Collect D is a good choice for this environment. I'm going to cover what Collect D can offer to people running the cloud itself rather than e-service inside the cloud and how different ways in which you can use Collect D's network plug-in to do communication between instances. The second third will be about aggregation which is an interesting factor and necessity in these kind of environments. The one part is the aggregation plug-in that is included with Collect D and then I'm going to say a few words about Re-Man and Boson. And I'll end with storage solutions from the abstracts as old but proven R&D tool to more modern approaches and that'll be our right for today. Approach is advanced. I don't do talks twice so I hope I get the timing right but I hope I don't run out of slides nor time. So why Collect D? Collect D gathers metrics and then eventually you can do stuff like this. So the graphing part is not actually something that Collect D does. Collect D collects the metrics. It can send the metrics around and eventually hands off the metrics to a storage system. It doesn't really do storage itself. It can write CSV files but I don't really know anybody who's doing this in a bigger scale and that's it. So in these words it's doing one thing and it tries to do this well and since we get a lot of users and a lot of pull requests and a lot of people seem to know about it I kind of have the impression that we're on a good track here. There's a link to Colgy's homepage in case you don't use Google and there's a Google Pass community that is linked there and there's also a Twitter account that every once in a while announces new versions and so on so forth. You might want to use Collect D because for one it's a free and open source project. There are hundreds of people contributing to it literally. We started out using the GPL and eventually found that to come or some to integrate with some other parts of open source software. So we are slowly but slowly moving to the MIT. It's just very tedious work contacting all the contributors. Collect D is platform independent. It runs on free BSD Solaris and every operating system with an X on the name. It does not run on Windows. There's a commercial software that you can use that integrates well with Collect D but Colgy itself does only run on Unix. But on all Unixes that you can come up with, I guess. Collect D uses a very modular design that poses many plugins. A bit about that in a minute. Right here that it's an agent-based design that only half the truth. The same core demon is running on all instances and depending on the plugins you load, depending on your configuration, it will perform the one role or the other. So it's not inherently an agent-based design. If you only have one server, there's absolutely no need for one machine. There's absolutely no need to run a Collect D server, which we'll see in a bit. So you can just store data locally. Or if you don't like sending metrics around, you could, in theory, store the metrics on every single machine itself. Again, it's a possibility but nobody is seriously doing this. It is extensible with plugins and I'm going to say a few words about plugins on the next slide. If you want to extend Collect D yourself to collect your own metrics, you have a wide variety of options. The plugins that ship with Collect D are written in C for the most part. There are language bindings for Perl, Python and Java. So if you like these languages better, you can use one of these languages. The plugins that implement the support actually pull in the Perl, Python interpreter or the Java JVM. So it gets executed more or less natively and should be decently fast. If that's too much investment or too brittle or whatever for you, you can also use the exec plugin which executes arbitrary scripts, bytes, what have you. And these scripts or bindings need to implement a very, very simple ASCII line-based protocol. Write this to standard out and Collect D picks it up and does stuff with it. It will take you literally a minute or so to get started and set up. So it's really simple to get two results fast. It also scales fairly okay, but many people kind of cringe at the idea. So plugins come in in sort of different plugins perform different things, which most plugins you can put into one category quite easily. But then there's a long tail of plugins that are something interesting and weird and doesn't fit any build really. The biggest category is read plugins. So read plugins get metrics from somewhere. There's somewhere is often an operating system and the metrics will be something like a Q length or the number of packets and or the amount of memory used by this application, that sort of stuff that operating system know about. Another set of metrics comes from applications like the varnish cache can tell you how many hits it had and how many misses and you can calculate a hit to miss ratio and so on so forth. The Apache web server will tell you about the workers it has forked but not used yet or currently handling a request. My SQL can tell you something about caches and number of transactions and so on so forth. So a lot of infrastructure software will tell you some metric or other and it might be interesting for you or not. The nice thing about the approach of having plugins is that you load what you need. And if you're not running my SQL, you're not loading the my SQL plugin and colleague you will have absolutely nothing to do with colleague with my SQL. And the same goes for the other services out there. And then there's again, other metrics that you can read. It ranges from see on fee, which is hardware infrastructure by Intel for high performance computing. So essentially, you have a things like a cell CPU or something on a PCI Express card and it can do teraflops of commutations or something. And interestingly, on this PCI Express card, they run, I believe, their own Linux and then communicate with the outside world. So you can query information from this embedded Linux on the PCI Express card. I didn't know that's existed until somebody sent in a patch. You can query SNMP for from your network equipment. Unfortunately, that's still a thing. And not that yet. One wire is a protocol to reach hardware sensors in a very easy and lightweight way. He company I worked with a couple years back aimed at reading the temperatures of their racks in the data center via one wire but eventually gave up because one wire didn't scale to the size they wanted. And they went to Modbus. So I got paid for implementing Modbus support for quality and you can query Modbus with us. And the list goes on and on and on. So there's a lot of stuff in there and a lot of interesting things. So chances are really good that the metrics you're interested in are already supported. On the other end, so that was getting the metrics. On the other hand, you want to do something with a metric and writing them to somewhere. There's 15 something plugins that can do something like this. The most commonly look up once I've put on this slide. So there's a graphite plugin that can send data to graphite. Rd2 I already mentioned. Rd2 itself doesn't do any networking. It's a library and command line tool. But there's an Rd caching daemon that can add some networking infrastructure around Rd2. And the Rd2 plugin of Colleague D does essentially the same caching that the RdCacheD is doing. But it can also directly talk to the RdCacheD and just send the values over there and you go. You do your thing. It can talk to Riman, which I'm going to talk about in the third half. MongoDB was like the new kid on the blog for a couple of years and nobody talks about it today. So I don't know where that is currently. And one of the most generic plugins is a plugin that just encodes your metrics in JSON or another format and just sends it by a post request or somewhere. This is used by several startups that provide essentially metric storage and visualization on-demand SE service. So with this, what can Colleague D do for people running eCloud, like the cloud as an infrastructure? There's a plugin that is currently called Libvert. I'm going to rename this eventually, but no matter, which was contributed by Red Hat. So I've seen a Red Hat earlier. Thank you. Thank you. The Merchant Technologies team did this. So just tell them I said hi if you run into them. So what Libvert, it's a library that Red Hat wrote and open sourced, can do is it provides a unified interface to talk to hypervisors. And you can query metrics from the hypervisor, which metrics exactly depend on the hypervisor, unfortunately, but on Xen you get what is listed here. The CPU usage, unfortunately, it's only used or unused and no more detailed CPU states. But that is unfortunately due to the way that hypervisors work. Memory. Again, basically the balloon memory size or how much physical memory is assigned to a guest and not necessarily how much memory the guest is using. Swap, disk IOPS and bytes and last but not least, the network throughput, that sort of stuff. So very basic metrics for virtual machine. But the good thing here is you get all of these metrics more or less for free as in you don't have to instrument the guest at all. This could be anything running in that guest. And you'd still get these kind of metrics out of this. You don't have to make your customers load some library or run some demon or anything. Your customers run some job on your cloud and then you get very basic information that you can then provide to your customers on a website or look at yourself or whatever. The configuration is simple. There are a couple more options that you can use if you're interested in. But this is like the smallest possible config almost. So you tell it which hypervisor to connect to Xen colon slash slash is apparently the default for connecting to your local Xen D. And in this case, it's actually limited to one guest. I'm only interested in this one guest. If you leave out that line, it will get the information for guests currently running. And then you get something along the lines of this. Again, Krogi doesn't do their graphs. It just provides the data. But that's something that you might end up having eventually. So it's the virtual interface to tells you kind of that this is Xen again and without instrumenting the guests here. So this is virtual machines as in virtual Linux servers running on some hypervisor. On the other end of the cloud spectrum, you might have virtual Java machines that are scheduled to run somewhere and you want to get metrics from the JVM. And there's a plugin that can help you out with this. The generic JVM plug-in, it's called, connects to running JVM via the Java management extensions, JMX, and it can query all the ambience that JVM is providing via JMX. Ambience are essentially Java's word for metric. The configuration is a bit more verbose. But in the end of the day, you will get, again, essentially for free without instrumenting the code at all. Memory information, garbage collection information, and some that numbers, at least rough numbers, from the JVM without having to instrument the code at all. So if you have like a third party binary that you want to run, that might also be an interesting option. This configuration is severely stripped down. It won't work exactly the way it is printed here. There are a couple of lines missing. But I wanted to demonstrate that you have to load the plug-in, and then you define these ambience over here. Can you see this? So this is a definition, a mapping from the ambience that the JVM provides to the data format that Coddy expects. And then you have a connection block down here that essentially tells you which JVM to connect to. And what's missing here is information, which metrics you're interested in. So you can define or actually look at the Coddy wiki and find a longer list of predefined ambience that people have already collected for Tomcat, Catalina, and some Java servers and so on and so forth. And you can just copy and paste this into your configuration. And then you get something like this. So this is a Java memory. So you see that they use in the committed memory slowly building up and eventually the garbage collection is run. And all of the objects in this memory segment are either deleted because they're not used anymore, garbage collected, or moved to a different memory segment called the young generation, I think. So that's something you can get out of a JVM fairly easily. So this is what Coddy can do for people running other people's VMs and encode. Next, I want to talk about how bigger colleague D setups can be hooked up in various ways to build a bigger metric collection network, so to speak. How to do metric collection in a bigger environment and one that might be changing. Okay, sorry. The network plugin is a plugin that provides ways to send metrics to somewhere else and also to receive metrics from the network. And which of the two it does depends again on the configuration. It's using a very efficient binary protocol that needs, it kind of depends on the names of the metrics, but something between 50 and 100 bytes per metric. So 100,000 metrics would amount to let's say roughly 10 megabytes per second coming in. So it's essentially hard or impossible to saturate your network link just with metrics. You're going to run into a whole problem much more sooner when you're trying to store these metrics to desks on some time. The protocol is using UDP. It's strictly unidirectional so the clients will send a UDP package to the server and the server will never reply back. The idea behind this was that we really wanted to use multicast at first and the unicast implementation was added as an afterthought. So that means you can use multicast if you want to. Most people don't tend to want to, but if you're playing with the cool kids, then multicast. There's a bit of crypto available so you can sign metrics before they go over an untrusted network so people don't inject garbage into your metric system or you can encrypt the metrics if you want to hide whatever metrics you're gathering. Especially if data centers are linked with untrusted networks so that tends to be an interesting option. So I said that the configuration defines the role because the configuration defines the behavior of the plugin. So if the plugin is configured to send metrics then you would probably call this instance a client and if you configure the plugin to receive metrics you would probably call this particular instance a server even though the same binary is running on both of them. You can also send and receive at the same time building proxies which makes it possible to forward metrics to somewhere else. So again on the right you have a very very simple or the simplest possible configuration for the plugin. You tell the network plugin in this case that the server that you should send data to is example.com and that's it and the network plugin will take everything that it gets and just sends it to this server or whatever it resolves to. And contrary on the receiving side you configure it with a listen and the call-in-call-in is the IPv6 any address so it will listen to any interface it will accept all the packets on the colleague D port and will submit these metrics to colleague D and if colleague D is set up to write metrics to somewhere all the metrics that it receives will eventually end up in the storage system. And last but not least one extra line is necessary for the forwarding option you listen on one end you send one on the other side and then you have to specifically specifically say colleague D that yes I would actually like to forward the packets I received off of the network. If forward is false so the inverse of this colleague D will only send out metrics that it gathered itself and not the metrics it received from the network making it possible for two server endpoints to send their own metrics their own local information to each other. So with these different ways of configuring the network plugin you can build several types of network so to speak. The simplest is just send have all your machines send metrics to one server. Just you and you and you all go to example.com everything's done. So the one config up here is listen on the any address that would be the config for the server and the the line below the server is a.example.com would be the configuration for the three clients and you would get roughly this setup. If you have two servers for fault tolerance reasons you simply add a second server line and you get something that looks a bit like this. So every client is sending out the metrics twice once to the A server once to the B server and the servers again share the configuration and listen on any interface there's. If you want to set up multicast all you have to do is drop in a multicast address and the server will automatically detect that this is a multicast address now and send the correct join IGMP packets and if your network does the right thing you're good to go. And you can do something that let's call it multi tier for now. So you have your data center here all set up with the single server instance and you have two more data centers over there and you want to get a global view of all the data centers that run something and you want to have all their metrics corrected in one place. So you can in this case two but you can have a global collectee that aggregates all the metrics in one place and you can store it there or look at it there or whatever it is you need to be doing. Or in the forwarding case you can bridge in the proxy case you can bridge between technologies so you can have the data center that uses multicast to get the metrics to the local servers and then the local servers can use unicast to send the metrics upstream or you can have IPv6 within the data centers but not over the external link and then the proxy again allows you to bridge this easily. With this I'm coming to aggregation. Aggregation is important in dynamic environments because you're usually not caring about a single instance or a single server. You tend to use the cloud in a way that if one job fails, if one server fails, you have enough running somewhere else to take the load and eventually that server will come back or will get rescheduled somewhere else and you tend not to care about a single server. So setting up checks the way that not just for example does it that you have econfig file and these are all the checks that need to run on the server and if they fail then boom tends to be tricky. So aggregates are often more useful for alerting. Let's say you don't really care if one or two of the servers send out the HEP 500 error once but if your entire fleet sends out I know 2% 500 errors in all the requests they handle that might indicate a serious problem in your code and you might want to beat your developers into fixing it. So the global view, the aggregated view of your system is more interesting than focusing on a single machine or a single job. The other two are kind of made up I guess metric storage is often IO bound or you might not have the capacity to store all the metrics in some other respect in which case it might be useful to just store a global aggregate of information for historic reference and not so much to store all the unaggregated data. Last but not least dashboards should not overload the user or the sysadmin with information so you need to kind of aggregate this to show it nicely in a nice dashboard and if you can do this or parts of this beforehand then you're better for this. So the way the aggregation plugin in colleague D works is that on one side it subscribes to metrics on the other side it spits out the aggregates in regular intervals was added two years ago roughly so it's it's in version 5.2 unfortunately Debbie and Weezy still ships an older version. So this would be like a schematic version of colleague D on the left-hand side you have the input plugins that read several metrics on the right-hand side you have the storage whatever that might be and the aggregation plugin kind of closes a loop here so it gets metrics on the right aggregates them and then acts as an input plugin again. There's a prevention here that does not allow aggregates of aggregates within the same colleague D instance in order not to to have ever increasing things in here that essentially only measure the running speed. Since we're running short on time let's skip this. The limitations of the aggregation plugin is that it has only online algorithm so it doesn't have a verbal amount of state per aggregate it only has quite simple things such as minimum maximum average sum of numbers. That also means that there's no median and there's no percentile and there's a couple other stuff missing. If you want or need to have something like this I recommend okay I thought another slide was coming up. This slide is supposed to show you that aggregation can happen at the client on here it can happen at the local level and it can also happen at the global level so you can pre- aggregate stuff at the client if you don't want to expose detailed information about every single CPU but only I know that the global I am using 90 percent of all the CPUs I'm having information. So if the limitations of the aggregation plugin are make the aggregation plugin not suitable for your use case you can use Riemann. Riemann is an event stream processor that can do a crazy amount of of aggregation stuff and also emit additional events and then you can use these created events to base your learning of for example. So for instance you can have an exponentially way moving average over a given fixed time window that is something very similar to what the load command on Unix is doing I think and it also integrates with many storage options so colleague D Riemann and then some storage in the back makes a pretty good team and very new and just recently released by Stack Exchanges Boson. Boson also has an expression language that can do all sorts of aggregations and I believe they essentially hand this on to OpenTSDB so whatever OpenTSDB can do I guess Boson can do and they also have a website here. So with this really quickly now the last part storage storage again can happen in at all levels so you can have the global colleague D storage its metrics and have all the storage in one place or you can go the entire different way and have all the clients store the data individually as I initially mentioned not that that's super useful or you can do something between stored data on the local level like on this intermediate level here and you can also do several other things like store some of the information on the local level and only selected important information on the global level there's some filtering language built into colleague D to do this. RDTool as I mentioned it's a command line client or a library that stores data locally there's a demon a separate demon available that does some caching and colleague D can either do the caching itself and then write it to local disk or it can talk to remote RD cache D via the bioprotocol. Graphite is having a lot of good press these days it's similar to RDTool it's written in Python I think and has a nice web interface and that allows you to easily drill down into the data that you collected whenever a problem crops up colleague D has had a right graphite plugin for a while that does some intelligent buffering and you can send huge amounts of metrics to graphite fast. OpenTASDB is not that new I think it's based on Hadoop or H-Base so you get horizontally scalable time source databases at the cost of getting all that Java stack on your machines so if you're already using Hadoop or H-Base that might be a very interesting option if not I've not actually tried this out myself it might be easier than I thought. There's a write.tsdb plugin in colleague D that can send data there and one of the earlier slides said that Boson uses the OpenTASDB protocol so I haven't tried but I think that colleague D should be able to send data there with this plugin. InfluxDB only came to my attention fairly recently it is based on levelDB which I think is a key value store library thing so I don't think it can do it stores data locally again and expose a network interface and the interesting bit is that it natively can understand the colleague D wire protocol the binary protocol you can also understand that the graphite wire protocol so you actually have a choice which protocol to use here it provides a scale like query language and for the last two years they claim that their clustering support is experimental and I have no not yet verified how experimental it is and last but not least there's something called Voltaire that I just learned about the other day which unfortunately does not have colleague D support yet but there's a talk tomorrow so go there get fixed on Voltaire and write a colleague D plugin I'll be awaiting your poll request anytime so with this do you have any questions so when an agent is collecting this information is so when an agent is collecting the information how do I word this which way does it go so does the the the framework tell the agent okay I need a piece of this information or does the the plugin thing send the information back into the framework depends on the vast majority of plugins is triggered at directly intervals by the core kernel of the demon and then goes on in dust off and some it's a value back to the the demon and the demon then forward said to other plugins and so on but you can also do this as a chronosly the network plugin for example listens on the on the port and whenever a metric comes in it will call the same internal dispatch function without being prompted to do anything so as a follow-up what's the the granularity of the timing for the how accurate is the time the default setting is 10 seconds 10 second intervals we at some point made a change I don't know which version that was exactly to overcome the one second finest granularity so right now the finest granularity is true to the negative 30 so you can you can go pretty low and you can pull really fast at the expense of massive amounts of CPU at some point of course but if you want to do 10 millisecond intervals that's possible so you mentioned that you cannot do aggregations like P99 right now are there any efforts to fix that sorry I can't write so you mentioned that you cannot do aggregations like I know P99 or percentiles medians are there any efforts to fix that there's absolutely no technical reason not to have this it's a purely time-based reason I don't unfortunately have the spare time right now to do much and therefore this is on the wishlist but nobody has implemented this yet but as a general rule we try to implement stuff that people actually want so before we set off and then implement all the aggregation functions that you find in some statistical book somewhere we wait for people to request them before implementing such stuff because we the idea is to keep it as simple as possible while providing as much functionality as people recently use but in the case of of percentiles and median and so on that was actually requested and that is actually something we would really like to have it's just nobody is working on this right now there's a question so I was just I've been I've been playing with it for a bit but I was wondering what your thoughts are around where the alerting should be going to actually trigger alarms and that sort of stuff so there's some very simple test holding code and events built into quality that can do the most simple things so if it's really just if this value goes above here then send me an email that's possible with what quality brings with itself for more serious alerting there is a bridge to get the values or to get at the values that quality has collected from nachos so you can use nachos to do your alerting which seems to be what people who have nachos set up anyway tend to do remen can do these sorts of checks very well and integrates for example with pager duty so you can get away with just called the remen and then send it off to pager duty to get some alerts and I believe those are the most common options I'm not aware of of a big elephant in the room otherwise when I was using click dear year or two ago I hope it's not out I noticed there's about four five common dashboards that for the RID and they all a sort of 80% solutions are you pushing to get a are you a leaving it as is be trying to get a common one or see saying just use something from graphite how do you see that evolving so quality does not do graphs that so so my answer would probably have to be don't care that's that I don't want to have want to bundle any one of these dashboards with colleague D and say that's our dashboard now I kind of like that there is a ecosystem I would just like the ecosystem to be more on the 95% done scale than on the 80% done scale that said I recently ran into of what's it called a factorial factorial I can look it up and then tweet it or something that that seems to really nice and really well thought out and then of course there is in theory the option to give money to let's say liberado or stack driver and use whatever dashboarding they have which is supported and well done and hosted and everything so I guess longer term I probably would vote for for keeping it as it is as in we're not having the colleague the dashboard but people are welcome to contribute to whatever open source dashboard they like best and make it even better but I look up that one I have in mind and it's it's looking really promising it looks awesome I understand this is probably not the normal use case for collectee but is there anything in the architecture that would prevent me from running say two instances on a one specific node no not at all not not binding ports or in all that so actually some of the plugins require privileges and there's currently no very good story about dropping privileges and only keeping them for specific plugins or so yeah so what people sometimes do is that they run one instance of quality with root privileges for instance for pinging and doing the network requests and then a second instance for doing all the other checks where there's no no privilege is required so no there's there's no single lock that you have to hold or something that's totally possible and people do it so it's not as uncommon as you might think we have time for one more question yeah I just want to if you can go into the roadmap a little bit and if you've got things like auto discovery of which module should be loaded or caching if the network's down that sort of thing you can collectee plant away and yes so ever since my son was born earlier flask year I felt hardly had any spare time to spend on this and I'm really sorry and I am literally thousands of emails back on the mailing list and they're I don't know like 200 pull requests so waiting in the GitHub repository and I'm totally swamped for the little time that I can't spare so I'm very sorry for this and everybody was blocked or waiting on me and really sorry I try to recruit more guys onto the team and Pierre Yves and Fabien and whoever I forgot have been a huge help and have been doing a lot of pull requests recently and they've been awesome but as far as the roadmap goes there's currently not a really distinct roadmap of things that we do we do whatever patch is easiest to pull in gets pulled in first so if you write a patch that is really well written and well documented and has all the build system changes that it requires with it and essentially all that is required to do is someone from the committers list take a look at it and say oh this is awesome then it's probably going to be merged within a day or two and you'll be hearing from me because I'm giving out free t-shirts to people who contribute to collectee but if it's a huge patch that does something really interesting but there is a lot of work required on our side to verify that it's reasonable and and I don't know has weird coding style things and trading white spaces everywhere and it's not having a man-page entry it's probably going to take a long while so there are a couple of things I I would like to do for instance the crypto I briefly pointed at in the networking code is is vulnerable to a known plaintext attack and I would like to fix that and probably play with open SSL a bit and so that kind of itches me and that is probably going to be fixed in there not to distant future I have another plugin that allows you to communicate between instances via TCP and SSL authenticated rather than GDP and essentially hand rolled encryption and that's another thing I really wanted to go to do and that has been on my desk for years so those are the things I would focus on but my main focus really is on looking at pull requests and whatever is the easiest to pull I just pull right now does that answer your question halfway right so thank you very much for attending my talk I'll be around for the rest of the day now I'll be wearing colleague D t-shirts and otherwise you have seen me now huh so I'll talk me up I love to chat about your use case and your feature request and awesome metrics you want to have and yeah thanks