 Hello everybody. Thank you a lot to come from at FOSDEM. Today we'll see a presentation about storing metrics at scale of WIC docky. Thank you. Thanks. Hi everyone. So today I'm going to talk to you about Nuoky, which is, thanks, a chancellor database. First I'd like to introduce myself briefly, so I'm Julian. I work at that for a bit more than a couple of years now. I mainly work on OpenStack, which is written in Python, as you might know, so OpenStack is a cloud computing platform for those who don't know. So you're supposed to run a lot of federal machines and a lot of computers. I've done a lot of free and open source stuff since 15 years now. A lot of Python these last years. So today I'd like to talk about what Nuoky is and why we created it, because I don't like to create projects for nothing. How does it work? What it does? How it does it? It's in Python, so it may be interesting for you guys. How to use it? Because, well, you can use it in Python too, so it may be interesting. So that's it. So it's a chancellor database. So I can imagine most of you know what a time series is, but it's basically storing a set of points composed of two things, a time standard value, and then you can create nifty graphs and nifty charts to sort of think. So the first value in a time series is time stamp. There's nothing very surprising, and the other value is going to be anything you want to measure to measure. So it would be the temperature in a room or the CPU usage of a computer in our case. So in the context of the Apprentice Stack project, which is a scalable cloud platform at least in theory, we needed something to store metrics of a lot of virtual machines, volumes, networks, everything you can imagine in infrastructure. So we need something which was scalable, which is not something you can find easily. A lot of software others are not designed to run on a lot of computers. So that's something we wanted to find. We wanted something that was easy to use. So everything in Apprentice Stack is controlled by API, and so we wanted a time server database which was programmable against something you could use to write programs, query, do things like retrieve statistics or things like that. And something very easy to operate. So what does that mean? The Apprentice Stack stack is mainly in Python and uses pretty simple technologies that we know for a long time like SQL or messaging queues or things like that. There's no big things like Adoop or Cassandra or things based on Java, for example, and nobody wants to install that in our world, at least. So what does the blocker for us? Because we took a look at 18 solutions, such as Graphite, which is deployed a lot, but it's not scalable. Some of it is in Python, and I went to a nice blocker a couple of years ago about not to write Python, you can take a look at the code from Graphite, it's pleasurable. It's not modulable. InfluxDB was a new cool kid in town. It's written in Go, but it doesn't work. It has gone proprietary in part last year, so it's not really open source anymore, and it doesn't scale. Open.jsDB is in a resolution. You need to set up Adoop. You have new things in a couple of years like Chrome, which I think uses Cassandra, which is not an option for us, too, because nobody wants to set up Cassandra on your Adoop or anything like that. So we have to create something from scratch, which we call gnocchi. It's part of the OpenStack project, but it's really not part of OpenStack in terms of dependencies. It works on the loan by default. It's like a standard database. So we designed it in a way that is easy to install because you can use PIP to install it, which is a very big difference in a lot of our software. It's written in Python. It's free software. We also put something that you don't see very often for a strict documentation policy, where we first developers provide documentation before merging anything, which works pretty nice. We have a pretty good documentation, and obviously, it was designed to be distributed in resolution to failure, which is standard in the cloud system. And it has a lot of awesome features. I'm not going to explain all of these features today because it would be way too long, but it does a lot of things. It's integrated with standard tools such as CollectD, Nagios, Stasi, Graphout, those kind of things. And it offers a rest API that you can work against. I will show you just after. So what do you need to do to learn to use Nukey and audit work? So there's a few data that you need to know better. You will see in the API being used. So the metric, the orange one, is pretty obvious. These are the time series where you start the metrics in Nukey. So this is pretty easy to understand. In Nukey, everything is pre-aggregated. So everything, when you send metrics to database to Nukey, it's going to compute things like the minimum, the maximum, the average, all these can be in advance. So when you want to retrieve them, it's very fast to retrieve them, which is something that we needed, because waiting for minutes to do the computation each time you want to draw a chart on the screen is not something that is very useful. And how you control this aggregation is defined by the archive policies. So in the archive policies, you're going to say, well, I want to compute the average, the minimum, the maximum, all these kind of things, or the percentile. And I want to keep the data for a year, for example, and I want to do this kind of aggregation every five minutes, every hour, or every day, or whatever. So you configure all these in archive policies. All of these data model errors can be manipulated using the API, the rest API. And all of these metrics where you store your data, it's not really obvious what to retrieve them like. When you meet a lot of things, let's say a few hundred, a few thousands of virtual machines, you have a lot of time to restore in the database, but you want a way to organize them, to create them, to find them again. So what we do is to create resources. So resources, it's exactly like classes in Python where you have resource type where you declare type of resource, for example, an instance of virtual machines, you declare attributes, like the host it's running on, the framework, these kind of things. And you can instantiate them as a resource. And you can link metrics against them. For example, you can call a metric a CPU, memory usage, or if your resource is an application, you can have a metric which is a number of visitors. And all of this is stored into an index. So the index is in charge of keeping those relationships that I draw trust on the slide before in a storage system. So we have free storage system in the UK. This is the architecture. It's not really a very simplified, it's actually how it works. It's pretty simple. So you have the users taking to the API, which is a rest API, nothing very complicated. And when you send the user, we're going to be stored in a temporary storage, which we call the miserable storage. It's usually the same one that the metric storage, but it can be different. And you have another process, which is called metric D. So you just have to follow the red arrows here. And the metric D process is in charge of getting these new measures out of the storage, compute the new aggregation, the new average, the new minimum, the new maximum, for each of the variety you configure in the archive policies, and store them back into the metric storage. So metric storage is a long-term storage. And all of this is stored in the index, which is the last piece of storage of the UK. So we have three different storage, which are three different technologies. Why? Because it's all different data. Having everything stored, for example, in SQL, it wouldn't work. Storing millions of points of time stamps in a table in SQL is not really an option. And that's really scalable. Metric itself, both the API and metric D, at the moment, during the computation are scalable. You can run any amount of the amount that you want and you need. And they're all coordinated with something that we call a coordinator, which I'll talk about after, which is also something we built in Python to help us. So the difference to the index is currently it's a driver system. So if you saw the talk before mine, which was talking about decoupling, we do a lot of that in OpenStack in general. We did that in Nukey. So everything is driver-based. So we have a driver for the index storage. And it's currently the only driver is a driver with SQL KME. We do leverage a lot of features in PostgreSQL. So we do recommend that one. But it's also worth in my SQL. And the storage for measures is either paid files, if you don't need any scalability, or self, which is our best storage driver because self offers the best performances for storage. And we also have Swift, which is an object store equivalent to S3 at Amazon that we do know how to use too. Very less efficient, but they do work well also on very, very scalable. You can store hundreds of terabytes on those kind of storage. We have two magic libraries that we use in Nukey. So the first one is K-O-P-N-R. It's not write-nosed tundalot library. It's a library that is embedded in Nukey for now because the API changes a lot. It's based on pandas and MPy. So we started with pandas a couple of years ago, which is very good because it's very easy to write any kind of statistics, any kind of computing you want on your time service, but it's very slow. It's very, very slow. When you metric deep, the demand responsible for doing the computation does a lot of this kind of computation over and over again using pandas is quite slow. So we're slowly writing the part of pandas that we use in K-O-P-N-R using MPy, which is way faster. But it's a very good start. I mean, having pandas out does a lot. Having something working from the beginning. And also, here's another library that we created a couple of years ago, and it's twos. So twos is again an abstraction layer. We do a lot of that against all of the kind of things, MMKHD, Nukey, SQL, etcd. So it provides two things. First, it's a distributed lock management. So it does not provide itself the distributed lock, but it can leverage MMKHD or ZooKeeper or etcd, or whatever, even my SQL works for doing a GLM distributed lock. So you don't have to, when we created Nukey, we wrote for the coordinator part that I showed earlier, we need the distributed lock just to say, oh, I'm working on this metric and I need to log this metric for a while during the computation. We add the option to depend on something like ZooKeeper, which is pretty resilient, scalable. But if you say to users and operators depending on software that you want to depend on ZooKeeper, they're not going to install your software because they don't want to run ZooKeeper. But a few operators that would be happy to run ZooKeeper because they already have a cluster system using it, but most of the time they don't want to use it. And the same goes for the developers if you say to them, wow, you want to test your software, I mean, you don't test Nukey, well, you need to ZooKeeper on your laptop, but you're not going to be happy. So we have this absorption method, which allows us to configure dynamically in the software while ZooKeeper is available, so that's pretty solid for production use and you can use it, but if you don't have ZooKeeper, you can just spam MMKHD and it will work the same way. You can actually write a pretty poor distributed lock in MMKHD. It also provides another set of primitive, which is the group membership, so that's pretty helpful to actually manage a group of processes in Python. If you have a group of processes as spread across a lot of nodes during computation of things or distributing jobs, basically, if you have jobs you want to distribute across a lot of processes across a lot of nodes, ZooKeeper can actually do that, so it works with a lot of overdrivers like MMKHD, same things with ZooKeeper. It doesn't work in my store, but it's pretty easy to have groups of nodes to do the same thing and distribute it. So now, how do you use it? So it's pretty easy. You can use PIP to install it, so there are several and the clients. We use the FlavorSystem to deploy it, so you can between the brackets say which driver you want to use, which will install the right dependencies, so we won't install all the dependencies. So Nuke, Bracket, File, PostgreSQL is going to install the FileDriver dependencies and the PostgreSQL dependencies, the clients. You just have to edit the configuration file where you have to change like a couple of lines only, because everything works by default. The Outgress Script creates the first data in the SQL database and then you have only two demands to run and you can run any amount of work for these demands on any number of nodes, which is the rest API on the metric demand. Once you've done that, you can use, so this is pretty small, sorry, but as a result of the output, so you can use the Nuke command light tool to, the first command is the Nuke Archive Policy List, so this one just lists the archive policies, so there's three by default, with less or more less retention, that's not what you need, and on the right you can see the execution methods which are used like standard deviation, number of points, 95% time, minimum, maximum, etc. The second command is Nuke Metric Create, so let's just use to create a new metric, so once you do that, you have a new metric with ID is just there, so it's a UUID which is created for you by Nuke and you can use this UUID here to just on a bunch of measures, so that's what this command does here, it's add measure, here you have the timestamp and the values to this UUID just before, and then you can request to show the measure and you'll see here that by default return the average, so the average has been completed for the whole day here, for the whole hour, and for the two five-minute slots that we sent here. Same thing, you can look at the minimum, so the smallest value is 22, which is 32, so it works. 42 is the minimum interval for this five minutes here, it's this point here, so that works, and it also compute percentiles, so if you want, you should do network stuff, it's useful most of the time, network people love to have 95% times. As I said, all this kind of metrics, they can be pretty hard to organize, so we have our resource system here, what we do is we create a resource type here, there's the basic resource type, which is called generic, which we can use, it's like the object class in Python, it's just a standard object, here I create several resources to attribute a name, which is a string, and an host, which is a string too, I'm going to create here a resource, whose name is WBLUE 42, the host, it's all string, it's computational, and two metrics, CPU and memory, the type is a server, which is the type I created here, and I'm going to generate here a UID, because you have, you can pick the UID you want for your resources, so UID of the metrics are created directly by, and you keep it for the resources, you can supply your own ID, which is useful if you have objects, or you have IDs in an outside system, and here it returns the object you have created, then you can obviously update the resource, so what happens a lot, for example, in the cloud computing system is that when you create an instance, you have to move it at some point to another host, what happens a lot in this example, it's what I've changed the attribute hosts from compute one to compute two, so the server has moved to another computer, so the attribute here is updated, and I can then retrieve the history of the resource, so I have a whole timeline of the history, which is stored here, it's in this case I asked in the JSON format, and I just have here the first, it calls compute one, I have the region and start which indicates which part of the timeline this object is exists, something here, it changed to compute two at this time here, so I can retrieve the whole history of a resource object, which is pretty handy, since my object has metrics attached to it, I'll proceed to metrics CPU, normally I can directly send metrics measures to the CPU of this resource here, and same thing I can show this value back, which are shown here, like I said there's a lot of features, so obviously we have a database with all of this information in it, and you can do any kind of lookup, or you have doing a simple search, based on the host, those are also servers that we don't compute to, and I do this with my server here, you can do any kind of query in the history of the resources too, you can do search in the metric value too, so if you want to retrieve the list of servers that used too much CPU, you can just send a query to Nukech to retrieve it, it's not tied to only infrastructures, so in our context in OpenSync we do talk a lot about instances, volumes and networks, but it's actually applicable to any kind of resources, if you have an application, a website, with visitors or anything, you can use all this system to create your own metrics, so this is how to use it in person, so it's pretty simple, you just have to use SDK, which is Nukech Client, which is owned by API, first thing you create clients, so specifically the URL of the API, and then you can create a metric, you just want to call, add a measure, add a little resource, get a resource here, it creates a generic resource which exists by default, so it doesn't have any attributes, just an ID, I use a string here, a metrics, the number of visitors, and then I'm going to send the number of visitors for these different timestamps, and I can retrieve them here and it will return the list of metrics, so I just send a couple of metrics here, and I retrieve them here, and the equivalent, Nukech Client, come on, is just here. It has Grafana support, so I'm not sure you can see it, but it's Grafana, we have a plugin in Grafana, which is pretty easy to install from the Grafana repository, and you can plug to Nukech Data Passwords, and just a few clicks, build for the dashboard to show any kind of metrics that you have in your infrastructure or application. So that's it, we have a pretty good documentation on Nukech.XYZ, it's hard to find CLDs there, and we have a high-resit channel if you want to hang out and ask any question, and that's it, thanks.