 Okay, I think we are right on schedule. So, good morning everyone. Third year in a row that we have the honor to have a presentation here at the Head Summit, so I hope you enjoy it. So my name is Alejandro Comisar and I'm the CTO at Nubali.com, a Latin American company that we made OpenStack happen for Latin American companies. And today I'm gonna be talking about about how to get metrics, not only for OpenStack, not only for KBN, but from big players, big cloud players like B&W, AWS, and Azure also. The base, in the best case scenario, utilizing exactly the same tools that you have today to get metrics, performance metric, utilization, and information metric from an OpenStack cloud like Sailometer and Miyake. So let's go through the agenda. We're gonna run a quick through about what Miyake is. How many of you don't know what Miyake is? Perfect, so most of you really know, so that would be great to know how much to say about Miyake. On the second one, okay, we got metrics. We got metrics on OpenStack already. They work great, and most of all you see, if you use Miyake as the final backhand for those metrics, but what about the rest of the metric is if you are already experienced enough to know what to do with the information that you already have from OpenStack, right? So we detected three, again, big players like B&W, AWS, and Azure, where we can get information from. Most of our customers have this fourth cloud technology utilizing at the same time for hybrid cloud. So once they really understand what's going on with OpenStack, they wanna know what's going on with the rest of the technology that we're using. So that's the next chapter. And after that, I will try to show you what we got. Unfortunately, I'm not gonna do it live because our environment has been shut down, but I made a video already, so I think I'll be good. So let's talk about Miyake. Miyake is a time series database. A couple of days ago, it's not anymore an OpenStack project. There, I gave you the link about that news, but it was, past week. We consider Sayinomir Collector the best friend. If you're using MongoDB, consider seriously getting metrics to this back end, which is the one that actually works. It's a multi-tenant time series database. I mean, every information, every metric that you put into Miyake, actually, the owner is a specific project. It's a specific project, and in the quite near future, it will be a project into a specific domain also. It's really high performance. It was built to work really fast, basically because all the information that you put into Miyake is pre-aggregated. If you wanna, for example, know what's going on with the instance uptime every five minutes, for example, that information that you pushed into Miyake will be pre-calculated for you, and you can see it five minutes, one week, three months, one year, and all the information already exists, which is collected by a part of Miyake that we're gonna visit afterwards. Uses two different back ends. One for the actually real storage, where the information is, and one for the index, that is the back end, the storage back end, which actually makes the connection between the moving parts in Miyake, which are the resources and the archive policies and the metrics itself. Uses lots of different back ends. You have lots of different back end options, depending on what setup you already have in OpenStack, and if you don't want to use it directly with OpenStack, because it actually can work standalone, and you can use it if you are running instances on AWS and you wanna push metrics directly there, you can use pretty much the technology that you will have there, so we're gonna visit it in just a moment. And the great thing is not only certain series database, but it's also as a service, which means that it's running on a REST API, which is amazing because that means that you can easily utilize most of the tool that you already have on whatever is the technology that you're using that I named previously. So let's run a quick run through about how does the architecture of Miyake looks and what are the back end that you can use. They prefer the fastest one and the most efficient one itself to put the information in, and we are always talking about the metrics and the aggregation, right? The default back end is a file. That means that if you wanna use file and you are deploying Miyake, highly available and highly scalable, you can kind of share, for example, on NFS through all the Miyake's and you will use the file back end and that works just fine. Like I said, if you don't have CIF, or if you're working on a public cloud, like for example, Amazon, since the 3.1 version, you can utilize Amazon 3 as the file on the can, which works pretty well, needs a little bit of a work around, but it works really well. If not, you have another options, like for example, OpenStack Swift, if you have a ready object store, for example, you have full block storage and other technology that it's not CIF, so that's not an option, but you have a pretty well and strong setup of object storage, you can use Swift or you can use Redis if you are skillful enough or you trust on a Redis cluster to put all the information, so our watch is great also. So, what happens? The clients, which are us, we interact directly and only with the Miyake API. The Miyake API interacts with the first back end, which is a message storage, which is a temporary storage where the metrics that are real time generated are put it on the first place and then we have a background process, which is called Miyake MetricD, which is the responsible of getting that temporary information and do the pre-aggregation based on the archive policies. An archive policy in Miyake is the possibility that you have to define if you want to calculate on any metric by, I want the value by five minutes every one week, every three months, every one year, I want the value of all the metric on that granularity and I want to pre-calculate the minimum, the maximum, the average of that exactly same value. So, the responsible doing all that processing is MetricD, which is a scalable process that you can scale the same way that you scale any OpenStack project. And you can also, it's running amazing on containers also. That will be put by Miyake MetricD on the final storage, which could be different from the message storage from the temporary one, but the metric storage will be one of the sources that I named previously, one of the options that you have on the left. And then we have the indexer, the indexer is, most of the time is a relational database. It could be, the preferred will be Postgres, but it works amazing on MySQL, we use it on MySQL and it works amazing. And as I said, the index is responsible to, okay, I have a resource type, which is instance, I have then resources from that resource type, which are all the instances that I have running not only on my OpenStack cloud, but also might be on a different technology. And then those resources have metrics and those metrics belong to all these archive policies. The one responsible to get that information, to have that information stored is the index service. So, one yucky. It's fast, it's really fast. For the ones that are used to wait 10, 20 minutes or more, to get a query from salemeter API with MongoDB as backend, I'm talking about the exactly same query in less than 10 seconds, 20 seconds. I mean, it's really fast. And most of all, if you use it with the preferred backend, which is self. And that has directly to do with the second point, right? I mean, if today in OpenStack, you want salemeter, if you wanna use salemeter, okay, so you need to go with yucky as the preferred final storage backend. Scale for real. Scale for real means that the API is really efficient to get thousands of metrics even per seconds from a medium to kind of large cloud setup. Okay, so if you need more APIs, you can spawn them as easy as you spawn any other API. And as I said, it works amazing on containers and exactly the same with the metric processes, which is the one that actually uses CPU. So it's as easy as spawn as many as you wish, as soon as you get the performance and the timing on the metric processing that you need. There is no hard storage dependency. As I show you previously, I mean, you have lots of options storage to use, right? So again, if you're running on AWS and you don't want to deploy a self cluster on virtual machines, or maybe you do for a cost matter, you can do it. If not, if you're already utilizing S3, you can just point to use S3 if you want to use EFS. For example, and you want to share a file system among lots of new instances, you can do it. I mean, the options are kind of enough to adapt to the scenario that you might have. And it will not leave you hanging. It will not leave you hanging means that it will have to happen something really weird on the storage back end for you to start to feel that as more metrics that you get, as more the performance gets degraded on the API, that really doesn't happen. I mean, we have lots of metrics on the new key where we are pushing all the, all the, our customers, virtual machines instances, and not only virtual machine, but also supervisor, and works really amazing. So for those that you don't know how things are done, basically in OpenStack, in OpenStack, we have mainly two entities that are gonna, that we are gonna get metrics from. We have the compute nodes, which are the IPervisors, which is the salemeter agent central, the one responsible to get those metrics from. And then we have the virtual machines. The virtual machines, we have the salemeter agent compute plugin that is running on the compute nodes. And those are the responsible to connect with, for example, in KBM with Libvert, and they're gonna get the metrics of all the virtual machine, and not only a supervisor, but virtual machine, they're gonna push the information that they get to the OpenStack notification box, which most of the time is a RabbitM2 cluster, which is pretty much specific for that task, and you don't want to utilize exactly the same cluster, the RabbitM2 cluster that you're utilizing for the notification and the RPC on the rest of the cloud. Then after we got all that information, we have the salemeter notification agent, which is responsible of doing what it's called the transformer. A transformer could be just reading the metric and pushing it back again to a temporary queue on RabbitM2 to be read by the salemeter collector, or a transformer could be, for example, calculating in flight the average of a specific value based on the previous version that happened. So that is the stack of the notification agent, which actually scales as well as any other entity that we have here on the presentation. Then we have the salemeter collector, which pushes directly to Niochi, and then we have all the Niochi infrastructure, which is pretty much what I've talked previously. And then I draw the rocket platform, which is our way to show the metrics to our customer, to show everything that is placed in Niochi, so that I'll try to show you a bit about what we're doing with those metrics. Then again, the indexer, which is responsible to put everything together about the relationship between the resources and the metrics, most of the time will be, I don't know, a Maria Gallardo de Bec cluster, could be a Postgre cluster, or it could be, if you're running on AWS, it runs pretty well on RDS, the RDS service. The metric D, the responsible to do the pre-aggregations for Niochi to answer really fast to any query that you have, and that's it. So, yes, it works. It works amazing because it's OpenTag and everything was built to work with OpenStack. But our customer said, okay, what you're showing to me, that it's placed in Niochi, what you're showing to me regarding the utilization of my cloud is amazing, but still, what can I do? I'm running AWS, I'm running Azure, I'm running VMware, which is not connected with OpenStack, which is not managed by OpenStack. What can I do with that? What can I do with that information? I want four, five taps on my browser to explode information. So, we thought, okay, we know Selometer, we know that works really well, we know the architecture, we've seen it behave as amazing, I see it behave with two VMs, with thousands of VMs. So, we thought, why not utilize this information and this knowledge that we have to try to get the metrics from all the players from a single architecture, a single way of doing things. So, that's exactly what we did and we started to kind of try to understand how Selometer works, how you can plug in into a Selometer, for example, an Azure compute process and define a specific namespace for you to get kind of metrics from another provider, which is not OpenStack. So, what we did with VMware, what we're looking at here, it's kind of exactly how it's done with a OpenStack managed vCenter that is happening today. You have the Selometer pulling agent, which is the compute namespace, which is kind of connected with the vCenter API, but actually on the OpenStack managed side, it happens to call lots of requests to Keystone to then call to Nova and ask what are the instances that are actually existing on this vCenter deployment. We made some safe assumptions about how to do that without having OpenStack in the back, like what projects are all those instances on vCenter that has nothing to do with OpenStack part of, what is the domain that these instances are part of, what's going on with the flavor, there's not a flavor information, so what do we do with the conjunction of the memory, the CPU, et cetera, which is configurable by the Selometer confile. So, you can kind of place those configurations there and launch the Selometer agent as easy as you launch it with anything that actually is running OpenStack, and it will start to get information of about every virtual machine that it's managed by Diaby Center, and it will use the same infrastructure and the same architecture that you have already in place. Maybe a little bit different happens with our version of Azure and Amazon Web Services, basically because what happens is you have credentials, credentials to log in and to call the API in both cases. So what we did is we placed barbecue into the picture, which is the Kim Management API, the official Kim Management API and secret management for OpenStack. So what we do is we put on every customer we have, we put their secrets into barbecue and into a project that we assign to this customer, and then he might have a single project can have like 10 credentials, can have five for Azure, five for AWS, for example. So we developed the Hybrid Puller, which is a plugin of Selometer where actually it's intelligent enough to call Barbican, to go through Barbican and say, okay, I have all these customers, I have all these projects. I'm gonna read all the credentials that I have on Barbican and I'm gonna use those credentials to start calling the API and getting the information from Azure and Amazon Web Services. And we make sure that this same Hybrid Puller pushes the metrics with the same names and with the same format that actually is being done on KVM, for example. For you to, if you, for example, are already exploding the information that you have in Yockey, you made exactly the same query, but with specific metadata on Yockey that exists, like, for example, the provider, you need a specific metadata to know what virtual machine belongs to what technology or provider, right? So the provider is a specific metadata on resources for you to realize what technology is running on. And then, again, that is being pushed to RapidMQ, then that is being pushed to the Selometer Notification Nation where we generate the transforming, we'll get back to the RapidMQ, and then the Selometer Collector would be the responsible to call the API as fast and as violent as the amount of metric you have needs to. So what other things can you do with that? You can do show what can charge back with exactly the same information. We had a presentation on 2016 on the Austin Summit. I'll leave you the link there about how to do show what can charge back. Utilize another API, which is called CloudKitty, which is officially on the big tent, and you can do lots of things, like not only seeing what's going on with the cloud, but only charging for what's going on with the cloud also. And you can do capacity planning also. We made, we did also a presentation on the Barcelona Summit, which was amazing, and we showed what can you do to kind of foreseeing the future, to kind of see, okay, this is what's going on with my application today regarding performance. What is going to happen from here to one year if I don't change anything? What's going on with the CPU? What will happen with the storage? What will happen with lots of things? And you also can run simulation scenarios. What will do if I change the flavor? What will happen if I double the requests that are entering to my application so you can get exciting information from that part also. So, and you can do cloud monitoring and analytics, which is kind of what we are trying to do right now. We are trying to get the information from Miyake that help us to know what's going on with the instances, not only on OpenStack, but also on the rest of the providers. So, again, I was trying to do a live demo, but the developers insisted to do a migration right in the middle of my presentation. So, they asked me to have a video, so sorry for that. So, I will need to start and stop as fast and as good as I can to show you how this works. And this is the dashboard. The dashboard is a horizon dashboard. It has been modified by us as a couple of plug-ins. After this, we have the first layer, which is we show all the providers that we have. This is a gnocchi call, which is, okay, give me all the resources ordered by provider. So, we can really and fast to, we can see what's going on with the specific customer on a specific domain. Okay, in your case, will be just a domain. What's going on with all the resources with a specific metadata, which is the provider. So, we can make this tree and organize the information there, and we can show to our customer what's going on with everything they have, no matter where the cloud is. So, you can see all the instances on every provider, but if we're talking about OpenStack, I can see not only the instances, but also I can see where they are running regarding the hypervisor. Then, if I click on on hypervisor, I can see, for example, I can query gnocchi and see, okay, what's going on with the resources that are running on this specific hypervisor. And I can do graphs like this one, like for example, to see what's going on with all the instances that are running on that hypervisor, and not only to show what instances, but also to show what's going on with them. I have, we have a color coding way of showing things, like if you have it in gray, that means that, for that resource that actually exists, we don't have metrics, or metrics doesn't happen from one hour from now. So, it's kind of, the instance is not doing nothing. And then, the rest of the color that go from green to red, it's a pretty fine values that we assume to be safe regarding what's going on with the CPU usage, the memory, the networking, and the storage. For you to know, if the instance actually is doing something, which could be amazing, or could be wrong for the instance to be doing something, but it's kind of configurable. Then for us, it's kind of important to show to the customer what are the optimal, the wasted, and the idle licenses for the customer to know, to quickly run a through about what's going on on the compute node, or what's going on with the specific provider. So, that is information that a user could be of pretty much use, and you have the same possibility to see what instances are powered off, what instances are overcommitted, or what instances are idle. Idle means they're kind of doing nothing despite the flavor that they're using, right? Perfect. So, and if you click on a specific instance, what we do is, we call New Yorkie and say, okay, give me all the metrics that I have on this specific instance, and I'll show you the most important ones, right? The CPU, the memory and the networking and the storage usage. We define a health marking, which is pretty much a way of saying this instance is stressed, if you want to say it in a way, and we show also thresholds defined based on this specific metrics. This is not the newest version. The newest version will show a graph for every metric, for a threshold for a specific metric to be defined, because maybe 80% of CPU utilization is a critical threshold that is not exactly the same, for example, for the memory or the networking utilization. But we kind of show the user what's going on with the thresholds predefined for all the metrics, which is a way of the user to know what's going on with that specific instance and a little bit of performance tips about, okay, you can change the flavor or you can resize up and down the instance. So, then we have a quick way of showing the user also what's going on with the real time metrics, which is, which the user can filter and the user can filter and say, for example, I want only the OpenStack provider instances. Then we call New Yorkie and we say, okay, give me all the instance of resources that are the provider OpenStack, give me all the resources where the metadata host where it's running, it finishes with SM03 and they call API. Then you can filter and see only those instances and take a quick look of what's going on with the metrics, with the performance of those instances really fast. We show the main ones, we show the ones that we think are important for the user, like memory CPU utilization for you to know what's going on with the performance and you can change granularities, right? And this is, again, the exactly same called New Yorkie. I don't want the granularity by one hour, I want the granularity by one day, I want the granularity by one weekly, I want it monthly or maybe yearly. The user can do a full screen or they can kind of zoom in on a specific metric and they can see how the rest of the metric are getting updated, basically because you might be interested on knowing, okay, this happened with the CPU of my instances a day ago. So what happened with the rest of the resources when the CPU went up? How a specific metric actually gets in the middle of the performance of my application, of to understand what's going on with my application regarding the performance, the overall performance. We have exactly the same metric for the hypervisor because, again, it's a resource, it's a difference resource type from the one on any instances so you can pretty much take a look at the same. And again, this is really important. Everything is happening on New Yorkie, right? I mean, we don't put anything weird in the middle to show this information. Then we have a pretty much practical and analytical way of showing things which are the dashboards. The dashboard, it's a graphic representation of a specific metric. We have a graphical representation of what's going on with the memory usage. We have a graphical representation of what's going on with the CPU utilization, the disk utilization, and this, as New Yorkie allows you, you can graph things like this one to start to interact with what's going on with my OpenStack provider, what's going on with my IWS instances, what's going on with the CPU utilization of OpenStack. I want to see what's specifically going on with OpenStack. I want to see the instances that are using this average of CPU utilization. I want to see how they are divided, or I want to take, for example, IWS out of the picture. I mean, this is something that you can make pretty much interactive as soon as you get the information directly from New Yorkie. You can see what's going on with the disk utilization of a specific, again, provider. I want to know what's going on with VMware while something specific on my application happened with the CPU on OpenStack, because maybe traffic went a little bit, I don't know, violent on another provider, so you can take a quick look at that one also. We show information that we think will be important to the user, like networking bytes, going and in going, and for example, BCPUs attached by OpenStack Flavor, that's not that much important information, but for you to understand what's the power of the information that you can get from the New Yorkie API once you get all the metrics pushed into the platform. And again, these graphics are, I don't know, are interactive, but you can do pretty much whatever you want. You want to just graph something for a specific provider, you can do it, you have lots of tools to do that, so that will work amazing also. So not only that, but also if you're using Showback and Chargeback also, you can kind of use that information to also show what's going on with the cost of the instances with the cost related to the specific performance metrics that you have on your cloud also. We, in that exactly way, you can do a quick run-through about what's going on with the current cost about your specific platform. This is CloudKitty, this is with a little bit of modification about how to interact with different metadata that New Yorkie has, so I can see quickly what's going on with the cost of OpenStack, I want to see it divided by project, and I want to see how the total cost of that project is being divided by all the instances I have, and then we have the different cloud provider, which is information that you can get exactly from the same hybrid pooler that I showed on the architecture previously. So you can see pretty much what's going on with the cost of lots of different parts of the private cloud technology that you're using. In this case, we are showing what's going on with S3 divided by bucket, what's the price related with that. So the idea of this video was to show you what you can do with this information, which we believe is really cool, that's the main reason why we did it, and what we want to do is, from next week after the summit, we want to open source both Asians for you that if you have already OpenStack, and you said, okay, I'm getting metrics from the rest of the players that I'm naming off, I want to utilize what I'm already doing on Selometer, or at least I understood, or I think the architecture is cool, so I want to deploy Gnocchi, which is really easy, and then I want to use these Asians to try to see how things are going with the metrics on the rest of my cloud providers. So that will happen. We need to kind of run through the code a little bit for you not to see messy things, which actually is how it's looking like exactly right now. So for Netflix, you will be able to push this, and maybe in the best case scenario, if it has a use case, and if it's something that the community feels that it might be a fit into OpenStack, we could maybe have another repository into the OpenStack official GitHub. So, okay, I think we have a couple of minutes to run through some questions about what we saw. So if you please, I would love to hear what you're thinking. I would love to hear what you think it's right, or what you think it's wrong. We have some minutes. So if you want to, like, this guy going through the microphone will be amazing. Hi, good morning. Good morning. So you mentioned you're about to open source the agent part. Yeah. Maybe I missed it, but what's the horizon plugin part that you were showing there, and what's the status of that project? The horizon plugin, that's a good question. The horizon plugin is something that we are kind of seeing how it works with a customer. We are seeing how to integrate things in horizon, which is not a trivial thing to do. It's something that we are developing, and the status is that is, I can't say. I can't say it because we are testing it. We are using our customers to test it because we want to push it into the community, basically because we don't know if it fits for something that the community might be interested in. Basically because we explode and we show information in a way that we are not sure that's the way most of the user wants to explode information. So what we thought, it's mainly, maybe on the first step, open source the agent and start to see how the user utilize those agents and how they think that that information must be gathered and must be shown in a way that has a specific use case. So then if we kind of see that matches what we have already, we can think pretty strongly on open source it also. Thank you. More questions. Yes, good morning. Good morning. Could you tell me how big is your deployment? How many metrics do you collect? Yes. Today I'm gonna tell you about all the clouds that we are collecting metrics, not only OpenStack, because I don't know if it was understood, but for us, a customer is an OpenStack domain. The infrastructure that we're using is based on OpenStack purely, and for us, if you are using AWS and you want to utilize this dashboard, for us, you are a domain, right? So we have lots of domain which our customer and we have lots of project into that domain, which is a specific cloud provider. Today, in total, we're pushing metrics, the finest granularity that we're pushing is 10 minutes, which is for us enough to show the customer what's going on with the performance and from there, utilize that information to do showback and chargeback from weekly, monthly, et cetera. So today, what is happening to our platform, we are pushing information in about 2,000 instances, and that is using a really low footprint on our resources. We are handling all that traffic from 10 minutes from 2,000 virtual machines, utilizing three Niochi API containers running on virtual machines and only four Niochi metric D to do all the processing as fast as we want for the customer to show all the metrics. So you won't need that big infrastructure because Niochi works really well and it's really efficient. And all that information we have already, almost one year of information already placed there and it's not taking more than 30, 40 gigabytes of information which is really efficient on the way on that Niochi actually handles information. I don't know if I answered your question. Sure, thank you. Thank you. Any other doubts? Perfect. So two minutes early, so I don't want to waste time, but anyways, thank you very much. I hope you enjoy it.