 Well, good afternoon, everyone. Just to contribute a little bit with all the jet lag and all the sleepiness around 5 p.m. already, we're going to talk about time series prediction algorithms and interesting stuff. So this talk is about capacity planning using just flat modules that you can get from OpenStack that are in your key salometer. And we'll be talking about a little bit about how we can use that data to actually apply some rating to your metering so you can do showback chargeback and stuff like that. So let's move on. OK, let's go a little bit through the agenda. We're going to talk about a little bit about how it's the metering chaos that is generated generally in the enterprise world when they actually know that they needed to be measuring all your clouds since a couple of years and months ago. They just realized in time that they don't have the needed information to actually take wise choices about what hardware they should buy, if their flavor strategy OK to what they're doing, and maybe in some public clouds that happen in two. So we're talking about a little bit about that. After that, we're going through the mechanism of rating all the metering using Cloud Kitty that is actually to an OpenStack project right now. I'm going to put myself out there and try a live demo with a platform to show how you can use those tools to do some advanced monitoring, do a showback and chargeback. And after that, we're going to talk about capacity planning, capacity management, and how you can use a capacity planning API that we're going to put upstream, open source it, so you can use it and collaborate with it to actually do it, do capacity planning, and try to predict the future in some manner. And after that, I'm going through a capacity planning module demo, OK? So metering chaos, some common mistakes that we as cloud operators or we as cloud implementators do when we deploy a cloud or create a cloud inside an enterprise. You will be so surprised about these mistakes that we do. So I'm going to talk about a little bit with my experience and into clients and my experience when I worked at Mercado Libre building the cloud. What were the common mistakes that we did back there? So generally, there's a lot of reasons for a company or an enterprise to try to adopt the cloud. But mainly, there's a couple of reasons that are common. In all the cases, for example, the upper management are pressuring to have a cloud strategy. Sometimes they have a reason. Sometimes they're not, but just want a cloud, just because. But mainly, you're trying to do something with the cloud in the enterprise, depending on your workload. Probably your company, for example, that was our case in Mercado Libre and our case in so many clients that we see. You try to get features into production. You try to reduce the time to market. You're trying to get faster to deploy applications. So maybe the infrastructure team is getting slower to deliver that infrastructure. They need a cloud strategy that is dynamically as possible so they can get all the infrastructure that they need. So the other reason is that that team need, for example, a way to control the infrastructure seamlessly in a way that they can do it, centralize, automate it, and all the stuff that you already know because you're here. So there's a couple of things that actually happens in, for example, public clouds or public clouds talking about metering and rating and showback and chargeback. Sometimes in a lot of private clouds for your surprise, they start with a fixed-price model. They start, for example, with a model of some packs with CPU and RAM just with a fixed price. They did publish the client, and maybe that model fits for then at the beginning. So the billing model and the chargeback model is based on a model that is quota-based. And in public clouds, private clouds, sorry, probably start with no metering at all. They just deploy it in a cloud because they need it like we did. And they start measuring with some basic tools. They have legacy tools to know if a server is getting saturated with IOPS or throughput or whatever. But they actually don't know if the cloud is going to be used like it's supposed to. They don't know if the cloud is used in an optimized manner. So what are the consequences? Your infrastructure team is running well with your budget. They're always asking for servers because the developers or the guys that consumes the infrastructure always need more resources. And you really don't know if that is happening because the need is right or because you have a bad flavor strategy. You're buying the wrong hardware. You're not doing things right because you actually don't know what's going on in your cloud because you don't have any metrics. So that happens a lot. And another thing that is actually going on is, for example, it's a pretty common in private cloud infrastructures that actually don't have advanced measuring. It's probably have a lot of servers with instances from 1992 from a developer that actually launched an environment and it's sitting there doing nothing. And it's consuming resources that you cannot locate to a project that actually needs them. That's a huge, huge problem in private clouds. We hit it every day and it's a big concern. So that's one of the reasons that you have to measure a lot. So another is that you're probably missing clients because your pricing or your log model, your quota-based model of charging them is really doesn't fit in. So they probably moved to a paper use model because they need a more dynamic way of spinning up things and getting charged for it. The worst of them, and you actually probably, like we did, losing money and you don't know about it. So you don't know this guy, so you don't know nothing. So you don't have to keep calm like John here. You have to worry about it. So because you're going to need some way to measure things, some way to defeat the white walkers you have in your infrastructure. So let's get out the default or the suggested infrastructure architecture that we deploy in our clients and internally to actually get the measures that we need to do monitoring, show back, and charge back. You guys probably are related with the salometer architecture. So it's composed mainly by collectors. Yes, so that collector is responsible actually to pushing the notifications. And that matters that the collector received from the RPC notification baths are collected by the notification, the computer, and the central relations of salometer. That mainly the computer runs on the hypervisor, takes care of asking, for example, LibBird about what's going on actually with the instance. So how many ops, what's the traffic, how is the CPU consumption. And you have the central agent that is responsible actually to collect all the metrics from the hypervisor level, mainly using SNP drivers. And you have the notification agent that is responsible to do all the transformations and all the transfer that you need to do, for example, to create custom metrics or to do math operations to a metric to match, for example, your collection time. The salometer collector, then what he does is to push all those metrics into the new API. For those that doesn't know the new API, you're going to talk a little bit more about it in a couple of minutes. So the new API has two components. One is the indexer, when it has all the consistency about what resources are created, what are the resources ID, all the metrics ID, and all the metadata. And then you have the metric decomponent that is actually the responsible of processing all that metrics using all the rules that you create actually to perform aggregation or what query do you do to the new API. Generally, what we do is to deploy a self-cluster to actually hold all the metrics because the raw measures and the processed measures actually resize and stuff. It's the most developed driver back end for New Yorkie right now, and it's the best performer by the time. So that's what we use right now. And of course, on the database, you have to store all the IDs or all the metrics and all the metadata. So, okay, for those that doesn't actually match about New Yorkie, New Yorkie is a multi-tenant series that are based as a service or resource as a service. That is basically used to do what the, for example, influence the view that's right now as a standalone. It's kind of a time series that's based as a service with a pretty much developed API and a really tight integration with Selometer that actually it's performing pretty great right now. And it's responsible, of course, of performing all the metrics transformation and queries to get all the data that you need when you're asking for specific metrics from the cloud. So why New Yorkie? Because MongoDB doesn't probe to scale a lot. I am candidly, a couple of months ago, we ran a Selometer query with MongoDB backend. By now, it doesn't have the result yet. So, MongoDB not quite my tempo. That's why New Yorkie came into the picture to replace it as the mainly backend, de facto backend for telemetry. So what can we do? We're all the measures that we have right now. So we can use it for whatever you can. So we can monitor our hypervisors and instances and service since New Yorkie, it's an open API. You can work standalone without Keystone, so you can integrate New Yorkie to push any metric that you want. For example, if you want to push IoT measures to do analytics in New Yorkie, you can do it. If you have custom software or custom hardware in your infrastructure that you want to use to push metrics to New Yorkie, so you can later do a query and have just central point to console information, you can do it. You can actually create baselines and threshold for alerting purposes, because this is integrated without the age, so you can actually launch alerts based on different policies that you can create that compares your metrics that is pretty useful for pre-dynamic load balancer environments. And you can also see cost to the metrics that I'm going to talk a little bit deeper about that in a bit, so you can actually transform your static billing model to most AWS-likely a prep-use model and why not to use it to predict the future for capacity planning and management purposes. So now that I have all the metrics and all that information, how do I charge for it? How do I associate costs to that metrics? That's where CloudKiddy came in to actually works as a rating model for OpenStack. This is a kind of logical way of how it works. So what it does is all the data that Sailometer collects with all the stations it get pushed into NEOKID like I told you before and CloudKiddy can consume all these measures from NEOKID and you can create rating rules. You can create different cost association based on your own patterns to do whatever you fit your business model to charge for it. So after that, that gets stored. Right now it gets stored on the native storage. That is a feature that we committed upstream. That feature actually before it was ending into MySQL databases and the performance was seems pretty bad when scaled. So we kind of developed the native storage feature to store all the rating results on set like NEOKID does with all these measures so to get a little bit more performance. And you have all the database that actually has all your configuration and rating rules for your business. So let's get into the live demo. I'm going to show you how you can actually use all this data to show information to your users or to create billing reportings or something like that. So we have a pre-flying clouder. So I'm going with my really secure password, there is password. And okay, I'm going to talk a little bit about this particular screen right now. Before we're going through the billing section. So this is the default cloud kitty screen that is integrated into Horizon. So here you can define all your rating rules or your billing rules and you can see you can apply different criteria to each metric. For example, we have here a CPU Delta metric that has associated a flood rule for applying a cost to it. I was going to be kind of blind after working on IT all this year. So it's going to be around $10 cents for each CPU nanosecond that you use. It's just an example. And you have the ability to create threshold rules. For example, if you have a client, a particular client that has, then you want to apply some discount or you want to manage separately cost or you want to apply a custom rule, you can use threshold rules and you can use combined with metadata rule sets in order to differentiate that using any value of the metadata that it's come on every metric because every metric on gnocchi has a lot of metadata and you can actually define metadata dynamically for in metric. So you can actually later on your billing platform you can use that metadata to apply different cost, apply another workflow for that particular metric in particular. For example, if you have an availability zone that has dedicated hardware or has another speed or better storage or whatever, you can use that metadata that is pushing to the metric in gnocchi to handle a differentiation on the price later. So with all this data you can actually after that create, for example, some reports that you can see here. There's a couple of demo reports right now. So they're going to show, I don't know if there's cost associated with it but as you can see, you can create different reports to show what it's actually, what tenant is what doing in which hypervisor you can actually define different levels of grouping in this case we're grouping per instance, per project, you can do it in a flavor base. You can create different reports based on metadata that the old metric has. And with all this information, you can actually create, I don't know if I hit it. Yeah, you can create, as you can see here, you can create charts. Why you create charts is because you have all this information that maybe you want to show it besides the report that maybe you want to import that report later on your actually billing platform. For example, you have another billing separated platform, you want to import all the data, you can do it and the report is gonna fit you well or you can do it via the API just to retrieve and do a quick integration. But if you want that a user that logs into a public cloud and maybe you want to show them a quicker way to viewing a friendly form, all this billing information where his man is going, you can create charts related to that metrics in particular. And with the charts, I want to show a little bit how a chart structure is first. So, okay, this is all the information that you have on Gnocchi. This is just a random interface that we did just to show an example of how you can work with into Verizon to actually get all the data from Gnocchi. You can do all the groupings to basically group the chart as you can do it grouping the report. So you can do regular search, regular expression filtering and of course you can set up the chart type. This can be a pie, it can be a line, a comparison or many charts. So, with all the charts that you create, maybe later you want to create a default dashboard for a final user or you want to create a dashboard for example, the board of directors that want to look at it and see what's going on with the cloud, where all the man is going. And after you create a dashboard I'm going to show just example dashboard that's here. This is a dashboard that is created mainly using Angular and high charts. This have a script fancy things. So, here you can interact with all the graphics to enter actually your granularity level. So, if you grouped in your chart like you did with your reports, you can actually go here and interact with the charts to the deeper metric level. That's the things that you can do if you have all the information but other thing that you can do that is really, really useful for mainly private clouds. I don't know if public will apply to that model but you can do advanced monitoring. So, Nioqui left you, configure different granularities in order to provide you the information that you need. And if you configure a granularity that is pretty small for example with a sampling of one minute, you can actually kindly do real time metrics or real time monitoring real, okay? So, if I'll go here, for example, and this is hypervisor chart, I'm going to show the instances there is nicer. So, if I filter here, for example, I have a couple of instances I really don't see what I'm selecting. So, I can filter here. Yeah, I'm trying to use the provider open stack. And yeah, I'm going to filter it, to filter it by day. Okay, there should be some information there. So, you know, oh, okay. That's right, I want to select by month. Okay, great. So, you can use all your Nioqui metrics to, besides you doing showback and chargeback, you can use it to do some advanced monitoring. So, you can deliver all these panels to your users and since you can post any metric that you want, you can, if you want, give the users, give the final users the ability to post custom metrics and create custom dashboards. This is just an example of how they use it. So, here on the cloud overview screen, we kind of have two clouds integrated. One opens a cloud. The other is an integrated VMware cloud. And what we do here is using the capacity plan API that I'm going to show you later, guys. Using all the information that we have with Nioqui, we create kind of a heat map of which instances are not running good, basis on thresholds that we define on ODH. We can create all this graphic. So, we can quickly see which VM maybe is hitting CPU or hitting not enough RAM. And you can create performance graphic or performance charts that actually are the same framework that actually are graphing all the charts that I showed you before. So, that's the little bit live demo that I wanted to show. So, let's move on with the capacity part and we have later another demo of the capacity plan API itself. So, to me, there is a candidate to predict the future. If you have enough information from your cloud, it's really doable. So, you can project to a future how the capacity is going to be consumed or when or your hardware is going to be tapped out. Or maybe if you have to, in the future, for example, you can realize that you have to change your flavor strategy because you're buying, I don't know, hardware with a lot of RAM and your users actually are not using it. So, you can change your flavor strategy based on information that you didn't have before. So, mainly that's the Wikipedia definition of capacity planning and capacity management. So, that's what it is to do it. To handle all of your resources in a way that are right-sized so you can be cost-effective. So, what's with the capacity plan API that we develop on here in value? Basically, as I told you before, it uses all the telemetry infrastructure that you already got on OpenStack. You don't need black boxes or some other things to use it. It's fully integrated with Keystone. It's fully working with V3. So, you can actually configure domains and use it for maybe resetting in a public cloud environment. So, it's capable actually to doing estimation based on all the metrics so you can actually know when your hardware is going to get tapped out or when you're going to need to buy new hardware so you can do a better hardware planning strategy. So, another great thing that you can do is I want to talk about little deeper in a couple of minutes. You can use what if simulations scenarios. So, you can actually apply variables to your metrics to know if X variable is going to affect in which way to your infrastructure. So, for example, I don't know, if you have, you know that you're, for example, an e-commerce company and you're going to go through a Black Friday or whatever, you can actually, based on metrics that you have privileged from other years, you can actually apply a simulation model to your current data that you have on Niochi so you can use that in order to know if you're going, for example, to need further hardware resources or you need to adjust something to your cloud. So, simulations, of course, since it's using Niochi, you can apply it in a per metric manner. You can apply simulations to a metric in particular. For example, if you have to apply simulation, what if simulation model to the CPU Delta metric, for example, you can do it and you can do it to the memory usage metric in order to basically be very granular. We have, right now, as we speak, a really, really minimalistic front-end that we did for Horizon that's going to show you in a couple of minutes. So, we need a lot of work there, so since we're open sourcing it, all help is welcome and it will integrate in a future with ODH, so when a user defines a capacity threshold, it can work tightly without ODH to send you alerts. So, this is the capacity plan API architecture. Basically, it's, as you can see, it's pretty similar as the show you CloudKitty. All the metrics that are collected from Sailometer are stored in Niochi. So, before that, or after that, sorry, the capacity plan API has its processing daemon that is actually the response of what to apply all the simulations to the metrics that is actually get done by retrieving all the metrics from the Niochi API and applying all the variables that you want on projecting in the time that you want and you have the capacity planning database where all the conflicts about all the simulation and all the scenarios are stored and in between we're storing simulations are consulted very often or when you just create a simulation and you didn't change any parameter, all the simulations are getting cash on greatest, so you don't have to wait for Niochi to respond every time to your simulation if it's the same, if it's the same every time. In a future, we'll plan to store, give the user an ability to store a fixed simulation or a flat simulation and to put it in the set backend so we can actually work more like Niochi is doing right now. So what we use to create the API, this is a couple of libraries that we use, we use Numpy, we use pandas to interact with the data and we use actually stats model to interact with the data to apply the algorithms. We are testing two algorithms right now, the algorithm that you use to apply the simulations and to do all the scenario testing is up to you. It's pretty configurable via the configuration file, so if you want to introduce your custom prediction algorithm, it just has to be a time series prediction algorithm so you can actually interact with the Niochi data, you can do it if you want to contribute with a custom algorithm that you guys maybe can have that would be great too. So we're kind of testing both algorithms to test which of those generate the better and most accurate output right now so to leave it as the default algorithm so to ship it with the API. As said before, all this whole model is implemented with stats model. So, I'm going to show you, this is a recorded demo since our environment is kind of pretty early so I want to go through and explain a little bit. I don't know if I explain, okay. Okay, so here you have, this is the screen that we have right now on horizon when you can create your scenarios. These scenarios are going to be simulation scenarios that can have what if models apply to it or it can be just flat simulations with a standard prediction based on how exponentially, for example, ametric rows, okay. So, here you have, for example, an example scenario that we created with the CPU metric. In this case, it's applying a what if simulation model. It's very basic, for example, we are simulating that. In that data specifically, we're going to be demanding, I can actually see the number, demanding 10 times more CPU resources and increasing 10 times the capacity of the cloud to see how that metric moves on as we will fulfill the requirements that maybe we're going to have by that time. So, you can have all the, see if you don't specify actually a capacity and demand value, it's going to project the metric which is based on historical data. So, here are the simulation scenarios that we have. It's going to a bit one, so we can show you more in detail what a scenario actually is. So, here in the scenario, we have to a memory, a memory scenario that has actually growth the capacity and the demand in a percentage and now we're going to move on with the simulation to see how all the metrics that we have altered using the algorithm are going to be affected. We have one simulation there. So, I'm going to go ahead and get it. And there, if you can see, I'm going to pause there. So, if you can see, you have the simulation name, of course, and a small description and you have, for example, all the time span that this simulation is going to show, so from where to where are you going to retrieve the data and the great area that you see there is actually what's going on right now and after that, what is going to maybe happen if you introduce all these values into your metrics or that we change in the infrastructure. For example, if you add more CPU or if you will require more CPU that is graphed right down there. So, as you can see there, you can use all the metadata that is stored on the NEOKING metric to do some filtering in order to graph back there all the simulations. So, for example, right there, it's very even. So, if you, for example, demand 10 times more capacity and you add 10 times more capacity, it's going to stay really, really, really flat. And, for example, back there, if you see the memory, you can see that it's going to, I forgot, the red line, it's the threshold that we defined for the capacity. So, you're seeing actually how the green line that is actually the usage is going to reach our threshold in a future if we continue moving that way with the consumption and with the hardware that we have. Okay, I guess there's not much on the video. Like I said, it's just a basic horizon panel just to show the current features of the API. And that's pretty much there I got for the presentation right now. So, if you have any questions, are we pleased to answer them? Yeah, actually, Ikea, for the capacity, I can show you that right now. So, I have just a light environment, working environment of showback chargeback and billing, but I can actually, get back to you, so. Yeah, yeah, yeah. Yeah, the question was that many people maybe are using Monaska to call it right now, the metrics. So, I think that's totally possible. We can, since we're open sourcing it, we can actually, since it's modular, the platform is modular, so we can actually maybe collaborate to call the Monaska API driver back in so we can work with it, yeah. Okay, so, okay, thank you all. Thank you for coming.