 Hello everyone. Very nice to see you all here. I apologize for my voice, but what happens in Hong Kong stays in Hong Kong and I'm glad because when I propose this presentation my intent was to Make sure we had the opportunity to talk about something very interesting that happened in the last recently cycle Which is the collaboration of two projects? but as Much as I would have liked to I was not very instrumental in this The actual development of this so I have asked Owen and Julian to Do the presentation because they were the guy instrumental in doing this collaboration So I am myself Nick Barsett. I work for in advance Been more involved in the open stack project for quite a while not as much that I would like to anymore because of conflicts let's say like that but The company in office is still very much involved as you certainly know Julian works with us. He is one of our Lead developers is now the pto of Cilometer He's been working on the open stack since the beginning is a long-time Python developer and oven whom is Working for red app is a principle software engineer over there been involved on open stack for the past two years that correct Anyway, let's go to the meat of this presentation So at the beginning we had two project we had Only one thing in common That thing was the fact that they've been incubated and integrated at the same time in open stack so it's One could say they are twin projects, but one was taking care of measuring what's happening in open stack And the other one was taking care of orchestration One is a template driven orchestration mechanism, which is meant to automate deployments. The other one is something that is lying underneath that is Collecting information so that you can eventually bill you can eventually Do analysis and what's happening in your cloud? You can do quite a bit of stuff with Cilometer, but yeah one was on top one was on the bottom so on the surface these two projects are Really not much in common When you look at Cilometer In fact what we built with Cilometer is a Collection engine the transformation engine the publishing engine That stores information in various databases and aggregate them We try to collect from all open stack component with a growing number of available meters and This is what open stack measurements should be this is what Cilometer is about and Owen came up with that great idea that I'll let him express. Please Owen join me on the stage. Cool. Thanks Nick Yep basically Cilometer provides this workflow whereby we acquire data We basically bring it through at a sequence of transformations We then publish the data via various different conduits our primary conduit is a MQP but we also have a Configurable mechanisms so that the data can be for example in a lost tolerance situation be published over UDP instead and We then store the data in a variety of different pluggable storage Drivers for MongoDB for SQL alchemy so you can use my SQL or post-squares for H base for dp2 and We also provide a API service that provides an aggregation view over these data so you can go in and say over a particular time window and Give me this particular Data sliced and diced in in in various different ways, right? So that's what oh, that's what Cilometer is all about It seems like a very low-level thing compared to heat and I'll describe the basic workflow heat next but as it turns out this method of acquiring metering data and Pulling it through the pipeline Storing it and then making it available for aggregation via an API. It turns out that's exactly What he needed in order to drive auto scaling as we shall see But let's just step back a second and for people who aren't so familiar with heat I'll just describe the very basics now Actually deliberately made the font there so small that even the most eagle eyed amongst you are going to struggle You know, I see a few people squinting. Don't worry about the details there, right? The point is the conceptual understanding of what a template is about so heat is about a declarative mechanism for Standing up your application stacks Describing the resources that you need how those resources are interrelated to each other, right? And doing all of that via template. Yeah, so the type of things you describe in the template would be What resources you need? Okay, so I need yay many instances volumes Network elements like floating IPs load balancers all that sort of goodness, right? Then you describe how all of the resources are Interdependent on each other. So for example, you might have an instance You might have a volume of block storage and in order for the volume to be made available for use within the instance It's got to be attached to the instance with a particular Device path and so on then you also in general need some form of Customization some way of sort of saying for the current orchestration workflow And I want these particular values to be used and those values might be things that are just Inherently variable like the size of a volume today. I want it to be one gigabyte tomorrow I want it to be five gigabytes right or the flavor of instances that you create today I might want to spin up tiny instances tomorrow I might want to spin up extra large yeah, so those type of things can be parameterized You also have a situation where sometimes you've got kind of sensitive information that you're going to be and using during the orchestration workflow things like passwords and keys and so on and Pre-signed URLs that type of thing and those are all injected into the template via Parameters because these templates end up in github and you don't want to encode sensitive information And then lastly you've got some kind of outputs generally from the orchestration workflow things that you can't predict in advance like Say you're allocating some floating IP what the actual IP address is right So that's basically what you're talking about in a template what you want in terms of resources How they all mesh together and how you want to kind of specialize this individual and this individual workflow execution So once you've got all that crafted up in the template The next step is for that to be sucked into the heat engine So the heat engine consumes the template description, and it does all of the kind of goodness that you'd expect It verifies it for correctness. It builds a Directed a cyclic graph Describing all of the resources that are required and how to depend on each other and then it walks the graph in a particular way Firing off tasks to create each of the resources doing things in parallel when that's possible But also taking care to impose the correct ordering constraints. Yeah, then when it finishes After interacting with the public APIs of the various different open stack services It talks to Nova to spin up instances talks to Cinder to create Block storage volumes talks to neutron to allocate loading IPs all of all of what you'd expect all through the public API You end up with your complex application stack stood up exactly as you need it You've got instances volumes the instances are placed behind the load balancer keys are injected met user metadata is injected in the Indecrect way and that's basically what he gives you a you know declarative way of Doing these things that you could do manually But obviously you want it to be repeatable you want it to be customizable You want to capture this in a well-defined way and that's what what what he templates provide you with So that's all good. That's fine, but Sizing these things can be hard right deciding how many instances you need to meet the load because predicting the actual level of load and The patterns of variability in that load can be hard Yeah, so sometimes these kind of patterns are very predictable You know you got your your your well understood seasonal variations in demand that a that a retail website might have Yeah, like lots of people buying stuff on cyber Whenever it is Cyber Monday, Black Friday those kind of days, right you get weekly spikes, you know You get you get daily spikes where at certain times of the day. There's a lot more activity But you know beyond those kind of very predictable changes in load there are Quite unpredictable and unexpected spikes in demand or troughs in demand And you want to be able to react in this way in into these variances in in your demand level in an automated fashion That doesn't require any human intervention, and that's what auto scaling is all about It's about automating the scale out and the scale back of the instances within your application stack to react to dynamic load conditions, right so When the heat guys approach auto scaling first Celerometer, you know was kind of like it was the early days for Celerometer It was the early days for heat and they needed something that would work something rudimentary And this is what they did, right? So this is the kind of version 1.0 So basically very simple idea within each instance that that heat spins up. There's a little Python script called CFN push that's and that's generally installed via cron to be run in a scheduled fashion Every minute or whatever it is and that basically uses the psutils package to find out stuff like Mem util CPU util that kind of those those kind of basic stats to do with the current instance And then it reports it up to what I call cloud watch light and a really rudimentary Cut-down version of the cloud watch API, right? Then heat basically persists these data and then runs periodic jobs to check to see if the sequence of metric data points that have been reported by this by this Python script within each instance and Have crossed some threshold that's been defined and if so it goes and scales out the stack Yeah creates more instances and the the rate of scale out and the parameters around sanity checking You know and ensure you don't trash and that kind of thing. That's all nicely definable, right? so Basically you go from a small stack to a bigger stack and back to a smaller stack as load conditions change over time It's a very simple concept Now the thing is though and don't get me wrong. This stuff all works, right? The version 1.0 worked it was solid, but it had a number of kind of or impose a number of kind of undesirable requirements on Heat that the heat guys were fairly eager to free themselves from all right Let me just describe a few of these disadvantages that were perceived around this version 1.0 of you know This rudimentary internal implementation of statistics gathering and alarm evaluation within heat itself So first off if you return to this picture here You notice that the we've got a requirement on the actual image that's used to boot up the instance We require that this particular little Python script is available and that the cron job has been installed to run it in a Periodic fashion right so obviously anything that makes the Images that heat can use to boot up instances less generic is bad Right we want to keep these things as generic as possible right then we've got to call up to CloudWatch light now that's done using the Boto client for CloudWatch and that requires and a Key pair is injected into the instance so that the Python script has the correct level of privilege in order to invoke on that API All right, that's generally something you want to avoid leaking privilege to an ingest agent All right, so again, that's kind of mmm making people nervous right the next thing is the actual periodic Evaluation of the metric data points to check if they've crossed the threshold now That's done within the heat engine right and a heat engine basically Current like there's a lot of interest in scaling out the heat engine horizontally Right now if you have a periodic task within one of these heat engines, right? You basically it it militates against this kind of scale out because then you have to have coordination between the Scaled out heat engines so that this one knows that it has to take care of some subset of the alarms And this other guy is taking care of a disjoint subset But all of the alarms are covered somehow even though that you know the population of heat engine replicas is Changing as you scale it out and scale it back again Right, so that's another reason they didn't like it and then the last thing that that was kind of undesirable about this is That heat was storing the actual metric data itself in its local database right now Metric data can by its very nature can be quite high volume and the heat database is not designed You know, it's not like kind of MongoDB or anything It's intended as a kind of more traditional open stack style of database relatively reasonably and small Data volumes so obviously storing a lot of metric data there and having to take care of expiry and all of those things Yeah, it just wasn't the area that the heat guys wanted to be involved in it's not kind of a one of their core Concerns right so they're very eager for this to be kind of taken off their plate now Solometer to rescue it turns out that is non-core concern for heat is Exactly core to what's a long order is all about right and as it happens We're already doing a bunch of the stuff that heat needed We're already gathering most of the relevant stats and we were gathering in a way that's much more convenient Instead of doing it from the inside out as in a script that runs within the guest and reports up to a cloud watch API We were doing it from the outside in with an agent that runs on the Nova compute node and then talks to the local hypervisor Talks to the libber team and for example and extracts this information So you've nothing in the guest right no need for a special script to be installed no need for the key pair to be injected We also had an API service and that API service exposes Aggregation functionality over statistics so that you can basically say over a particular time window and Give me the average for this particular statistic sliced and diced in this particular way all right, so a lot of our work was already done in terms of and Rebasing the auto scaling functionality in Solometer, and that's always nice when when there's a good start already in place So what do we need to do to actually make this work? What what what new elements did we need to add? Well, we needed to define a new API exposing alarm lifecycle All right, so your basic crowd operations and also alarm history, so you kind of have an audit trail You can see state transitions over time We wrote new services a to evaluate alarms So that's kind of the equivalent of the periodic task that was previously done within the heat engine All right, so we do this in a way. That's kind of horizontally scalable So you can have a single alarm evaluator service or you can have a set of partitioned alarm evaluators that divide the work amongst themselves So we have a coordination protocol for that And then we have another service that basically handles the feedback loop back to heat right a notifier service That basically calls out to a pre-signed webhook just a URL basically just as a post on a URL a little fragment of Jason saying Yeah, we had a state transition. We've gone from the okay state to the alarm state Because so many points data points were above the threshold and here are the most recent data points Yeah, so all very simple and The beauty of it I guess is that most of the hard work was already done right the data acquisition the massaging of this data from you know complex to two samples that can be Easily aggregated over that was already in place as a natural part of the salameter workflow So how does it all hang together? Well, let's return to that template file even smaller now, so you definitely can't read it, right? Basically a few extra elements needed to need to be added to the template in order to enable autoscaling first off you need alarm definitions and these alarms basically bound are what we consider to be The busyness or idleness of our application stack, right? So usually you define it around something like CPU utilization Yeah, you've been talking about if the average CPU utilization across my current set of instances is more than 75 percent I consider them running hot right and in that case I want to spin up new instances to share this load Yeah You also have a kind of a low watermark alarm that would say basically captured a thought Something like if the CPU utilization averaged again across my autoscaling group is say less than 20 percent I consider myself the application stack to be overscaled in that case Yeah, the kind of running idle I could do with fewer of them. So I want to scale it back Yeah Then you have to have some conception of the membership of an autoscaling group and that's done via user metadata Yeah So as you're probably aware when you spin up an instance in in Nova You can associate user metadata with it and we've got a namespace in convention That allows heat to kind of tag all of the instances that are in a particular auto scaling group in a way That's recognizable to salameter and then we need actions that basically Provide the kind of the feedback conduit so that salameter can kind of feed back this this triggering Information back into heat and these are taken in the form of pre-signed URLs And and then we need some policies that basically Control the rate of scale up. Yeah, do we want to scale up in jumps of a single instance? Or do we want the size of the current pool of instances to be increased by 20 percent or to be decreased? By 20 percent. Yeah, so it can be an incremental and Delta it can be a percentage delta or it can in fact be an exact number You can say if the high CPU alarm fires jump straight up to 10 instances If the low CPU alarm fires jump straight back down to two instances Yep So it's quite flexible and then there's some kind of sanity checking built into it as well in this concept of a cool down period which is a number of seconds for which the alarm state has got to persist before The any scaling actions actually occur and that's to protect your resource allocation Against trashing if the actual level is running very close to the alarm threshold Yeah, you don't want to just creep over the alarm threshold scale up go back down again scale down creep up scale up and you're just constantly Trashing and when you do scale back. It's the oldest instance. That's deleted and you know So just constant the adding and subtracting at a cool down period allows you to kind of smooth out that jitter If you're in the unfortunate situation that your actual load is very close to your threshold so here's a piece of Just an example and I didn't want to put up too much of this template language in the presentation Because it's really to count the concept that's important here And he'd actually support several different versions of its DSL So this is in the kind of the Json AWS style, but it also has a kind of a YAML Heat specific form and the two are completely equivalent. You know so in fact and Converting between them is very trivial So here I've got the definition of an alarm that's bounding a business condition All right So when the idea is when this alarm fires we consider ourselves to be underscaled And we want to go up in terms of the number of instances that are allocated So we give it a name the type is OS metering alarm that basically says this is a salameter alarm as opposed to a native Heat alarm then the type of information we we need to specify would be the meter name, right? So in this case, it's CPU you till that's just that CPU utilization a threshold right and a kind of a time window over the recent past over which to evaluate and in this case We say five evaluation periods each of 60 seconds are in length So that's five minutes in effect the statistic to apply so that could be min max sample count some Average average is what we're mostly interested in when it comes to auto-scaling Comparison operator greater than less than equal to us so on in this case is greater than and then what you want to specify is An action to take when the alarm fires and in this case, it's a scale of policy Yeah, and that policy the my web server and or sorry scale of policy in this particular case would describe things like the Adjustment step size and cool down period and so on and then lastly that matching metadata. That's kind of a strange one That's the tag basically that heat uses to represent membership of the auto-scaling group So all of the instances that are spun up as part of this group will have that user metadata set on them Okay, so how it all hangs together. So now what's different about the mechanism that's used Well, first off when the heat engine spins up the stack. There's no need for this push CFN push stats business There's no need for the key pair to be injected that all goes away Okay, all we need to do is ensure that some user metadata is set on the instance Then basically the heat engine goes and creates the alarms that bound the kind of high watermark and low watermark It does it via the public and salameter restful API that allows you to control and alarm life cycle That's basically interpreted by the salameter API service and this salameter also has a on each of the Nova compute nodes It's got a compute agent that talks to the local hypervisor and basically and extracts information Equivalent to what the CFN push that script used to do within the instance itself We also have our alarm evaluator our alarm evaluator can be scaled up in various different ways You can have a single one or you can have you know multiple instances of the service dividing the load amongst themselves But in any case the basic work pattern is is very simple It's talk to the API service get the statistics over the configured time window and figure out whether you've crossed the threshold or Not in the case where the threshold has been crossed The webhook is called out to that pre-signed URL and he goes aha lovely time for me to scale out the stack Okay, and the extra instances that are created have the same user metadata set on them And that's what basically represents the group membership So you get the the scale-up effect as before so basically then drilling down a bit into the the salameter box We see that basically we've got a number of interactions between the actual components that that make up the salameter pipeline and these interactions are totally central and You know core to salameter's mission Whereas when this same kind of interaction was happening within heat It was basically something that was in not really core to their mission and not really what they wanted to be involved in Yeah, so basically the the mechanism that we use to evaluate alarms is this kind of arms length idea All right So you've got your alarm evaluator service that effectively calls out to the API service to well first off It needs to grab the alarm definitions. So those rules that we saw in an earlier slide and This one here so basically all of that information describing the threshold the duration of the evaluation window the statistic the comparison operator all of that information is made available to the alarm evaluator via the API service Okay, so once the alarm rules for the currently assigned set of alarms to a particular evaluator instance are retrieved The next thing it needs to do is go to the statistics API for each of the alarms Basically get the aggregated statistics over that time period that's being configured in the in the alarm rule And then basically determine whether the threshold has been crossed or not when the threshold has been crossed What happens is the alarm evaluator emits an RPC message Right over a MQP to another service, which I didn't clutter the diagram with it But that other service is at the alarm notifier and that's responsible for doing the actual notification Calling out on the on the pre-signed URL in effect Constructing the little fragment adjacent that I spoke about earlier that he receives describing why the alarm actually fired and it calls out to to and the autoscaling API and the autoscaling API then takes care of Applying things like the cooldown period and so on now in order to simplify life for heat We actually continually notify heat on every evaluation cycle as long as the alarm state Persists and that simplifies their implementation of that cooldown period So we provide the initial hey, we've just gone into alarm and then a minute later Hey, we're still in alarm a minute later. Yep. We're still in alarm And then eventually so many of these repeated notifications are received the cooldown period Expars and then the actual autoscaling logic kicks in so we've made life as easy as possible for heat and the the old implementation still exists and It's kind of I suppose deprecated at this stage and you know eventually we were able to remove it completely And we've taken a lot of kind of non relevant stuff out of the mix as far as heat is concerned Okay, so I'm gonna hand you over to Julianne so It sounds simple I want to explain Oh, and but it's it's kind of a huge piece of work that we did during only one cycle And this is something I want to emphasize is that we did this Which is something that is I mean it involves two projects It adds something that won't already It's not like when it did us at first sight, but We did something in one cycle which is bringing up the two different project Working together to improve the whole stack. So it's it's kind of a huge achievement, I guess Well, I'm I was just the PTO of cinematography in this cycle and I just had to follow group wings so I didn't write any of this and I don't know it as well as Owen, but This is exactly what explained to us as a cinematography developer Six months ago. So and this is why we were able to do this so fast as Well, because when we decided to add the heat guys and to bring them this feature that needed We are lucky enough to have at least two Developers who are both of heat and cinematography. So that is these things a lot they came to us, I mean Owen came and booked a few sessions During this the summit in Portland and explained exactly this to us This is what we need and this is how we're going to build it So we just had to listen and to I mean we trusted him in terms of God and architecture and things like that and we all of us validated it and it was very easy Bound during the development process to see patches going through and just reading the code was just nitpicking about passing stuff We didn't have to ask ourselves like is it a good architecture or is it something we really want? We knew we wanted it. We knew or it was supposed to be done from the beginning because we had a good communication between our project and between both projects. I mean Where they are a very good understanding of things So I think it it comes down to communication between what the open stack project and between our own program And I'm pretty proud of what we did only in only one cycle if I if I may add What's really interesting here is that? the end result alarming That serves eat can also be seen as just alarming who doesn't need to have some kind of cloud watch to do alarming and One thing that another thing that happened that we didn't put in the slide is that in advance at a customer I've already knows them. I think by now if they call cloud what that add this need to do Alarming just alarming not for heat yet but for other purposes and These two needs the heat needs then the our customer need suddenly matched and then We were able to spread the work against Two separate team that jointly work on delivering this is because we built things And opens I can general I think in a generic way We knew it was used mainly for its but we didn't build only for it So like Nick sale we can use each standalone. So that's so great So now that we know it works. We can't talk about what we're going to do and I think in this cycle for most of this It's not going as to be as big as what we did because we are something pretty basic But it's still a lot of features, but working and we're going to add a few terms to it Like adding more metrics, but we came doing that in Sunameter anyway so it's not particularly tied to the alarming parts it also doesn't support yet the Combine metrics we implemented late in the cycle, but with it I guess it didn't have enough time to implement it. So we also use a pretty simple statistics analysis Would like to improve it for example, excluding data points for our low quality and Out of the trend We're going to work the we're the our learning projects. We are a session a few days ago about this We're going to work with these guys. So this is going also. I think a good inter-project experiment experimentation because we are going to implement things they want us to do We're going to leverage this in our learning in the same way we also discussed the I'm Constraint for our arms with charm of days Don't know if we're going on place for a long time. It will be discussed in previous cycle, but it's something that Comes up sometimes so it might be useful and We are using currently a pretty simple security scheme for web box We'd like to improve it that should to use also the EC2 signature on this So if you got question, I think we're ready to answer There's one question the two questions over there Over there Yeah Maybe I'll take the just just the different resources. So basically auto scaling in heat is is very Instance centric right so it's all about scaling up instances But one of the things that the heat guys want to do in the next development cycle Is to kind of make the auto scaling aspect a bit more general purpose so that basically you can Your it's decoupled slightly from your use of heat templates And you can you can effectively use it to scale up any type of resource if I understand correctly So you can kind of add extra volumes not just add extra instances You can you know, maybe add extra things that are running databases of service Endpoints or whatever it is. Yeah, so there's definitely I don't fully I'm not fully cognizant But exactly what they're planning to do But there's definitely an appetite within the heat project to make this Mechanism just a bit more generic and a bit less dependent on your use of templates Now there might be a template actually underneath the hood But it wouldn't be visible to the user it would be kind of like a default template It's generated that just wraps a single resource Yeah There was another question Yes So the the API the API was extended to support Briding from anywhere to Cilometer so you can now inject Values from anywhere you want and the way it is done is these value are will be constrained to the tenant environment So that you can't leak Into the billable information, which would be horrible So and just to extend I mean there are other cases as well where you know, they the Idea of having a compute agent that's sitting on the compute node talking to the local liver team And that's a little bit kind of liver specific in a sense even though in the code It's it's actually quite generic and it's extensible so that we can have multiple different first inspectors that could You know have do the same similar things for different hypervisors, right? But it's just the way it was done initially for liver right But then you've got the case that Julian mentioned where you've got say bare metal involved All right, so you've got an a bare metal host that's managed by ironic as if it was an instance Now in that case, you've no direct equivalent of a local hypervisor, right? There's no liver team and involved So in that case, I think the agent's Interaction will be remote necessarily if my understanding is correct in fact I think we are trying in the end to remove In compute node agent, right? We would love for Nova to be automatically generating the data without us having to put an agent everywhere I mean if Nova was to admit these notifications with the information We need at the correct cadence and it was a predictable cadence will then that would obviously make life a lot easier for salamander Yes, so Generally like so let me give you an example of where that work might be made a bit simpler All right, so say for example, you wanted to alarm on the rate of and disk IOPS All right now currently what salamander meters is the cumulative number of and Bites or disk write requests or read requests or whatever So that's not really suitable for alarming when you're comparing to a static threshold because you got a kind of monotonically increasing cumulative value right, but we have a mechanism We have a generic transformer that allows you very easily just via in a declarative way just by writing some YAML Right in the in the pipeline configuration that allows you to say okay This value was being collected currently as a cumulative value That's not really what I want or I do want that but I want additionally I want to be able to work out the rate per second right an alarm on that you can without writing any new code You can just basically configure that into the system so that in addition to the cumulative value a Rate per second is also metered and then alarm on that value. So that's kind of a special case where We are collecting data that's related to what you require. We're just not collecting it in the form That's suitable for alarming on yeah But if it was something totally new if it was a metric that we're not currently gathering well then you can't alarm on it because We just don't have that data So either you would have to contribute the the agent to gather it or extend an existing agent Or the community would have to do it But either way that that data would have to somehow be collected by Solometer into the metering store and made available via the API before we'd be able to learn a lot So I'm sorry. I'm going to have to interrupt you because I think we've run out of time right and yeah We got maybe 28 seconds left So I would like not to impede on the next session Okay, maybe we can take additional question outside of the room in a moment. Thank you very much