 good afternoon everybody i know it's almost end of the summit so you are you are busy maybe planning to go home or kind of thing we'll take half an hour only so this talk is about application autoscaler how many of you have heard of application autoscaler project okay good so application autoscaler project is a cloud foundry incubator project it is being collaborated between SAP IBM and Fujitsu so now Fujitsu actually exists so it's being managed by SAP and IBM right now we have couple of developers from SAP side as well as from IBM side i'm Tanmay Pal from SAP i work as a senior developer in SAP Bangalore location we have Rohit from SAP Bangalore location and i'll just give you a little advanced topic on autoscaler that is autoscaler bring your own metrics to have custom metrics facility with autoscaler so what is an app autoscaler app autoscaler is a cloud foundry extension that provides the capability to scale your application based on two different things either through the resource load through dynamic scaling or based on schedule based scaling so while talking about dynamic scaling it means if you have pushed your application and if your application memory consumption or the response time or throughput is going high you want to scale up your application instance or if it is going below some threshold you want to scale it down to reduce the cost and also there are some use cases where you can do you want to scale up your application on a specific time or in recurrent schedule let's suppose for a payment related let's suppose a payment related application payroll service you want it multiple instances or you are you are expecting maximum load during the end of the month to generate the pay slips and all those time you can actually scale up your application and the remaining month you can just scale down to the minimum instance so those also we can cover using a schedule based scaling and in schedule based scaling we have two different time either at a specific time once let's suppose I want to scale up maybe on 13th of October or at eight o'clock one time or you can say okay every every month 29th morning I want to scale it up so that is also possible using schedule based scaling now let's suppose if we do not have application autoscalar project at all in your deployment so how do you monitor your application you just push your application or deploy your application you keep monitoring your application and you if you if you think I need some scaling up or scaling down required you manually do a cf scale to the to the required instances and again keep monitoring though it looks very simple but it is not that simple because you you really don't know when your application's resource will be very high or very low maybe in the mid mid of the night your application which was becomes very high memory consumption becomes very high you didn't check your application crashes or maybe you didn't check your application for a long time and it's not required you have three or four instances running so which will incur more cost so it is little little difficult to manually monitor all the time and doing a manual scaling operation in case of if you use application autoscalar project what you have to do you have to just create an application autoscalar service instance you bind your application to this service instance using a policy policy is a set of rules where you specify when you want to scale up or scale down you can specify okay I want to scale up my application when my web application response time is very low to some certain amount or maybe my memory usage of the application is above 80 percent or you want to scale it down when it is less than 35 percent so all those things you can specify in this policy and you have to just pass those policies during the service binding and then you relax you don't need to do anything the application autoscalar will keep on monitoring your application load what is the current resource consumption and based on that it will scale up and scale down and if you think it might actually hit you very heavily due in the cost it will it can actually scale up up up it's not like that you can specify the maximum number of instance you can scale up let's suppose you specify minimum is 2 and maximum is 10 so even if your application load is keep on increasing it will not scale up more than 10 instances now what are the different metrics as of now application autoscalar supports so those we we support the basic standard container matrix so memory usage in mb so you can specify okay I have given one gb of memory or two gb of memory to my application and if it consumes 1.5 gb I want to scale it up but there are some certain situations where this absolute value in mb doesn't work well so you you you specify two gb and you don't want to specify a very absolute value rather you want okay if it is above 80 percent you scale up or below 30 percent you scale down you can use memory usage in percentage similarly response time if you want your web application response time maybe five second or more less than that you can specify in response time matrix in millisecond also throughput throughput is how many requests per second your application can actually handle and above those those threshold value maybe you you want to scale up or if it is below that threshold you want to scale down now everything seems okay here but the problem is it is not possible to actually measure the resource load of the application with only these four metrics it's not at all possible because we have different varieties of application using different resources we are just considering the memory response time and throughput so which is not fair enough so I can just give you some example let's suppose you are using a java application maybe heap memory or garbage collection statistics or jbm statistics is very important when to scale up or scale down because that is kind of important performance matrix for a java application but for if you are using a message queue let's suppose rabbit mq so the number of message already in the queue or it is in the ready state you send message and message is not acknowledged which is also a very crucial parameter you are using some in memory let's suppose radius or something like that so what is the latency or heat rate or evicted keys of your in memory thing you can just specify that if you are using a node js application maybe jbm or heap memory is not the ideal parameter to consider for the to consider for resource load maybe the event loop statistics is important for node js application similarly we can have any number of such parameters like stdp error rate or maybe you got some error in the log based on which you want to scale up so there are some situations where your application is running if you get certain certain types of log either so which means something is wrong with the application and you want to scale it up that is also possible or maybe stdp status code but the problem with these things is yeah we we need to have some consideration on this parameter but this is not readily available in the container so we need some way to consider this performance matrix while doing my scaling up or scaling down applications so we how did we achieve this thing so we did we introduced a new thing which is called a custom matrix it is completely new thing so it's like you specify rather than memory response time throughput you specify your own matrix give it to autoscaler and autoscaler will handle the scaling operation so this is the architecture diagram so this stepwise once you have a running application in the cloud foundry you create a service instance of application autoscaler and then you bind your application to the autoscaler service instance once the binding happens successfully you will get the credentials which will have one username password and the URL so what URL I'll just explain in later then once you get those credentials your service your broker your application then it is your response because we do not have or the container doesn't have these custom matrix information so it is your responsibility to send those metrics to the application autoscaler through a simple stdp REST API once you send it to application autoscaler so we have a matrix for order component which will actually convert actually convert this to a logigator matrix through a metron agent inbuilt so there is a logigator agent or metron agent inside the matrix for order which forwards your matrix to the logigator once it is with the logigator it is all same so whether it's a container matrix or standard matrix or a custom matrix it's all same so we have a matrix collector component which keeps reading your matrix so whether it is a standard matrix or custom matrix it keeps reading and it does some aggregation based on your instance level if you might have multiple instances also and once this matrix collector collected and it's aggregated then autoscaler decides or it actually go back to your policy and check what threshold or what policy you defined and based on that it asks the scaling engine to perform the scaling operation and scaling engine triggers a scaling of the instances now i said during the service binding you will get some credentials so this would be something like that once you bind your application to the autoscaler service instance will get a credential which is the user username password and the url so because ultimately you have to do a REST API call so that url is the base url to hit the REST API endpoints and it is already authenticated so you have to use this username and password for the authentication and how the REST API looks like so REST API is a simple post API so this v1 apps and the app GUID is known to you because you know your app GUID and matrix you have to just append this part with the uri you got from the service binding and this is the request body in the request body what you have to specify the instance index because you might have multiple instance running of a single application so you have to provide the instance index so that we can just aggregate those metrics in an application level and simultaneously you can send an array of metrics as many metrics you want you can send there are three things name value and unit the name would be the name of the matrix and it should exactly match with the name of the metrics you specified in the policy otherwise it will mismatch there is a value so as of now we are supporting a gauge matrix so you can specify any gauge matrix value and definitely the unit what would be the unit of your application you have to specify so this is the REST API structure you have to just keep calling in certain frequency from your application to autoscalar and then autoscalar will take care of scaling up or scaling down of your application based on the the data or the metrics you provide maybe a small demo will help you to understand better so we'll have a very small demo i'll hand over to Rohit for giving you a small demo hi good afternoon so i'll start the demo so i already have logged in to one of our dev environments so is it visible or should i increase the font i think this should be good enough yeah so yeah i've logged into one of our dev environment as you can see let's quickly check the marketplace it's quite difficult to read i'll decrease the size a little yeah yeah so you can see there are three plans i'll just use any of those plans so i'm using the standard plan and creating a service instance autoscalar standard and the service instance is created i have another service instance a rabid mq which i'll use for the demo and i have already pushed an application over here so let's bind autoscalar to this application start the application so the application is started it's a simple application so it has a UI which shows the rabid mq queue length which you can so you can just do simple rest api calls and it increases or decreases the UI queue length so let's make it zero we'll consume all now now we have binded the application let's see the application environment here you can see this is the credential which it got after binding to autoscalar so the password the URL and the username using basic auth this application sends the job queue length message queue length to autoscalar as a custom matrix so the endpoint you saw it uses that and keeps on sending on five second intervals so autoscalar gets your matrix your matrix is getting monitored by autoscalar so let's apply a simple policy to it using the this is not a standard autoscalar command this is autoscalar plugin command autoscalar attach autoscaling api so i have attached a policy let's see what the policy looks like so as you can see the policy is minimum number of instances one maximum number of instances four the metric type is job queue which is not one of the standard types so it's considered as custom matrix and the threshold is 100 so if it is above 100 it should scale out if it is less than 40 it should scale in so let's increase the queue size i'll start monitoring the application so currently it has one instance running maybe we can we can go back to policy and yeah meanwhile i'll explain the policy a bit more so here if you have attended the earlier session you might already know so there's something called bridge duration second so this is the time during which your aggregated matrix has to be higher than the threshold which is being used and there's adjustment parameter which you can see here plus one so if your average memory is higher during that time then it should scale it by one instance and yeah so rest are already all standard operator so also there's one more thing cool down so this is used to so your application once it scales up or scales down your average aggregated memory matrix takes some time to get normalized and so for that reason autoscaler makes your so you can provide this and let your application wait some time so to get that to get your application instead of matrix normalization done and then you can again do scaling if it is again needed so as we can see on the left hand side it's already scaled so we have two instances running over here let's quickly check application history so this is also one of the cli plugin commands autoscaling history so it shows you the history of the auto all the scaling is done by autoscaler as you can see here it increased by one instance because the job queue was greater than 100 messages for 60 seconds so i would hand it back to Tanmay again so the demo we have seen here the application of this sample application is what it was doing it bounds to autoscaler service instance and internally every five second it just keeps on sending the job queue to the autoscaler application now autoscaler and with a policy when defining the policy it has a policy with job queue specification where the threshold was if it is above 100 it should scale up so we said through the app the queue size was 150 so which is above 100 and it scaled it up this is that simple and this reading from the binding credentials and sending sending through rest api is encoded in the in the demo applications so any question no it's up to you you can define any matrix you want okay so we have standard set of set of metrics like memory usage memory percentage response time and throughput except these four if you specify any matrix in the policy and if you send those metrics through this rest api it would be taken care yes if you use those existing standard metrics it will it will show an error during the binding itself saying it is standard metrics don't use this name yeah anything else all good okay thank you thank you very