 Good afternoon and thank you, thank you for joining the session. Yeah, this is Ying, and from ABM, I work for the BlueMix Oath Scaling Service for several years, and now I am a contributor for the Foundry App Oath Scaler project as well. Yeah, today I will cover two aspects. The first aspect is from the end user point of view, how to understand Oath Scaling, how to use Oath Scaling Service to serve for your application. And the second aspect is from the service provider point of view, how to deploy Oath Scaler, how to add Oath Scaler to your self-department, and how to enable the service. Yeah. First of all, yeah, Oath Scaling is used to help to manage your application capacity. With or without Oath Scaling, your application operation will be a little different. But anyway, as a prerequisite, without Oath Scaling or with Oath Scaling, in both situations, you'll need to understand your performance benchmark of your application. You need to know your workload type. You need to understand your peak hour. You need to know when the workload is beyond your expectation and will introduce some performance problems. And without Oath Scaling, you need to do the monitoring yourself. Maybe have some marching tools and you need to set some alerts. And once the alert is triggered, you need to do the CF scale command manually, and then you need to keep monitoring. And with Oath Scaling, you just need to create a service instance and then create scaling policy, bind the service to your application and attach the policy. Then Oath Scaler will do everything for you. Here, I want to mention, when we talk about the scale, we are talking about scale in or scale out. That means add or remove your application instance numbers. We didn't touch about scale up or that means add more memory quotas to your application. Yes, in CF, if you want to change your memory quota of your application, you need to shut down the current container and create a new. And then this will introduce some downtime so that Oath Scaler won't touch that area. Yeah. So here, maybe create service, bind service, you are all familiar with that, but you don't know how to define the scaling policy. That is what I want to talk about. Here, we have two types of Oath Scaling. The traditional one is the Dynamics Scaling. And here, I show a simulated workload for the throughput over 12 hours. And in this workload chart, you can see the workload is fact, sorry, it's fact-treated. It's up and down. And, but in general, you can see in general, the average value will increase at the beginning. And then, down at the edge, near the 4 p.m. And according to your understanding for the application, you define two performance benchmark. One is upper threshold and the other is lower threshold. And the green area is pretty fun. And the yellow area, yeah, is acceptable. It's normal, it is acceptable. We consider it's pretty fun. And the red area is pretty dangerous. And we hope to add more instance to serve the red area better. And we hope to reduce the instance numbers while it became green again. But as you notice, the workload is fact-treated. Do you expect the Oath Scaler to be quite sensitive with the spike? I think no, yeah. If we are quite sensitive to with the each spike, your instance number will be fact-treated as well. So how Oath Scaler do that? Yeah, in fact, we are smoothing your, we're smoothing the metric values with the start window. We will collect all these metrics numbers during this window and do an average. And if the average value is average, the average threshold, we will count it your average now. And then we move the start window a little bit and do another average again and to see whether it is continue to bridge. If the bridge is lasting for long enough, we will think, okay, it's time to trigger to get out. You need to add more instance to your application. And this is the result. After Oath Scaling, this is a similar workload for instance. Each instance, for example, in the middle area in the half, you can see the workload is almost a half compared to previous chart. And in this area, in the near to 4 p.m., you can see the number is quite similar to our previous chart. Why? Yeah, since in this area, we scale out to more instance, and then we scale in again. Yeah. Currently, we have four, we support four metrics, different metric types, memory use, memory utilization, throughput and response time. Here, I just want to mention the utilization is the percentage of your memory used. And besides dynamic scaling, we also support the schedule scaling. You can define different schedules. Schedules means a time duration. You can define recurring schedule by day of week or day of month. And also you can define the specific days schedule. And you can define multiple schedules here. And if the schedule has some conflict, for example, if you define every weekend, you need more workload, but also you define on the Christmas Eve, you need higher capacities than the specific days schedule will overlap the recurring schedule, since it is more particular specification. And also the schedule scaling is not conflict with dynamic scaling. When the schedule is taken in fact to, the dynamic scaling is also happy to change your application in capacities dynamically or maximally. Here we have some, I have a chart. In why you define a schedule, you need to define a minimal instance number, max instance number. This true value is quite useful. It's pretty one to the unlimited scaling. Otherwise, you need to, maybe your scale, your instant number will keep growing up. And there is a special one, initial minimal instance account. That means, let me take an example. For example, before the schedule taking fact to, you only have 30 instance. Then you define the initial minimal instance to 40. Then when the schedule taking fact to, all the schedule will help you to change your instance number from 30 to 40 directly. Yeah, but if you, but if before the schedule taking fact to, you already have 50, 50, sorry, 50, 50 is greater than 40, then our schedule won't change it anymore. It just ensure when the schedule, at the beginning, you'll have enough capacity to handle the more workload. So, yeah, here it is our policy definition. It should be a decent file. I just try some table here just to make sure it is pretty to understand clearly. And here in the decent definition, you need to, first of all, you need to define the minimal instance account, max instance account, and scaling rules. This works as default. And then you can optional define the schedules. And in the schedule, you define the time zone, and define the recurring schedule and specific day schedule at the table here. And for the scaling rules, you can see there are some numbers here and operators here. I can translate it into native language so that it could be easy to understand. For example, the second row. That means you will check the throughput at the metric type. You will focus on the throughput. And you will take the average number of your throughput every five minutes and to see whether it is greater than the 15,000. If it is greater than 15,000, it is a bridge. Then if the bridge continue for 10 minutes, then it's time to add more instance, add one instance. And after the scaling out is done, you have a period of cooldown. That is waiting time. That provides a chance for you to keep your system stabilized. No more scaling out action will happen during these 10 minutes, during the cooldown. Okay, and then after we already have a policy, we can attach the policy. We have several ways to do it. First of all, you create a service and then you bind the service with the parameter. You can define the policy file name as the parameter. Another way to do that is through our public API. You can do a CUR, you can do a CUR command to attach your policy with the put. Here, I just want to mention that the authorization code here, you must use your CF user token. After you're logging in your CF OAuth token command, you can get your user token. Put it as your authorization code when you exercise our public API. And we have the other operation available on public API. Retrieval policy, update policy, delete policy, and retrieval metrics, and retrieval of histories. Anyway, it is not quite flexible to use the APIs. It will come in soon. We will develop our command lines. The all-scale command line will work as the command plugin. So we will, all the operations for policy for metrics and history will be supported in the command line. And here, I list some steps to implement all scaling. I don't need to go through all this stuff here, but I just want to mention, as a prerequisite, you need to enter your application before enabling all scaling. You need to know the workload tab. You need to know the benchmark. You need to know how to define upper threshold, lower threshold. You need to know how to define the schedule if it is applicationable. Okay. Yeah, that is, and then, this is a deep dive to all-scaler. First of all, we discuss how to deploy all-scaler. Currently, we have a Bosch release. You can download from the gate and we can do the Bosch deploy. Both of the Bosch v1 manifest and Bosch v2 manifest is ported. And also, we can deploy using the self-deployment approach. Yeah, if you are working on Bosch Lite, you can use the self-deployment it is very quick. And then, after you deploy it, you can enable all scaling service. I have a short demo to include all these steps. So, wait a minute. Here, I just recorded it on my laptop, so it is using self-deploy to... You need to create release, upload release, and then use Bosch... Here, the Bosch command means the Bosch v2 command line. And in this batch of my deployments, the API server and service broker will be updated. It did take time. So, when I do the recording at ISBTUX, so that it won't take a long time for all of you. But anyway, it is quite easy. You only need three commands to steps to enable it. Now, the Bosch deployment is finished. And then, we can create a service broker. Here, we register our open-all-scaler endpoint on router. So, you can access it by open-all-scaler-boschlight.com. And then, you can enable service access. Now, here it is. Then, the service binding. First of all, you create a service, and I already have an application here. And I already have my policy defined here. So, I just bind the service with my application. And then, I can retrieval my application with the API server. It is... The endpoint is allscaler.boschlight.com. And then here, it is policy, as we just talked about. Yeah, okay. Go back to the next. I want to go through the roughly architecture diagram of open-all-scaler. We have two entries for any other. The first entry is to create and bind service from the CLI through the open-all-scaler... through the Swiss Broker API. And another is using the API server to do the policy, metric, and history operations. Once the application is bound, and once the policy is defined, the metric collector... On this way, the metric collector will start to fetch the container metrics from LogRGator. It will fetch the memory, and it will fetch all the HTTP-required numbers and response times. And we will do aggregation for these metrics, the raw data as a scaling metric. And the aggregation part will do the average. Yeah, if you still remember that window, we will do the average. And the event generator will check the average value with the threshold and try to bridge a scaling event. The scaling event... And then the event will be signed to scaling engine. The engine will check whether the event could be handled right now. There are some situations that the event will be ignored. For example, if you want to scale out, but the max instance account is reached, then the scaling engine will stop to serve the event, since you already have the maximum number. And if you are doing the cooldown time, the scaling engine will stop... Well, you can know the event as well. Yeah, since as you define, you need to wait a while before another scaling. And also, we have a scheduler. The scheduler will trigger the scaling event to the scaling engine. All these data are saved in Postgres database. And we have planned to add more database support in the future, but currently, it is Postgres. And all scheduler has to be deployed on the SAP Cloud in July this year. My name is Pradeut. I am responsible for apart from everything else to make sure the autoscaler runs fine on the SAP Cloud platform. So, I mean, a lot of people have questions about whether it's production ready. For example, we've been doing development on this for the last more than a year and a half, or almost a year and a half. And yes, the answer to that is yes, it is production ready. We already have this deployed on the SAP Cloud platform since June this year. It's currently available as a beta service, so which means no SLAs yet. But yeah, we have ramp up customers, internal customers who have been using it for a variety of use cases. We have very positive feedback on that. For the multi-cloud strategy that we have in SAP, so in that context, we have this service deployed on AWS Azure on an open stack. GCP is coming up in the future, so which means we'll have it available for basically customers everywhere. Performance tests, another question that we have all the time is to what can we support? So we are currently running tests as a beta service it is to be able to support at least 2K, 3K apps. The tests are currently in progress, it is positive. And once these tests are done, we will probably have a final release up on GitHub so that people or platform operators can actually download and use it. So that's going pretty fine as well. And with that final release, we will also go to GA on the platform, on the SAP Cloud platform. Yeah, that's a short update from my side. Yeah. VM, where platform? No, I don't have any updates on that yet. Anything else? Maybe we discussed that on the next slide. Do you have a slide for that? Okay. Sorry, I didn't put a slide too for the future, for the coming soon features, but I can do some summary here. First of all, just as I mentioned, in the future we will deliver the COI support. And also if we will output our performance benchmark and do some scalability upgrade. And also we will consider to add CPU metric back into our support metric type. And as you mentioned, we will consider to support the customer metric. That means your application can define your own metric and report it. And then all scaler will do the all scaling actions according to your customer metrics. And also we will consider to change the deployment approach. We may deliver our approach to deploy all scaler itself as co-foundry apps, not totally limited to both release. Yeah, that is our future plan. Yeah. Just to add to that, regarding the question about customer metrics. So the session that we had before this about side cars. So that is the concept that we're gonna leverage to pull metrics out of those containers so that services like auto-scaler and others who need those metrics for building use cases can be developed. So kind of a dependency on that. Any other questions? Yeah, sure. At the moment, you have quite a lot. It's a very important person recording your magical voice. Try that. Yeah. At the moment, you have quite a lot of moving parts to the auto-scaler. And I was wondering if there's plans to kind of reduce that or keep it with lots of moving parts of the long term. Kill the microservices. To give more detail, yeah. There's currently lots of moving parts. And it gives us quite a lot of flexibility in terms of being able to have multiple different metrics collectors and so on. I think over time we would like to reduce some of the number of those parts. Part of the custom metrics thing we were talking about is it'd be nice if all the things just came from Logregator, for example. And then you wouldn't need that flexibility. Everything would just be reading from Logregator to get everything. So over time we would hope to do that. But at the moment, I guess if you like microservices, we've got a few of them. You heard it first. Less micro, less plural. Big, big service, one. Hi, tomorrow during the keynote, one of our team members actually complained that the odd scale does not respect open connections. So, Satru? Yeah, you are. It's Diego's fault. It's Diego's fault. We have a no blame culture. So the problem is a valid problem, right? Like, when you auto scale stuff in the platform today, that means when you scale down, you get less instances and the instances get stopped. And at the moment, that doesn't coordinate within flight requests. That's actually not really auto scaling. That's a generic problem that we don't have coordination with the router as we're shutting down instances. So if you went to the Diego update earlier, you would have seen that one of the big items that Diego's now working on is zero downtime deploys and blue-green deploys. So it's really those features in the platform that would stop auto scaling having this edge case. It is a valid problem with auto scaling right now, but the solution is gonna be in Diego. I mean, if your app has long-running requests as a feature, you run the risk of being upset because it's a cloud. The current behavior is you get a term request and you get a few seconds to shut down. So normally, if your process is well-behaved, that should be okay. But yeah, if you've got long-running in-flight requests, then currently, that's just a general thing that we don't handle great. If you pick another ecosystem, the Heroku platform restarts every container every 24 hours regardless of what you think you might be doing with it. In part, to stop this idea that containers have a long life that you should be able to trust. Just to add that again. So if your application is doing a lot of background jobs or probably it is not restful, yeah. I mean, it's like if you don't use auto-scaler, but just do a CF scale down, you're gonna face the same problem. Really not another skill or issue. Thank you.