 OK, I think we're going to get started. So thanks everybody for coming out today. We're going to talk a little bit about Sunlin, which is maybe one of the open-sec components that you haven't heard very much about yet. And we'll hopefully solve that problem for you today and show you what's going on with Sunlin. So my name is Mark Velker. I'm the open-sec architect at VMware. I'm going to let you guys introduce yourselves. I'm Qi Ming-Tang. I'm from IBM, a researcher working on the standing project. Hello, everyone. This is Xinhui from VMware. I'm a developer engineer and I'm working with a market geiler, and Qi Ming. So without further ado. So just to give you a feel for what we're going to talk about today, we're going to start off talking a little bit about requirements for resource pool management. Then we'll talk about Sunlin in particular. And then we'll actually do a little demo for you a little bit later on. So let's talk about managing resource pools. When we think about clustering at open-sec today, there's actually several services in open-sec that do some sort of clustering in some form or another. And one of the things we'll see about Sunlin is that it's kind of correlating a lot of that work and kind of putting things together. When you think about cluster management, there's a few things that come to mind. One is simple manageability. You've got to understand which bits of your infrastructure are part of which clusters. So very simply, cluster membership. You think about elasticity. So in any cloud-native environment, in any cloud-native application that people are writing greenfield these days, one of the things that you have going in your favor in the cloud is an expandable model to where you can bring up new infrastructure on demand very quickly and add that to your cluster. Load balancing. Very, very common pattern in applications today is that we have several maybe stateless services sitting behind a load balancer somewhere. Maybe actively distributing work among them or maybe in other cases, it's an active failover scenario. But the more cloud-native applications these days are moving to a model where for the stateless pieces of the infrastructure, they're more horizontally expandable. So load balancing is an important part of the general requirements. We look at flexibility. When we look at the kinds of applications that are out there today, there's a lot of different patterns and a lot of different ways that they may want to deal with clustering. So we talked just a second ago about active-active versus active-passive. There's maybe a lot of other patterns that you might look at there as well, especially when you look at changing the dynamics of a cluster in response to changing conditions in the environment. Maybe when a lot of new load gets thrown at your application because it's an e-commerce platform running Black Friday sales, maybe you need to expand horizontally to a lot more nodes. In other cases, maybe the more economical thing for you to do is not to expand the number of nodes but to change the size of those nodes or maybe move them to different IO zones so they have better IO or there's other dynamics that can come into play there. And if you're really optimizing the application, those things actually went up, turning into dollar signs at some point as well, in many cases. So when I spin up lots of new instances, that's relatively expensive, maybe compared to moving them to a different zone, for example. So we want some flexibility. We also think about flexibility as well in terms of the things that determine the dynamics of the cluster. So maybe the best indicator when I need to change the cluster, maybe expand it out horizontally, maybe the best indicator for me is not necessarily how pegged the CPUs are. Maybe it's something else, maybe it's something inside the application itself is the trigger to expand or contract that cluster. And those use cases can be very different between different applications as well. So that's something we want to think about as well, is having a lot of flexibility in an engine for cluster management. And last we want to think about extensibility. We all know that there's lots of types of different resources that we deal with an open stack today. We also know that that set is ever expanding, especially with the Big 10, there's all kinds of new projects coming into play, all kinds of new dynamics that we're seeing out there. Up on the keynote stages this morning, we talked a little bit about some of the clustering, sorry, some of the container bits that are out there now that are really becoming very prevalent in an open stack's world. So we need something that's gonna be extensible as well. So let's think a little bit about what we have today in terms of cluster management and open stack. Primarily the way that people think about that today in open stack is with heat. Heat is essentially an orchestration service and inherent in that it has an auto-scaling policy. So we do have some auto-scaling and cluster management capabilities in heat today. When we look at what that looks like, if you're familiar with heat at all, it's basically an auto-scaling group that you create in your heat templates, scaling policy and alarm to trigger when things happen in terms of auto-scaling actions. The heat auto-scaling model was kind of taken from the AWS model. Actually doesn't have the full capabilities that AWS has today. So it's a fairly limited use case. And at the end of the day, that has proven to be sort of a minimally viable product that suits a whole lot of use cases that are out there today. So, you know, it's a great start. And when we look at the things that it can talk to you today and what it can do in open stack today, it's sort of a good minimal set for auto-scaling. However, at the end of the day, heat's mission really isn't to care about auto-scaling necessarily. It's a piece of the puzzle. But what heat really wants to do is orchestrate things. And auto-scaling just sort of happens to be a component of that. When we look out across the rest of open stack, there are other components that are kind of in a similar situation where they need some sort of manageability, but it's not kind of core to their mission necessarily. So when we started looking out at sort of the open stack ecosystem, one of the things that became apparent was, maybe we need to think a little bit about cluster management as sort of a first class service that everything else can tie into. And so the Sunlin folks here have been doing a lot of talking with the heat folks and looking at maybe taking the auto-scaling capabilities that are in heat today and offloading those to a new service in the future, they can expand the set of capabilities that are offered there without putting too much load on the heat folks themselves. Okay. That also opens a lot of interesting doors for us because if we make auto-scaling and cluster management sort of first class things in their own component with that being their core mission, there's a lot of room to expand what we offer. So maybe we go outside of the set of things that we have in heat auto-scaling today. Maybe we go to a super set of what's available in AWS today because we are in many cases deploying open stack in private cloud environments where there are very opinionated choices made about the underpinning infrastructure that may have different capabilities that we may be able to expose that may actually be advantageous for folks there. Okay. So lots to think about here. In the course of kind of doing all this listening to the sort of end user community always comes into play as well. And so as we've all talked to customers that we're seeing out in the field and people that are pulling up the site today some requirements have sort of come up that we thought we'd talk through about what they would really like to see in terms of cluster management and auto-scaling in the future. So first one is cross availability's own placement. You can kind of obviously see how applications that are living in a cloud native world want to span across different failure domains and be managed still by sort of a central service that knows about their health, right? And very similarly, we may actually have cross region placement needs as well where applications that we're running at open site clouds may need to run say on more than one continent or maybe on other sides of the country or maybe again just for sort of resiliency the application you want to put them in different geographies so that an earthquake doesn't totally take down your application for example. Anti-affinity placement pretty self-explanatory should be pretty familiar for most folks. Choosing a specific node to delete when scaling in. So a lot of times when we talk about auto-scaling we think only about the expanding use case where demand goes up. Equally important especially when you attach the dollar signs is scaling back down when the demand goes away. And in some cases their applications out there that will make very opinionated choices about which nodes they kill. And if you think about that in the context of being very aware of the interpinning infrastructure in some of these cases, it makes a lot of sense because maybe when I scaled up my application one of the things that I did is that those new nodes that I brought online, I put them in and say a more expensive IO region because IO is my bottleneck or maybe I put them on better hardware or a more expensive region or more expensive abilities zone. So maybe it's those nodes that I wanna get rid of first, right, when I scale back down because I don't need that extra capacity and I certainly don't wanna pay for it anymore. Triggering auto-scaling with application level metrics. A lot of the auto-scaling that happens today both in OpenStack and in other applications looks at infrastructure level metrics. So we're looking at CPU on the nodes that are running, RAM utilization, IOPS on the actual nodes that are running there. But in some cases it's the application that actually needs to trigger whether it's healthy or not. And there may be very specific app level things in there. So this kind of goes back to that concept of flexibility. If we're gonna talk about application level metrics, then we need a pretty generic processing engine to be able to do that. Manual scaling, we talk about auto-scaling, but we also talk about manual scaling because even though application is healthy, I can see that maybe somewhere down the line next week I'm gonna start a big sale on my e-commerce site and I wanna scale out to be ready for that ahead of time. So the ability to do manual scaling. Automatic node recovery. So we've talked about sort of expanding nodes up and down within a cluster. What we may also wanna have is some level of health monitoring within that cluster to know when a node goes healthy from a healthy state to an unhealthy state. Maybe even goes away completely and then we automatically bring it back online. And again, something that you saw a little bit of kind of this morning on the keynote stage. Migrating nodes from a standby cluster for rapid revision. So when we think about how quickly we are able to respond to auto-scaling events in the real world, we all know that it takes a little while for a server to come online. Even if you're maybe even using containers, those still take a little bit of time to come online. So one of the sort of traditional ways that application developers have dealt with that in the past is to keep a warm pool. So you have a warm pool of resources that you can immediately transfer into an active cluster to bring out more capacity very, very quickly and then backfill your pool. In fact, if you look at the open stack info team, we actually do this in the real world for managing open stack CI today with no pool. And also soft scaling as well. Okay. Okay, thanks Mark for the quick introduction. So next I will give you all a quick overview of the sending project. Actually, we didn't start everything from scratch. The whole idea was offloaded from heat. Before we started with this project, we had several rounds of discussion with the heat core team, whether this is the right thing to do, whether we should do auto-scaling again inside heat or just start a new service. So after several rounds of discussion, we decided maybe starting a new project is the right thing to do. That will give us a quick deliver. So if we do everything inside heat, yes, we get some strict code review and it may take forever for the whole thing to land. But if we start something new, maybe actually we only use one year to get the whole thing built up and now sending is an official project in OpenStack. So when we started this project, it's a new project. So we actually need to build everything, every single line of code by ourselves. Sometimes we steal, sometimes we borrow if the license permit. So the first thing we thought is what we need to do as a first step. So we have to attack the problem in a step by step way. The first thing is we need some kind of group management. You create individual nodes, your group things together. You can manage the membership of those groups manually. That's the first step. Then you can provide some primitives so that you can scale that group manually easily. Then you can try to introduce some intelligence into the service, make the scaling operation a little bit smarter, a little bit automatic. So that's the philosophy we have. So speaking of resource pool management, we try to look around OpenStack and see if there is anything we can leverage by the side they know. We don't have any existing grouping service or collection service there. So okay, we found this is a missing piece in OpenStack. So why don't we just build a collection service? With that collection service, we can make auto-scaling. With redundancy provided by a resource pool, we can also provision high availability. So you know all high availability solutions based on resource pool, based on resource redundancy. That's a natural fit. So we can make the resource pool load balanced. But sometimes all these advanced features are not so, you may need one or two of the features you may not need them all. You may need just one auto-scaled cluster but you don't want to do load balance. You have a load balanced cluster but you don't want that cluster to be auto-scaled. So all these features are orthogonal. They are independent from each other. So in such kind of a service, the auto-scaling and auto-healing features are all usage scenarios. So the basic is still a collection service. And speaking of the objects we can manage, it can be anything. It can be a novice server. It can be a heat stack, for example. It can be a signal volume, whatever, for the floating IP pools, for example. We don't care. So we want to build just a foundational service allowing you to manage resource pools. That's the whole idea. So this picture you're showing us, the high-level architecture of the sending project. We have a client talking to us, the sending API in a restful way. And then the sending API talks to the backend sending engine via RPC. Actually, you can deploy more than one sending engine so that the scalability won't be a bottleneck. The engine is architected to be able to manage different kind of resource types. We use a new abstraction called profile. A profile basically is the driver that allows you to tell sending engine how to create, how to delete, how to update an object. That object could be anything I just mentioned. To make the engine a little bit smarter, we also invented a policy. Policy has been abused in many ways. So we are abusing it again. So a policy can be anything you want to enforce, any set of rules you want to enforce to be checked when you are performing some operations on the cluster. So that's the whole idea. So this page is showing some examples of the profiles and policies we are providing. Eventually, we hope sending can help you manage clusters of physical machines, virtual machines, and heat stacks, and even containers. So that's a very ambitious goal. You can do all these things via profile plugins. Speaking of policies, today in our version 1.0 release, we already have placement policy that allows you to specify cross-reason, cross-available zone placement, and deletion policy allowing you to specify the criteria you want to enforce when removing some nodes from a cluster. And scaling policy, which is very similar to the heat implementation and Amazon specification. We also have some preliminary support to cluster health today. But it is still under development. Hopefully, by the end of the Newton cycle, we can come up with a whole story how to maintain the health of a cluster. And in all the balance, we support LBA, AASV2 from the very beginning. We don't have support to LBA, AASV1. Batching policy is something we need to figure out. Because when you are deploying a large cluster, it may involve a lot of invocations for the bank and the service. For example, you are creating a cluster of 1,000 Nova servers. You don't want to send out 1,000 VM-create requests to Nova API, and that's the DOS attack. So sometimes those kind of things we have to control. OK, I'm going to make this a little bit quicker. OK, this diagram is a little bit complicated. But this is showing us the whole architecture design of the sending server. The green boxes include the sending API and the sending engine. That's the core components of the sending server. All the other components are modeled as plugins. So when you want to extend sending engine to manage, to create and manage resource pools of different types, you can simply just write your own profile implementation. Today, we have some built-in profile for Nova server and Heatstack. Since we support Heatstack, basically, we can support any resource type that they heat already supported. We don't want to reinvent the wheel. We got some requirements to manage web application clusters. That's doable, but it's not yet on our agenda, for example. When the sending engine is talking back to OpenStack, we have the dependency on OpenStack SDK. That is the only dependency we have on OpenStack. We don't have dependencies on Nova client, heat client, single client, new client. We don't have that. The OpenStack SDK is the single dependency we have today. If you remove that dependency, you can replace it with a dummy driver. So if you are using a fake driver, you can test the whole service very quickly. So that's another advantage you get using a driver model. The upper left corner shows a box named the receiver. That's how you trigger sending cluster operations using external event system or alarm, or whatever monitor service you prefer. It can be a kilometer alarm. It can be a monaska. It can be not OpenStack's solution. It can be a Nagios and Javix, whatever. The only thing you need to expose is just a web hook URL today. And from your monitor service, you can trigger whatever operations on a cluster. That's the design. So this page is showing us the operations we already support for the different object resources. For clusters you can create to delete updates. Add nodes, delete nodes. You can scale out, scale in, resize, attach policy, detach policy. When that policy is already detached, you can dynamically enable it and disable it. So these are all the flexibilities we provide. So the following slides are showing you a quick look and feel what are the operations you can do using the sending command line. We already implemented the OpenStack Client plugin. So all these commands, even if it reads sending profile create, you can also use OpenStack Clutter profile create. That's another alternative. Because across the community, there is a movement to depict all the client command line interface. So we are following that direction. You can operate profiles, clusters, nodes, and the cluster membership. You can manage your policies as objects. You can manage the cluster and policy association relationships. You can also do some useful things using the command line, for example. I forgot to paste a screenshot here. Actually, we have a sending dashboard project launched last year. It's a horizon plug-in. Most of the operations today, we provide the command line interface. We also have our web interface API interface for you to use the service. So this page is showing us one of the commands we support. It's a cluster resize. You can see here we are trying to make such a command very flexible, very powerful. You can resize your clusters in many ways by percentage. Or you can specify the new exact capacity of your cluster. And if there are size constraints you want to respect when you are resizing your cluster, you can tell sending you want to do a strict resize operation or a best effort operation. Sometimes, as Mark just mentioned, if you want to manually resize or scale out your cluster before the weekend, you may want to raise the minimum size constraint of your cluster at the same time. Or else, if you have auto-scaling policies in place, the cluster size will drop down quickly. So that's something we are trying to provide with these commands. So with that, I'm turning over to my friend, Xinhui, to give us a quick demo how we achieve auto-scaling plus high availability, plus load balance in a very simple way, which you can deploy virtually in five minutes or so. OK, thank you, Xinhui. OK, here in the rest of time, I will show a simple demo about all an example how to use Xinhui to create a cluster that's elastic, resilient, and a lot of balanced. And why we chose this example to show, because we want to try to provide some real reference for the industry practice. So nowadays, actually, all the available auto-scaling samples are based on the infrastructure-level metrics. That's very simple, such as the CPU utilization or something like that. But actually, in real practice, we will not do that that way, because we want to use some business representative metrics to trigger the auto-scaling. So that's one trigger reason we chose this example to show. And another reason we want to highlight is the auto-scaling functions actually are very important to auto-scaling too. Because if the manager of the auto-scaling cannot know how many active nodes actually exactly exist in the pool, that means maybe the auto-scaling is totally wrong or that's not trustable. So we will provide the auto-scaling functions also. So as Tim mentioned, actually, all the heat resources has been merged into the heat in Mitaka. So we will show this example in a single flight heat template to implement all the functions I just mentioned. And here, you can see the architecture of our demonstration. Here, we will use a profile to define the node or parameters configurations we need to create a cluster. And then underlying, actually, we will call Nova driver to create the compute instance based on the profile. And we can attach a policy such as load balancing policy and scaling and the health management policy together with the cluster. Once attached load balancer policy, actually, the Selenium will in the back end to create automatically the load balancer, load balancer pool, and the health monitor things. And with the attachment of the scaling policy, actually, we need to create two receivers to receive the scale out and the scale in alarms to execute accordingly. And for the scale out and the scale in alarms, actually, we based on the business metrics that comes from the load balancer throughput that's a representative of a transaction based on the metrics. We use that to trigger the auto-scaling. And for the health management part, actually, we will show how to recover the field node and detect the field node and recover them automatically. There are two ways, actually. First one is we use polling. That means use the underlying health manager demon to detect or poll all the members and do the recovery automatically. The other way is as I showed here, just an example. We can create a receiver to receive the recover alarm. Actually, the alarm can be triggered from the status changed from the load balancer pool, load balancer members that should be changed by health monitor. So that's the architecture. And in the follow the charts slides, I will introduce you some resources together to see how to use this single flight hit template to get the cluster. This one should be the profile. And here, you can see we define all the flavors and all the properties we need to create the node of the synline. That includes the flavor image and network security group, such kind of things. And our cluster resource will refer to all these definitions. And here, we need to define the minimum size. That's at least how many nodes you want and what kind of profile you want to create the cluster. And the next slide is about the load balancer policy. Here, you can see you can define the load balancer pool and the VIP and the health monitor, such kind of sightings. With this attachment, all the synline will create all these resources automatically. And here is the scaling policy. Actually, I want to give more attention about the event property. Here, actually, synline policy is some logical rules will be enforced or executed before any action. Here, we use the event property to define what kind of objective actions you want to guide the user list policy. And the type of adjustment, actually, here, we use a change in the capacity. With this definition, that means we will change the number of the nodes of the target cluster when one's alarm is triggered. And we similarly define a scale out policy. Here, we still use the change in the capacity policies. Also, you can choose the percentage as another choice or a type of adjustment. That's our health policy. Here, there are three elements are very important. The first one is the detection type. Here, I will show the node status polling. That's the type that means we will use the health manager demon to polling all the nodes to know the failure happens and do the recovery automatically. And the interval means the period between the twice detection. And the recover list, actually, we provide different ways. That means in what way we will do the recover, we can recreate the field nodes or we just rebuild some other ways such as very, very fault tolerance things. And for the receiver side, here is the example how to define the receiver for the scale in. Actually, we use the receiver to encapsulate the action when the external event or the alarm happens. Here, the action is used to define the target action is the cluster scale in. And the type is the web hook. That means how to trigger these things. And this will be invoked in the scale in alarm. Actually, just I mentioned, we will use the average rate of the load balancer incoming bytes instead of infrastructure-level metrics here to trigger the alarm of scale out and scale in. And here we have the threshold. The number actually comes from the capacity analysis we do based on the target cluster. We use 17% of the peak value of throughput to be work at the threshold. And then the alarm actions here is corresponding to the receiver I just defined. That's the similar ways just for the scale out. So we can show the demo very quickly. OK. Here, firstly, you can say we will use normal KPI to create the SHKK pair. Oh, it's too fast. Sorry. And then we will use a template I just introduced to create a heat stack. And after that, we can list the stack. And we can use the Sennin CRI commands actually to list how many nodes, to see how many nodes created. Because we define the minimum size of a cluster is 2. So here we have two nodes. Actually, the two nodes correspond to the two members of the load balancer. And then we will show the auto scaling scenario. Here, just recover some policies I just mentioned here. Very quickly, we use the deletion policy. Here, actually, I want to give more highlights about the deletion policy. We use that to guide which candidate to choose when you scale in. That means here we use the criteria, like youngest first. That's different with the heat default implements. Because for the in many customer case, actually, the longest one with the oldest one actually, that's not a best choice to delete. Because maybe it can run the most stable and have some logs and an important data layer. So here we choose the youngest first to delete. And then here is the receivers for scale out and scale in. And the corresponding alarms to trigger the actions. Before that, we can show the list again to see the status before the scale out. Then we will put some processor there. We use the stage to simulate the transaction-based workload to the load balancers. Then after some while, we can list again. You can see there's a new node is added to the cluster. And accordingly, it also be added to the load balancer post as a new member. Then we can just stop the stress for several cycles to see the auto scale in scenario. Here you can see the youngest has been deleted. We just leave the first two members inside the pool untouched. Then we will show the auto scaling scenario. Auto healing, sorry. That's a quick recall about the health policy here. We used this polling. And we also recreate. And then here, actually, we list all the nodes again to see the original status of the two active nodes. Then we will use no stop to simulate the failure of the two nodes. Then after one cycle, actually, Sunlin has detected the failure of the nodes and changed the node status into Isra. And then in the next cycle, we will recover the field nodes. That's near some time. Then in the third cycle, you can see the Sunlin node has been recovered into active again. But by recreate the two new nodes, and the two new nodes will be added into the load balancer pools to replace the failed members. So that's all what we want to show in the demonstration. Maybe now we can make some Q&A with our three. Yeah, there's a couple of microphones here if you don't mind going to this. Any questions? Suggestions? Right. So you talked about applying clustering policy. So if you apply a clustering policy to an existing set of containers, and they're all on the same server, and your policy says to move them to another server, it'll move some of them to another physical host possibly? We don't yet support container clusters, but it's on our agenda in this cycle for VMs, we can have several different options for you to specify as recovery options. You can do reboot, rebuild, evacuate, and finally recreate. OK. And then you mentioned load balancing several times. What load balancer are you all using here? I'll be ASV2. We use that interface, but that can be configured in many ways, I think. Yeah, so basically it's the Neutron LBAS service. So that has a pluggable layer underneath, and you can use whatever load balancer you want under that. Right, and so do you have a recommendation? Come see me later. OK. Here in the demo, I use HAProxy. Yeah, I know, everybody keeps telling me HAProxy, and then I look silly in front of other people talking about that instead of the F5. So anyhow, Java applications. So maybe some people in here know Java applications. Java containers are very unreliable. So are you all looking at working with Java containers as something you would manage and try to help run as a cluster? That's an interesting topic we haven't yet looked into that. OK, but I think you mentioned you were looking at application clustering. We got some requirements. Yeah, yeah, right, because I don't want to run WebLogic, and I need something for JBoss, right? So cool. OK, keep watching. Any more questions? OK, thank you. Thank you. Thank you, everyone.