 Before I start my presentation today, I would like to do a little survey. How many of you here today have used or still using deployment for release? Please show me your hands. So how many of you haven't tried deployment anymore? So we still have people who haven't put out their hands in the two rounds of survey. And I will be very quick in the first phase part and I will save more time for the demo. The Kubernetes deployment is the replication controller and there are two important tasks completed. That is the replication and the rolling update. That's two important tasks. Then Kubernetes are solving the problem of availability and the consistency. When we move to deployment, the deployment and replica set has a structure of two layers. So we have the capability of version control so we can do the rolling back. After the release, we can return back to the historical vision. So that is the current status of deployment. So when we evaluate deployment for the cloud native release, what are the challenges? The first challenge is the deployment strategy. As the R&D team, I think you have one thing in common. We have the application. It doesn't mean our application is available, especially during the deployment process. If we want to check whether it is available, what we do? First, we will check whether there is abnormality from monitoring. We will refer back to the main link, whether the major features are normal and thirdly, whether the changes are effective. After we do this confirmation or verification, we can make sure the service is available. When we use deployment for rolling update, we didn't have time for verification. There is a pause mechanism in deployment. But when we really run the pause, it's highly possible the rolling update is continued. So we cannot have accurate process control during the deployment process. And the second challenge is the part scheduling strategy. We find that under one deployment, the part scheduling strategy is exactly the same. What is the problem? We can have a look at this scenario. Imagine we have a cluster. The cluster is distributed in two data centers. And I want to release four poles. And the scheduling strategy is exactly the same. So it's possible that my four poles will be scheduled to only one server. And usually it's okay and it will all be routed to the same server. But in case of disaster, for example, there is a road construction outside of the server and I don't have a power supply, then I don't have this server still on. And with the Kubernetes operation mechanism, it will be shifted to other server. But the backup time is out of our control. That is why we say if you want to have a high availability, we need to have these backup in one city or in multiple cities. And if there's a disaster happening to our machine room or the computer room, we might not have enough resources. So whether my application can be recovered in another center, that is a very big question mark. So these consistent or the same scheduling strategy do not allow us to have the tolerance of disasters or the catastrophe tolerance. And the third challenge is the IP address restrictions. We have some users, they have a very complete maintenance and operation system, but it is based on the VM deployment. On the VM, what are the features for the deployment? The physical resources, the ECS, my IP will not change with the iteration of application. The application process can start again and again, but my VM, my IP, maintain the same. So in the classical scenario, we take IP as the standards, and it is widely adopted. They monitor, they link analysis. The gateway and the security wireless strategy all have taken IP as the standard. If it is relocated from VM to container model, the IP address is still maintained the same. And in the process, we find that we cannot take deployment to satisfy the needs of our customers. So we have developed our own CR and we call it the cafe deployment. Our cafe deployment supports the in-place upgrade. In most of the release scenarios, we can make sure the pod is stable, IP is the same, and we have considered the high availability in the financial scenarios and supports active replication or multi-active replication. And our pod has the graceful shutdown feature. The official Kubernetes also has the capability of graceful shutdown. But if the concurrency accumulates to a certain amount, the pod cannot achieve fully graceful shutdown. And the cafe deployment supports safe and flexible deployment strategy. And there's a beta or the rolling update in batches so that they can have accurate control of the deployment process. And they have enough time for verification. So these are the three features of the cafe deployment. Next is the structure of the cafe deployment. And it is the same as the Kubernetes deployment. It has three layers. The first layer is in-place set. In-place set is something which is a replica set which supports in-place upgrade. And it maintains the number of pods. And cafe deployment will not have direct interaction with the pod. And we have in-place set. It is mainly for the in-place upgrade. And this is the in-place upgrade configuration we can support. If the changes happen in these areas, we can maintain stable pod and IP. And the cafe deployment naturally supports active-active replication or multi-active replication. And in the spec, we have a clear definition of the topology of the application. On the left side, you can see the output. And the cluster has covered two data centers. In each data center, I have the node. When the node is created, it will labeled with the data center. And this label helps during the calling of the node that it can satisfy certain requirements. And according to the topology in the spec, it will create the unique in-place set. And the in-place set is unique to the data center. And the in-place set is responsible for the maintenance of the life cycle of the pod. Cafe deployment takes the gold view and it decides in each data center how many replicas I need. And the cafe deployment has the in-place upgrade and the photo-tolerance or disaster-tolerance feature. Next, I want to introduce Hao Tian to share with you the deployment strategy of a cafe. Thank you. Next, I would like to present the features of a cafe deployment. Number one is the deployment strategy. I will give you a simple example. There is a cafe deployment and there are two data centers connected. And we have in-place set A and in-place set B. And the left bottom is part of the specs of cafe deployment. And we have 10 replicas by default. They are divided equally to the two data centers. And in-place set A has five poles and in-place set B also have five poles. And we also have the beta upgrade type. And the 10 poles will be divided to three groups. The first is a beta group. And the other two groups are the standard batches. And the beta group, let me explain. We take one random pod from each data center for a new version. If the customer says no problem, then we can move to the standard deployment. And the client needs to click and confirm. When we first have a cafe deployment, the pod grows from zero. If the deployment has 100 replicas. And if we create directly these 100 pods and one pod has a problem with the image. And then it will cause a waste of resources because your deployment fails. And the creation of the pod is by batch. We will first create the pod in the beta group. And one in each data center. And if the customer confirms no problem, then we will have the following batches. And each batch we have four poles, two in each data center. And one in-place set will have two poles. And then customer confirm. And then the last batch of four poles will be created in total 10 pods. During the upgrade of the pod, we will follow the same deployment strategy. First is the beta group. The customer confirms no problem. We move to the batch upgrade. And every batch is four poles. And confirm again. And then we have the second batch. During the deployment process, cafe deployment controller will not directly apply on the pod. The pod creation and upgrade is done by the in-place set controller. And the cafe deployment controller is only to abstract the strategy. During the deployment process, if the pod version has some problem, the customer can abort and undo the upgrade. During the deployment process, we might find that in-place set A do not have enough resources. And then maybe one pod cannot be created. So we have the rescheduling mechanism. And it will be automatically rescheduled to the other in-place set. Some customers can close this configuration if they don't need it. If the customer want to have four poles in one data center and all the others will be deployed in other in-place set, then you can have this specified. For example, DCA four poles and the DCB 60% of the pod. And the cafe deployment is not directly working on the pod. Because there is an interface, pod set control interface. And they use this interface to work on the bottom workload, like the replica, etc. In-place set has realized the grouping and high availability scheduling across different data centers. Some customers already use or plan to use like regular set, replica set, or state for set, etc. And the scheduling or the scheduling between different groups or different centers cannot be realized. And they can also use the interface and the native workload can be reinforced. And we didn't change the code of the controllers. So the replica set or state for set can be maintained at the previous status. So the customers can maintain the previous and it will not have any attack, any invasion to the original code. And we have connected the replica set and we will also connect to the server set. And this is after connecting to the replica set. And it is the same as in-place set. In different centers, we will have the replica set. It is to make sure the version of the pod when we do the deployment, we will create new replica set. And the new replica set will maintain the pod of the new version. And the cafe deployment strategy or logic can also be used, the replica set. And we will create this pod and customer confirms. And then we will move to the second batch for pods. And the customer confirm again. And the last batch of pods will be created. So when we use in-place set for deployment, we might encounter the challenges. For example, the shutdown is not graceful and there might be failure of request. Because IP shutdown is the reverse incident trigger. And the pod is not ready and then it will inform. And it will shut down the IP from the list and it will have impact on the IP label. And then it will shut down the routing. If the customers can have the right to queue the signals, it can realize that the graceful shutdown, but it is not guaranteed to have the graceful shutdown every time. So the implicit controller layer is selected for the graceful shutdown. We have the routing principle shutdown first. And then we reach to the pod for the graceful shutdown. And we use the replica skater, redness gate. We can change the state of redness gate to indicate whether the pod is ready or not. When we upgrade the pod in advance, we will set readiness gate as a force. So the gate is not ready in advance. And the pod controller will realize this state. And it will remove the IP to the not ready list. And the other routers, like Kubrocy, will watch that it's not ready. And it will also remove it from the list. If we use a cluster AP to exposure this pod, implicit controller cannot realize whether this traffic or this router is offline. And we here have wait three seconds logic. We wait for three seconds. And we assume that it will process the request or the traffic in three seconds. And then we will start the upgrade. If the upgrade is successful, we will set the readiness gate to true. And then the router information will be generated. Of course, three seconds cannot be a full guarantee for all the traffic from the pod is processed. For a true quadcopter shutdown, you need the involvement of a load balancer. We have an SLB controller who will work together with the magnetom. And when the pod will be on the load balancer SLB, they will have a finalizer on the pod. Used to guarantee the pod will not be revised or changed. When we are truly upgrade the pod, we will take the readiness gate at first. And then the routing rule will be recognized. And there will be a true shutdown. Once you have that and there is no traffic, they will go back to remove the finalizer. And then the controller would know that it is a true remove of the traffic. So there will not be the waiting of three seconds. And that is how you can guarantee it will be truly synchronized for the upgrading. Now I would like to show you the demo. I will mainly cover by creating a control implement by a implicit work load. And let me show you how did we create the pod in batches, upgrade and rollback. And if there is more time, I would like to show you how can we change it to the ready set as the bottom work load. Seems that it is not so easy to operate. Because my laptop does not show me the full screen of this code. So allow me to for control and turn back. And now I am going to create the environment that is clean with no cafe deployment created. The current cafe deployment configuration we would give him readiness and on the top. The strategy back set 4. That will be each batch it was before code. Upgrade type is better. After you do that, we will have a time to wait and confirm. And over here, the bottom right, I will be watch how the pod was created. Okay, you can see I just had a new pod. And then on this part, you can see already two parts has been provisioned. And for this window, I will be keep watching the state of cafe deployment. And over here, you can see progress. It is waiting for confirmation. That means the cafe deployment is already ready waiting for confirmation. And I'm here for confirm for the current beta version or beta release of the cafe deployment. Now we provide such a plan that in Anvilis, we will have a mark. If you say it is false, that means it is not confirmed. If the user found it is okay, I think the two better pod is okay. And I can change that to true. And then on the right hand side, you can see we started to relieve the first batch, which is for pods. On the upper right, you can see the current progress. It is again waiting for confirmation. Because we have finished the first round of release. And over here, I can confirm again. And you can see this is the last one we need. Pod number 7 to 10. On the upper right, you can see the progress is executing and now it is completed. We have all the 10 pods and are all ready. And the progress is completed. And on the left hand side, you can see all the pods. We have the node fission to guarantee that it will be deployed in two different rooms. Upon the node, we will give them level. As you notice, that is the foundation. Here you can see we have all the node affiliates of the node. Five of them have been signed to CIA. The others are for Room B. And here I'd like to show you how we upgrade. And the pod actually has a variable of environment. I can change this variable from Room 1 to Room 2. Let me show you how we upgrade. And over here, it's all the pods you can see. They have the fixed IP. You can see the pods are already into progress of upgrading. The current state is still waiting for confirmation. And then we can show all the variables of all pods. Because there is a beta release, the first batch will be two pods. And you can see it's already been two. And still waiting for confirmation, we can confirm this grouping for beta release. And you can see the first four pods. You can see the pod IPs are always fixed because the bottom layer is infraset. So it is what we call in-place upgrading. And over here you can see already six pods. It has changed the status to v2, so version 2. And now you can see if I got any problem, I want to go back. I want to give up on the current release. At the same way, you can see you can change the same thing. Change the notation. I gave it a bot. So the controller would now know that I want to abort. And on the right hand side, you can see it is in the abort status. And the bot is currently rolling back to 10, and that's finished. You can see all the six pods we just upgraded to v2 had rolled back to v1. And over here, because of time limits, for the following changing to replica set workload, I don't think I got time for that. And beta, I would like to welcome your questions. Any questions? It seems that the function is a little bit similar to cluster formulation with the deployment in multi clusters. Other than controlling its upgrade versions, have you considered, you know, I don't know when you're deploying for different KPI clusters, or you have different rooms in the big KPI cluster? Well, right now we have one cluster that will be noted across different rooms, or different cells. So in this cross-region, they will have the interactions. I've got a question about the confirmation. I can understand when you confirm your change from false to true, but after you finish this batch, how can we change from true to false again? Is it done by the cafe deployment control, or you need to in-place set to report? Annotation, you change that to false, that is the logic control of the release. So it is actually in the cafe deployment and controlled by the cafe deployment controller. In-place set cafe deployment would watch the status, and once it changed to waiting, and then it will be replaced to false, right? No, no, no, that is not the bottom layer, the top layer it is waiting. They would use the interface. They do not care if the implicit or replicant stepset. They would like to see its partition. Replicant is a little bit strange because it's new. For beta release, there will be two ports, and one is two ports already, and they will call to the cafe deployment controller. I would also like to say implicit itself would not judge whether my cell is in weather vision. They would just report to cafe deployment how many of the ports are ready, how many are not. Then cafe deployment, based on the report of in-place set, they will make a decision that is it confirmed waiting or executing or whatever. Another question, if it's a play cafe deployment, will that configuration also change from bottom to in-place set? It's just like play deployment, you have the update deployment, and they will upgrade replicant, right? Yes, yes. Okay, it is already time. We cannot have QA, but if you are interested, you are welcome to come to us offline. I will remain here. And one last thing. Is it okay for me to take a picture with you? So it's a good gift for us. It's a selfie for all of us. One, two, three. Thank you.