 First, thank you for coming, for attending this small topic, actually, today we are talking funny stuff. So the topic is from Alibaba internal best practices. So the story is simple in itself, we will tell the story about in the past two years how Alibaba group moved from in-house container program to the cloud-native architecture which is a Kubernetes and how we address the large-scale migration issues. First, I will start with Alibaba's background, you may already know that Alibaba started to deploy the planet a long time ago, as early as 2011, we call it this phase container phase. So there are no docker technologies available at that time, so developers developed something based on TLC and many companies and Baidu, they have a metric like this to address the similar problem. So this is the container phase and up-layer, but because of the containers' models are virtual, so in one part there is only one container and several sessions in one volume. And after 2018, Li Xiang and other Kubernetes members joined the team, so we were promoting a Kubernetes system. So this is the starting point of Alibaba cloud-native progress in 2019. We started an experiment called a cloud-native, and Alibaba is about to use an Alibaba cloud to deploy Alibaba cloud. So that is actually our slogan for a cloud-native environment and we removed the concept of a single container. Now we support multiple containers in one port. And after all this reform, at least cloud technologies based on Kubernetes, I'll show you a chart based on some of the mainstream open-source projects, for example Kubernetes, Helm, Operators Framework, Container D, CSI, all these things are part of the framework. And our monitoring system is quite a pervasive and standard thing. We have a container to provide a very high isolation. And the cluster we are using is actually an OM2, like our framework from ACK, we use a Kubernetes to deploy ACK. And this model is quite a standard in industry, I deploy a meta-couplets method and then from there to deploy multiple clusters facing the users. And this Kubernetes model, we can use this chart to illustrate our team as infrastructure developer. We serve two clients. The group, the second client is the technical team under the Ali cloud. So we are facing a wide portfolio of clients, including the PAS, OM systems, DevOps tools in Ali group, as well as some of the existing and employed monitoring systems, they should all be docked to Kubernetes. And also Alibaba has a lot of online businesses, Taobao, Teemo, et cetera. And some of Alibaba's AI businesses, offshore businesses, should all be docked with Kubernetes. So we have like 10, about 10 large cluster. So that means over 10,000 units and over 1 million containers. For some companies, they only maintain a few thousand small micro classes, but we are moving towards another direction. In Ali, we have a complex structure, we can only use one system for all, but we have some thinking. Like our QB server, we were not modified, we will use the upper strength server because the maintenance of the server is complex. If you want to balance, it will be difficult for us to maintain the server by ourselves. And also it provides a standard entry point, it's the very soul and idea of Kubernetes. So we don't want to, you know, interfere in that regard. And as a big company, the scheduling policy must be very diversified, and the upper strength cannot fully cater to a needs. We have also optimized the performances. At the very earliest, one Ali employee said when choosing the mall, he will only take a hundred samples, even if there are 10,000 available. And he will use a sample to adjust the 100 units. And this is a standard practice by using some strategies to reduce the sampling quantity and later on, we added this feature. It uses a horrible way to change the portfolio and the quantity, the content of the notes. It's based on one idea. In the past, the 10,000 sampling is required, but now not anymore. So this is about the scheduler. Scheduler is in-house developer. It can process batch as well as online businesses. So our mixed business is based on this scheduler. And in terms of controller manager, we use the upper stream system. But above it, we have an in-house developer controller in Ali. We highlight the in-place upgrade. We don't want to, you know, after upgrade the container will go somewhere else. IP address and topology will change. This is not conducive and deadly for our business. And we're also doing a mixed multi-tenancy pattern. So it's a result of a complex computing. So every time if I update my business, even redistribute the resources, then this scheduler is facing tremendous pressure indeed. So we want to ensure that, you know, to maintain a foreseeable and reasonable stability. And also, every step must be precisely controlled by our operator. I will talk about it later. We have separated as an independent project. And for controller, first of all, it's based on its mechanism. However, we need to write one of the controller that is suitable to our usage. And also, it needs to deal with some requirements on depth. And also, the other query as well. And in every bubble, we have some projects. And we use the logic. It's based on its functions. And for this one, for this logic, it's a spectral instance here in Ali. And in this class, the logic class, we are going to... We do not use Doha because you can test the frequency of Doha because it's much slower. It has used a lot of the extra things to it. So it's meaningless to us. And before, we used ponder container, but now we have improved it. And as for the cloud, we use a very high ACK, the CSIC and I, to make it into cloud provider. And also, for some of the big products, it needs the elasticity. And also, sometimes it should be flexible because this is a physical pool. It's not easy to scale, it's fixed. So we need to be able to borrow some resources from others. So we connected with the elastic resource pool. And we call it the ECI. And it's elastic resource pool so that it can... It's a place where we can borrow some of the resources to fit in the scenarios. So it has built up a new system before we do not use it like that. And also, we have a special bare metal instance, just mentioned. And you can see there are a lot of tenets in our systems. And this is a washable constant. We call it a washable constant. And in these resource pools, the users are not going to use it. We have mixed some plug-ins here so that the Kubernetes are able to show and expose the APS of all tenets you can consider. We have APS servers to all of these. And it has the AP objects. It's a virtual object. And this virtual cloud, the virtual things, it's a smart... But there is no nodes because all of the business will be located into the classes. So all of them are sharing the same pools. And each of them will have an independent API. So this is the virtual counter. And this is what we are working on. And in the future, it will be an open source. This is our current resource model now. And it's for some of the technologies here I can introduce to it. We started from 2018. We are doing the container things. So what is the subcontainer? I think of the rich container, they call it the system container. It means that in this system container, apart from the competition of the business, they also have some others like the SSHD and etc. And when you look at the rich container of Alibaba, and what kind of resource we have, like SSHD, low-monitory, cache, VIP, all kinds of things that we put in it, including some of the boot scripts and etc. You don't even know who have put those things in it. And first of all, I want to set up a rich container. And the SSHD will put it inside. And then they're going to start their script. Because at that time, they want to connect with the business side much easier. Why we want to eliminate the rich container? Because it has cultivated a bad mode for operations and maintenance. You can see the development will just randomly write those codes in these containers. And they do not have any kind of concepts like reading only, etc. They just write the road and the other place just write better and then finish. So we want to eliminate the rich container. Because we don't think the rich container has any problem. If you're able to containerize the operations and the rich container is OK. But in our conditions now with this rich container, operations of any bus model has some problems. So starting from 2018, we decided to eliminate the rich container. Now we have a lot of improvement. All of our containers nowadays are actually lightweighted containers so that we can make it into the ports with several ports. Because upstream we have done some standardized work on our users, for example. And cloud native of Alibaba, one of the ports will include these containers. First of all, it's business container. It's also we have some of the life upgrading side cards. For example, I have an application, I want to upgrade it. And then it's going to load a side card into it. And then you don't have to redo it again before maybe you need to copy it to the rich container. Now you don't have to do that because we have a side card. We had to make a side card. Side card just work on one thing. Now you want to upgrade. OK, so the input of the side card is upgrade contents. So it's going to copy this side card content to this because they can show the volume inside it. And also we can show that they can share this log as well. So in this port, we are able to enable upgrading as well. Next, we also have the assistant engine and also help us to fix some small bugs as well. So it will include different kind of ports. And all of those rich containers procedures has become one of the input here inside the ports. But there were some problems here that are unable to solve. And that is monitoring. Monitoring is actually relying on reading the documents, the files like the SQL and other values as well. We need to find the other way to address these problems. Originally, it's a rich container. And it's working on the VM. If you want to convert it into the port, you need to solve first what is the booting sequence. Then we need to use a concept called Nick Container. And what else? It's a lifecycle hook. Some of the containers may be earlier than others. So the latter booted would need to set some of the prerequisite conditions. So the business should first define these conditions for pre-start and post-start of these different containers to decide which ones should start first. I'll give you an example here. All the hubs should first define all the lifecycle codes so that it's able to run properly. Because we are trying to evolve to the cloud native. This is a very good path for us. And we have utilized a lot of things in Kubernetes, like the post-cycle, container-like cycle to control it. Of course, you also need to define the life niche codes. And in what way that you can consider a container has been started or not. And after we have eliminated the rich container, then you will find that originally I utilized the Kubernetes, which will cause different impact on this automation. Why? We will find the optimization impacts here. Because even the rich container was in utilized system D. You are not going to know the lifecycle or the application lifecycle's connections. They don't have any coupling or any relation between them. However, the lifecycle is aligned with the app's lifecycle. So you are going to identify the systems. Then the systems will be better able to run this. Run itself. So the documents, then you are able to build up the volume very fastly. And everybody are able to focus on their own functions like monitoring. You just need to focus on monitoring work. And later, we are going to have side part operators to input the side part inside the containers. You don't have to do it yourself. Or we have a call-exit introduced. It can automatically put the side part in it, including the upgrading and deleting. And it will utilize it as an automated way, in an automated way. And everybody will be able to focus on their functions. So we are going to find a lot of benefits after eliminating the rich containers. And later, we are able to utilize it to work on more things. And now, first is the resource utilization. And also, it's something that some of the lifecycle is similar to the container's lifecycle. So we are going to have application management to manage its lifecycle. And then we are able to put the data into volume. The persistent volume, I don't have to find another place to write this. So in the future, it's going to upgrade space on the community. So when we first build up the 1.0 data, we have set to release and then upgraded to 1.12. And now, we have a 2.1.4. So you can see we are not going to lag behind to release of the communities. Because the Kubernetes is able to, for Kubernetes, you can control the minimum of gray and changes. And next is the cloud-native app management. We have a whole set of Kubernetes. And we can do some of the leasing. And first, we need to define different kind of management and also relationship, and also how to deal with the multiple clusters and interaction between them. So here is the application and operation of it. And first of all is the input operations. Next is the application operations. And we need to communicate with different teams. And they need to follow the Kubernetes concept in its development process. And we also need to explore what kind of definition of it. And now, we are using help to define the applications. However, some of the resources are not applicable. So for example, how can we define native resources on help? And this kind of resources are not able to define by the documents. It's not easy to define it. So we are still working on it and communicating in the community that whether we are able to build up the application definition that is facing the cloud. It's based on clouds. Later, we are going to adjust this problem, hopefully. And as for the management, we are still doing some pilot and experiments. We have our own thinking. First, we are utilizing the definition from help. So we need to build up a hub, an app hub. But the official app hub is not that big in China. And also, there are different kinds of challenges, like the GCR.io. And we need to do some localization based on this hub. And we have put it in the early cloud and provide some free services. You can utilize this hub. We can automatically synchronize with the official hub. And also, you can utilize some optimal there as well, because the cloud-native GitHub is also completely open. You can utilize it there as well. And as for Kubernetes, there are different type of tools. And it's mainly the yarn documents. And how can we allocate the documents? And also, how can we configure those parameters? And we utilize the help to change is changing some of the definitions and also parameters. And we have utilized Kubernetes native to work on some of the changes. And we need to use the centers to do the decoupling. And the old Jenkins, you need to start it from the beginning to the end to put it into the KBS. In KBS, for example, I have an application and it's documents in-game. And in the KBS, there are a lot of components. And they are able to automatically modify the utilization states of the applications if it finds that I'm lack of resource and need to scale the instances number. Then the operations is that they need to modify the replica. And they need to change the number of the replica to 10. And it's able to change upwards. And no matter where you have put your documents, write your documents, they need to write it there. If there is a black box between the AIC and CI, for example, I do not allow you to store the format, then my HAPA are not able to change the documents. I have the YAML documents. However, it's going to be modified by developers. It also be able to modify the Kubernetes. If you need the Kubernetes to change it, you should have the open protocol that could be understood by the KBS so that it can operate this process. So I think that CI and OPA are able to do this communication. However, starting from the pack from the source code and then build up to go through different process and then put these documents into resources and all those generated documents could be put into GIT because it has been put at the GIT. So we are going to have GitHub. We don't think we need to go ever further, even further. So after we put it into the GitHub, we are going to manage this process. Next. I'll talk about how our workloads is dealing with our app processing. Maybe you're very hungry, so I'll keep it short and quick. Speaking of Ali, the first thing coming to mind is the November 11th shopping day. Currently, we have nearly a million containers on the cloud distributed in over 100 server rooms with hundreds of thousands of applications. So this quantity of infrastructure moving to cloud, we are facing the stability and some other risk factors. We're also assisting the past platforms on the upper stream to migrate to the cloud instead of past remain unchanged. Otherwise, it cannot cater to needs. So let me talk about the November 11th issue. You know what our upper applications have done for November 11th. A few more to talk about, they will share this disclosing information. About our infrastructure or big Kubernetes, what will it prepare for November 11th? So on November 11th, the traffic is like n times of daily average, even over 100 times. So we need far more resources than what we need on a daily basis. So before November 11th, the first thing is to establish the station by scalable resources. First, we should upscale the resources before November 11th on one hand to improve the resource utilization rate. And second, to stabilize November 11th. Suppose we were to upgrade our pod into 10,000 units. Should we rely on changing the quantity of a replica? This way, the risk, because the scheduler itself does not know what applications and their scalability. It can only check the current status quo and whether it needs to be expanded or not. If we were to scale up to hundreds of thousands of parts, if we rely on this strategy, this individual scheduling strategy, the outcome cannot, can definitely not cater to our needs. Some of the resources overlap, and some of the rules will not be met. So when establishing the nodes, we use an offline scheduler to understand the status quo of the resources and then matchmake with hundreds of thousands of parts to enable centralized distribution. And after that, through batch placement, we based on the feedbacks, we use the batch placement to enforce the resource distribution. Internally, we use the two CRD to deliver the function. One is called a batch allocation plan. The other is called a batch adoption. So batch allocation plan by its name, it creates buffer parts with the candidate nodes. That is to say, in this flexible resource pool, if we were to create hundreds of thousands of parts, we cannot guarantee that the first distribution is in line with our expectation. So the batch allocation plan allows the creation of buffer parts until we distribute the resources in at a reasonable level. When it meets our expectation, we will enforce that action. So in this way, the overall resource distribution meets our expectation. And if you are familiar with the Kubernetes, the default release strategy is to recreate part. The problem is after our efforts, maybe the hundreds of thousands of parts, it finally met our expectation. And when one application released it, then all other parts will migrate away. So it will not meet our expectation. But before November 11th, Ali will go through several times of the whole circuit after prediction. So during the testing phase, even if it passed the verification, but upon release, we cannot guarantee the actual outcome can meet our expectation. So this is the release strategy from the meta. There are some release strategies provided. No matter what strategy we use, part migration is part of it. So part migration breaks our expectation to the stability of the part. So we developed this. It's called an in-place update. So when releasing, we can upgrade the container on the very same spot instead of recreating a part. So for upper stream users, you don't care about this part, but I only give a brief introduction of it. So within the advanced state for set, if we set up the partition to be 1, then under this condition, the advanced controller will only operate the corresponding container that you intend to modify. And after that, Kubernetes will identify the containers that the Hatch value has changed. Then it will use mirror image to recreate the file. In the process, the set of containers in the part is in operation. That means it's unaffected. On the other way around, if we need to upgrade a set of container, it will not affect the application container. And this is what the default model cannot support. In a template, if there are multiple containers in the part, if your components and container is to operate the version or the mirror image, because it uses a part recreation strategy, it will cause the deletion of the old part and to create a new part. And this does not meet our expectation. We want the upgrade of the component not affecting the business going online because it affects the stability. For comparison, to put it simply, the default model and our rolling update in-place update can better cater to our needs, for example, to ensure the certainty of the system. We know that Docker and containers have mirror image. If it's upgraded on the very same spot, some of the layers can continue to be used. So it improves the efficiency. Even if it falls the buffer resource demand, and there's another issue, Kubernetes, the default strategy for recreating it, it's difficult for Ali to execute. For one, it's because of the uncertainty and also Ali's share of volume. Every day, our online app releases 6,000, even 10,000 times. And each app, just like a John has said, there are over 10,000 parts under each. If we were to recreate it in order to upgrade it, then there's a huge pressure upon the schedule. So every release, it will distribute tens of thousands of parts. And for the supply chain, for the service registration, it will cause tremendous pressure. So through in-place update, we can recreate the part that we like to wait in time without modifying, without resharing. And now I'd like to say a few things about another CAD we do. It's called Sidecar. Let's first take a look at the problems of the traditional model. You can either define a step set or whether the template includes your expected containers, like your business container, your Sidecar container, your OM components, your agents, and other authorized components. And these components were all defined in this block. For some small-scale operation, it's fine. But for large-scale operation, the first thing we encounter is for the operators, the managers, that early have over 100,000 applications. And the operator of the application, they don't know which Sidecar site is required. And they don't want to pay attention to such details related to OM. And the operators, business operator, they only care about the stability of the system, the upgrade of the system. They don't care the Sidecar components. The second issue being that when we define the Sidecar in a working environment, the question is how to upgrade it. We know that Ali's workload has hundreds of thousands of units. So when the component main operator is expected to manage hundreds of thousands of blocks, it is painstaking. Secondly, because the application, the definition, they are within the same block. So when the upgrade of the container and components are in confliction, are conflicting with each other, it will also affect the container stability. For example, when upgrading the application and the maintainer of the components identified the same problem, and he's modifying the Sidecar version of the stateful, then these two actions will conflict with each other. So to cater to this problem, we developed a lightweight development tool. It does a very simple thing. It separates the container and other components upgrades. So with the Sidecar, the app maintenance, it only needs to define and maintain the mirror image of the application. So application developers does not have to care about the Sidecars. All the Sidecar components are all defined within the Sidecar sets. We have dedicated Sidecar maintenance teams to say to the maintenance. And soon as the part is scaling up, every time the part is scaling up, the Sidecar set will have an admission output to intercept this action. And it will compare whether the part level is in line with the selection criteria. If yes, then it will inject into the corresponding part the Sidecar as needed. For example, on Ali, we have nearly one million parts. And how do we manage those parts? It's very simple. We just need to define one Sidecard. And we don't have to input the definition of the Sidecard on all those parts online. And these Sidecars are going to inject the container we need in all the parts online. So in this way, it can successfully combine the Sidecard with separate from the business. First, we have users as developers. Developers do not have to pay attention to the Sidecard container and the management of it. And also the manager of the Sidecard can easily use this Sidecard set to manage all those Sidecard containers online to work on the upgrading so that it will not affect the original business running. So here is a map that I've introduced. Once we have defined the Sidecards, Sidecard CRD and the operator's behavior is like that, it's going to inject the Sidecards into all those parts that fit the selected conditions. And the upgrades and delete of the Sidecard will just fit in its mechanisms of working of these Sidecards. For example, when the operator wants to manage it, it just needs to match with the mirror of the Sidecard. And then it's going to utilize its upgrade strategy to upgrade all the Sidecard versions. And the managers do not have to utilize the other resources to do that. So it would be a much easier way to couple. And we know that the Sidecard set and ESO injects have some similarity, but there was some difference. ESO, it's based on the selected conditions and injected into the containers and have a specific location of injecting. And Sidecards provides a rich set of functions. For example, there is a design maps for the locks. And these components need to read some data from it. So it needs the Sidecards to share with the containers or data. So this time, the Sidecard has become a data mounting, data volume mounting tools that are able to share the data volumes with the container. And also, in this way, we are able to gradually upgrade the online Sidecard containers without awareness from the business people. And also the managers and the users of these containers actually do not have to aware these upgrades. And we also have the local upgrade functions. Maybe you will not feel about it, these two functions. However, we have provided to an open source project that is open-cruise. And you can visit this. And this is the output. And all of the output functions has been proved internally in Alibaba. And also, it's based on some of the demand scenarios internally. And we are not going to 100% deliver all those functions. We're going to gradually push some of the generic functions into the community so that we can build these functions together. I want to introduce the open-cruise functions. And the cruise is the core function of the open-cruise project. And also, it has provided some of the basic and daily functions. One is blockhouse drop, and one is the Sidecard set. Maybe you haven't heard about blockhouse drop. Before, we also have some of those drops. It's similar to the combination between the two original functions. So once the drop runs, then it will finish. It's going to dispatch different drops to different places across different nodes in the clusters. OK, I'm going to generally talk about the advanced state for set. We provide two function. One is max available. And for example, for the default state for set, if the proxy has choose parallel, however, when it's releasing, it's going to release ports one by one if we have 500 ports. Even though you use the partition to set it into 10 batches, but these 50 ports in each batch will still be released one by one. However, the max available are able to do the parallel release of these ports together. For example, all the 10 batches, in each batches, the 50 ports will be released in a parallel way. So it will be 50 times accelerated during this process on updating. This is in place update. Maybe you have been familiar with it because it's an exit Sidecard set. We're going to inject the Sidecard into the container that has the ports that has been selected based on some conditions. And by combining with some containers, they are going to be run on each of the nodes across the classes. And we're going to stop here for today's talks. And you can contact me on WeChat and also welcome all of you.