 This is Tom and I work in Alibaba as a system architect. Today I will show what we are doing in Alibaba about the private infrastructure cloud. This is today's agenda. First, I will introduce the background and the middle stones about this project. Then I will show our architecture, which is based on OpenStack. After that, I will talk about customization. That means how we build our infrastructure cloud based on OpenStack. First, I want to contrast the traditional operation between private cloud. Traditional operation is more manual. Engineers may write scripts or tools to help with automation of operations. So that would be many loose subsystems or tools, such as DNS tools or load balance tools, et cetera. And it is engineer who decided how to use these resources, such as using which IP, running virtual machine, or which physical server. This may take a lot of time and may make a mistake. The private cloud is different. It is also automated, but it is more intelligent. The cloud maintains a unified resource pool and decides how to use the resources. Our private infrastructure cloud focuses on two sides. One is compute and one is network. Why we do this is all caused by problems. These are the problems we have met. During the past months, the physical servers and the network devices goes rapidly. We are now maintaining a large scale of hardware. There are all basic resources we should use. We should make a better use of them. It would be out of control if it is still controlled manually. And we have also met the large scale page review. Our November 11th sale, Alibaba hold a promotion sale in November 11 every year on Taobao and T-more. Our system and the network would be impacted because of the page review hour. We do need an infrastructure which can provide an elastic service. So naturally, we can extend the computer ability and the finished network demands in minutes. This is our million stalls. We started to design and implement it from quarter one of 2011. In quarter two of 2011, we released the first vision and the published data to use, which is scouted to more than 1,000 virtual machines. In quarter four of 2011, we added the interaction with the load balances to the system so the system can extend quickly. In 2020, three branches appeared in this project. One branch is aimed to make use of the Hadoop resources, which were called offline resources. These resources are used in night and their peak hour is in night two. But at daytime, they are free. On the other hand, our online services are running in daytime. We wanted to use offline resources to run online services so we can avoid the impact hours and make use of these offline resources. This branch succeeded in November 11, 2012. We succeeded to start the peak hour between online services and offline computers. Another branch was to build an infrastructure as a service platform, which can manage a lot of scale of machines. Until 2012, our platform has maintained more than 10,000 scale of virtual machines. The third branch is for network. This branch started from this year, 2013. We want to build a platform to provide the network as a service and I will introduce at last. This is our architecture, which includes the computer and the network, just like OpenStack. The computer in the picture provides the virtual machine as a service. The network subsystem provides the network connectivity as a service. Computer gets network connectivity from network subsystem and the identity subsystem provides unified authentication. All are connected to it and get its authority. Engineers use dashboards to operate. This is a follow-up chart to show the user guide. Cloud API is the main interface for the dashboards. Engineers use dashboards to control the system. And when a request comes in, the Cloud API accepts the request, gets the authority and puts it into the message queue. First, the scheduler gets the request and decides how to schedule this request. The core resource allocation algorithm is implemented in this component. When the schedule has done its job, it puts the request into the message again. The manager component gets the request from message queue and the regular allocation information from a database. It gets the execute the driver and run the execute the sequences. First, it generates the block disk and allocates local disk via LLM. Then, it configures the network to get connectivity. At last, it configures the database and launches the machine. Our order design didn't separate the network from this system. From this year, we started to start a new branch to do this. The network subsystem was run as a dependent component and provide an interface for the other systems. Before, it's a general architecture for our totally system. Next, I will introduce the system design in more detail. First is our computer subsystem. It provides the virtual machine as a service, just like Nova or OpenStack. It is a core subsystem and resolves the problems for the operating system and administrator. We have a larger scale deployed, and it is 10,000 scale now. This is the architecture for our computer subsystem. It is based on OpenStack, and we have done a lot to make it shoot our environment. You can see the cloud controller is a core component. It provides the rest for API to the dashboard and to the platform as a service and also to other systems. Virtual machine management, image management, and the scheduler are also implemented in cloud controller. Another important thing is the resource pool. The resource pool maintains all the resource information. All models depend on this pool. Zone controller runs in each zone. The zone is a logical conception, and each IDC can have one zone controller or several zone controllers. Each physical server runs as an agent called a node controller to master resources and executions in natural physical server. Next, I will introduce the problems we have met and how we do the customization of OpenStack. The first important problem is the computer driver of OpenStack. The normal computer driver allows all computer nodes to access the same database. This may cause problems. What would happen when there are thousands of computer nodes? The single database would suffer a lot, and we should do a lot of optimization for the database. Let's see what should we do to resolve the problem. Let's see this picture. We separate the original computer driver of Nova into two parts. One is called a zone controller, and another is called a node controller. The zone controller can access the database, while the node controller cannot. Zone controller accesses the database and gets the detailed information and generates execution sequences. Node controller is just like an agent in each physical server. It gets commands from zone controller and execution commands. The node controller can access the database, and it is more simple, just an execute. OK, let's see how we do the image management. This is an important problem between the public cloud and the private cloud. Public cloud may have lots of images for end users, because each end user may want his own image. So public cloud must spend more time to improve the storage for larger scale of images. It needs a heavy system, just like a glance of an open stack. On the other hand, the private cloud just have several images, because the operating system in the company should be more standard. A larger storage system may be useful. We just use the fast system. But the private cloud must spend more time to reduce the transfer time, while efficiency is very important. Let's see what we do in this field. We did a layer of the catch system to promote efficiency. We maintain the major meta information in the core database, and we deploy catches in each zone and in each physical server. In this case, if we wanted to run an instance, the node can get an image from its own local catch. And there would be no transfer time, because there are just several images. So we can do this thing. Scheduler driver is the core component of our system. It decides how to allocate resources. We have designed an algorithm to do this. The design of this algorithm has several limits. It should get a high resource usage, and it must also be stability and for the torrent. All our applications run on this system. When a network device is done, it must not affect the application. That means we must consider the distribution of the machine for the application. Besides this, we also make specific network demands. In Alibaba, we have lots of applications. Each application wants its own VLAN. So we need a dynamic VLAN to resolve this problem. And after the generator of the machine, we should configure the network environment, such as ACL, DNS, load balances, et cetera. Because we cannot find a convenient method to do this, we should do this manually. How we resolve these problems before? We use a VLAN trunk to resolve a lot of VLANs. We configure the resolved VLANs in network devices beforehand. The scheduler calculates VLAN that the application should use. When the instance launches, the driver do local network configuration to join it to the calculated VLAN. This method we use is more static. And this result is naturally wasted a lot of IP. Some application may use several IPs in its own VLAN. And the left IP would not be used anymore. We do need a dynamic and programmable network. From 2013, we started to build our network platform. The platform aims to maintain network's own resources and provide the network as a service platform. It also can provide a programmable API for our system, which needs network connection. This sub-system is under construction. And we have just to do the architecture. This is the conceptual architecture of our network platform. We wanted to build it via SDN. Produced as SDN platform is our target. Not only OpenFlow, because of the issue left over by history. Most of our network devices are commercial devices, such as Cisco or Huawei. So the platform we build must include them. And the protocols it supports include SMP, NetConf. We can do data connecting by SMP and operation by NetConf. The OpenFlow protocol separates network device controller into two parts. One is the controller, and one is data forwarding. We can do a lot of elastic things by rating controllers by OpenFlow. Our platform will take it as an important part. This is the technical architecture of our network platform. It is also a layered structure. Cloud controller is a core component. It provides a not spawned API to the upper layer system. Zone controller is responsible for controlling the devices in its own zone, which contains convention devices, OpenFlow devices, load balances, and others. At last, I will introduce why we wanted to build this network platform by SDN. We think SDN can unify network resources, and we can upgrade seamlessly. The important thing is that we can dynamically schedule network resources and promote efficiency. So we can get an elastic network by SDN. It's all on the way, and we are trying our best to do this. That's all. Thank you.