 Hi, everyone. It's my great honor to be here to tell you a little bit about Alibaba Cloud and some of the technologies behind it. So let's get started. First, I want to share with you a set of numbers to give you a sense of Alibaba Cloud. Alibaba Cloud was founded on September 10, 2009, which is exactly 10th anniversary of Alibaba Group's founding. And after two years, we rolled out our first product on July 28, 2011. That's the Elastic Computing Service, which is a virtual machine. Since then, we've seen tremendous growth of Alibaba Cloud. In the last eight quarters, we have seen triple digital growth quarter year over year consecutively. That puts us the number one in terms of cloud companies within China. And in our last quarters quarterly report, we reported 0.87 million paying customers. And the total revenue for fiscal year 2017 was $6.6 billion, almost $1 billion US. So the technology powering Alibaba Cloud is called Apsara, or in Chinese, Feitian. So let me just go through briefly about the journey of Apsara. So development of Apsara started in February 2009. If you recall in my previous slides, you would notice that actually development of Apsara precedes the founding of Alibaba Cloud. The reason being that when we started doing building Apsara, what we wanted to do was to build a unified foundation, technology foundation for all the Alibaba Groups business units. Because at that time, the business was growing very fast in different business groups. And also, the technology-wise, we have commercial software, open sources, in siloed fashions. So we wanted to really unify everything together, have a single foundation for all the Alibaba Groups business groups. So after about one year's development in August 2010, I'm sorry, Apsara became the technology foundation, the cloud infrastructure within Alibaba Group. With that, we support four major applications, including web search, web mail, image storage, and a micro loan payment service, which now is part of Ant Financial. The third milestone I want to talk about is in August 2013. This is what we internally call the 5K project. So on that date, August 15, 2013, the first 5,000 node Apsara cluster went online, went into production. The reason 5,000, this 5K is important is because when we set out to do Apsara, we wanted to build a large scale general computing platform. So what do we buy large scale? In 2009, what we know in the world, the largest cluster is about 5,000 nodes, probably in Google. So that's why we think that to really put a concrete definition of large scale, we said that we have to really make it at least the 5,000 node cluster. We thought we could do it by the end of 2010, but turns out it's much harder than we thought. So it took us three more years. So in 2013, we actually did it. I'm going to talk a little bit more later. In the next milestone, we not only prove that we can build the scale, we also want to prove that the system can outperform other systems. So in 2015, we participated into this sort contest. And our Apsara system broke the world record in the sort of competition by sorting 100 terabytes of data in 377 seconds. Next, so it's really not a milestone because starting from 2011, we hold annual developer conferences. So in 2016, obviously, that's six conferences. And you can see some numbers there that we have 40,000 developers attending the conference and over 7 million people actually watched it online. We did it live broadcast. So what exactly does Apsara do? Apsara, in a nutshell, it manages a planet scale infrastructure. So what does it look like? First of all, our data centers are organized into regions. And between these regions, we have a fast network which is called a transfer network. And also, we have our backbone network that connects our data centers with the telecom carriers. That's how we connect the users from outside network to our data centers. And finally, we have this edge network which is essentially point-out presence sites across the globe so that that accelerates the access of the services being deployed in our data centers. So today, we have four data centers regions across the globe. In the mainland, we have six regions, three in China North, one in China East, and one in China South, and two in China East. And also, we have 11 overseas regions. I'm not going to enumerate them here. And finally, about the edge network, we have over 600 point-out presence nodes and over 20 terabytes of bandwidth capacity. This slide shows the architecture of Apsara. So at the bottom, it's our data centers, our availability zones, and the regions. And on top of that, we have four red squares are the common building blocks of distributed systems. These include a distributed coordination, security management, log collection, and distributed monitoring, diagnostics, et cetera. And on top of that, there are two big pieces. One is what we call Pangu, which is unified storage management. The other is called a Fushi, which is distributed resource management. So we actually have these two pieces to pull off all the resources in our data centers as unified resources pull and unified storage. On the side, we have a component called a Tianji, which does the infrastructure management and the service management. Look at the, this is the only component that's aligned vertically. That's because it's actually connecting the applications running on top of those red, we call it, Apsara core, and also the bottom line infrastructure. So basically what it does is that it does the service deployment, expansion, and also server healing in the sense that when some service instance has problems, it would automatically shut it down, isolate it, and start a new instances. And on top of that, we have accounting, authentication, or authorization, metering the billing. These are the services very common to support a public service. And in those blue squares, these are the public cloud services we provide. They include some basic services like compute, storage, databases, networking. And also we have a collection of services to help developers build applications. That includes the middleware, service orchestration, and the service computing. Then on top of that, we have some advanced services, including data intelligence, such as BI, such as AI, or machine learning, and also some security products. And in the orange pieces are those ones that are dealing with the connectivity, such as the data transfer, database synchronization, content delivery network, and also Express Connect, which is that's the hybrid cloud connecting between the on-premise infrastructure and our public cloud. And finally, you can consider the App Store for Cloud, which we call the Marketplace. We'd like to think of Sarah as a hyperscale cloud operating system. The reason we call it the operating system is because it actually draws a very clear analogy between a Sarah and the PC operating system. So if you look at all those layers, at the bottom layer, it corresponds to a PC hardware. Then on top of that, the red pieces really correspond to the kernel. Then even on top of that, all the modern operating systems are multi-user enabled. So that's why every operating system has to have account management. All those blue squares correspond to system cores, system services, and bundled apps. And the orange pieces correspond to the input and output subsystem of an operating system. And finally, obviously, most of the modern operating system have an App Store. So I want to talk briefly about some of the design highlights of our Sarah. First of all, we talked about that we want to build a general purpose computing platform for our internal applications. That's why we think that the reason we want to do that, we think it's very important to run a mixture of latency-sensitive applications and batch applications. So the first key highlight is that our Sarah really is a uniform computing platform for running latency-sensitive applications and batch applications on one platform. The second one is that I've already talked about in 2013, we achieved a scale of 5K. But we didn't stop there. So currently, our largest of Sarah classes is exceeded 10,000 servers with hundreds of petabytes of raw storage and hundreds of thousands of CPU cores. And thirdly, in the design of Sarah, given that's a large scale, the design, throughout the Sarah design, it has zero single point of failure and achieves 395s of availability. And fourth, all the data stored in Sarah are triple replicated by default and achieve 10 lines of data availability. And in fifth, to really manage such a large scale infrastructure, all the monitoring, all the diagnostics, all the deployment are fully distributed. And lastly, you must have noticed in my architecture diagram, security is embedded in the deep of the core of Absara. And we implored what we call minimum TCB, trust the computing base to ensure the security of this whole system. So like I said previously, I want to develop a little bit more on Absara 5K. The reason it is important is because not many companies can actually, well, in this world, there are not many commercial or open source offerings that can do 5K. We became one of that. And also why it's so important is because with 5K, it was the first time that the scale of a Absara cluster exceeds that of a Hadoop cluster. And so with that, we actually decided that we should unify all the data processing workload on one platform. So that's what we internally, it's our internal project Apollo, internally we call it a Deng Yue. So now every core, Alibaba core businesses, the data processing workload are being processed on Absara. And also on July 1st, 2014, we made a max computer generally available. So that's the distributed data processing platform built on top of Absara. And that was the first time that any one company that offers such large-scale data processing capability commercially to the world. And the impact of the availability of such kind of service actually exceeds even our imagination. So I want to give you one example. So prior to the release to the release to the public of mass computer, we conducted an experiment, the Tianzhi challenge, which is essentially similar to Netflix challenge. What we did is that we give away a slice of Alibaba's internal data, so anonymized it, remove all the sensitive fields, and let people to do prediction algorithms. Essentially, in this case, what a user would click for a certain ads. As a matter of fact, over 7,000 teams all over the globe attended register for the competition. And even 351 groups were from outside China. That is more than 10x of similar contacts held in some academia conference like KDD. And even more amazingly, we led the top 10 team to participate in our single-stay event in 2014. What we did is that we sliced the traffic, online traffic into 11 pieces. Each one of those 11 pieces of traffic goes to one of those participants. And we reserve one piece going to our in-house algorithm. We set out a million-yuan challenge in the sense that if the click-through rate of any of those participants exceeds our in-house algorithm by 15%, we give away 1,000 yuan. So it turns out that one group actually won. So their algorithm beat our in-house algorithm by 16%. So as you can see that the availability of such kind of data processing platform, letting people to very easily iterate their algorithms, makes all the data scientists from the global to realize their ideas much faster. So I'm not going to go through that slide. This just shows the comprehensive portfolio of our cloud products. So I'm almost running out of time, but I want to just maybe give me one minute of borrow time to talk a little bit about virtualization in Alibaba Cloud. By virtualization, I really mean virtualization in general. So that means because young cloud is always multi-talented, the isolation, the runtime. So I want to talk about virtualization in general in three aspects. One is the resource isolation. The second is what we traditionally know as virtualizing, which I call server virtualization. And third is the container technology. So I diverge a little bit. Talk about Linux at Alibaba. So currently, all our physical servers run a variant of Linux distribution, including Fedora Centos. And also, we released our customized kernel in 2011, which we call the Alikernel. And also, since 2010, we have almost 300 patches being accepted by Linux. And that makes us the number one Chinese cloud company in terms of contribution. Why mixing latent-sensitive workload and a batch worker is so important? This is borrowed from a data center as a computing book from Google. So you can see that typically, the class that run the latent-sensitive workload has much lower CPU utilization than those running batch workload. So the basic idea is why can we mix them together to improve the overall utilization? Well, the key challenge is we want to make sure that even though the user's utilization increased, we don't want to sacrifice the performance of a latent-sensitive workload. So we've done a bunch of things. I'm running out of time. I'm not going to go through the details of all those technologies. Essentially, what we did is that we actually did a resource isolation in all the dimensions, including CPU, including the networking, and including the IO. So let me show some of those results. So this slide shows the effectiveness of CPU isolation. So this vertical line, essentially, is a triggering event when a batch workload is being launched on a class that previously is only running latent-sensitive workloads. As we can see here that before and after graph, the CPU utilization increased from 35% to over 65%. So that's over 30% increase. And also, in terms of degradation-wise, both the latency and the throughput for the latent-sensitive workload is less than 5%. So that definitely shows the effectiveness of our CPU isolation. Next, we look at the networking isolation. So we have three bars here. The green ones shows the latency of those latent-sensitive workloads requests. And the blue bar shows the request latency when we just simply mix the workload without doing any kind of networking isolation. And the yellow ones shows the effect that when we have the network isolation enabled. As we can see, without the network isolation, the average latency requests the processing latency increased by 8-fold. And even worse, the tail latency increased by 20-fold. While with the network isolation enabled, the degradation goes down to 10% to 20%. So that's the effectiveness of network isolation. And now we look at the IO throttling. So what we did is that we allow applications to throttle their IOs based on file's directories. Here we can see that we have set up a single process, some data being unthrottled, some files being throttled. We can clearly see the throttle threshold is 250K IOPS. And we can clearly see that we can bond the IOs for those throttled files. Next, I want to talk a little bit about the server virtualization. So server virtualization is essentially the black butter of cloud computing. As a matter of fact, our first service, last computing service, essentially is the server virtualization. And underneath the hood, we did a transition from Xen-based hypervisor to KVM-based hypervisor in 2014. And we get engaged to the Linux Foundation in 2017 by upgrading our membership to the gold membership. One of the key things that we did with the KVM was to support the hot upgrade. So the key problem is that typically a hypervisor upgrade would impact the service availability. A case in point is in 2014, one of the vulnerability that would lead to, that's vulnerability for Xen, by the way, that would lead to a malicious user to gain access to the physical machine or even impact the case that OS is on the same server. And Linoda, that's a company, had end up have to restart all their server instances to patch the kernel. So what we did is that we did, that's a very tense taking engineer effort. We made all those components related to hypervisor hot upgradeable, including K-mode and the K-mail. I'm not going through the details here. And today, every VM will go through some kind of upgrade event once every one to two months. And during that upgrade event, typically there's only tens of millisecond service interruptions. So user doesn't see a thing. It's just the service gets paused maybe for 20, 30 milliseconds. Finally, I want to talk about the containers. Alibaba Cloud embraces the Cloud container technology very daily. So we formed a strategic partnership with Docker in 2016. And just this month, we actually announced the Docker Hub in China. We also joined the CNCF Foundation as a Gold member in two months ago. And we are the only, as far as I know, only Cloud provider in China that supports both Docker Swarm and Kubernetes. So some of the bits of what we did for the container front. First of all, we did the native integration of Docker Swarm with Cloud infrastructure. So that means that we have comprehensive infrastructure code support for storage, networking, and logging. And we are actually working on to make such kind of integration available for Kubernetes. We also had a lot of scalability enhancements. Currently, a single Docker container cluster could exceed over 30,000 VM nodes. And finally, we believe we are hosting the world's largest container-based application, which is Alibaba's e-commerce platform. So in the single day of 2016, over 300s of Docker containers are being deployed over multiple regions. And the peak throughput is 170,000 requests per second. Finally, I want to conclude my talk to talk about some ideas of our further future directions. First of all is lightweight virtualization for containers. Obviously, given the more and more weight we put on the container technologies, this is critical for us. The second thing is that the hardware becomes much faster, such as NVMe-based storage or 25G network. So the five-system and the networking optimizations for such kind of ultra-fast hardware I think is going to be a very interesting direction to explore. And thirdly, the security enhancements for heterogeneous hardware. We talk about FPGA, GPU, and some other customized ASICs. With that, thank you.