 Next, we have Yonggang Hu, who is a distinguished engineer from IBM. He's also a member of Spectrum Computing. So please join me in welcoming Yonggang Hu. Hi, can you hear me? Could you play my slides? Good morning. So glad to be here. This beautiful saintly Hangzhou. My slate is just outside. I just hosted the E20 Summit. Thanks, Linux Foundation. Thanks to Ben and the Microsoft community to have this event here and have all of us here so that we can have some fun. I'm also very honored to be here representing our team in IBM. And we're going to share with you some of our thinking, some of our product, and also some of our belief. And they have been doing some fantastic project. And today, I'm going to talk about embracing modern workload without losing it. Without losing it. Not just IT. We're talking about without losing control. We're talking about we're going to run modern workload such as Kubernetes and Spark on missiles. Efficiently, economically, and also beautifully. We love missiles. Otherwise, we wouldn't be here. We share the same belief as a missiles community. Here, we're talking about a two-level scheduling. We're talking about workload management and resource management. We did the layer. We believe two-level scheduling will give you the performance, scalability, and application assuring. We learned that from so many years of experience building some of the largest HPC environment, we feel we're talking about the millions of cores. Running genomic analysis, designing the next generation cell phone chips, maybe building the fastest phone-run racing car. We also built some of the largest grade for financial services. Over there, we're talking about tens of thousand cores, or hundreds of thousand cores with cell phone GPU, or hundreds of thousands of cores. Last week, there was an election in the US. Donald Trump versus Hillary Clinton. Of course, I won't go into the political and ugly part. So, on the election night, at one moment, the Dove Future job, 800 points. At one moment, I bet all the great diplomats and all our customers tried to repress the entire financial market. Here, we talk about a market of 1.2 quadrillion US dollars. This is 10 times bigger than the global GDP. 100 times bigger than the Chinese economy. This is how important the work hold running on top of the grade that we built. We also learned that some of the cutting edge sparrow cloud service, over there, we have to support millions of users. And without a two-level scheduling, that wouldn't be possible. We also believe that results always are limited, while demand always is unlimited. We have a bank over there around 10 years ago. They run across 1,500 cores on cluster, almost running 100% utilization. And over the past 10 years, their cluster capacity has been increased by 100 times. Right now, they're running at 150,000 cores. But still not enough, still cannot keep up with demand. The reason because there's more applications, applications become bigger, there's more utilization, and there's more data to process. But they have a constraint, which is their budget. They have a cost constraint. They want to reduce cost. That's why they want to miss out to do the cost-breaking and also improve utilization. The point is, you're going to always have some limit. Could be either budget, could be data center capacity, from your cloud provider. In the case of a formula-run racing, we have a customer called Red Bull Racing. It used to be number one. They limit by the formula-run regulation. That means at one moment, at any moment, they cannot use more than 30 terflops of computer power to do aerodynamics design and simulation. Because for the racing, they want to level the ground. So, you have a limit, and you also have a lot of demand. Then you need to run your workload efficiently, economically, whether in the cluster, in grade, or in the cloud. How to do that? There's a full-level scheduling workload management and the results management working in conjunction. That will be the key. And think about it. You need to be able to prioritize your work. You need to dynamic play around in real-time. You need to be able to do promotion. You need to be able to run many workloads in a shared environment so that you don't have a salary and utilization will be high. So, that's why full-level scheduling is key. And we feel that we have some experience in this area. That's why we like to contribute. And we are contributing to Microsoft's community. And then, I want to share that we have a full-commiter and we have become number two in the Microsoft community of course, after Microsoft sphere. And what I want to say is we are very grateful we really appreciate the community and the band and team for all the support and guidance without your help, without your guidance, we wouldn't have achieved what we have done. Thank you. We are using Microsoft as well and IBM. For some of the IBM cloud surveys, for example, support wasn't and we need some of IBM product. But we are doing more. Here, we talk about how to support this ecosystem. How to help to expand the Microsoft ecosystem. Over there, we do couple things. One is we want to bring more and more applications and secondly, we want to run this work efficiently and beautifully. And today, I'm so glad to share news. We create a new component called Spark Census Scheduler and this new component is going to be available on Microsoft. And this component is going to enable Spark to run on top of Microsoft beautifully. As you know, we also contribute to the Kubernetes integration with Microsoft. And all the contribution I can recognize by both the Kubernetes community and the Microsoft community. And we will also adding more capability in that area. And my colleagues are going to share in the breakout session. And we also bring some of our services. We're going to put RSF and MPI on top of Microsoft so that we can enable more work to support HVC. Let's get into a little bit detail on the Spark side. So Spark definitely can run on Microsoft. Let's know that. But in some scenario, the Spark running on Microsoft is not as good as the Spark running on Microsoft. So in the case, you'll have a dream of running through myself concurrently. And if you had a Spark, the access will be the total duration and the Y access is the job run time. And every single doubt will be done of a job finishing time. And in some worst case, even in thousands of seconds. And here is the Spark standard. This is a new component we have on Microsoft. So what you can see here is it not only reduces the total run time but also makes the workload consistent. Meaning that you have consistent run time. Your workload is predictable, deterministic in production. And this exactly demonstrates that the power of workload is generally in conjunction with the resource management and working cunning and perfectly. And we believe this is a component. We're going to make myself the best open platform for Spark. Let's look at a little bit more. How about interactive workload? Here is Spark SQL and no-book. And one to run in your class or maybe cloud environment. People may think, oh, it's easy to go to Amazon, whatever public cloud, get some VM and you can run in the tripling. But not so easy if you want to run it efficiently. Because if you go to get VMs and you need to know how many VMs you need and what time. But Spark is a high-level language for Spark. You have no idea about how many tasks will be generated at each stage and how many jobs and the CPU and the memory requirements and when those tasks need to be run and what service already. And make it even worse if you have for many users in the system and it will become more complex. It will be very inefficient to have a salary environment say 10 VM for every single user. Because in that way you may get 1% utilization because for the entire workflow you may run one task in 1 second, 2 seconds and after that you look for the result and the entire class will be idle. What you need to have is a shared classer and also an efficient scheduler and the second scheduler is exactly the workflow manager for these things. You can navigate all the demand directly work with the Spark framework to understand the demand at every single stage and then do the fine during the scheduling. Dynamic allocated results are called in priority and then working perfectly with the results manager, Miss Mysos to give you the performance you need, the SRA you need and also the high utilization that satisfies your IT people. So there is another talk on this I don't want to spend time on the detailed diagram. So, welcome to join IBM session. So, Spark Sentencing Angular is in production. We use the IBM Spark Law Service and over there we support many, many users many, many thousands of users and then we also use our product and we already have a customer quite a few, a customer in production. It's a production improvement in production writing now available on Mysos. So to demonstrate that, I would like to have a little bit of fun this is a demo and hopefully it will work usually I don't have a good luck on live demo and let's find So, I have a cell phone with me, I'm going to use WeChat and the WeChat can upload a picture and then it goes through a few of microservices like a Kafka, HTTP server and then going into the Spark layer and over there we have several Spark applications managed by Spark Sentencing Scheduler. Spark Sentencing Scheduler is going to dynamic look at the results on demand for those Spark applications and those Spark applications are going to run in your cloud and deep learning work will come around GPU cluster and then it will do the processing and generate a new picture for you. Let's see whether I have luck today So, this is my cell phone, this is WeChat hopefully network is ok So, this is the name of the demo Fantastic, I can make her This is instruction, it's very simple Let's upload a picture So, how about doing a selfie my Apple device Ok, smile Ok, use the photo So, the next step is to select the style Let me see just put nice things Ok, it's uploaded Let's try a few more Let's have another picture I'll maybe use this one Let's have another style We should be able to see something from back end Let's see Ok, so right now I have two applications Allocate CPU slot and doing the deep deep neural network using GPU We can see there are two GPU currently being used almost 99% Let's try one more Maybe, because I already shared this channel Maybe some of the mice developer can make some work to try it out So, I'm going to try another picture This time I can let me go for the beautiful mice leg Nice Ok, you can see actually it's kept increasing There are more and more GPU being used, the more CPU being allocated dynamically and right now I have 5GPU6 being used For sure, some people are doing the work, submitting more work That's fun Oh Ok, you can see the results being allocated dynamically to different applications So, this is a wide set of schedule is important Ok, some people come back Let's see Ok, can you see that This is deep learning using GPU and the default is like this and now it looks nice beautiful finish in 91 seconds using a test-locked 18 device Ok I think we have more people sending workload right now already like 10 GPU being used This is all real time, it's amazing Ok How about let's look at another picture You know I think it looks nicer than Mohan Song than the picture so I'm not more happy with this All those GPUs are fully being used Oh, in real time, oh, another future come back Is that beautiful? This is the picture before and this is one I think everybody can become my name Ok, let's switch back to the presentation and later I'm going to publish the wage hander channel and everybody can have a try Ok, so we talk about Spark Science and Schuyler we talk about Kubernetes we talk about RSF, MPI for digital workflow and we have those components available major components available but to be an enterprise those components are not enough You need to think about security enterprise using things like sandmander RDAB, single-sign-on you need to think about image registry think about a network think about authentication and sourization things like that there are a lot of open source tools available and they are very valuable but the question is the question is whether you want to build yourself you definitely can use all the different tools project to build it from ground up but sometimes for some customer you don't have to because time to vital matters that's why we came up and planned the package and we bring the batch the open source component and IBM integration and IBM 5.0 editor to provide and one experience for user for an enterprise customer and another good news to share is available now we have the community addition and you can download and use for free that's pretty much all from me and we have a quite a number of talk from my colleagues they have done some amazing work they love to share with you hopefully you can join some of the sessions and try this fantastic acro maker let us know if you encounter a bug or something like that we love to hear your feedback thank you