 I'd like to thank everyone for joining us today. Welcome to today's CNCF webinar, Mindspore and Cloud Native Ecosystem. I'm Christian Jens, Cloud Consultant at Level 25 and a CNCF Ambassador. I'll be moderating today's webinar and I would like to welcome our presenters today, which is Howard Huang and Yidong Liu, both Community Manager and Open Source Engineers at Huawei. Please bear with me with a little incorrect in pronunciation and correct me later. A few housekeeping items before we get started. During the webinar, you are not able to talk as an attendee. There is a QA button at the bottom of your screen. It's right below the presentation. Please feel free to drop your questions in there and we will get to as many as we can at the end. This is an official webinar of the CNCF and as such is subject to the CNCF Code of Conduct. Please do not add anything to the chat or questions that would be in violation of that Code of Conduct. Basically, please be respectful of all and your fellow participants and presenters. Please also note that the recording and slides will be posted later today to the CNCF webinar page at cncf.io forward slash webinars. With that, I'll head it over to Howard and Yidong to kick off today's presentation. Thank you very much and good morning or good afternoon, everyone. Thank you very much for attending today's webinar. This is our first time doing CNCF webinar. Hopefully, we're doing it in the right way. So today's topic is about Mind's War, a newly open source deep learning framework and how we adopted in the cognitive ecosystem. I'm Howard Huang from Huawei, I'm an open source manager. And my colleague here is Yidong Liu. He's the author of the MS operator, which is the kind of the Mind's War Q-Flow operator, which he will give a deep dive later in the talk. Okay, so basically we will have a three part of the talk. I'll be giving a high-level introduction of Mind's War and then Yidong will walk you through how we like deploy Mind's War on Kubernetes with Q-Flow and then there will be a short demo. Okay, Mind's War is a new open source deep learning framework. So think about TensorFlow, PyTorch, MXNet. So Mind's War is a new addition to the a slew of open source deep learning frameworks. So we open source it Saturday last week. So it's fresh out of the oven. Mind's War is designed for developer or users to easily use for mobile edge and cloud scenarios. Hopefully we can provide with very fun friendly design for developer to use and also efficient execution for data scientists. And Mind's War is highly optimized for Huawei's send processor, but we also support like general hardware like CPU and GPUs. So you can visit our official website. Here's and we provide both Chinese and English version of the website. The main repo we hosted on GeTi, sort of like Chinese version of GitHub, but we also provide a mirror at GitHub for those who are more familiar with GitHub. And the PRN issues are open for GitHub. So you can very easily join the discussion. Okay, this is the overview of the architecture for Mind's War. So very similar to most of the mainstream deep learning frameworks. Mind's War has a Python written front-end. So like data scientists could write a machine learning deep learning models in Python really quickly and easily. And then we have a C++ backed implementation of several key features. And we also have another module called graph engine. So graph engine is sort of like the backend engine for Mind's War. It provides many of the like the low-level optimization, pipeline parallel on device execution. For example, you can actually offload an entire graph through graph engine onto a send AI processor. So you can gain the maximum performance out of it. And then you, we have several like backend runtime targeting different like type of hardware. So we have for general computing like for CPU, for GPU, for a send 310, which mostly used for edge computing. It's a 910 for large-scale cloud computing and for mobile, Android and iOS. So several like key features that Mind's War brings to the world. The first one is auto differentiation. Auto differentiation is not a new thing per se, but Mind's War offers source code based auto differentiation. So for those of you are familiar with compilation technologies. So source to source, compilation optimization is very useful for scenarios like when you want to exploit like the maximum performance of certain hardware. For example, there's another open source project called Tornado VM from the University of Manchester, Beehive Lab. So Tornado VM is also providing a source source source compilation optimization, but for Java to be run on top of future genius computing hours. So that is to say from the Java source code to be compiled to open CLC code. So for Mind's War, it mostly from the front-end expression, compiled to what the back-end best-renders stood, the source code. And also another great feature is for Mind's War, if you're writing the model, actually you can just add one line and you can switch between a static graph and a dynamic graph. So static graph versus dynamic graph is kind of a, like forever ongoing struggle in the deep learning community. So for production, people usually prefer to static graph, and but for like debugging and development, people usually prefer to dynamic graph. So Mind's War kind of provide the data scientists both way, just add one liner and you can switch between the two mode. Another thing Mind's War brings is the auto parallel. So typically in deep learning, we have data parallelism and model parallelism. So that means when you run like distributed training, you can have either like data distributed across the cluster or you can also have model be distributed across the cluster. So sometime you can have a hybrid parallelism to take advantage of both data and model parallelism. For Mind's War, we support both type of parallelism and also it's similar to the static graph and dynamic graph switch. It's also like one liner change, you can add to your even like serialized code, you can change it to the parallel execution by taking advantage of the Huawei's Ascend AI processor. So it's pretty amazing. So Mind's War has its own IR defined because as I mentioned, we heavily rely on compilation optimization. So we also have like two other modules open sourced with the core deep learning framework. One is we call the Mind Insight, which is our visualizing tool similar to the Mind Insight. To support and another module we call Mind Armor is like a adversarial attack evaluation tool for a model. So you can like testing the security of a model by using that tool. Okay, so technology aside, we also embraced an open governance model that we learned from CNCF and also Kubernetes. So for example, we have a technical steering committee setup with 40 members from various universities, companies, startups, institutions, actually across the globe from China, Europe, UK, US. And we want to make sure we have the community have a truly open and global technical governing body. And similar to Kubernetes, we also have six and working groups setup. So for six, it mostly in charge of the system specific feature development. For example, we have the front-end expression, compiler, executor, model zoo and so forth. So working groups are handling matters that topics across C. For example, we have the documentation and also infrastructure. So we welcome like further six and the working groups establishment if there's any like need. For example, like research working group or security working group. So all the like establishment of the six working groups will be approved by TIC and everything will be done accordingly to our charter. So these governance structure actually, we want to guarantee we have a open development procedure. We also have community partners that not necessarily involved in mind sports community governance per se, but could like collaborate in open source. For example, like the DGL app, which is a really good at the graph neural networks and open source project from LFAI like Melvis, which is a very great project providing the vector processing and so we can build a indexed searching engine basically. Okay, without further ado, I will be handing over to Yedong to talk about how we deploy months for our Kubernetes and how we like build months for with the cloud native ecosystem. Yedong, are you there? Yes, can you hear me? Yeah. Okay, great. Hello, I am Yedong Liu from Huawei and I will introduce something about my sport and the cloud native on many Kubernetes here. So if we take a look at other deep learning frameworks, including TensorFlow, PyTorch, MXNet, these frameworks benefit from implementing the TF job or PyTorch job or MXNet job, these custom resource definition or CRDs and using these CRDs to create and manage deep learning jobs in Kubernetes cluster, mainly for distributed training. As Howard mentioned, Minesport has some highlighted technical features, including automatic differentiation and auto parallel. So if Minesport can also leverage the resource allocation and the management capabilities of Kubernetes, the distributed training is much easier and more controllable to achieve in container environment. Plus monitoring the job is also visible through operator. So MX operator is something we want to achieve in a short time. So you can see a plus Minesport and Minesport operator. Minesport operator is in Minesport scope, but right now since Minesport is very young it's only four days old. So right now we only finished a proof of concept of training a simple MNIST model using CPU in Kubernetes. Hopefully we can see distributed training on multiple backends, including CPU, GPU and Huawei as a processor in the near future for more demos. Next slide. So I want to talk about something about Minesport and Kubeflow ecosystem here. Since Kubeflow just announced its major 1.0 release recently with graduation of a set of core applications, including Kubeflow's UI, Jupyter Notebook, Jupyter web, Jupyter Notebook controller and web app, TF job, PyTor job and KFCTO and so on. So Kubeflow is in our eyes a very matured community to cooperate and using their powers together with Minesport to push both of us forward. So the Minesport community is also driving to collaborate with Kubeflow as well as making the MNIST operator more complex, well-organized and always dependencies and packages up to date. So all these components will make it easy for machine learning engineers and the data scientists to use the cloud assets both public and the on premise for machine learning workloads. So Minesport is looking forward to enable our developers to use Jupyter, which is our one of our tasks to develop models. Developers in the future can use Kubeflow tools like fairy, Kubeflow's Python SDK to build containers and create Kubernetes resources to train their Minesport models. Once the training is completed, we can also use the KF serving to create and deploy a server for inferencing so that we can complete in the lifecycle of the machine learning. Another thing I want to talk about is distributed training. Distributed training is another field that Minesport will be focusing on. There are two major distributed training strategies nowadays, one based on parameter servers like the TensorFlow and other based on collective communication primitive, such as all reduce. So the MPI operator is already implemented and be used in the Kubeflow community. So MPI operator is one of the core components of Kubeflow and it is easy to run synchronized or reduced style distributed training on Kubernetes. So MPI operator also provides a CRD for defining a training job on single CPU, GPU, multiple CPU, GPU or even multi-nodes. It also implements a custom controller to manage the CRDs, create dependent resources and reconcile the desired states. So if Minesport together with MPI operator, I think we, together with the MPI operators and Minesports with multiple backends, including the Huawei Ascend chips, the high performance Huawei Ascend chips, it is possible that the Minesport will bring the distributed training to another new high level. All right, next slide. And this is the MS operator workflow I imagine in the future. So this is a high level set of tasks needed to run the Minesport job on Kubeflow. So first we write or we review the Python training code and then build the YAML file based on the CRD definition of MS job, describing the training job, the container image, the program or the training file we wrote in step one for training execution on the setting our parameters. Then find and container or build a Docker container image, containing all the code and dependencies. And last one is just send the job YAML file to the cluster for execution with KubeCTL command. So all of the bugs Kubernetes doesn't understand how distributed Minesport works. Kubernetes only needs to help understand where the daemons for running and how they talk with one another. So we can see the general flow of how different parts of Kubeflow work together to get Minesport containers working on Kubernetes and coordinating with each other. Okay, next slide. Actually, there are some fun facts about the installing issue. As we mentioned, the Minesport is just open source for four days. So it's only four days, it's super young. And the most issues we encountered in our open source community is installing or building because many developers, they want to build from the source but many fails, they cannot build from the source maybe sometimes it's compiling error. Sometimes their environment is not suitable. Sometimes they want to install on Mac but right now Minesport cannot support just directly built on a Macintosh. But we have some alternative solutions. So we prepared Minesport Docker images for users both CPU version and GPU version. Actually, it turned out that this is a great solution to these installing issues. Here, as you can see is a right button. One of our developers, she said it was more comfortable installing from Docker than building from source. This is a translation. No pain installing, no pain running the demo. The starting experience is fantastic. I strongly recommend everyone installing Minesport by Docker. So that's the power of Docker and Kubernetes. Okay, next one. Okay, in this demo, I recorded a video of training LUNET with MNIST dataset using Minesport CPU on a single node in Kubernetes cluster. How can you go to the YouTube? Okay, so the network condition is not very stable so you don't record, pre-record. The demo is on YouTube. You can check it out anytime you like. So I just play it. All right, this is my virtual machine. So just doing some version check. They check in the Docker, the Kube CTL and cluster info. All right, we're using the Kubernetes version 1.14.0 and the CD to our source route. All right, these are all the source code and CD to the examples we run today. So we store our training data in the MNIST folder and we check our YAML file. It's pretty simple. The image we use is a Minesport-CPU 0.1.0 alpha which is the first version of our release. And we will train the LUNET.py as our script. Okay, we can check our script source code. And this is a very simple Python file for defining an area net model and setting some parameters like ARs and approaches. Only should be within 200 lines. So this is a Minesport-written model, right? Yes. So for our demo, we set the approach time to approach size to one and only one just for demonstration. Let's run now. I think we can start creating our training job. All right, the pod is created. We can get the status. Now it's container creating. Adjust kidding, it won't take 2,000 years. Just should be within just one minute. Okay, we finish, we use four minutes to finish our training and get the logs. All right, okay, to the start. Then we can check our accuracy. The accuracy is 96, more than 96%. So that's it, that's how to train a Minesport area net on Kubernetes cluster. All right, that's a demo, thank you. Okay, so as we mentioned, this is a new open source project and we definitely want every developer who's interested in deep learning development to participate. So there's a lot of ways to participate in the community. So we can check out the code. As I mentioned, our main development will happen on GTI. GTI also have the English support. It's very nice and easy. But if you like still prefer GitHub, you can still use our mirror repo. You can submit PRs or issues. So we'll have someone pick over the PR which is reviewed and over to GTI for landing. So as you don't just mentioned, you can try experimenting Minesport using Docker. This is probably by far the most convenient way we saw. So we prepared actually with the help with another developer who we do know, but to just help answering the issue, providing the build instructions for the CUDA Docker container. So we prepared two version of the CUDA container. And so in April for the China region, actually we'll open up the Huawei Cloud like the Kubernetes service for the Ascend back-end cluster. So that'll be the Ascend back-end Docker you can experimenting with. So for discussion, we are on Slack. Sorry for the long link. But you can join our discussion on Slack or we strongly advise you to register or subscribe to our mailing list. If you've got any questions or other things you want to discuss with the community. Yeah, that's it. Just head over to our website and check it out. We have FAQs, we have tutorials, we have APIs, documentations. Hopefully you can provide as many as the answer you're looking for. Thank you very much. Any questions? Great. Thanks Howard and Jadon for the great presentation. And yeah, time for questions. If you have questions, please do drop them into the Q&A section right in Zoom. And we have one which is, can you please paste the link to all of these things into the chat? Is all the links like? Yeah, maybe I think, I don't know if it's for the mailing list. If Samuel, if you could be a little bit more or put in what link specifically you're looking for. The question is just please paste link. Okay. So the slide will be made available, right? So. Okay, yeah. So maybe to answer the question, the slides will be made available so you can just look them up online. I can do it now. It will also be in Slack in a minute. Yeah, you can join the conversation. It's like a conversation in the Q&A section. Any other questions? Anything else? Okay, sounds like that's it. So yeah, thanks again for the great presentation Howard and Jadon. And that's all we have for today. Thanks everyone for joining us. Again, the webinar recording and slides will be online later today. We are looking forward to see you at a future CNCF webinar and have a great day. Thank you so much. Thank you very much. And everyone stay safe and stay healthy. Yeah, thank you everyone.