 Hello everyone, my name is Xi Yuan Wang. I'm a software engineer from Huawei. I'm a full-time open-source contributor and have worked on open-source since for more than seven years. I mainly worked in the open-infra community before and was a maintainer in the open-stack Keystone project. Now I mainly work on AI-related open-source scenes. I'm very happy to speak at Open Source Summit Europe. I hope this topic can help you more or less. Thank you. Today, our session will be separated three parts. First, we will give a brief introduction about Spark and deep learning on Spark. Then Xi Yuan will give an overview and a technical detail about how to support Onyx and isn't on Onyx right time. Finally, after above two parts completed, we will give a whole workflow to help you understand and show how to run Onyx AI inference job on Apache Spark. Okay, let's start. We will first give a brief introduction about Spark. What's a Spark? Maybe you already have a preliminary understanding of Apache Spark. We can look at the picture on left-serve first with a very simple line of code. We can use the API of Spark DataFrame to complete the processing of big data. Behind the simplicity, Spark helps us convert the same code into Spark program that can run in a distributed environment. This environment contains many, many workers and physical nodes or containers or anything. The user submits a Spark job. The Spark driver will request multi-workers from the resource management component. As we know, this resource management component can be yarn or carbonase. They are a very popular resource management component. Then Spark will help us segment the data. Just split the big data into a very small sense and then transfer the data to each worker. Like the conspirator handles a conspiracy task. That means a very small task to do something like reduce the data or something. The acutor is very real. Data is processed, including the data conversion, some transform, aggregation, drawing, or other operations. When a start is processed, it can be written to the conspiracy failure or return to the user for further use. It is precisely because of the simple API and the power functions of Spark. Spark is the most important component in the field of big data processing. As we all know, Python is a very important language for AI, especially when the data ecology and the AI ecology are integrated. In the process of training an AI model, we need to prepare a large amount of data processing in the world, such as feature engineering and data washing. Spark also provides the PySpark. This is a very important role to connect the data on the AI through PySpark. For example, on the left picture, you can see Spark as a base and Spark as a base, which is written by Scala. PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark application using Python API, but also provides PySpark shell for interactive understanding your data in a distributed environment. Python supports most of Spark features such as Spark Circle, Data Frame, Streaming, ML Live, and Spark Core. You can see the right part. You can easily install the PySpark using PIP, and then you just type the PySpark. You can start a PySpark shell and use a very simple API to read the JSON and analyze the JSON file. This is a picture of the data flow in PySpark. PySpark is built on the top of Spark Java API, and the data is processed in Python and catch a shuffle in JVM. You can see the left picture, Py for Java is a very important bridge of Spark and PySpark. It's a bridge of Java and Python, so you need PyCo and PyCo the data object between Python and Java. The most important thing is how to transfer the data between Python and Spark. After we have some introduction about the PySpark, I will introduce the history release of Spark, how to improve the ML of data and AI. Spark also has lots of improvements to help data and AI integration in the past releases. So just like Spark 24374 is named as barrier execution mode. One of the reasons is to support the deep deep learning framework on Spark. So the core of this SPIP helps task-level schedule or to say one-on-nothing. That means such as tasks are dependent on each other. When one of the tasks fails, all tasks are retired. So this is just one-on-nothing. Notice that this is the task-level schedule, not the resource management schedule. For example, Horrward is an open source framework to do the deep learning on scale. Use the MPI to implement the distributed deep learning for variant DL frameworks. After this feature is supported, we need to do this since like Horrward work. This is the help to distribute the training. The second one is a celebrate a VR scheduling. The core feature of this IPvp helps to discover GPU hardware integrated into the Spark. You can easily integrate your specific GPU hardware. Also, the user can define the discover script. This SPIP also provides a way to request GPU in application and help the task scheduling in GPU environment. This feature is a very good SPIP to help the user do the GPU integration. They help discover the GPU. The second one is to help application to the environment with GPU. The last one is the data exchange. As I mentioned before, the data exchange is very important for Spark and PySpark. And also, it's very important to speed data change between the PySpark and other frameworks such as deep learning. We can see that it's a very important one. It's an error integration. Error is a very important project to help memory HR or memory transfer something like data exchange between the process or components. The last one is just for this. It introduces a Spark 3 word. We can see that the history of Spark has many, many improvements in the data and AI integration. So what else can we continue to do and help the integration of the Spark and AI framework? Okay, let's see this page. This page shows the simplest workflow of the inference workflow of the data and AI pipeline. In general, the user uses Spark as a data processing platform and uses only runtime as an inference platform. So that's an example. The only strength is one of the DL inference platform. So the defined data frame interface is very friendly to data engineering. The data engineer can easily to load data and complete a feature engineer like ETL or something. They are very professional in the data processing. As a defector standard of the big data platform, Spark can help users easily and conveniently use a simple interface to divide huge data to each acutor. That's just like I introduced before. So especially in recent years, Spark has been able to pandas on Spark API since version 3.4, which is a very important convenient to users who use pandas API. You can easily to migrate your pandas application to pandas on Spark. With no more line change, you can just move the pandas from the single machine to the distributed environment. So as a present, as we know, the various AI frameworks such as data flow path, how to transport has their own implementations. So for for for developers who use the AI framework, they are often profession to the such like a parameter tuning of the AI framework and the internal principle of frameworks. But just to say there's two parts, two rules in here is data engineers and ML operation engineers. So there are some gaps of fraction points between these two because the data engineering often don't understand the framework very deeply. They don't know the internal of the various frameworks. And the same as AI engineer, they also didn't don't know much about the data data engineers or the work or some integration work. Therefore, the Spark community has also initialized a discussion on SAP API helping on hoping to hide the complex process of each framework by providing a simple API making a process of the Spark and AI inference smoother. So the goal of the SAP API is to simplify the deployment of the DL models to Spark inference and also enable the integration with third party DL framework. Naturally, the target person data engineer who need to deploy DL on Spark or the developers who want to deploy the DL models on Spark. You can see you can search the Spark 38648 to learn more. But I just want to say this SAP API is still under discussion, maybe not complete integration just to introduce some simple API in in PySpark and help the data engineering and AI engineering to increase smoother. So first of all, I'd like to introduce the basic concept of ONIX for you. ONIX is an open-source project managed under LINX Foundation AI and data. We call it LF AI and data. It is an umbrella foundation of the LINX Foundation that supports open-source invocations in artificial intelligence and data. The mission of LF AI and data is to build and support an open artificial intelligence and data community and the driven open-source invocation in the AI and data domains by enabling corporations and the creation of new opportunities for all the members of the community. ONIX is an open format built to represent machine learning models. ONIX defines a common set of operators, the building blocks of machine learning and deep learning models and a common field format to enable AI developers to use models with a variety of frameworks tools, roundtimes, and compilers. In using ONIX, you can develop in your preferred framework without worrying about downstream inferencing implications. ONIX enables you to use your preferred framework with your chosen inference engine. What's more, ONIX makes it easier to access hardware optimizations and use ONIX compatible roundtimes and libraries designed to maximize performance across hardware. An ONIX roundtimes is a cross-platform machine learning model accelerator with a flexible interface to integrate hardware-specific libraries and it is an open-source project from Microsoft. Let's look at the flow as the red side in the slide. It shows how users work with ONIX and ONIX runtime. First, users should fetch pre-trained models in any format. For example, it could be the patch, models, TensorFlow models, and so on. Then users need to transform the models to ONIX format using the transform tools provided by the ONIX community or any other AI framework. Next, users can load the ONIX format model into ONIX runtime. Then the models can run on any hardware that ONIX runtime supports. We can see that there are many, many AI frameworks and hardware now support ONIX and ONIX runtime. Next, I'd like to introduce the basic concept of ASUND. ASUND is the name of NPO AI processors from Huawei. Around it, Huawei has built the ASUND AI ecosystem. Let's take a look at the graph here. The red rectangles here represent different ASUND processor models. ASUND 310 only supports AI inference and both 7100 and 910 support train and inference. Based on the processor, Huawei built a series of AI-related hardware, which is showing blue rectangles in the picture. They are called ATAS. Here I'd like to say more about ATAS 3000. It's a kind of PCI card and used widely on data or AI-processed servers. Our develop and test work in this section is based on it as well. Then based on the hardware, ASUND ecosystem also provides a software layer called Khan. It's the yellow rectangles in the picture. Khan provides APIs to help developers quickly build AI applications and services based on the ASUND platform. Frankly speaking, it's similar with CUDA in the video ecosystem. Then at last, based on Khan, it's the user layer. Any AI-related apps, frameworks, and other applications can easily wa... no, no, no, sorry. Any AI-related apps, frameworks, and so on can use ASUND easily for Khan. For software, Khan is the main point that both developers and AI frameworks should know. So let's focus on Khan. You can see in this, this is Khan technical stack view in ASUND ecosystem. The picture here shows the newest version called Khan 5.0. As you can see, there are multiple layers in Khan. It contains service layer, completion layer, execution layer, and the base layer. So the service layer contains operator library, optimization engine, and the framework adapter. Operator library names AOL is a neural network library and contains computer vision library and glass library and more included. So for AOE, as ASUND optimization engine, it mainly speeds up the end-to-end execution of models through operator auto-tone and SGAT and other technique. And for framework adapter, it's mainly to address the AI framework. And then for computing and completion layer, it's mainly contains ATC. ATC is tensor compiler. It's graph compiler is a control center of graph building and execution and can be used for many management of graph runtime environment, graph execution engine, operator packages, subgraph turning, and graph operations. And TB here is tensor boss engine. It enables auto scheduler and operator buildings. And the next layer execution layer, ACE is the moon point. ACE executes models and operators, schedule tasks, and manage compute units. It includes runtime, graph executor, digital vision pre-processing, and AI pre-processing, and Huawei Collative's communication libraries. And the last layer is the base layer. I think there's no more things to say because the contents are basic, base OS, SVM, and others. After the ascended introduction, let's fall back to Onyx. And currently, if a user wants to run Onyx model on ascended hardware, he should first use the model translation to provided by Khan to translate the model from Onyx to ascended. The flow is a little complex, and the translated model may lose some precision under the performance may be poor. Even in some case, the model may cannot work correctly. So to solve the problem, a better way is to find a way that Onyx model can work on ascended directly. Onyx runtime has a machine eyes to support different hardware. It is called execution provider. So in Onyx runtime, we'd like to add Khan as a new execution provider. Once it is done, users can use Onyx model on ascended hardware via Onyx runtime. Of course, we'll add the related CI system in up-screen as well to make sure the Khan execution provider can always work. The line below is our roadmap. First, we'll push the basic code to up-screen. The end-to-end flow will be down in it, and the ResNet model should work correctly on Khan execution provider. And at the end of this year, we'll finish all the Onyx operator support and make sure all the models in Onyx model zoo can work well on ascended. In the next year, we'll focus on optimization work, such as performance improvements and so on. Now the patch for Khan support is done. Use the code short here. The link short here. You can now run ResNet and VGG models correctly on Khan using Onyx runtime directly. If you are interested, please leave us a message and we can discuss more in the future. Maybe we can provide ascended resources for you to test temporarily. Okay, that's all for my share. Next, Equin will share the full stack about how Spark and Onyx and Khan can work together. This is a simple example to show how the developer should complete the inference after DL on Spark in Able. We can see the live file. The only things the users need to do is to impart the corresponding framework extension and the model URL. Then all complexity, including the tab processing, tab conversion, and the framework initialization are hidden in the model user defined function. This model UDF maybe will be supported or implemented by the AFRAM provider. What each framework needs to do is just to implement the corresponding implementation according to the actual situation of the framework, just like PyTorch or TensorFlow. Then see the right picture. Spark will help you acute the Onyx inference in Spark encoder and Spark will help gather the final result. In this way, users can easily complete the Onyx inference on big data and help split data and influence the AI without seeing hardware with different hardware in Spark encoder. At the same time, since Onyx is used as a framework for model inference in the encoder, all hardware supported by Onyx random coupling are ultra-lized. Okay, thanks for joining us and thanks for your watching. That's all our shares. Thank you.