 and Ni Hao and Bron Shuo and whatever words that I can speak. Say hello to everybody and I appreciate your time and attendance. My name is Yong Nian Le and I'm the architect of GCC for OpenLa Compiler. And my colleague is Ming Chuan Wu who is a committer for the GCC Compiler in OpenLa. Both of us come from OpenLa community and today we will share you the GCC Compiler plugin for the customized compilation and development. And today our session contains four parts. The first one was the introduction and the second would be how it works. Then follows the several case study and also our future plan. Through this presentation you can understand who we are and what we do and where we go. And by the way I just give our framework a name called pink which is the short name for the plugin framework for Compiler. Although we initiate our activity from GCC Compiler, but actually our framework can work for multiple compilers including GCC and the area of VM as well. At the beginning let me introduce our organization a little bit. The OpenLa is kind of the open source operating system for the digital infrastructure. As our chairman Mr. Hu introduced in the first day keynote, our OpenLa operating system is part of the project of OpenAtom Foundation. It belongs to the same organization of open harmony, but the mission is a little bit different and for OpenLa we try to be a base for the digital infrastructure. And same as multiple community as well, our OpenLa community has multiple special interest groups including Compiler Interest Group. And there are many teams inside this group and covers multiple compilers used by OpenLa including JDK, GCC, area of VM and also other compiler technology as well. And the whole team come from different country and on behalf of different company who is interested in compiler for their community as well. So you can see we actually organized in an international-wise manner and the work is organized in some transparent and open source kind of way. Then we'll go to the GCC compiler. You know as we have multiple groups inside the compiler special interest group, actually there are many teams inside and works for different compilers. The same as other interest group we have debate and we have collaborations and some kind of guys would advocate area of VM and some guys would promote the value of GCC. And for the GCC actually it is the fundamental compiler used by OpenLa which is the compiler default compiler to generate the binary and the image for the whole community. For the mission, if we compare our compiler together with the upstream compiler GCC, our compiler's mission is trying to provide an ecosystem compatible with ease of use and performance leadership on some specific scenarios based on OpenLa open source community. And we have a team comes from different company and I'm the architect for the whole team. And our work can last for three years starting from 2019. And so far we make several contributions. For example, we can improve the performance of spec CPUs 2000 int about 15% better than the upstream GCC. At the same time we provide the one-click feedback directed optimization very easily for use to enable. At the same time we collaborate with the hardware team to enable some chip silicon for the user. And also the last but not least is the plug-in framework which we will share you in today's session. So the first thing, why we need this idea comes from the real problem. As I remember, there's a gentleman from Dynatris shared how a project was initiated. The first step is trying to ask questions. So for our why we need to create this project, we ask ourselves at the beginning, you know, there are many companies recently work for the ARM processor. CPU architecture is very popular. Not only the NVIDIA but also MPAIR and also like Fujito, all of them create their processor product based on ARM architecture. At the same time different cloud service provider will also provide their solution based on ARM processor on the cloud environment which can reduce the cost for the customer by 20%. So the ARM processor is very popular and ARM architecture involves very quickly recently. For example, the SIMD comes from 128 bits to 258 bits recently. But we won't like the user to develop their code again and again. For example, if they already have some code written in Neo and we would like to generate the SV instructions directly if they use our compiler. That would lease the power of the latest CPU architecture but do not create burden for the developer. This is the purpose of our work, but in the real world that created some difficulties. For example, we won't like to change the whole architecture and discuss with the upstream, GCC upstream for a long time to make sure the infrastructure or the whole feature turn on. At the same time, user might also have their special version for the GCC and I don't think or I don't expect they would change to a new compiler just for some specific feature. So this is the way that we would like to do something to help them but do not create several burden for them. At the same time, as I said in our community, we have two kind of the compiler infrastructure for the CC++ namely CGCC and ARVM project. We hope our work to not repeat again and again from this project to another infrastructure. So this is the motivation why we have this idea and it's naturally to have to think about when we come to compiler plugin. Actually compiler plugin is not a new concept. Many developers will choose compiler plugin for their own purpose. The compiler plugin can help the user to do some specific work based on the compiler capabilities but do not need to change the compiler from whole. This is kind of the way for them to shorten the development life cycle and also reduce the efforts of the development. So this is the reason that the compiler plugin is quite popular recently. A lot of vendors like Tours company will choose compiler plugin for their own purpose. For example, a very famous tool I released below is Oslo all based on compiler plugin no matter which compiler infrastructure they choose. But the compiler plugin works pretty nice but there are still some challenges so far for us. The first one is repeated building. This is the case on the kernel development. Some guys work on the plugin to hardening the capability of kernel to enhance the security capability but GCC has multiple versions. In this case, the user need to create multiple plugins for multiple GCC versions to make sure that the version can do not create some incompatibility issue. At the same time, not only the compatibility case but also the logging needs and also the integrity verification. All those common capabilities need to be created for different plugins again and again. So this is the things that we need to think about. We want those kind of efforts to be invested for our team multiple times. At the same time, as I said, we have LRVM, we also have GCC and we also won't repeat the history plugin again. For example, the random struct is very useful plugin which can create some random structure order to the user but this kind of the plugin was introduced in 2017 for the GCC and two years later, those kind of code was repeated and rewritten again for the LRVM. So we hope that we can have the infrastructure to allow the user to write just once and can run multiple compiler infrastructure which can reduce the efforts for the user. So that's the come from our infrastructure, our idea of the framework we call this pink which is a plugin framework for the compiler. At first we are resolved the repeated building trying to make sure that user need only focus on the plugin logic itself and do not need to care much on the common facilities like logging, like compatibility and also integrity verification. All those features are very necessary for each plugin for use but those kind of efforts do not need to be developed again and again. At the same time, we have two compilers, infrastructure GCC and LRVM. We only need to write once and run and fix for the different infrastructure at the same time. So this is the first challenge we use our framework to attack or resolve. At the same time, we also abstract the common part for the logic of the plugin into IR and also API to standardize the interface between the logic of the plugin and also the interface to the compiler. In this way, we only need to provide different clients for different infrastructure and do not need the user to rewrite their logic of the plugin again and again. So that's the way we resolve the second challenge for the user. At the same time, we also provide the capability to enhance the common facilities. So far, we think for the user, we need to provide the integrity verification and the compatibility support and also the logging capability. Those are three major things a user choose plugin feel necessary to have. And in the future, the developer might request for more needs for the plugin. For example, they might need to add some debugging capability as well. So these things, we think we can improve it step-by-step through some collaboration on the upstream and when there are more users to this framework of this framework, then we can add more capabilities step-by-step manner. But you might wondering how we achieve this. So here I will introduce our committer, Mingchuan Wu, to share you the details how this happened. Thank you. My name is Mingchuan Wu, and I am a competitive committer for the Oklahoma community. So it's my pleasure to introduce the implementation details on the pink. So let's first take a look at the overall design of the pink. In the previous introduction by Yonglian, we know that many developers of combination plugins hope to shorten the building and testing times as much as possible. And they are not willing to go into the implementation details of every compiler, such as GCN-LVM. So to this end, our plugin framework helps developers decoupled from the intermediate representation of the specific compiler. After all, the plugins, dynamic delivery and verification files are provided to the plugin users. It only needs to develop once and avoid the repeated development of multiple compilers. And the users use the plugin spares as the plugin server to communicate it with the connect of each compiler to enable one plugin on multiple compilers, and both the server and the client of a plugin framework provides common capabilities, such as integrative application and login to users. So that will help plugin users to quickly use the excellent capability of the compiler. So let's look at the overall architecture of the work. So the major feature of the pink is the boxing model design of the plugin used in server and compiler client. So this design of the logic of the plugin talks to run as an independent process through the plugin server. And decoupling it from the specific compiler and the developers only need to develop once based on the plugin server. And the plugin's user can choose a specific compiler client to use according to their own needs. And the user and the client communicate it across through the gRPC to transmit IR data. So let's take a closer look at the sources. On the plugin server, it provides plugin developers or is an IR that is decoupled from the compiler and provides plugin APIs to develop quickly. The framework also provides many common capabilities on the server side, such as multiple levels of login and running mid-tons to help the small see execution of server and the client. And the communication energy is used for the goals of this communication and makes it efficient by serializing and deserializing IR data. We also have made separate compiler clients for GCA and LLVM. These clients can be loaded through the plugin system of each compiler to quickly enable plugin compatibility by also building the whole compiler. So in this client, in addition to the same common capabilities as the server, and the PINC also provides compatibility to have verified the compiler vision and provides integrated verification of AC28256. So we also provide event management that can help developers accurately mapping the plugin register positions to the combination pipeline. So of course, the key capability of the client is in the translation and the conversion of IR. So let's map the IR and API of the plugin to the extra compiler and release the compatibility of plugins in the specific compiler. So everyone will know that we also discovered that the major difficulty of our work is how to translate and convert the IR used by a developer with the IR of each compiler. So in fact, once you use the IR of a specific compiler for development, it is actually very difficult to transplant it to other compilers because IR formats and torches of each compiler are very different. The migration cost is very high and the physicality is very development. Therefore, we have developed an IR based on ML IR. It's plugin direct. So developers can based on this IR and the API and cover it with IR translation provided by our work so that it can be used on many compilers. So the plugin direct is provided by our framework to developers and it's based on ML IR. So we chose ML IR because there is a high quality infrastructure that would benefit multiple compilers. So we want to use the ML IR and generate representations to help us quickly develop an IR for plugin development. And using ML IR as an abstraction, they help us release the interoperability of IR in many compilers. Of course another important relation is to the ML IR's convenient and fast infrastructures to help us quickly and efficiently to conversion between IRs. So what is ML and why does it have this magic power to do this thing? And Gracely, now, public paper on the CGOT NIVON and this shows that ML IR began with this issue that model machine learning frameworks are composed of many different compilers, which didn't share a common infrastructure or design principles. And the compilers industry has the same problems. So the ML IR project aims to directly talk these challenges. So by making it cheaper of the define and introduce a new abstraction level and providing box infrastructure to solve common compiler engineered problems. That can reduce the cost of building to my specific compilers and in connecting exciting compilers together. So it's actually the biggest version why we chose ML IR. But maybe some people think that ML IR is used in the field of AI, but however, ML IR does not refer to machine learning but is to multi-level. So we believe that to use the ML IR in traditional compilers is very promising. So ML IR providing declarative system for define IR direct design and facilitate the expansion and the evolution. So also the ML IR providing a wide range of common in frame structure. So that makes it easier to complete development work. So it's very simple and convenient conversion with existing and direct. So that's why we think ML IR is the right fit for this project. So as we know, ML IR has two types of representation. So that supports ML IR's extensibility. So the three figures use the unit of the schematic in ML IR operation. Everything for instruction to function to module is modeled as operation in this system. In ML IR directs for plugin frameworks. It's in the plugin directs. So we have designed for the sediments and types such as the side conditions. ML IR and ML IR will conform the same regions of some origin. So one of important task is to introduce GCC's intermediate representation in this game pool and into the ML IR tool chain. So we developed and implemented the translation from game pool to plugin direct to introduce function including function and operation translation. On the plugin server we use plugin direct for plugin development. So we use plugin direct for course process communication to pass our data and apply to the compiler. When the user use the plugin and the plugin logic is executed by step by step and the compiler is scheduled to operate the IR through the process modification according to the API used by the development. At the time, we can use the translation module to translate the required compiler IR into plugin direct and pass it to the server to execute the plugin logic. For example, on the GCC IR data is achieved through the translation between game pool and the plugin direct on the LVM client conversion is performed through the LVM direct corresponding to LVM IR and the MIR in frame structure is used to quickly complete the translation process. So let's review the insure machine already the compilation would fall when the GSC version used to compile the module was even slightly different from the one used to build the color. So this will cause plugin maintainers to spend a lot of energy on maintaining the plugins for GCC vision. But if we develop based on pink, because the plugin logic is decoupled from the compiler, the plugin maintainers only needs to focus on maintaining their own plugins. The plugin maintainers can support multiple vision of GCC based on the GCC client. So the division of the memory is clear and the efficiency is higher and the workload of maintenance is also greatly reduced. Next, we need to look at two cases. The first case is an optimization path which is already compared. This is an optimization feature of GCC for open OLA. It can use a wide data type to compare multiple elements at a time. So choose the red figure and this optimization can improve the performance of a spec by 7% and we developed this optimization for whole months for GCC. However, and if it's available in LLVM, I still need to develop for another month. So we migrated this optimization on pink and it only took 7 days. Then the optimization is insulated for GCC and LLVM can be quickly enabled to reduce the maintenance cost for the developers. And let us go back to the idea already. So we can develop a neon 2SV based on pink and develop it on the plugin server. So use the respective plugin system of GCC and LLVM to launch the compiler connect and enable the capabilities. So thanks to the design of our plugin, we don't have to modify the compiler source code. So we can shorten the time to build and test new features. The design of the plugin server and the compiler connect allows to develop plugins based on plugin direct and then both GCC and LLVM are enabled. So at least let's introduce our future plan. Okay, it's my turn again. And after you know about how the plugin framework, the pink works, and then we will show you our future plan. The future plan contains at least four bullets. The first one is trying to improve the coverage of IR function and the transformation utilities. You know the IR of GCC infrastructure and LLVM's infrastructure is very huge. And currently we try our best to cover about 70 or 80% LLVM or IR capabilities of GCC. But there are still some we need to enhance step by step manner based on the usage. This is the first thing we like to improve. The second thing is the user experience. As I said for the plugin usage, at least we need the logging capability, the integrated verification and the compatibility. Those three items we understand is very useful for the plugin as a user. But for the plugin as a developer, they might use another kind of capability like debugging. And also if user has more requirement on the usage we can also consider it to add them into our load map of the future. So this is the second bullet of the plugin. The third one is trying to add some plugin mechanism. So far our plugin mechanism basically is focused on the compiler middle end. This kind of the mechanism are both provided by GCC infrastructure and LLVM infrastructure. But actually the compiler has front end, has back end. And also there are linking stage as very popular as the link time optimization. So we would like to extend our work to the other stage of the compilation, the whole compilation or whole source binary building process in this way that we can add more capabilities from end to end. But that would change the current GCC infrastructure and LLVM infrastructure. So we would put this as our third step and we need to align with the compiler upstream to collaborate on this. And the fourth as I said that we need to contribute a bit, not only put this in our open order community and we would like to extend our work not only on current framework but also in the compiler infrastructure from end to end to collaborate with the upstream community together to make it better in the future. So basically those are the four items I would like to list here. But I can understand that the punishment is actually endless and we would like to call for more contribution from the community and we can create a more powerful infrastructure for the compiler plugin because we think it's very helpful for us to the compiler community. At the same time if you do not like to develop compiler but you can just to use it, that's also very helpful for us to provide our feedback to us and we can improve the future step by step in the future. So basically those are the future plan and maybe last I will provide the more additional information. All the source code we will put our code open source and provide some document for the user to follow and also there is a forum for you to discuss and write feedback but also you can send us email and scan the barcode to communicate with us and provide your feedback based on your experience. We do appreciate your work and your suggestions and as I said the open world is just a baby compared with the other very powerful community and we are just the beginning but we would like to embrace the collaboration, embrace the innovation together to make a brilliant future together. Thank you. I'm not sure, do you have any questions? Yes? So you mean mixing those transformation together or else? I'm not sure whether I can understand your questions. Could you Yes? You mean how to make sure the translation is correct, right? Okay, so we as I said that we will add more debugging capabilities in the future for the user to understand debugging where there is a logic failure in the plugin itself or there is some failure on the transformation itself. So far we do it in our internal some tools. Using some mechanism to make sure that using our transformation it works without our transformation it works as well. So in this way we can compare with transform the code and then transform the code to make sure it works but really lacks a kind of the mechanism engineering mechanism to debugging and isolate the problem between the plugin logic and the infrastructure itself. So this is the work we are working on in our next step. Yeah, so far we working in some comparison manner not very efficient but that works. I'm not sure whether I answer your questions. Okay, okay. Okay. Yes, sure. 80% of the IR? Yes, yes, yes. You mean different IR has some differences in between and there is a common. So far we provide the common part of 80%, right? So far I put this way. We will provide the 70% of the GCC IR and make sure GCC goes through at first. Then we are working on the area of EM part. The area of EM part is the community is also working on the MRIR kind of the translation and we will try to provide the capability to make sure those kind of inconsistencies can develop in our framework itself so that you do not need to come back to the area of EM IR and just use the similar capabilities provided by MRIR. Yeah. So we do it not in parallel step by step. So if you are interested maybe we can communicate offline and exchange the information contact information. Yeah. Okay. If no more questions that's all. Thank you for your attendance. Have a good day.