 Hello everybody. Thank you very much for joining my today presentation. My name is Bing Huo working in Micro-Semiconductor Memory Company in Munich. The topic of today my presentation is the challenges of deploying eBPF as a tracing tool, eBaited Systems and the alternative of the solution for eBaited Systems, eBaited SFS and eBaited Events. I have two goals of today's talk. First one that share what we learned and the challenges and problems we faced while using eBPF as a tracing tool in eBaited Systems. Second one that is in current discussion and participation in the Linux community or eBPF has been an issue for the improvement. For example, NIP Trace SFS and NIP Trace Events. This is our agenda of today's talk. First one that is a brief introduction of Linux Trace Systems. Second one that is eBPF and the challenge of deploying eBPF as a tracing tool in eBaited Systems. Third one that is NIP Trace SFS and NIP Trace Events. This is the main topic of today's presentation. I will give you a very simple example which is based on NIP Trace SFS and NIP Trace Events. Show you how to get a trace data sample from Linux color space. Then that is a comparison between different trace tool which based on these two different trace infrastructure. eBPF BASTER and NIP Trace FS and NIP Trace Events BASTER. Last one that is the conclusion of my today's presentation. This is a Linux Trace System. We have a lot of trace tool in user space for Linux and Linux kernel. Sometimes it's very easy to get confused. Also, we want to know how do they fit together in Linux and Linux kernel. In order to better understand that, we split all Linux Trace Systems into two spaces. At the top, that is the user space. At the bottom, that is the color space. In the color space, there are two layers. First one that is data extractors. Second layer, that is the data sources at the bottom. Data sources is a place in the kernel that data trace data is produced. If you have a look at that data sources, we have a bunch of data sources. Data extractor actually is a method either way to get the trace data from data sources and retain the trace data to the user space. Also, we have a bunch of data extractors. Regarding that trace from the tool in user space, that is trace application or trace program, which is a reciprocal configure and attach at the enable trace point and the program and organize trace data. Some of that trace from the tool can do preprocess before store the trace data to local disk or somewhere else or display trace data to other console in a nice way. So, if we have a look at the data sources, K-Probe, U-Probe and static trace point, they are used by most of trace from the tool. Also, for data extractors, EPPF has more data sources compared to other data extractors. So, EPPF can get data from K-Probe, U-Probe and static trace point, also can get data from USDG and Perf event. That's why we say that EPPF is much more flexible because EPPF has more options allowing you to get data from color space. So, what is EPPF and EPPF? So, EPPF derived from EPPF. EPPF is short of buckling package filter, originally used by a network to do a message filter. Then, EPPF redesigned and enabled more options and data structure, which make a EPPF go beyond the network message filter. Now, it has become a Linux tracing infrastructure in the kernel. That is the EPPF, extended EPPF. With EPPF, we can pass user space parameter to kernel. Also, we can retrieve trace data from color space. For EPPF, you use the color space, the idle virtual machine, which is the execution engine. With this engine, we can execute by-cord in our safe manner. Also, we can get the trace data efficiently. So, if we have a look at the right side, that is an EPPF-based trace data path graph. So, in the space, we have that EPPF program, or BCC, or EPPF trace. You should use that Synon or LLVM, this two compiler, to compile your EPPF program to be EPPF by-cord. Then, EPPF by-cord will be passed to color space by EPPF. In color space, the EPPF verifier will do safety and bug verification. As now, that being a verification path, there is another compiler or translator in color space, that is GRT, just in time. We are translator EPPF by-cord to a specific machine instructor set. After that, as known as your trace point or trace probe, hit by execution, the trace data will be filled in that EPPF maps. So, you use the space, you will get that notification, this notification. Then, you can use the EPPF mapping gate system call to read the data from EPPF maps. One point should be highlighted here is that with EPPF, we can specify what kind of data to get from color space. Also, you can organize your urban data trace data format or data structure. Because we can allow our EPPF program to filter trace data in color space, which means with EPPF, we can get trace data in a light-weight. This is the EPPF very general introduction. So, now let's look at how to enable EPPF in color space. So, firstly, of course, that is EPPF, then that is a EPPF system call. Because we want to use the EPPF system call to set up a trace point and get a trace data from color space. Of course, the third one that is GRT, color space compiler. We also prefer to use that static trace point. So, we should enable EPPF event, this option. In order to enable this option, the other option should be enabled as well. For example, key problem or U problem. Also, you need to enable F trace. Then that's a perfect event. Then based on your different color version, you should either enable EPPF GRT or EPPF GRT. Then that is our EPPF trace format. Actually, this is a very useful feature in our file EPPF, which is help us to animate that dependency at the right time. But the problem is that this option only available since the color 5.2. Also, if you enable this option, it will add about 1.5 megabyte size to your color image. I think this is not a bigger problem for data center network cloud customer. However, for embedded customer, they care about that fast boot. Also, our CPU site, the resource is named. 1.5 megabyte increasing size is a challenge for EPPF system. Also, if we want to get a trace data efficiently, we want to use that latest feature on data structure in color on your EPPF, that is a ring buffer map. For this data structure only available since color, that is 5.8. For EPPF system, if you look at our, if you look at that already used color version for EPPF system, some customer using latest color, some customer even there still use 2.6 or 4.0 color version. Considering that the color image size, even EPPF customer using that latest color version, but they will not enable all these color configuration options we require by EPPF. This is a challenge for our EPPF deployment in the embedded system. Now let's look at the dependency of the EPPF-based tool compilation as a deployment in the embedded system. We know that for EPPF-based trace tool already have several ones. We can directly use that. For example, BPF trace and EPPF-based tools. You can find that in BCC, Github. Probably that you should use that ILLIVM-CNOW to compile that BPF program to be by-cored. Also, for cross-compile, especially for ARM-based system, we want to cross-compile. If you look at that EPPF-made file, it's very complicated, a lot of dependency. You need to install all your server-side or your target platform. For BCC, use Python. Also, BCC needs that ILLIVM and runtime. I think this is not possible for the embedded system to use BCC. Because ILLIVM installation or target embedded system is very difficult. Then we have the supply. This is a very good example for the embedded system. It's a good example beginning to use the EPPF-based trace and tool for the embedded system, which depends on the IPC. The problem is that lines require to enable more color options. Also, you should use the latest color version and enable the BPF option, which will increase your color image size. That's another problem we found with lines that when we do higher workload benchmark testing, as there are several there, that is even not the issue. So, now we'll print how many even not the CPU. I think we should look at that later. Also, that BPF and CRI is very good to options and exam features for EPPF to animate dependency at the runtime, but require customer to use the latest color version to enable more options. As I mentioned, for eBaiter Cessar, even they are using the latest color version, customer will not enable all. That's a couple of color configuration required by EPPF. So, the challenging for EPPF in eBaiter Cessar is clear that we cannot guarantee our customers different customers they are using similar on the latest color version. Even they are using the latest color version consider a faster boot color image size, they will not enable all that color options required by EPPF. Also, for eBaiter Cessar mostly they are using ARM-based platform. For cross-compile you want to do that you need to enable at the setup a lot of library or your server or your targeted platform but even that you cross-compile your EPPF eBaiter Cessar server site that cannot guarantee 100% guarantee your binary executable tool that can work on your targeted customer platform because even still there is some library missed on the targeted platform. So, of course to solve this problem it's better to compile your EPPF-based trace tool or targeted platform but this is not possible for eBaiter Cessar. So now we think about an alternative solution for eBaiter Cessar which allows us to develop that trace tool for eBaiter Cessar customer. So that's why we come to NIP Trace FS and NIP Trace Event this two legacy F-Trace Spacer library API. So NIP Trace FS is extracted from Trace Command that is a user-spaced fronter tool for trace. With this API of library we can easily to configure our trace FS. NIP Trace Event library provides a very rich API to access color trace port events. Also we can use this library to pass or analyze a theater method trace data from trace event data. The requirement for these two libraries just need to enable configure F-Trace. This is a very very good example for us to enable the developer trace tool based on these two libraries because for most of eBaiter customer system F-Trace is enabled by default. Even for some products on the field F-Trace is enabled on that platform. So we can easily deploy these two library based trace tool on our eBaiter customer platform. So let's look at that NIP Trace FS and NIP Trace Event based data path. So NIP Trace FS and NIP Trace Event based trace tool can use that API to enable trace point through that our trace file system. Then we can use APIs provided by NIP Trace Event to filter and analyze all paths that trace data. Of course before doing that you need to read that raw trace data from Marine Buffer. So that's why if you look at left side trace FS based trace tool and that's trace data path is very simple but the cluster in the user space you will have very heavy pre-processed program in the user space. That will cluster CPU cycle and spend some time. So how to use that NIP Trace FS and NIP Trace Event to trace data. I summarized major four steps. First one is to configure trace FS with NIP Trace FS API. We have hundreds plus APIs to help you to configure trace FS with a very convenient way. Second that is preparation. You should locate and initialize trace event parser name space. Then that is located key buffer because we will read the raw trace data from Marine Buffer. Then that's a web page of trace data to color buffer. Third that is not trace event format because it's very potent. NIP trace event library use this format to parse trace event data. Third step that is enable trace event and register event handler. Last step that is that will be wire loop to read that trace pipe raw is a file or this ring buffer from color space from that trace FS. Here is a very simple example to show you how to use that NIP Trace FS API to configure trace FS. First step of course you need to figure out way that your trace FS folder because different Linux distributed system will have different trace files as a folder. With this API trace FS trace DRR you can easily get that way that your trace FS is a folder. Second one that is to enable specific trace point, static trace point. Of course you can enable that key probe with that NIP Trace FS NIP Trace FS API library. Here is just an example I enable block requester issue this static trace point. After that you just turn on tracing. While tracing is on you can you should read that pipe raw data that's a binary raw data from ring buffer that will be read in that's a wire loop. While you finish the tracing you can use the NIP Trace FS API to clean up of your tracing. This is a trace event format file. This file actually this format is a sketch of line and binary raw data in the kernel ring buffer. It explains how to read each trace data from our color buffer that's color buffer. I can NIP Trace event mainly use this format to pass a filter retrieve data from our trace data. If we have a look at this format actually each event static trace event has all this kind of field versus field that is a unique static specific event ID. Each static trace point has its own event ID. That is a common field. All that static trace point have a similar common field. Third one that is specific unique trace field for specific trace event. Last one that is a C statement used by Karlo to print that trace string data to that F trace file. So this is a example to show you how to use that NIP Trace event to pass your filter, your trace data. First step of course you need to allocate that tape that is a trace event part of the space. Then you should note that's a common trace format. Because I mentioned that each trace event has this common trace format. You should note that to parse the name space. Third one that is a lot of specific trace event data format to parse the name space. Here is an example that I just note block request issue. That is the trace format to parse the name space. Then you can register event handler. Allow this handler to preprocess your trace data. Because for ring buffer each CPU has its own ring buffer. Of course you need to think out how many CPUs you have on your system. You need to continue to read that ring buffer data from ring buffer. So read the raw buffer actually that is a file based on operation. First step open that trace type of file with read only and no block options. After that just read the web page to buffer. Then you can use the key buffer read the event this API to get the web image trace data. After that you can parse or you can print to trace the name space sequence. At this time that will trigger that event handler. Event handler will do trace data filter. After finish that you can move on to the next next next event. So here is the example and handler of blocker request issue. So we want to think out where what is actually command name and what is the device ID and what is the logic block address. Also the most important that what is data length. Also we should understand what is operation of this requester. Also you can get that time steps. For example we want to get a PRD you can use a common PRD this key. Actually these strings come from our here is common format field. So it's here. So we use this string as our key in this API to get it's value. After that after you get the PRD you can get what is the command. What is the command name. That is string. You can use this TapeGateFieldRaw this API to get that string based value. Here I just print as now you get all this kind of raw data from your trace event or trace data. Here I just print because it's just a very simple demo example. Because currently trace event library doesn't have that hash table to allow you to organize your event trace format as BCC or NPPPF does. But you need to give it yourself. Here what points to highlight is even handler is not mandatory. Which means here we don't need to use this API TapePrintEvent to print your even trace data to LAMP space sequence. Actually after key buffer read even this API you already got that even trace data. You can directly parse the preprocess here. You don't need to print and trigger that even handler. That way I'll save some CPU cycle and overhead relatively lower compared to that trigger that even handler. This is my example. The source code you can get from the link. I have two examples. Use even handler. Other ways don't use that even handler. You can have a look at that. This is a comparison. So for NPPF NPPF trace event base tool portability is higher because it's only needed that it's labeled in current color configuration. But cost is that this overpaid is higher because we master read that raw data from that room buffer per CPU. However for EPPF base trace tool PCC apply NPPF base trace tools. The dependency is higher. Portability is lower. You should use that latest color. I find .2 on your targeted platform. But portability is higher. With EPPF you can get light with trace data from color space. Also you can specify what kind of data you want from color space. This is the conclusion of my presentation. So it's very clear that EPPF is very powerful. It's mainstream and a hot topic in current community and a lot of conference. But for EPPF system, portability is key. So how we can successfully make an EPPF base tool mostly and successfully work on different customer platform which they have different color version and different color configuration. This is a big challenge for EPPF system. EPPF CRI is a very important feature to improve that portability of EPPF base trace tool. But there's still a lot to laugh because sometimes you enable this in your color, your target color and you still cannot guarantee what kind of library means to all your target platform. Here there is a hope that our customer can adopt the latest color version in their future design. Also they can enable that EPPF required options in color configuration. Also they can try as a lister to not be applied. This EPPF for embedded system version base tool on your target platform. TraceFS and EPPF trace event provide very rich API to allow us to get data sample from color in a comfortable, confident way. But probably that the cost is higher. We should read that RIM buffer in a wire loop that consumes some CPU cycle and the system overhead is higher. Here is a suggestion I just think I don't know if this is possible or not to allow that attached human handler to that trace point itself, which means we can disable some data of that trace point cannot print that trace data to your RIM buffer. Some trace data we don't know what. If you look at that static trace point printer data a lot of that noise in that RIM buffer, we don't know what all that data but we just want to filter specifically what is expected of that data. This is all my presentation. If you have any question, we can discuss in the other chat. Thank you all very much for your time. This is all my presentation. Thank you.