 Hello, everyone. Welcome to this session, EPPF Power Observability for Telecom CNS. My name is Junichi Kawasaki. I'm a network engineer at KDDI, Telecom operator in Japan. Currently, I'm working on monitoring and orchestration for autonomous network. Let me start my talk with background. We telecom operators are responsible for key providing reliable network to customers. In 5Z or the next beyond 5Z era, network will be provided to more various use cases, including mission-critical ones. So we have to maintain carrier-grade quality, 5.9 availability, or more than that in future. Of course, we have to keep this carrier-grade, not only in PNF, VNF, but also in CNF, cloud-native network function. For this purpose, we need to transform our operation style. That's proactive operation. Productive operation here means to act before impacting customers. One of the key enablers for this is predictive failure detection using AI ML. However, only ML technologies is not enough. As you know, machine-running models are able to provide outputs based on training datasets. In other words, future failure prediction is impossible without good information or observability. So EVPF is expected to take another key role for proactive operation because it can provide fine-grained information without modifying CNF codes. So this is our proposal. It performs predictive failure detection by the combination of machine-running and EVPF. It consists of data collection and analysis part. Our target environment is 5Z core CNFs deployed on the container infrastructure. In the data collection, there are two ways of collection. One is EVPF-based, collecting fine-grained information from infrastructure. Another is exporter-based one, collecting basic metrics by C Advisor, as well as 5Z-specific data from container layer. And in the analysis step, ML models are developed with the stored datasets, and they do time-sharing analysis by failure prediction. Next, I will explain how to empower the 5Z CNF observability by EVPF. First, let me share which EVPF tools we implemented. We take these two failure scenario cases on our experimental 5Z core testbed. Number one is increasing packet loss rate in Cplane. Number two is increasing computing resource usage on one of the UPFs. Concerning these scenarios, we selected tools in BCC and BPF trace libraries to derive useful insights. From the over 100 tools, we chose these ones listed on the right table. It includes many TCP-related tools, as well as some processing related. We selected the promising tools. But here, raw data itself derived from EVPF are not provided per CNF container. Each tool outputs all the data traced on the server without considering which data are related to AMF or SMF. From the AI point of view, the data should be differentiated per CNF type 4 analysis. So we coordinate data by key information. We use two types of mapping, PID-based and IP-based. In PID-based mapping, PID, process ID, or PID namespace are used as a key. After creating CNF container, execute docker ps command like this to get PID. And type ls command with the argument including this specified PID to get the corresponding PID namespace. As is shown here by referring to PID namespace in this case, the collector is able to know from which CNF data are coming. The other type is IP-based mapping, especially for TCP-related tools. It outputs the traces with a source and destination IP. For example, this .14.45 is one of our AMF IPs. So the collector is able to take the data whose source IP is .45 as data related to AMF. For additional information in this mapping, it also calculates the maximum, minimum, and average values from this histogram and add as supporting information. Here is a dataset that is obtained after mapping. Our system collects this JSON file every 10 seconds. As you can see, it includes collection time, native CNF, 5G-related information, and C-adbite information, including basic CPU and memory metrics. And the eBPF data related to AMF is surrounded in orange, presenting rankurat, TCP drop, and TCPV metrics, where only partial information is shown here due to the space limitation. This JSON file includes all data related to 5G core per CNF type. Regarding analysis part, we use LSTM model that is one of the major deep learning models for time series analysis. Finally, I will show you the experimental result to share how much eBPF can contribute to LSTM based future prediction. We used this 5G core CNF network that was created on the Linux server by Kubernetes. Two types of data, eBPF and C-advisor, are collected from the network. For comparison, we developed two LSTM models. One is a model trained with all data, including eBPF. Another is trained without eBPF. In the evaluation, the trained model inputs multi-variant test data, and then outputs the predicted number of registration failures. We evaluated the prediction performance by determining if the output future prediction can detect problem or not with a threshold. Here is the result of prediction performance at 150 seconds after the fairer event starts. This is the result for case number one. The first model using eBPF shows high performance. You can also see the prediction accuracy on the above graph. The solid line here is true, and the dashed line is predicted values. At 150 seconds, this model predicted the future increase of the failed registration. On the other hand, the other model without eBPF did not get a good score, and also there's a big difference between the predicted and the true values. This result shows eBPF highly contributed to developing the LSTM prediction models. In conclusion, from our practice experiment, eBPF tools with several mapping processes enabled 5G-CNF observability, and LSTM models trained with the dataset obtained by this observability achieved high performance failure prediction, which will hopefully lead to proactive operation in future networks. Thank you for watching. I hope you have enjoyed and learned something useful for your activity.