 Hi everyone, this is Xuxia from China Mobile. In this part, I will introduce the lessons learned from cloud-native design of network function. This is the agenda of our topic. Firstly, we will introduce the designing strategy of cloud-native network functions. Secondly, we take the user-play function UPF as an example and describe the microservice design of it. Thirdly, we separate the data and business logic to make UPF as a stateless network function. Then a telco load balance is explored to support UPF. As the network function is divided into multiple microservices, we set up a specific telco monitor system of UPF. Lastly, as XGVela provides a configuration management as a service CMAS system, we integrate the CMAS system into UPF to test its feasibility. So, what kind of network functions can be called cloud-native design? From the top of the picture, we should break the monolith architecture of the network function into microservices in order to make the network function more flexible. Then we refine common capabilities required by upper-layer services as pass function on platform layer. For the infrastructure layer, we use containers to deploy network functions. This is our goal to achieve telco cloud-native network functions. This picture describes our microservice design of UPF. We separate the monolith UPF into four parts. The green one on the left side is UPFOM, which is responsible for operation and management. The orange modules are UPF-PFCP and UPF-Explorer. UPF-PFCP communicates to SMF, which represents the N4 interface. UPF-Explorer module communicates with mobile edge computing platform, which is responsible for communicating with edge UPF or the open capability interface. The blue box indicates UPF-U, which is responsible for data plan communication, represents the N3 interface. We separate the UPF according to the following principles. Firstly is the 3GPP specification. We cannot change the functionality and protocols UPF. Secondly, we should obey the multiple plan isolation principles. The microservice design should follow single responsibility and the interface in the independent principle. Most of the cloud-native applications are stateless. We try to separate the data and the business logic. Moreover, we focus on UPF flexibility, reach the PDEU session from database, in order to achieve elastic stick scaling. This picture indicates how we deploy the microservice UPF. The network function is deployed into four clusters. It is difficult to achieve all the modules to be stateless. As the N4 server cluster is used to connect to SMF, it stores the connection with SMF and the PDEU session. The OM cluster also stores application management data. We only make UPF-U a stateless module. The UPF-PFCP cluster stores the PDEU session into an external database. And UPF-U reads the stored data and transfers the PDEU session into the right PDF-U instance. Therefore, UPF-PFCP, UPF-Exploder, UPF-OM are stateful applications. And the above microservices are deployed as stateful set clusters to achieve high availability. UPF-U is a stateless application. It is deployed in deployment mode. And high availability is achieved through multiple copies. This indicates the stateless design of UPF-U. The PDEU session, the PDEU session of UE, are extracted from control and data plan. We use an individual right-of-state center to store it. As the PDEU is stateless, if one instance fails, we will initialize a new one and read the same PDEU session from right-of-stance. The advantage of stateless application is flexible. The stateless UPF-U can be flexible extended. PDEU sessions can also be scheduled to different instances. As stateful business always need to be achieved high availability, such as finite availability. We design a high-performance Redis cluster. This picture describes how we deploy the Redis cluster. It is a multiple master and multiple slave cluster, and it is a point-to-point architecture without a single central node or wide single node failure. The green dot line represents the gossip protocol that is used to transfer the message to maintain the cluster topology and the cluster metadata. All the data are transferred through the proxy to read or write. When a UPF-PFCP needs to write a data into Redis cluster, it firstly communicates with the Redis proxy and writes the key value data into one couple of master slaves. If the UPF-U wants to read a data, it communicates with a proxy that randomly sends query statements to one of the master nodes. If the master node stores the key value, it returns the right value. If not, the node will return a move message that carries the target node information. The proxy will recent the request to write target. The multiple master and multiple slave cluster design is to improve the reading, writing efficiency and reliability. This picture indicates, in order to ensure high availability, if the master node fails, how we recover it? In our implementation, we use one-to-one backup strategy. A master node has only one slave node. The slave node copies all the data from the master node. In our implementation, we use manual scaling to add a couple of new master slaves to the existing cluster. A Redis cluster only has 16,384 slots, numbering from 0 to 16,383. When the cluster is working normally, each master node in the cluster will be responsible for a part of the slots. Another key is mapped to a slot, and that's the master is responsible for it. The master is responsible for providing service for this key. If we want to scale the cluster, we add a new couple of master and slave nodes. At this time, the slot has not allocated to this new couple. The slot origin starts to move some existing slots to the new master nodes. Some existing key values also meet grades to this node. Lastly, the cluster has four couples of master slaves. In order to ensure data balance among multiple master nodes, the number of slots is transparent to the application. In order to meet the requirements of flexibility and reliability, UPF also needs to balance and distribute workloads to multiple processing units for operation just like web applications. The open source and commercial software, there are already some successful load balancing software such as Invoi and Endings. However, these software only supports 3-layer IP, 4-layer TCP UDP, and 7-layer HTTP load balancing. They cannot support UPF load balancing in the city field. They may develop a telco load balancer that feeds UPF, satisfying the packet forward-in-control protocol PFCP that transfers between UPF and SMF, and the JPRS turn-in to users which is called JTPU protocols that transfers between UPF and the radio access network. For the existing open source Kubernetes, Kubernetes load balancer is provided for cloud providers, and the load balancing services of existing public cloud vendors only provide 3-layer, 4-layer, and 7-layer HTTP load balancing services. UPF is a high-data throughput application. Compared with web applications, it has higher requirements for data forward-in performance. UPF requires multiple network plans. Port IP can only use signaling plans with low performance requirements, and the data plan should use a physical network card, virtualization technology to accelerate data plan processing. We developed an enhanced load balancer as a Kubernetes service, which is called ELB. ELB is the implementation of Kubernetes load balancer for telco cloud. It has telecommunication protocol identification, high-vpp throughput, and support specific city multiple protocol types, such as HTTP protocol on the control plan, PFCP and JTPU protocol on the user plan, and the function of UE and UE session information identification, and data distribution and binding based on UE information. However, ELB cluster, we set up an ELB controller which is deployed in Kubernetes masternode, and several ELB worker nodes. ELB controller monitors the creation, modification, delayed operation, then it generates forward-in rules, transfers the rules to related ELB worker nodes. ELB worker nodes are treated as data plan gateway of north-south traffic of Kubernetes cluster. ELB worker nodes transfer control and data plan traffic to UPF-PFCP and UPF-U instances, according to specific load balance algorithm. The control plan is mainly composed of ELB controller, which dynamically generates and configures load balancing distribution rules, according to the creation of UPF microservice in Kubernetes. ELB controller sends load balancing distribution rules to ELB worker, and ELB worker forwards UE contact and other data message to a certain service instance of UPF, according to the weight requirements and the distribution rules. In this picture, we will start to introduce the telco monitor system. This is the overview. The monitor system is based on Promyhils, focusing on data collection, storage, query, display, and post-processing. Promyhils collects basic platform, container, and network function indicators. Post-processing includes time convergence of indicators, spatial convergence, setting of indicator alarm routes, and a root cause analysis. The monitoring platform is based on Promyhils, focusing on storage, query, display, and post-processing of monitoring indicators. Promyhils collects basic platform indicators, container indicators, and performance indicators of the network function itself. Post-processing includes time convergence of an indicator, spatial convergence, setting of indicator alarm routes, and a root cause analysis. This is the alert system. It includes the performance alert, log alert, and network function alert. For the performance alert, alert manager completes the notification of performance data and sends the alert to the corresponding receiving backend. For the log alert, elastic alert completes the log alert notification, and sends the alert to relevant receiving backend according to the webhook configured by the user. Lastly, for the alerts reported by network function, network function can directly report alarms, and the alarm data will be stored in ES, elastic search. OMC is responsible for taking out the alarm information in ES and visualizing it. This is the log system. We use FireBeat to allocate the log reported by container and the whole system. FireBeat is a live-in log collector that monitors and collects log files directly. The log information collected by FireBeat will be sent to Kafka. Kafka can play a role in cutting peaks and flat valleys, effectively reducing the impact of a large number of transmissions on the system. LogStash is responsible for transmitting data information from the input end of the pipeline to the output, and provides a powerful filter to meet various scenarios. elastic search is an analysis engine, designing for horizon expansion, high reliability, and measurement convenience. Kibana can represent data in the form of graphs and has extensible user interface. Lastly, we will use the open source code, Xtravela, in the UPF configuration process, and integrate the configuration as a service CMOS with UPF. This is the use case. The network administrator configures the IP address of N3, N4, and N6. Interface of UPF through Netconf or CLI interface. The SIM container is deployed in the SIM pod as the UPF microservice container according to the sidecar model. The configuration measurement service CONFD or NetoPL2 will update the configuration information to ETCD. SIM watches new UPF configuration data in ETCD. SIM calls the RAT4 API of UPF OM and issues the configuration command. UPF OM returns the configuration command result of SIM. SIM informs the SIMOS that the configuration operation is successful through Kafka. That's all. Thanks for your listening.