 And today, my colleague, Wijo, and I will cooperate to finish this session. Okay, let me take a bit time to introduce ourselves. Wijo is a senior architecture leader in Dark Cloud, and he is responsible for all the network projects in Dark Cloud. And I am the product manager in Dark Cloud, and I'm responsible for container platform in Dark Cloud. Okay, then let's start. Firstly, let's take a look at the CNIs currently available in Kubernetes. We can categorize these CNIs into two tabs, underly CNI and overly CNI. Obviously, that overly CNI are more popular than underly CNI, for example, where they use the particle and the ceiling with the EBPF acceleration capabilities. Compared to the underly CNI, overly CNI has lower dependency to underly physical network, and it's easy for use. However, in some scenario, underly CNI cannot be replaced. So next, let's move to the scenario. The first scenario is about the traditional application, which is migrated from the host to the Kubernetes. There are some typical characteristics for the network of the traditional application. Let's look at the red side. We can see the multicast and group caster is required, and the ARP is necessary. And the second part, some of their application is exposed. So it's by the IP address of VM or physical server, and it is isolated by the firewall policy with no net through the physical network. And the last, the network of this part's application is isolated by a villain subnet for business traffic acceleration. For example, an application has two interfaces. One is for TCP traffic, and another is for the UDP traffic. And sometimes the log traffic is very enumers, and it should be isolated to avoid impacting the business traffic. When this part of application is migrated to the Kubernetes, some companies may want to save the cost, so they will keep the pattern as the orange network pattern. So there are two ways to migrate. The first way is we can look at it. The application is moded to the Kubernetes, but without any architecture transformation. And the second way is the application, which is to be deployed on the VM. When they moded to the Kubernetes, we can use the technology such as KubeWert. That means after the migration, it's still run on the VM, but the VM is managed by the Kubernetes. In these two cases, the network pattern will keep the same as the original. So in this case, until it's done, the installation will be more suitable than the original, because it needs to communicate with the physical network and it needs to fix the IP for the port, and we need subnet isolation. Okay, let's move to the second scenario. It's about communication on the side of the cluster. Sometimes the service registry center is deployed out of the cluster, maybe it's in another cluster. And in this case, the port should be accessed by outside. And additional, some middleware or database will be deployed across the cluster for availability. So the data synchronization across the cluster became crucial, and network connectivity and performance is very important. Of course, all the CNI has solutions for this scenario. For example, Calico has a BGP mode and Celium with Turno, Turno mode, but on the lesion, it's more lightweight and easy for use in this scenario. Okay, the last scenario is about the AI large model. Everybody knows in 2023, AI large model is very popular. So let's look at the trust GPT 3.5 from OpenAI. It has 170 billion parameters and it's utilized 10,000 GPU and 2,000 nodes. Due to the distributing computing it needs a lot of computing power. Okay, we can see in the middle information the communication between computing modes has reached hundreds of GB per second and the bandwidth in data center is over 800 GB per second. As we know, most AI large model is deployed in Kubernetes. So the network in Kubernetes should be the bottleneck. Some cloud suppliers adopted the IDM technology in order to reduce training time improving the GPU utilization, of course, of load CPU so they can save a lot of cost. Look back to the underlying AI. It can work with RDMA and the performance is great. After talking about the scenario we can look at the wrong teacher on the underlying AI. It has a high performance by using RDMA with the SRV VLAN and of course it can reduce application migration cost by keeping the same traditional network pattern and of course it can do bandwidth isolation for different business traffic which has a strong VLAN network isolation and firewall isolation. Of course it can communicate across cluster. Okay, then let's look at the solution in open source. There are a lot of good projects to satisfy the requirements in this scenario but it has some problem. The first problem is about the communication with the underlying AI for example. The health track for micro LAN port IP cannot work if we just use the open source project and the communication between the node port and port IP sometimes will be failed. And the second problem is as we know the Kubernetes has become more and more reliable and the scale became larger and larger. So when there are such as 1,000 or 2,000 IP was allocated at the same time the IPM will face challenge. As we know, we have tested it. There is no efficient IPM allocation mechanism. And the last problem is about the, it's the limitation of the communication between multiple CNIs. As we talked about it's the traditional application migrated to the Kubernetes department have several interfaces. So due to the conflicts in the looting table or misconfiguration, it may cause some connection issues. Okay, that's why we published the project by the pool. Despite pool can run on brand mental, VM and cart and it supports the scenarios such as we talked about it's AI training and astrology traffic acceleration. Log traffic acceleration of course it can speed up the performance of RADIS and other media well. And to look at the media part we can so that despite pool can work with Mark Willem, IP Willem, SRV, CNI and it has a lot of features such as the first one is multi interface for pools and it has an efficient IPM mechanism in the large cases. And of course it supports dual stack and it can work with RDMA, EBPF and we have a new feature it's about egress get away. And the next visual we'll talk about it. Okay, the next is about master of despite pool. We have a private project in 2015 and then this switch and it's only called the Mercury and CNI. And then we moved it to 2019 it's upgrade to the project puzzle. It's called Calico CNI and Willem CNI. Then in 2020 we published SPIDE pool as a github and we have the first release. And now we have released 8.8, 0.8. Okay, that's the milestone of SPIDE pool and we talked about the scenario and why we published why we have SPIDE pool. Next visual we'll show the architecture and the features about SPIDE pool. Okay. Okay, thank you. Thank you, Chupin. And next part I have a quick demo. I don't worry about it because no database in SPIDE pool will be always running like yesterday's demo. Let's get down to the business. Easy setup for SPIDE pool and we can see just several pods in the cluster and we are going to create two SPIDE IP pool with IPv4 and IPv6 address. And we're going to create MacV lines in configuration. Finally, we are going to create application to test. We have created the application named server and we are going to create another application named client to test the server. The pod IP and the class IP of the server. So we are going to inside the client pod and we will visit the class IP of the application server. Oh, it works. And we know that MacV line cannot visit the class IP. So SPIDE pool enhance list features and we are going to show how to visit the another service by RDMA interface. It's running a latency server in a pod and we will wait for another pod. Oh, it works. Okay, that's a quick demo. So let's move to look at some details. As shown in the figure it's the architecture of SPIDE pool. It integrates motors. As we network operator and RDMA device plug-in it also has components such as SPIDE agent controller and 3D chaining thing which communicates with the local agent to enhance the capabilities of MacV line, IPV line and SLOV. Firstly it's about the motiles enhancement. When writing a network attachment definition for motors you need to write a JSON format chain. If there is a Cormor mistake in the chain for example it leads to power start-up failures. So SPIDE pool utilizes the CRD SPIDE motors config to automatically generate motors network attachment definition. There are some advantages. Firstly, the senior configuration with YAML format is less error prone. Secondly, in the data of generated network attachment definition it adds chaining thing of SPIDE pool. This helps reduce the usage complexity. Furthermore, developed values is set with best practice it helps reduce the configuration workload. IP address is managed by CRD SPIDE IP people it implements strong verification to avoid IP overlap between updating them. For underlying IP address is required to assign to a pod on any nodes which is different from overly IPAM based pre-allocated IP blocks. In the right figure it's an example of SPIDE IP people. It includes IP address and optional affinity settings. The affinity settings determines whether a pod could successfully allocate an IP address from a specific SPIDE IP people and there are various manners to specify SPIDE IP for a pod as listed in the slide. This is the node affinity use case of SPIDE IP people. When nodes in a cluster are deployed across the network region, west and east how can you customize the IP address of replicas on different nodes for deployment? It could achieve this by creating multiple instances of SPIDE IP people for example one SPIDE IP people has affinity with eastern nodes and another one has affinity with western nodes and they have different subnets in the instance. This way when it's creating pods the IPAM can select the right SPIDE IP people based on the node where the pod is located. In some scenarios, pods need fixed IP. For example the root of the firewall needs fixed source IP in the packet to enforce security and for example the service of stateful application could be exposed by fixed IP. In the community, I notice that there is a common manner of fixed IP in several CNI projects. They just annotate the pods with IP address. I think that is very important to IP conflict between application and it's almost unable to observe the total IP usage. So for SPIDE POD only one hand it implements strong verification of IP overlap between SPIDE IP people when updating and on the other hand as showing the right diagram a pod with IP address can be bounded to a state list and a stateful application thereby all applicants are restricted within a set of IP address. Furthermore if the pod affinity is conflict it can ensure that this pod is exclusively occupied by the matched application. In addition each stateful site pod and the Kruberwald virtual machine could get persistent IP even restarting. SPIDE Stubborn Net is an experimental future provided that there is a platform department responsible for cloud native networks while the application department is only just responsible for the application. If the application needs to ask for help from the platform clue and figure out what IP is available for creating SPIDE POD what a communication burden. So the CRD SPIDE Stubborn Net aims to solve this issue. The platform clue could create a SPIDE Stubborn Net with all available IP address in the Stubborn Net and then the application clue could use IP address from the SPIDE Stubborn Net to create a SPIDE IP POD object. Instead there is no need for the application clue to create SPIDE IPPOD manually at all. It's just to specify the SPIDE Stubborn Net object in the pod's annotation and exclusive SPIDE IPPOD will be automatically created and bonded to the application and it will dynamically change the number of IP address according to the application scaling events. What is a long BIP? A long BIP is defined as it's an IP address that remains occupied by a deleted or dysfunctional pod. In underline network this IP address is locked resulting in creating new pod. SPIDE POD controller takes charge of reclaims long BIP. Basically it reclaims long BIP taken by deleted pod and furthermore it claims long BIP taken by deleting the pod where duration of the deleting state is longer than the graceful termination timeout. This is especially useful when a certain node breakdown and it can ensure the IP availability for a new schedule pod. As we know, underline has various access limitations. Therefore SPIDE POD has made enhancements as showing the diagram on the right. SPIDE POD inserts an various interface for the pod to connect to the host by setting the routing on both the pod and the node they can access each other even the subnet is different. This enables smooth pod health checks and SPIDE POD helps improve the ability to access class IP. On the one hand the traffic to class IP can be directly forwarded to the host and denoted by Cropoxy. On the other hand it introduces C Group EBPF from Celium to reload the class IP in the pod's network name space this could directly forward the traffic by McVillan interface and provides better performance compared to the Cropoxy scanner up to 25% improvement in latency and up to 15% improvement in throughput. Thirdly, there are import STARP, a changing thing of SPIDE POD use probing ARP to help detect IP conflict and gateway reachability to ensure the availability of pod's network network. A pod may need multiple underline network interface to access different services in isolated subnets. On one hand SPIDE POD could insert multiple underline network interface through MUTUS. On the other hand SPIDE POD performs additional adaption work. APAM supports the allocation of IP address from different subnets for different network phase. And secondly it tunes the routing for multiple network interface. As shown in the figure as you wish, you can remain the default route of ESLirror in the main table adjust the default route of net1 to table 100 and directs traffic to class IP and local host through the VSLirror. Finally when external requests are sent to a specific network interface of the pod, it guarantees the data pass for request and response are the same avoiding packet loss. In addition to overly narrow interface a pod may need a secondary underline network interface. For example it uses an underline interface to transmit data separately avoiding any impact on the overly network, like VM migration of cool word. For example it uses the overly thing for TCP and it uses the underline thing for RDMA. In this scenario VSLirror is not inserted by SPIDEPO, it just helps to tune the routing. The default route of underline interface net1 is moved to table 100 and directs all traffic through overly interface ESLirror. Therefore any traffic is very smooth. On cloud platform currently there are just a few underleasing solutions. Selen and underleasing plugin provided by cloud vendors. However underleasing plugins may not work due to IP and MAC issue in VPC networks. It's universal underleasing solution that can work on all public clouds. SPIDEPO aims to implement these capabilities. By showing a figure it can create node SPIDEPO with valid IP address from VPC and it utilizes IP v-lensing to solve the MAC address issue. Finally protocol succeeded to communicate in the VPC networks. This approach is particularly suitable for hybrid cloud. It provides a unified network solution. Currently this solution has been verified on AWS and ACK. The major advantage of underleasing is to integrate with RDMA. We know that RDMA offers significant performance, low network latency, high network throughputs of loading the CPU load. At least in the slide SPIDEPO provides several options to operate with ROK and InfiniBand. SPIDEPO introduced egress policy future for the underleasing network. We have created a new project named egress gateway cooperating with various things such as SPIDEPO, KaliCo, Flano, Weave. We have created a new project called egress gateway features shared or exclusive eIP multiple gateway class instance. Active active gateway nodes support for TCP and UDP and a dual stack. The IPv4 address in the underleasing network are limited. The IPv4 has a long power outage. Podes has slow start due to various reliance including IP allocation issue. To adjust this issue we conduct an extreme test where the number of IP adjusts in the IPAM match the number of the pods. It creates 1000 pods together and monitor the time how long it will improve the efficiency and the stability of IPAM. As shown in the figure the report covers the latest version of SPIDEPO where about Kube-O-Wing, KaliCo, and Acedium where about field of test and other things as succeeded. As we know IPAM of overleasing is based on the allocated IP block. So the competition for IP allocation is not very intense. In contrast underleasing is different. It aims to ensure that the IP adjust can be allocated to pods on any nodes. So the competition for IP allocation is very intense. But amazing no matter whether IP quantity is restricted or not SPIDEPO performance is the best. We conduct network latency test on multiple thing as this test involves SOC-PRF testing between two pods located on two nodes. KaliCo was configured to work on the native routing and the IB table state passing pass. Well, salient was configured to work on the native routing and the IBBF acceleration are turned on but no big TCP. SPIDEPO was tested with MAC VLAN data pass. In the left figure take pod IPS destination SPIDEPO demonstrated good latency performance. In the right figure takes class IPS destination SPIDEPO with C-group IBBF is the best. Here's another radius benchmark test against KaliCo, Selium and SPIDEPO in the left figure take pod IPS destination SPIDEPO demonstrates good throughput performance and in the right figure take class IPS destination SPIDEPO with C-group EPPF is the best. Here's the QR code for feedback and GitHub links of SPIDEPO and egress. I'd like to hear from you. Any questions? Thank you.