 Hello, let's get started. Hello, everyone. Welcome to our topic. Today, we are talking about open-v-switch in Ultron. And the performance challenges we have run into. And we also come up with a solution based on open-v-switch hardware. So, let's just introduce us first. My name is Kong Yongsheng, and I'm working for United Stack. And the United Stack is an open stack service and product provider. We have a booth in the hall. So, you're welcome to visit our booth. Bo, can you introduce yourself? Hi, I'm Bo. My name is Yang Bo, and I come from the 99 Cloud. And I'm also the administrator of TriStack.C, tested by a project in Chinese open stack user group. Yeah. So, why we choose open-v-switch? So, in April this year, the open stack user committee conducted a survey. It got 197 departments of open stack from about 400 respondents. It shows us that the open-v-switch, there are about 39% of the open stack departments which are using that content is adopting open-switch solutions. So, the open-v-switch agent is very popular with the open stack deployer. And we have the reference URL. If you are interested in the survey, you can go there. So, this is why we choose the open-v-switch topic today. So, the board will do the following presentations about this session. Next, Yongshun. So, today I'm going to share you some open-v-switch usage in open stack, and especially in my open-v-switch experience to build TriStack.C, completely open-stack SDN environment. The next, we will talk about the open floor a little bit and what are the problems I met in the deploy open-v-switch in open stack. So, the last one is the possible acceleration solutions for open-v-switch, especially in performance issue. Okay, the first, we are going through open-v-switch usage in neutral. So, we maybe have already known the open-stack deployment in popular. We usually have two networks, one for the private network. It's for VM communicating for each other, and the public network helps VMs to access outside. In the TriStack.C architecture, we installed mostly open-stack components in our box. It includes the Keystone, Glance, and Cinder API, Nova API, and so on, especially the Neutron server is also in our controller node. We have also have a natural node. This node is installed mostly neutral-related components. It includes the L2 agent, L3 agent, and DHCP agent, okay, and metadata agent, yeah. The rest of the machine is installed as the computer node. It's set up as Nova computer and OVS agent. It's also known as L2 agent here. So, we can put up the VM on the computer node. When the VM is set up, the data path will be like this. VM sends packets to the open-v-switch and going through the private switch, and approaching to the network node. Then the last one, last stop is the public switch. Maybe this public switch connected to the public router. So, that's running the computer node. When we spend the computer node, it runs out of the VMs here. This computer node is connected with the physical network. Also, as we know, the VLAN mode and the GIE mode. So, the GIE mode present here is... It draws the dash and the physical network draws the sony line. If the VM want to access to each other, then the packet will go into the GIE tunnels. So, as we already know, the neutral supports a couple of the network types, like GIE mode and VLAN mode and the flat DHCP. So, in this case, we are using the GIE tunnels. Okay, let's see it again. These networks are isolated by the VLAN tag. So, the blue tenant VM can just only access to blue VMs. The green one is only able to access the green one. All this VM is bridged on the open-v switch in this case. So, for example, this is the VLAN bridge. Our VM network device shows us the tab bridged on the VR int. VR int has the connection to the physical network device. Actually, the VR ETH1 also is a bridge. It connects to VR int to ETH1. So, a VETH vest here. The right side is the computer node installed. So, it has just only one bridge here. The left side is the network node, but it also has the computer node. So, it has two bridges, one for VR int and one for VR EX. VR EX is bridged to ETH2, and it's able to access the public router. So, the VLAN bridge is pretty simple here, and when we change to GIE mode, it's complicated a little bit. So, here, both two hosts have one more bridge here. You may say the BI tunnel is for tunnel connections. The ETH1 interface has both the IP address here. It's used for tunnels and points. So, you can see the BI int has a patch port to connect the VR tunnel. This is connected for putting the VR tunnel and the VR int together. So, when the VM has the tap interface, tap ports on the open-v switch, the VM can approach the VR int bridge to the VR tunnel and put the packet to another machine through these tunnels. So, after that, we can talk about the Neutron workflow with these plugins. The first, we can start Neutron server at the control node. The Neutron server will prepare the database connection and the message queue connections. If we have the Neutron database already here, so it just do nothing. But if the Neutron database not exists, when we start Neutron first time, it will create the database structures. But it will do nothing here in this diagram. So, let's start the open-v switch agent next. When we start the open-v switch agent, it will check the VR int bridge. If VR int bridge not exists, the open-v switch agent will return failure here. It also prepared the tunnel connections, so it will create a VR int bridge here. When we start the open-v switch agent, so we have the Neutron network connected, so we can start the rest components of the Neutron. The next, we start the L3 agent, and you can see it will check the biX exists. If biX not exists, it also return failure here. And next, we can start the DHCP agent and metadata agent. All these three agents include L3 agent, DHCP agent, and metadata agent. They are known as particular orders, so you can put them in the same time. When we prepare this agent, the Neutron server is already started, so we can create the network. When we create a network, mostly we create a network through the CLI command for the M, and the web request, like Horizon, we can create a network by Horizon. Whether the CLI command or the web request, it sends the command to the Neutron server by REST API. Then Neutron server will put this request and it goes to the message queue. So when we create a network, L3 agent will receive the request. The L2 agent will prepare the queue DHCP namespace here and create some ports for DHCP. So at this point, you can use the OBS control command to show the ports of the open-v switch. We can see the queue DHCP port in the queue DHCP namespace. We usually create a network with subnet, right? So if we have the subnet, we need the queue DHCP endpoint here. The next one is create routers. Because the VM attached to the open-v switch, they need the router to access the outside network. So when we create the routers, the router's L3 agent will prepare the queue router namespace in the network node. The queue router namespace set up when all the network environments were set up. So we can put the VM with the network. So we can say we can put a VM from outside or in the same host. When the VM puts the VM, we'll send the DHCP request to the queue DHCP namespace and the DHCP server will return the IP configs. In this case, we use the DNS mask. So the DNS mask will return the IP configs to the VM. Then the VM can have a gateway set up here. And now the VM can be accessed from the network. I previously need to talk a little open-v switch in neutral. So let's go through the open-flow section. Open-flow actually is a control protocol in the network management. So we can hear a lot of the SDN concepts or the open-flow controller concept. In this case, we have an open-v switch installed here and it connects with an open-flow controller, like the queue or the open daylight. So it's connected to the open controller with the security channel so we can have a control center to manage this open-v switch centralized. Netflow usually contains at least one description of the flow and at least one actions here. So in this case, we have the flow description with import equals to while the credit. The actions is output seven. So with this diagram, the incoming packet sent to the port two or casually port. So the output, the upcoming packet will send to the port seven. The next one is... We specify the import is port one and the output is flat. This will be a broadcast flow. This is a simple flow here. The next slide will explain the flow tables with L2 population. So Yongsheng, would you like to take this one? Sure. So this slide just shows us how the open-v switch agent organizes the flow tables. You can see that we have many flow tables here. Table zero is the first table. It will, for the traffic from the VM, it will direct the traffic to the table one. Table one will tell the difference between the unicost or the multicost traffic. If it is the unicost traffic, it will direct the traffic to table 20. Table 20 will pass the traffic to the GRE tunnel, tunnel port. Then the traffic will go out to remote endpoint. This is the unicost traffic. If it is the multicost traffic... You can actually have an animate to show... Every time it's an animate. If it is the broadcast traffic, table one will direct the traffic to table 21. Table 21 will flood to the GRE tunnel point. If the traffic is coming from the outer side through the GRE tunnel port, the traffic will also first go to table zero. Table zero will know that it is from outer side. Then the traffic is directed to table two. Table two will convert the GRE tunnel ID to the local VM ID. Then the traffic will be passed to table 10. This is how the flow tables are organized. I have the... It's the unicost traffic. Then we have broadcast traffic. Then for incoming traffic, you can see that the flow tables are organized how they are organized in the open-v-switch agent. In the Havala release, the new team introduced a new mechanism. This is L2 population. L2 population will improve... Will fill in the forwarding database entries of the open-v-switch bridge. It is the tunnel network. The GRE tunnel will be managed by the open-v-switch agent as needed. When the traffic is coming here, the needed tunnel point will be created. It is different from previous release in a GRE release. Go back. Yes. All right. Here are some problem statements. I will define these statements when I use the open-v-switch in the TriceDial.7 deployment. This section will have some interview with the open-v-switch performance issues. I'm going to ask you guys how many people already deployed the open-v-switch here. Do you use the GRE tunnel for the open-v-switch connections? Yes. At the TriceDial deployment, we have about 10 servers. This is not many servers. In this diagram, we have 12 servers here. As we know, the GRE tunnel will connect server to server. It actually is a point-to-point connection. If we have 12 servers, you can see each server will connect to each other, all of others. The tunnel numbers will be n minus 1. When this tunnel is created, the server will handle about 11 tunnels in this server, each of them. The tunnel performance issue appears to all the deployments. We can see a lot of floor tables in the open-v-switch. That's why I bring this problem up, because we find a lot of solution to solve the tunnel performance issues. The next problem is if we use the GRE tunnels, the VM connected to the other VM on the different host, it will have the MTU issues. If the packets are approaching to the tunnel, the packet will be attached to the GRE head. The GRE head is usually to break the connection, so it may have some HTTP server-compatible problems. In our case, the VM on the GRE tunnel is not able to access some websites, especially most Chinese portal can't be accessible. The next problem is floor matching. In the open-v-switch, the first packets always miss or hit. Then it miss the floor to ask the v-switch.d, which is a user space. Even if it will hit the path, it's still a slow path here, so if we have a lot of the short floor, the performance will decrease considerably, especially in the older version of open-v-switch. If you are running 1.4 version of the open-v-switch, it only uses the single core on your server. If you have multi-core server configuration, it's only used one. But the latest version has changed. Thanks to the multi-threading mode and mega-flow, it improved the performance of the open-v-switch floor tables. But even if you have the multi-core to match these floors, it also has the locking or dead-off problems here. So the question is how we match these floors quickly and improve the CPU usage. When I was first time to deploy OpenStack with the GRE tunnels and open-v-switch, you can see the network node has a lot of ports on the network node because we have about 100 networks in the neutron and about 50 rotors in there. So the open-v-switch on the network node has about 300 ports, and the floor table is very long. This situation takes us to a harder position. The bandwidth is very limited. So the first problem is how we improve the floor matching. Then we find the solution. We think about how about we put the open-v-switch in a hardware switch so it can be accelerated with the hardware chip. Especially use the ASIC or FPGA mode to accelerate the floor matching problem. So we find a network switch vendor. It's called SENTAC, and our team is working closely with us. So we have improved architecture here. We have the hardware switch running open-v-switch. So we can still use the open-v-switch and agent to manage this hardware switch. And it also creates a tunnel between the hardware switch. So we can decrease the tunnel numbers here. And in hardware, we can put a lot of floor tables in their switch. This model of switch, we can handle the 20k of floor tables. And about 40k max. I don't remember the specific number, so if... Please correct me if I was wrong. So we have this switch connected to this computer node. So you can see the open computer node as a rack. This switch is a Tor switch. So we connect racks between this switch to cross the racks. So we have two racks. If we have two racks, we just need one tunnel here. In the computer node, we can still use the Linux bridge. We have benchmarked the Linux bridge is always faster than open-v-switch. So you can still use the open-v-switch in the computer node, and you can also choose the Linux bridge to bridge these VMs to the switch. This is the hardware-switch solution. As you may know, the other switch renders maybe also have the open-v-switch solution, like pk8. I don't know if the pk8 guy's here, so I just will bring it up. Why we choose the syntax? Because syntax is an amazing product for the open stack, and they are near with us. And based in Shanghai, they are based in Suzhou. So it's the nearest city here. And there is another solution. If you are interested in the flow processing, you may be here about the DTTK from Intel and the PF ring or NetMap. I just started a little bit about the DTTK. DTTK is basically a bunch of the API SDK libraries. So it has three key important components here. The DTTK provides the CPU affinity and UIO and huge table page. The huge table page is provided by Linux kernel, and UIO is created by Intel SDK. We can start with the huge table page. It can improve the usage efficiency of the memories. And UIO can drive the network device on the user space. So when I saw the OpenWay switch slides in the Linux conference, they bring the possible performance solution and put all packets on the user space. No need to switch to the kernel. So DTTK UIO can provide the math to process these flows or these flows on the user space. The CPU affinity provides the function to focus one CPU core to focus handle one network device. Like if we have four cores of the CPU and we can net the core one, focus to process the packets on the ETH1, core two, focus to handle the packets for the ETH2. And with the GI Eternals problems, we can change to the VXNAML. We are not used to VXNAML because VXNAML is very new for us and we are not quite familiar with VXNAML. But the newest Neutron release has already supported VXNAML. So if you guys want to try it, it's a good choice. The last one is some open-v-switch debugging tips. So if you are using open-v-switch, if you have the problem with using open-v-switch, there are some tips for the debug. The first is to test the basic connectivity because we can possibly bridge a wrong interface on the open-v-switch like we intend to bridge to the ETH1, but ETH1 has no wire connected. You can use the TCP dump to see expected packets on the wire. So the TCP dump is very easy to use. Another master is we can also just use the bridge and without the open-v-switch. But mostly we have an open-flow setup on the open-v-switch, so the packets may not hit the flow. We can use some open-v-switch tokens to debug this problem like OVS, OF control, OVS app control, and OVS DP control. This all tools is very amazing, so you can just need this tool to address your network issues. So in this section it has only 40 minutes. We just have all slides here. We have a couple of minutes if you want to ask some questions. We use the open-v-switch in our OVS stack grizzly environment, and we found a problem that the OVS works maybe a little better in the Ubuntu environment than in the Red Hat. For example, in some use cases, the OVS will drop ARP packets in the Red Hat environment, and we tried 1.4, 1.6, and 1.9 versions of open-v-switch, but I guess there are still some problems about the function of the packet transmission in the open-v-switch. So can you give us some advice about the usage of debug of open-v-switch in the Red Hat environment? Thank you. In Red Hat environment, I'm not familiar with the Red Hat environment, but I have these issues in Ubuntu also. So if you use the version 1.4, that's actually the older version. You can see you have packets dropped because the CPU can't handle the flows, flows matching anymore, so you may want to upgrade your open-v-switch or you can decrease your flow table size. I have a question. Great presentation, by the way. A lot of useful information. So you mentioned the Linux bridge is faster than the OBS bridge, so what kind of performance testing did you guys do and why do you think the Linux bridge is faster, I mean, from an implementation point of view? Actually, we have some benchmark with the mass of the flow tables. So Linux basically put the packets through the device directly, but in the open-v-switch, it's always to put this package to the... How to say? It's always have an app called to contact the v-switch. So it have a contact exchange here. But in the newest version of open-v-switch, it's not different to too much. Okay. Another one? Was VLAN an option for you or did you have to use GRE tunnels? Excuse me? So you used GRE for tenant isolation? Yeah. Could you have used VLANs instead? Yeah, we finally have to use VLANs instead. Okay. Yeah. You still wind up with your VLAN cap, 4096. Yeah. So I have both in the... In the hotel. So if you have another question, you just can find me there. Thank you. Thank you.