 In the meantime, I'll just go ahead and introduce myself, and you all can do the same. Maybe you want to go first, since your name is? Yeah. Hello, everybody. My name is Yohaku Sonen. I work in Nokia Mobile Networks as a software architect. Yeah, hi. I'm Ajay Simha. I'm an NFV architect at Red Hat. I work in a group that produces all the reference architectures, and I focus on NFV reference architectures. All right, today we are going to share with you our experience in DPTK troubleshooting. Although it says troubleshooting, I know some of you at least may not have the kind of architectural background of how DPTK works, so we'll start off with that. And then go on to these are things that we expect for you guys to walk away with. One is to get an overview of DPTK, and what can go wrong in DPTK installation, and how to troubleshoot that. And lastly, the tuning of DPTK, which is very critical for you to get high performance, and what are the things that you can and should do to get high performance, and how to look at stats. Because DPTK, once your interfaces are in DPTK, it's no longer in kernel, so you cannot do IP link show, your links won't show up, and so on. So we're going to show you important and cool commands that you can execute on Thread Hat Linux. So this is the agenda. We're going to walk you through how we got here. What are the telco requirements? Because DPTK is a data plane development kit, is a method for you to get high throughput, which is like most of these requirements came from network function virtualization, and that is kind of the main driver or motivator for developing things like SRIO, VEE, and DPTK. So we're going to talk about the telco requirements at a high level, then talk about journey to DPTK, how we got there all the way from VertIO, talk about the DPTK in the data plane, which is going to be like hardware tuning, what needs to be done there, how do you get the throughput, what kind of throughput can you expect using DPTK, and CPU core allocation, high availability. We're going to talk about high availability just at a high level, but we're going to talk about that because it's really important. And lastly, even though the title says DPTK troubleshooting, the last section focuses on troubleshooting and various aspects of it, which what can go wrong in install time, what you should do and what can go wrong, what does it look like in an HA environment, and lastly, the performance-related stuff. So getting to the telco requirements, there are basically two major pillars. Although telcos, service providers, operators, whatever you may call them, they'll always come back with, we have, I'm sure, a lot of us in that room. I consider myself telco because I've been working in service providers since 1998. So a lot of service providers will say, these are the two major things, which is, first is two pillars, which is high availability, and the other one is performance. But then again, they may come back and say, I also need, by the way, service assurance, I need that, I need billing, other things also. Everything is equally important, but these two are the most important. So we have heard of situations where telcos are talking about, oh, we do not want to set aside node to do control function only because we want to get maximum throughput, meaning the number of CPU cores that we allocate, we want subscriber, equate that to the number of subscribers who are paying for it. So that's how detailed they get trying to get ROI on this. So first of all, maximum subscribers per core, how do you get that? How do you get high availability as close to 5.9s as possible? Because here, remember, we have tiered architecture. It's like you have your data center at the bottom, the hardware, and then on the top of that, you have the network function virtualization infrastructure, which is typically open stack in our case. And then on the top of that, you have the VNF. So the high availability is actually tiered. So you cannot get high availability unless you have some sort of cohesive method from the top to the bottom of the stack. Again, high availability, even though we're talking about troubleshooting and performance, without HA, if you have no high availability, if your node goes down, failure means zero packets per second. So let's remember that. So that's why it's always important to think about high availability. Delco applications and services typically serve millions of subscribers. I mean, we may not realize this, but when we talk about a mobile operator, even the small ones are talking about tier 2, tier 3, are talking about 8 million subscribers and 10 million subscribers, and it can be higher. So what used to happen is the hardware-based solution used to provide them these things because they used to have multiple refrigerator-sized boxes having dedicated ASICs and hardware doing exactly what needs to be done to get the acceleration and the high availability. So now we need to try and get that in a virtualized environment. How do you get that in a virtualized environment as much performance as we can get? And we need to get that, the maximum packets per second without having drops, ideally. With drops, I'll show you a chart where we'll share some testing done by our larger team, which does performance testing. I will share that information with you. And you can see that when you have drops, like even though it's minuscule, you can get much higher throughput. And without drops, zero packet loss, the PPS is going to reduce, obviously. Network equipment providers, people who provide the VNFs, as well as the telcos, have tried to move away from solutions which depend where the VNFs will depend on the hardware, has any dependency. The reason I bring this up here is SRIOV works great. You get fantastic throughput. The only problem with that is really that the VNFs and the drivers for the NIC are tightly coupled. So if the VNF vendor moves forward, or more typically, the operator or service provider goes and changes the hardware, then there has to be a dialogue between these guys saying, like, hey, I changed this. Thing is not working anymore. And not working and those kind of things are not something that is tolerated in a telco environment. Typically, even if you have outages, they have to be on a certain seconds or minutes per year, you have to report it to FCC, at least in North America. So it's a pretty serious affair to have all these things working all the time. So ideally, many of the operators, service providers that I've talked to, want to get as close to the cloud as possible. So they're willing to go for the flexibility while sacrificing a little bit of the performance. Again, this is like a kind of a gotcha, because they may say we are willing to sacrifice some performance, but then you'll realize they want more performance later on. So it's always a game catching up game where technology is trying to catch up and give you more and more packets per second. Yeah, and then let's take a quick overview about the playground where VNFs are living. So this is a picture of Nokia Airframe solution. On the right-hand side, there's Vim layer, which in the Nokia case, also VMware is supported there. And then there's a hypervisory layer, and all of this is operating on top of the data center hardware. So and on top of that is the actual VNFs. And those VNFs may have then certain requirements, all of which are not exactly the same on all the VNFs, but probably in many cases, there are in common that the high throughput in terms of network, as well as latency is required. For example, this is true in case of radio and core cases. So there we see these terms like real-time capability and throughout needs to be optimized. And the DPDK is one approach to meet these challenges. Thank you. So now let's talk about the journey from VertiO all the way to DPDK, how we got here. So when people started thinking about NFV initially, it was all VertiO. And performance was considerably low. You could easily get maybe 600 to 800 Mbps out of a 10 gig link. It won't matter. It's just not acceptable from an NFV telco point of view. So that's what you have two nicks, 10 gig nicks in this whole scenario that I'm going to paint for you. And out of that, first of all, you have all these hypervisor and layering. There was actually one older OpenStack documentation that showed that there were seven-plus layers before, like a packet all the way from the VM. If you think about egress all the way until it makes it out of the physical NIC or the PNIC. So that was ridiculous. So things started improving. Even OpenStack made modifications. It was not layered so many places, but still the performance of VertiO in its native form is not good enough. So then the next stage was people who in NEP started saying, OK, we need to get as much throughput out of a 10 gig as possible, because you're setting aside a one-hole server to do certain functions. You need to get throughput out of every single 10 gig as possible, so that you can get the 40 gig, 60 gig, 100 gig throughput out of a whole box. So this is PCI pass-through. The PCI pass-through, the issue is if you look at, you have two VNFs, VNF1 and VNF2. VNF1 has got Ethernet 0, and VNF2 has Ethernet 0. But the 10 gig is going to be completely dedicated to the VNF1, because that's how PCI pass-through works. Basically, the hypervisor takes the characteristics of the 10 gig, presents it to the NICs, and one of the NICs, I mean, the Ethernets on the VNF or the virtual machine. And VNF1 uses that, and that's the end of it. So VNF2, Ethernet 0 does not have a place to connect to. That is a problem with PCI pass-through. Again, you could have multiple NICs for all these things, but again, think in terms of monetization. You want your cores and your NICs to be used maximum so that you're getting paying subscribers for those things. Then came SRIOV, which elegantly solved that problem, where so what it did is, again, there's another issue with this previous scenario is that the VNF1 may not really need all the 10 gig. It may want, say, 1.6 gig or something. There's no way to share that 10 gig. It's just like the remaining bandwidth for the 10 gig is just wasted. That is the problem. So SRIOV solved that elegantly by taking the physical NIC and creating physical function and virtual functions out of that. You typically have one physical function which has got all the characteristics of all the registers, all the characteristics of the hardware, whereas the virtual functions, the VFs, which you see, VF1 through N, those have typically only the receive and the transmit queues. And this is all kind of built. You can think of it like a switch within the NIC card itself. This is not really a part of the OS. That's why it's in its own box there. So with this, you can still have VNF1, Ethernet0 connecting to virtual function 1. And VNF2, Ethernet0 connecting to virtual function 2, and both of them sharing this 10 gig bandwidth and getting better utilization out of this. One issue with this I already alluded to this earlier is that the VNFs, VNF1 and VNF2 have to have a driver support for the exact family of NICs. If it's a X540, whatever it is, it needs that. So if the private cloud changes, they got a new generation of NICs, VNFs have to be reprocessed so that they will have that support, which is a problem, which is what we want to go away from. This is what DPDK typically looks like. You have a couple of fundamental concepts that you need to understand in DPDK is that DPDK uses something known as pole mode driver or PMD. So what it does is there's a polling mechanism. And another equally important concept is that OVS DPDK actually is going to reside in user space. Sitting in the user space, it directly is able to, because of other optimization, such as V host user and stuff like that, it's directly able to communicate from the VM and bypass the kernel. That is where you get the optimization and the speed. And the pole mode drivers. The pole mode drivers, the way it works is you will need to dedicate one core to do the, in my case, in my lab, I have 48 CPUs, 0 through 47. So I will need to dedicate one of them saying you shall only be serving this polling function. So it's very important to understand how this thing works, because it's all about in DPDK, the flexibility comes from complexity. And the complexity is, if you tune it and you go through the extra effort of doing that, then you get a pretty good solution that you can use where it's more cloud ready than SRIV. So in the DPDK data plane, these are the things to consider. One is the hardware tuning, what kind of throughput that you can get, what you should expect, the CPU core allocation and the high availability. We'll go through those things now. So from a hardware tuning point of view, first of all, we talked about zero packet loss. That is fairly important that we have this high throughput with no packet loss. For that, you will need some later hardware generations. You cannot have a older one. It's not supported. There are a lot of tweaks that are done. If you use Red Hat Director for installing the OpenStack, a lot of it is done for you. And more and more things, as we go from 10 to 11, 11 to 12 and so on, more and more of these things will be done for you. It will be completely transparent. But you will need to have a hardware, like a CPU, which supports those things, the generation of CPU. For example, if you're using the NFE lab that I work in uses X540 and some of the other labs have ordered X710, these kind of CPUs are required. Secondly, it's also important that you go into BIOS and kind of disable things like the C3 power state and take all these extra steps that are required. The reason they are required is if you don't do that, it will cause a lot of these local interrupts. If you look at the interrupts, you can see I think it's like slash proc slash interrupts. If you look at that, it'll show you all the CPUs and all the interrupts that are associated with that CPU. And the local interrupts are pretty bad. I will show you some charts, what it looks like when you have a tuned situation, tuned profile, versus a non-tuned profile. I'll show you what it looks like. So this is something that you'll find even on the OVS web page. These are things that they recommend saying that you should do these things to get a high performance. So now talking about, I know you have a nice picture on the right side, but I'd like you to focus on the left side, which is talking about some important things. For the test that was done here to produce these results, two Vortio net interfaces were used in the VM. And then two 10 gig interfaces were used. And then test PMD was the test tool used. Although we have kind of developed a new tool now, our group has developed a new tool called Pbench, which is also available for public consumption. I will show you that and share the link with you. Bidirectional traffic, as well as the various profiles. You can see on the left column, you have the packet sizes. If you talk to any service provider, if it's RFI, RFQ, always people will bring up the discussion about 64, the smallest packet size. But in a true iMix, whether it's a typical mobile network, everybody uses mobile now, like Netflix, whatever. In a true internet mix, or whatever you want to call it these days, it's a mix of all these packet sizes, 64, 256, 1024, and 1,500. In fact, there's a very cool command that I came across, which I've taken the output of, which will show you for these packet sizes, what were the counters. So you can actually look at that in DPDK. Again, if you look at column two, you will see with 0.002% loss, I'll just take the 64-byte packet, since it's very popular and discussed all the time by mobile operators and telcos. So with that, you could get almost 12 million packets per second if you tolerated that kind of loss. But remember, not all traffic is TCP. Not everything is going to retransmit. There's a lot of applications are built on UDP. It's just a lot of overhead. You don't want to throw a package if you can avoid it. And then you compare that with 0% packet loss. Again, remember, this is for two cores, not for one core. And for two cores, if you now look at column number five, which shows 7.34, that is realistically what you can get approximately about 3.7 million packets per second with per core. That's what you can realistically expect. So which is a huge improvement over VertIO and getting closer to the cloud ready model. With SRIOV, you can get close to line ready. You can get 9.6, 9.3 gigs per second on a 10 gig interface. Again, we don't like to talk in terms of GVPS because it really package size, frame size comes into play. Because as you can see, just smaller the frame size, the lower the GVPS. It's just simple math. You take small packet sizes and you multiply how many millions packets per second, you'll get the GVPS. And large packet sizes, you can get more GVPS, less packets per second. All right. So this is a diagram which kind of shows. On the bottom, you have the physical cores 0 through 3. And core 0 is being allocated. We do the CPU isolation and say, these are the ones which the host can use. Host can use for multiple purposes. One is for nothing to do with housekeeping, some other functions that it's doing. Nothing to do with NFV or DP DK for things like that. You can set aside CPUs for that. You need to set aside CPU for the pole mode driver which runs on behalf of the host to use. And then the physical CPUs 1 through 3, 1, 2, and 3 are given to the VM to use. I'll show you the configuration parameters in the director which you would set to achieve this. I guess no questions so far should be pretty clear and easy to understand, I guess. So talking about, yeah, we should wait till the end. OK, let's wait till the end. All right, so DP DK HA, the way we have kind of designed this for the lab and the upcoming reference architecture document is that what I don't show here is like NIC 3 and NIC 4 are also bonded. That's my network isolation for all other traffic like internal API storage management, all of that. And if you're using VXLAN, the tenant traffic which goes from the controller to the compute nodes for the VXLAN traffic, all of that rides on the bond which is on NIC 3 and NIC 4. I don't show that here because it's not out of scope for this. But just I wanted to mention that NIC 5 and NIC 6 are dedicated for data plane in the NFE lab that I work in. What we do here is we bond those two using OVS DP DK bond, the DP DK bond. And in this example, we show that VM 1 is also running DP DK and it's got a pole mode driver and it poles. That's the other thing that you have to consider is that it has to, if the DP DK and the user space outside the VM delivers a packet, unless the pole mode driver is active and goes and fetches that packet, you could have very poor throughput. So both things have to be in sync for this to work well. So in this example, if you have the way we do it as ethernet 0 connects to the DP DK bond, NIC 5 connects to switch 3, NIC 6 connects to switch 4. So either if switch 3 or switch 4 fails, you still have at least half the throughput. That is the whole criteria of creating these bonds. Finally, we'll start looking at the troubleshooting part. One is there are three aspects to this. How to install? What are the things to set for installation? And what can go wrong during install? How do you can look at that? Again, from what we said, HA is important and we want to use DP DK bonds. And we'll show you a command how to look at that. Lastly, we'll look at performance related stuff, how to get it on the node with show commands, how to look at counters, and what are the important things to look for. Hey, how are you, man? Do you need a Polemode driver DP DK to enable SRIOV as well? That you sort of made them sound mutually exclusive? OK, that is thanks, Arnais. So you're talking about a scenario. There's only one scenario where you would use a Polemode driver in the VM and SRIOV on the host. Is that the one you're talking about? Possibly. Yeah, that's the only scenario. Otherwise, you don't really need a Polemode driver for SRIOV. So there is a scenario, but I don't know of any NEPs or telcos wanting to use that. Yeah, they are. They are. OK. We should talk later. I'll talk at the end. OK, yeah, please pull me in. I'll be happy to discuss that. I've mentioned that in my. Do you think it will exist? Yeah, yeah, yeah. I know it will work, but whether you. That's again, like going back to the. That doesn't take away the problem of using SRIOV, right? You still need to use SRIOV there. It'll just increase the performance, I guess, from a VM point of view. Yes, do pull me in. I would like to learn what you guys have done, and we'll take it offline. Thank you. So from an installation point of view, again, in the lab, we have used recently Red Hat OpenStack director with the Red Hat OpenStack platform 10. And these are the parameters that are set in network environment.yaml file, which are kind of related to DPDK. Some of them are used for other things, as well as you'll find similar configurations for SRIOV and other things. But these are the things that I've kind of listed here, saying that you have to kind of configure and touch it for it to work. And the ones that I've highlighted here, if you look at the host CPU list, host CPU list can be specified as like an enumeration like you've done here, or you can give it a range, saying, for example, you can say, I could have just said 1 through 47, and just go ahead and use all of them. Or I can select these, which means that other CPUs, like one or something else, is not really set aside for this purpose. And out of the host CPU list, again, some of these, you might find some changes as these things are progressing and evolving, how these things are being interpreted. And already, I'm hearing through my developers at Red Hat that these things are going to change. How are we going to set this and interpret that to establish the tuning may change in the upcoming releases. But for right now, what it means is you have an entire list specified under host CPU list. So you say, this is the entire list. Out of that, you say, Neutron DPDK core list can use 4, 6, 20, and 22 out of that list for PMD, pole mode driver. You can use that. And then you set aside for the VCPU pin is the one which is these are the CPUs which are set aside for the VMs to use, the 8, 10, 12, and so on. So together, if you think of it, the host CPU list is a superset of the Neutron DPDK core list and the Nova VCPU pin set. And that can be some of this information is available. In fact, examples, a complete sample file of network environment.yaml. And each of these configuration parameters I explained on the customer portal, Red Hat portal. The link is provided here. If somebody wants a PDF copy, I can try to take names and send you a PDF of this presentation later. So during installation, if your install went well and this is what the DPDK setup, OVS DPDK should look like, in our case, we have a bond. So you will see the port mentioned as DPDK bond 0 and it's got two interfaces, DPDK 1 and DPDK 0. This is what a healthy installation should look like. There's some interesting things which you make mistakes in the ML file and all that. It could lead to some interesting situations, like having extra space between commas. I thought that it'll be ignored, but it's fed into the command line which fails silently and you'll never know about it. So those kind of things can happen to you. But if it's healthy, everything went well, this is what it looks like. The type DPDK is important because the non-DPDK 1, by the way, this is referred to as OVS user bridge versus OVS bridge. OVS bridge is the one which you would normally use outside the DPDK environment. If you want to check what OVS DPDK options are being set, you can go cat this file. After the install is done, you can go cat this file, etsy, sysconfig, open vSwitch, and you will see all the DPDK options there. There's also like this file, whether it gets used or not, also depends on which OVS version that you're using. 2.5 uses this and restarts OVS, and 2.6 reads out of some sort of a database is what I've stored. So anyway, I mean, you can go cat this file and you'll find that information there. This is kind of what it looks like when DPDK was not set up and you had some sort of error, like this one scenario, there's another scenario that we ran into, but I've not captured that. So this is like, what happened was there's kind of a race condition and because of which the DPDK, like I said, I fat fingered and put a space between two parameters which are fed to OVS on a command line. So obviously, in a command line, you can't have spaces. It barfed and silently did not install DPDK, but that's what the output looked like. You want to talk about this? Yeah, this DPDK dev bind is a pretty handy tool to bind and unbind devices and also checks the status, which means that if something goes wrong even in a very beginning when you are taking the DPDK into use, it's worthwhile checking that whether you have the interface even created or not. Thank you. We talked about how we have used the bonding for DPDK and the way to look at the bonds is you do a command OVS app CTL bond slash show and the actual bond name in this case is DPDK bond zero. From that, you can see all the details. The most important being that the DPDK one is the active slave. And so when we did the failover test, we actually went and failed the active, the switch side of the active link and tried to see whether it fails over to the standby and stuff like that. The other thing to notice is the bond mode right now is balanced TCP. LACP is going to be supported shortly, I think, in OpenStack 11. I forget the details, but I can easily find that out for you if someone emails me. Later on, normal LACP will be supported. But right now, it's not supported with this one. So you have to use, if you want LACP, you have to use balanced TCP as the mode. Sorry, when I say LACP is supported, it will be 802.3AD, which will be supported. Then comes the final part of this presentation, which is the troubleshooting, things to look for in performance. And there's two aspects of it, things which you can look at on the node and measurements that would be interesting, both. It's not just from a troubleshooting perspective. Of course, making sure your RX and TX counters are going up, drops are not going up, things like that are obviously useful. But also, if there's some way you can use it to maybe craft something or use it for stats, that's also important. Here's some of the commands that I got to use in the lab. And so you can grip tune D in the etsy tune D boot command line. It will show you the host CPU list, which was used. So this is the one that, because of tuning, this is what is provided so that at boot time, these CPUs are set aside for the host. Second one is, in the OpenStack director installation, it creates this file called CPU partitioning variables.conf. And that will also have what CPUs are being set aside for the host. And lastly, you can grip a vCPU on the etsy nova.conf, and it will show you the vCPU pen list. These are the things that are useful to look at. So this example is taken from an output of a lab, which is not there's a dynamic lab where we use it for performance. This is not from my lab, but this is very interesting data. So I thought I'd share this with you. So if you take a look at this is graphing the local interrupts. Like I told you, you can actually cat or vi proc slash proc slash interrupts. And you'll see it's a huge, very long file. It's like one line runs into one single roll, runs into many lines. And it's kind of hard to view it visually. But what they've done is they built a tool, that P bench tool that I talked about. And that tool is available on GitHub. And I've provided the link at the bottom. So you can go grab that and this readme files and stuff like that. They use that for a lot of testing as well as tuning stuff. So here, if you notice, I've highlighted CPU 17 in that like radio box or whatever you want to call it. And that's showing that the average local interrupts are approximately 1,000 per second, which is very high. That means it's doing something. You should only get two interrupts. One for the pole mode driver, one for the VM if you have tuned your CPU. If you don't tune your CPU, this is what it'll look like. So you may have run some installation, thinking everything is great. And then you use your traffic generator and you're not getting the performance. These are the kind of things that you need to look at. And here's the example of CPU which has been tuned. Now I've selected CPU 1 and CPU 1 from that radio box. And you can see that the average number of local timer interrupts is close to 2. Again, this is not the only thing. You can also look at var log. I think I showed that var log tuned.log. And you will see that it has actually attempted to do the partitioning and things like that. You can look at that. And a bunch of useful commands are captured here for sake of reference. First of all, this one is the one which displays the packet counters and the drops. I've highlighted receive, transmit, and drops. These are things which you can quickly look at. You make sure you receive RX and TX as constantly incrementing and drops are not. Sometimes when there are transitions, you can expect drops, even in hardware-based solutions. You will see certain drops. There are transient drops. We tend to ignore them. The more important thing to pay attention is drops should not continuously go up. And there should not be large numbers. That's very important. Again, I mean, this is a sake of reference. I've provided this. But this is not something that, at least, I have not used this a lot. But it gives you kind of correlation between the port numbers and the port names so you can make the connection. This is a very interesting command. I alluded to this earlier. This gives you, based on packet sizes, like packet ranges 128 to 255. That's the RX queue. And 1 through 64 small packet sizes is in that received queue and so on. It gives you counters per packet size, which is very cool because you can kind of extrapolate and figure out like the traffic mix also. Pretty cool command. And lastly, this gives you like displays the PMD port allocation to show you that NUMA node 1 is where the cores 14 and 15 are being used. I think that's all I had. If you have any questions, we can take them now. So TCP dump is not available with DPDK because it works at kernel mode. But there is a utility that replaces TCP dump. And that's all I know. Can you explain a bit about how you do TCP dump like troubleshooting in a DPDK environment? OK. First of all, for sake of transparency, I ran into the exact same problem. We ran into the exact same problem. We wanted to see packets. I love to see packets from the source all the way to the destination. Every hop of the way, I like to see that. And I quickly found out that TCP dump doesn't work. In fact, if you do an IP link show, it will not even the links don't even show up in that output. So there is actually an OBS TCP dump I'm being told by Flavio, I think, and Red Hat, who's like kind of a guru in this area. He said that you can use that. And that will show you all the output. But if you do a normal deploy using Red Hat OpenStack Director, you don't get that. What I understand is you may have to kind of build your own DPDK. And there's like a tools directory in that. From that, you can start using that. I think that OBS TCP dump will be available. I've not had time in the last month to dig into this. But if you leave me your email address, when I find out, I can share it with you. OK. Thank you. To add to that, you might consider also trying a Portmare within, again, within that. The key there is that anything that DPDK pole mode driver software is utilizing in an interface that's been assigned to it, no longer uses a normal Linux kernel mode driver. So TCP dump, of course, deals with kernel interfaces. Excellent slides. I recommend everybody in the room do get these guys' email addresses and ask for these slides. This is golden information. Can you go back to the install slide with all the vCPU variables? There are some bits here I want to point out to everybody that is absolutely critical in understanding high-performance VNF compute nodes, this. So this is a hyperthreaded system. You can tell by the numbering. The other thing, there's another session that was earlier today. Find the slides and recording. NUMA node alignment, CPU pinning, 1GIG huge pages. All of these things work alongside DPDK to enable not only DPDK if you're deploying it at your host level, which is what we've been covering in this session. But for any other use case for VNF as well, having 1GIG huge pages, nice, neat blocks of RAM, having NUMA node alignment is critical. And this is on a NUMA node system. The short version is if you have a multi-socket system, you have two motherboards that are connected via a QPI interconnect, like a bus. So every time you map a port, you want to map your CPUs and your RAM to the same side of the motherboard, the same NUMA node that whatever your transport mechanism is. So excellent stuff here. This slide alone is worth everybody reaching out to you for the slides. The other thing on performance that is always alarming the first time people run into it is the apparent CPU utilization of the pull mode driver. If you go and monitor just, your typical NMS guys are going to go, oh, you installed this thing. And all of a sudden, my utilization is 100% on these cores. No, really, it's a pull mode driver, which is doing what it says. It's pulling constantly for packets to be sent or received out of port. So that's a false positive, maybe way to describe it, until you knock people to settle down. But yeah, if you're still running classic OVS, you are failing your company, in my opinion. Thank you. Thank you for the comments. Just to follow up to what you said, does it make sense to actually track or measure the processing cycles for DPDK and the polling cycles for DPDK separately? So it gives the sense that, OK, how much of the CPU cycles are being used for processing packets? Great question. No, do not do that. Those CPUs that you're allocating to anything running DPDK, whether it's the many, many vendor-provided VNFs that employ DPDK within their own data planes, or whether it's DPDK and OpenVswitch, those cores are dead to you. They are gone. They are allocated for forwarding. You will not be using them for any other purpose. You should not want to use them for any other purpose. They are giving you much greater functionality in your environments. So don't even bother tracking. Those cores are gone. I understand. What I meant was that is there any way to find out that the CPU core allocation or CPU cores allocated to a VNF are pegging to the threshold? And that is the reason why the performance is limited, or that's the bottleneck for your performance. How do you do that? OK, so let's say we were comparing classic OVS to OVS DPDK. That's a good context for your question. Total number of packets per second throughput is what you should look for, because if you're running OVS, on a good day, you're getting 1.1 million packets per second, and you're done for the host. The rest of your host is sitting idle. Core is doing nothing. So packets per second is the threshold that you're looking for. As soon as you go to here, his chart's already covered all of that. I mean, even lightly tuned, DPDK-enabled OVS is going to decimate that. Regarding coexistence, shortest possible version of this, using open stacks scheduler filters, groups of compute nodes that have ports allocated for SROV, and groups of compute nodes that have, well, anything running OVS should have DPDK-enabled OVS. But those hosts coexist in the scheduler filters, do a very nice job of picking where your workloads go. So if you're a telco and part of your job is to schedule various Cisco, Juniper, Brocade, whatever, V routers, firewalls, load balancers that are compiled with DPDK, they will benefit from running on top of V-switch and your retain the flexibility that our speakers have mentioned. They benefit more if you can afford to give them Newman node CPU pin and one gig cube pages to those SROV-enabled interfaces. So the rest of your commodity oversubscribed compute load should be on top of these nodes without any question whatsoever. So coexistence works great. You guys choose what your business needs are to determine where those workloads go. I'm being signaled that we are run out of time, so can we take your question offline? Or maybe last question. Make it quick. Maybe it's the last question. Thank you so much for attending. There's also a reference architecture that we put out. There are two of them, one called deploying mobile networks using NFV, something like that. And the second one is going to be using OpenStack 10. So look forward to that on access.redact.com under reference architectures.