 All right, we're live morning everyone. Welcome to Berlin. Glad to have you here. I'm on stage today with Hello, my name is Jaime Camano. I work. I am a senior sovereign year. I work in the NFB engineering team in Susser Mark here is my project manager So I've been working open stack for a pretty good number of years. We've really been starting to develop some of the network maturation process Specifically around NFV in the telco space. That's what this talk is mostly about There will be some some setup that we're going to do on this. How many of you were at the Vancouver Summit earlier this year anyone? All right So if you were at the demo theater, we gave part one TR Bosworth the other product manager from Susser standing right back there And I gave a talk at that We're going to reference some of that because this is effectively a sequel talk to that where we really go into more A little more depth in what you really need to do to do a telco NFV compute environment so You will find part one of that at the following URL. So need it just do a search on Software-defined all the way with open stack and you'll you'll find that in the the open stack slide section So a recap of what we discussed during that talk Number one migrating from a physical data center to a software-defined data center That that basically there's a three-legged stool that makes up and you saw it in the keynote this morning from Jonathan The three-legged stool is compute storage and network So we walked through migrating the compute portion migrating from say a traditional sand over into a CES or CEP based product ours. Our CEP based product is called CES inside Sousa And then we started talking about migrating networking. We broke out enterprise NFV and telco NFV I'm going to give you a real quick brush up on that. It's going to take about five minutes of the talk And then we're going to go specifically into some further details that you need to understand when it comes to doing telco NFV so Software-defined this particular. This is one of Sousa's architecture slides Here's what a software-defined infrastructure really looks like from our product set And it's a fairly similar kind of environment that you're going to get across the open source world because we all work from the Similar set of software right it's packaging its distribution its support that tends to vary So let's re-establish what we were talking about once again This is a little bit of catch-up from sequel or part one You know last week on SDI with Sousa you saw and that's what we're going to run through for a moment This is what a physical data center looks like those of you who have been network administrators or have worked in DCs for a lot Of years understand this is a typical structure So you're dealing with a different set of compute devices for mainframe minis, micros, PCs and so forth Sitting on top of storage say a storage area network and then networks together using and I know this is not going to come out on the recording But it will help you all on the audience So basically everything networked together here that's your set of storage switches that link together your your array And then you obviously have your compute layer. Okay, so standard DC. Everyone should be comfortable with this Now let's move on into migrating Over the migrating the compute layer So what you end up doing with this is basically we've migrated away from the mainframes minis and so forth that were in the prior set and We've gone to a commodity-based KVM hypervisor. All right We do have some special flowers out here. These are multiple types of hypervisors Sousa does support multiple hypervisors in our cloud OpenStack is built to be able to do that and we also support containerization and OpenStack supports ironic so across all of those you can basically support everything from standard Linux KVM all the way through to bare metal So very flexible environment. That's your compute layer Moving into the storage layer You'll notice that what's happened is the switch types of changed and the type of boxes down here have changed This is a seph environment that exposes Block file an object to you via storage all off a commodity hardware And what really happens across this entire set even though they're drawn differently Down here basically the same type of Intel AMD x86 or x64 hardware Same thing that you have across all of those devices So it really allows you to economize and minimize the different types of hardware that you have to maintain All right Let's make sure we're clear on NFV. So a real quick touch point about this. What is in a V? It is the virtualization of network functions. So put very very simply NFV is about taking stuff that used to run in say an appliance a specific telco appliance Taking the software out of that box moving it in and running it in a Linux process somewhere Fundamentally at its core. That's what NFV is all about. Okay, so I'm not going to spend a whole lot of time on this slide You'll be able to download the slides and once again, you can go back for a more detailed discussion of this from last week or last time All right I'm not going to spend a whole lot of time on this one. How are we doing on time? Actually got quite a bit of time But I'm not going to spend a whole lot of time there other than saying enterprise virtualization. So go back to that prior slide What you had in this era was you had say a dedicated hardware appliance a box that was running DHCP and so forth And that's actually running in your network layer This by the way is a standard symbol for a router in the network world if you're not real familiar with it So you're running all of those devices in an appliance or network layer How do you migrate? Enterprise NFV and I want to make sure that we're very specific about enterprise versus telco Those actually migrate down into instances running on that compute layer Put very simply you're taking the software out of the firewall that used to be up here and You're running it say in an IP table is instance down here inside KVM now That's enterprise NFV It has different performance requirements than does telco NFV Enterprise NFV is not about massive throughput. It's about functionality and reliability So you'll actually get a little bit more mode one requirements in this space in enterprise NFV Then you might in telco NFV at times or especially in cloud native type applications All right moving on to the next slide Let's zoom in on an NFV node and talk about what telco NFV really looks like We're going to get into data plane acceleration first the driving factor when you get into telco NFV is Data plane acceleration it turns out there's a few others that Jaime is going to spend some time Focusing on but the real thing you need to do first is you need to accelerate data through the system Why is that for those of you who have never like built your own network switch? Here's what a switch is made up of it's a small low-end process or either an Intel Celeron or an atom and I'm gonna I'm we're partners with all the vendors out there But I'm going to use those because you're probably very familiar with those chip names So a Celeron or an atom is basically your management processor. That's what runs the network operating system on that switch Then you have one or more asics and each ASIC or FPGA Field programmable gate array any one of those devices actually allows you to connect multiple ports to those and then the data That's traveling from save from one network device to another goes through that ASIC. It does not go through the Celeron So what happens is that FPGA is a very very fast piece of equipment? The processor is not in the way. It just manages how that ASIC is configured That's why switches can be fairly inexpensive. You're not paying for you get into like a big Xeon It's a more expensive piece of hardware. You can produce ASICs as your own switch company more cost effectively. All right, so You need to replace that type of performance on a general purpose CPU that also has to do other things on the compute node That's a problem. So data plane acceleration is all about. How do you make that faster? So we have a pure user space technology in OpenStack OVS dbdk that runs 100% in user space Which solves a whole bunch of operating system issues context switches things like that We have a hybrid approach in SRIOV where we can have a physical function on a nick that exposes multiple virtual nicks And each one is basically user space mapped into each instance. That's running But that's a little bit more hardware focused and then we have a very hardware-centric perspective Which is PCI pass-through? So those three technologies basically go from very software defined up through a little bit more hardware focused and that is presented in this Particular slide. This was something that was a lot of value to people that when we talked to them at the Vancouver Summit Really tends to make clear to folks what these technologies do and how they actually work between user space kernel space and so forth Okay, so once again the slides will be available for you for download later All right, let's move on past The data plane now because a fast data plane it turns out is not enough you actually need more than that And why is that well? I touched on that a little bit in kernel context switches How do you ensure that you end up getting repeatable definable performance out of the kernel? For tasks that are running if all of this is running in user space If you have one task that is nice itself up to like the highest priority on the system You could actually freeze your your dpdk instance out and not get the throughput that you need to even though the data Plane can operate fast. It's not getting any processing time. Okay, so you need to solve that problem How do you solve those kinds of problems? And the answer is first we need to make sure we understand what the OS does So this will be OS 101 for those of you who didn't take any OS or computer science classes What does an OS really do for basic concepts? One multiple tasks running on a system. How do you like time slice and protect between these? Make sure they can't stomp on each other. That's that's task one number two How do you make sure that you actually context switch between these things correctly? Well, that's what a preemptive multitasking operating system does. Okay, Linux being Unix is a preemptive multitasking operating system Next the kernel has to protect the hardware all of these devices access the same hardware So you have to have something that's protecting the hardware So a kernel is in place to do that and then you need to patching an update facility Let's take a look at how these things actually apply in the NFE world First we've covered the performance section right we've got a fast data plane running So that's four different technologies and I added in one more that's actually got a huge amount of activity going on That we're not going to delve into in this talk But if we do say part three at a future at a future summit then we'll start delving a little bit more into VPP Because this is a very exciting area right now Updatability telcos require what I call mega uptime minimum five to six nines Their networks are built out of a lot of mode one stuff and when your cell phone doesn't work Customers tend to get pretty unhappy Telco loses business they flip to different plans or different providers. So they need major uptime Much more so than the vast majority of other networks do This by the way is a real issue and Jaime is going to be going into that and how we actually solve that particular problem Predictability we need more than a general purpose task scheduler We need something that when you're processing protocols in the network world You have to be on a very defined repeatable basis Otherwise you get that void chatter where the line breaks up and or the video stops playing So that said let's make an NFV compute node. You've got the floor. You've got nine minutes. Okay. Thank you mark behind you so What we did to showcase here does some of the features and mark talk about before is that we deployed a Susie open stack cloud on top of a Susie linux enterprise compute nodes and We have a two compute node cloud and one of them we have designated and as an FB capable Compute and we have enabled some features that we think that bring this in a capability for the compute So what we have two of the basic features that we have for that is Life patching that gives you the update ability and then we have the real-time kernel to get predictability on your task So we are going to showcase those specific features through a couple of videos that we are made That they're really that so you can see what we're talking about We're going to start with Susie live patching. I'm gonna hit the video. I'll start the video Can you stop it there for a second so I explain again? Sorry about that Just pause it. Yeah so what we have on the right is One video conference application that is being served on the non real on a non real-time compute non NFB compute And on the left on the left we have we are logged in into the NFB compute The NFB compute is running a BNF is a routing BNF is providing routing firewall and DPA services For all the traffic that is going to the video conference application so on this Cell on the left. We're going to showcase how we are using live patching not like that's So Yeah, kind of like this kind of life-patching. That's right. So you can play it. Okay? So basically we pass the kernel using the the software maintenance tool of Susie that's called seeper So you use the same tool to install software to make updates and to install kernel classes Basically first what we're doing Is we are just listing the patches We are specifying the reposits over repository that is holding all the kernel patches just for simplicity of the output So to really drive home the point of what's happening here in a telco environment You want to keep your node up and running all the time? But kernel CDEs specifically are coming through if you want to apply a security patch to a kernel You need to inject that into the kernel to pick that up on the vast majority of systems out there You need to reboot to pick that up and that just blows your SLA for the whole year with Susie live patching You're able to actually apply that patch as the system is running and that's exactly what this is demonstrating The video player is a VNF running on the node that you're patching a CVE into the kernel All at the same time and at the end of it You've got your kernel patch picked up and you don't have to reboot the node Say a year into the process after you've done multiple CVE's you can turn around and reboot that system and pick up The the CVE's that actually been patched into the on-disk kernel Okay, so pretty pretty compelling technology that we believe you need for an NFV environment Can you go back a second so I can explain the Going to be tough for me to go back. How far back do you want me to go? Okay? So basically we listed the security patches that we had available and now we are installing it through super like you install any other piece of software it's a bit slow because it's just a Reinstalling the patch in the kernel and building on the dr. Good modules, but the next a few seconds Now the kernel has been updated So we are running an updated kernel with the security patch applied now we have different tools to inspect the status of the kernel update One of them is KGR patches is going to list you the kernel patches that are currently applied on the system You can see that the the patch that we recently start to to one is is already applied The status is going to tell you if the system is completely Working on the latest version of the kernel that has been upgraded when it says ready It's that it's working all the processes are working on the latest version of the kernel The kernel patches installed as a kernel module So the technology that we're using is called care grafts It's it's applied as a kernel module and it's replacing all the kernel functions to new versions of those functions that need to be patched All right on to the next piece. Okay, and now we're going to To take a such a real-time kernel. So this NFV designated as NFV compute is running a real-time kernel I'm gonna show a few bits about it So this is the same setup video conferences are running on the on the right. We are logged in NFV compute on the left We check that effectively. It's running a real-time kernel for that for real-time and we're gonna check the QM a process that is running the NFV the BNF 14 at BNF So we want to do this because we're going to inspect in detail this process to check if it's really running in real-time scheduler so This BNF is running on two virtual CPUs So you're on one only one of those CPUs is running on a real-time scheduler because there's some management tax That still needs to run and not real-time The first virtual CPU is running on the physical CPU 2 the second virtual CPU is running a physical CPU 3 and we can see here that Only the the virtual CPU 2 It's running on the first in first out real-time scheduler the other CPUs and Threats of the QM a process are running on normal scheduling Now we're gonna do is we are going to run a stress test I'm we're going to verify that the stress test is not going to interrupt our service on the video conference So what we're going to do is we're going to a stress all the course except the course Where the BNF is running? So when you're running a real-time? VMs you want to run the VMs on different cores for isolation So we are going to test that that isolation is actually in effect and it's not going to the stress test is not going to affect the BNF We're going to check how the stress that is the stress that is running is running in CPUs 1 and 4 to 11 So it's not running on CPUs 2 and 3 where the BNF is running Now if we inspect all the threads that we're running on that the stress test all of them are running on Real-time scheduler So while the test stress that is going on there's really no effects It does not affect the video conference at all the video conference is still isolated in their own CPUs running on real-time non-affected We run top we can see that p stress is really loading a system All these threads They're running that are running a stress in the system and if we look into this into the CPU we see that the BNF CPUs are more or less free of work But the other CPUs are overloaded The other the other cores are overloaded So this really shows have real-time how real-time helps in isolating your your your task on the on the compute and Of course, it's also about predictability, which is very difficult to show in a demo But supposedly all the all the other tasks that are running in real-time have a predictable behavior in regards to timing So that's all that we wanted to show in this talk. Thank you All right, and we are at zero left on on the clock So basically quick summary those are the three pieces We think you really need in the telco NFV space obviously data plane huge amount of work being done in the open-stack environment In terms of uptime we feel that there's a huge value still in the operating system to bring here I want to make sure you understand what those principles are you all have a great summit Thank you for attending swing by the Suze booth We're at B3 if you want to chat with this right over there. You all have a great day Thank you