 Hi everyone, so thank you for coming and I hope you enjoy the open for some it's So what I'm going to talk about is about power saving and then even if it shouldn't go pretty Well together, so I'm because of fontan and part of red at What is challenge because with an avi the Constraint is extremely simple and complex at the same time simple because one packet in means one packet out That's complex because you have to do that Maybe five ten twenty million packet per seconds So how do we reduce carbon footprints while maintaining that five nines? HA Basically five nine hs means only a few minutes of unplanned downtime per year How do we achieve that? basically you have an active passive system or a System which not used at full capacity and this really means a lot of course just idling Waiting to walk just in case Of course, we talk a lot about carbon footprints. It really depends on the energy mix because in France We like it to have a lot of decarbonated energy, so that's 70% nuclear 20% Renewable, but that's not only that when we talk about carbon footprint. It's about other manufacturing and also What we can do to improve that of course on the manufacturing part, we cannot do anything Let's be honest, but what can we do to improve the actual consumption? Of course the workload scheduling, but that's the responsibility of the VNF orchestrator We do have the ability to optimize the VNF itself But as the infrastructure provider honestly, we can't do much so on which layer are we able to? Act basically that the infrastructure layer. That's what we provide to the VNF vendors And so the real goal is to reduce the individual power node usage So what I'm going to present is a very small system because it only has six cores with hyperthreading And so we will be using of course open V-switch with obvious BDK with a 10 gig NICS and to inject traffic We will be using t-rex on another set of cores So show me the numbers of course, but first how do we gather the power consumption? The first element would be to have some kind of smart plug on multimeter and that's great for a home lab But let's be honest. It's not realistic in the data center So we do have and we all have two tools that we can already use the first one of course is IPMI Connecting to the BMC you can just gather the Power consumption of your server Poor supply by poor supply by the way just by tuning the poor supply and the way you Configure the poor supply whether it's balance or optimized mode you can save up to five watts just with one bias tuning The other way to gather data is to use The PCM tools and PCM power that way you will be able to gather socket by socket the power consumption of your CPUs So I don't have at home HPDL gen 9. So what can we achieve at the lowest possible power consumption? So we do have some bias and UFI configurations, which is called static low power so basically you just scale down the CPU frequency and Other tunable such as the PCI speed you limit yourself to the gen one and you can have very good numbers You can go as low as nine point five watts idling But let's be honest with this kind of preferences. You cannot go very far with direct here We can not even achieve six million packet per seconds. Just a quick reminder for a 10 gigabit interface with 64 byte frames You need 14 million packet per seconds to be line rates. So that's far from being enough so What happened when we enable the energy tuning basically? you use the Static high power mode which use the fixed 2.6 gigahertz on my platform without any trouble you use the tune the CPU partitioning profile and that the all fans out all on and You don't have any kind of control over the pistachio from the software perspective You don't even expose them as a matter of fact you even disabled the CPU straight driver the Intel CPU straight driver and When you just boot up without any workload or obvious running you are already using 34 watts and Quick reminder that's really compared to the 9.5 watts that we used before only for the CPU parts So Let's try to do a live demo and pray for the gods of life demo I'm logged into my platform and as you can see right now what we have on the left on the left that the traffic generator and if I Just try to inject some traffic I'm doing something really sick. I will not go up to 100% I'm just running at 10 million packet per second five per port. So We just wait for a couple of seconds and as you can see right there no packet loss, of course, hopefully That's what the tune is for on the upper right side, we can see the CPU power monitor showing whether the Cores are actually working or sleeping So that's why you have on the top right C1 C1 E3 and C6 those are the sleep state of the different cores and for each sleep state you have a higher latency to wake up when you have something to do and As you can see right now with no packets. We are already consuming 38 to 39 watts on the CPUs So how can we go lower? So I'd love the CPU partitioning custom profile And I will tell you what the magic after is Magic basically right now is hey, we can put these unused calls to sleep so what should the Those call just waiting to to be working. We can just put them into sleep And as you can see right now, we just reduce by three watts But something you may not have seen is the CPU frequency of obvious because now that we have less Cores running at 2.6 gigahertz. We can have two L cores running at 3.2 gigahertz and of course this means the increase in power consumption and It doesn't make any sense So let's disable turbo boost Live demo of course So now we are at 35 watts and 2.6 and now I will set the CPU frequency 2.6 gigahertz and as you can see we are already reducing the power consumption to 27 watts So that's great. That's a good starter. How can I go still lower? But of course What really matters for an avi is the zero packet loss. So for everything else turnable I will restart t-rex to verify that I still have this one packet in equal one packet out So that's a good But still that may not be enough We see that we do have those 2.6 gigahertz cores We still need what we'll also do is to set the CPU freq driver either with disable it or with the Custom CPU freq driver that I have right now Cat proc cmd line. You can see the Intel P state right there is set to passive What does that mean? That means that the CPU freq driver is actually using the Intel one But in a way that we can configure it So with CPU power we can set the Frequency governor to the on-demand mode The on-demand mode will just scale up and scale down the CPU frequency For those cores and these core 0 and 6 are just running Linux services So we do not need them to be running at 2.6 gigahertz every time they can just go to sleep Also, what about the other course? From a user perspective we could also set those frequency to 1.2 gigahertz So on the frequency sets, so I will just lower that so this will be 2 to 4 and the cores 8 to 10, let's do that Okay So as you can see I lowered here the cores which are not used the power consumption Didn't decrease much, but that's better than nothing anyway And what can we do now? You can see that on the cores 1 and 7 obviously be care It is actually running on them, but no traffic is happening So what can we do obviously be care by defeat by definition will just pull the Physical hardware queues whether we do have packets or not So We used to have something very interesting back in there, which was the interrupt mode Let's enable that So that's a new Chernobyl in OBS that's still in pending in review, but As soon as we enable that Here it is What do we get? We get 12 watts idling So remember we went from 38 watts down to 12 watts, but once again What if we inject some packets will cost my fingers? Here it is zero packet lost Still running at 5 million packet per seconds The frequency will be fixed at 2.6 gigahertz because as you may know with your robust every time You switch from one frequency to another one you have a slight delay during which well, we cannot process packets That's about 10 microseconds. So that's Maybe at 10 million packet per second 100 packets So this isn't a lot, but if you change frequency Multi-times per second you may end into a situation where you will lose some packets and you will drop them. So What's next as you can see the obvious the big a NAPI mode Well, the patch is already in review upstream by David Marchand. So What do we have else the custom 2D profile? We need the custom 2D profile What do we do we enable C states? We re-enable them of course and we also modify the minimum preference counter from the CPU Freq driver so that we can scale up and down from 1.2 gigahertz to 2.6 gigahertz on demand and The last thing that we need to do as well is to enable those C states for the virtual machines And how can we do that? Of course? We could do that with Nova so that Nova could act on that but What can we do today today? We can just put a little bit hook in order to enable those C states on demand because as We are actually using CPU pinning. We know that those CPUs and those cores are dedicated to the virtual machine so we can thanks to that little bit hook just Put that up and down as we need So few words about software optimization. Sometimes we I hear hey, should we go with hyper threading? Yes or no, the answer is of course. Yes When we go from one PMD to another one to two PMDs the power consumption is only increased by 1% the packet per second on the other hand goes up to 18% and What about the pipeline complexity of a Open V-switch when we go from the well that's the loopback to the normal action Well, we get 8% no increase in power consumption So the depth of the pipeline is extremely critical Then what about a newer mega system? So if we go if we do the same kind of test we go from 90 watts down to 37 watts That's pretty awesome So of course the latest chips have better power getting capabilities hopefully so we can go even lower and To produce may be able to take carefully as I show you before Thanks a lot. So we have less than one minute. Maybe for one question Yes Yes, so the question is is there any dependency on the nick card For obvious decay in terms of so yes, because we need the proper Support in the driver in the PDK. So that's why I was using the X710 for obvious to be K and the Intel Niantic Nick for the traffic injection. Yes But we're actually working for to enable that on other nicks. Yes So the question is what is the support the support status of the NAPI? So that's the interrupt mode. So that's built in the PDK So within the difficult driver you need to the support for interrupts and then with an obvious David implemented that Yes, the patch is under review So hopefully very soon. I cannot guarantee anything about that Well, we're over time. Thank you very much and have a good summit