 Hi, welcome to KubeCon, CloudNativeCon, North America 2020 virtual. My name is Dave Kremmens, I'm a cloud server architect working in the network platform group in Intel. And today I'm going to talk about how we can enhance the Kubernetes networking model with SmartNix. So for today's agenda, I'm going to talk about the edge, bare metal deployments or bare metal as a deployment target for the edge and Kubernetes. Then we'll cover Kubernetes networking model, the simple requirements inherent in Kubernetes itself from a networking perspective and some of the trends that we've seen over the last number of years, especially as new industries onboard to the Kubernetes platform. We'll then discuss SmartNix and the types of categorizations we can apply to SmartNix. And then disaggregation, which I think is a key aspect of how we can enhance the Kubernetes networking model and maybe some more discussions around offload techniques. So I want to start by discussing the edge just for a moment. So the depiction here, we keep hearing things around what actually is the edge and the edge is simply geographic distribution, computing done at or near the source of data. So as you can see here, we have different aspects of the edge. We have the on-premise edge, we have the access edge, we have near-edge, we have far-edge. And it's all in terms of the point of presence. Where exactly does it allow you from a geographic perspective? And what kind of computing can we do at that location? So we also tend to look at edge computing and there's close tie-ins and alignment with the 5G world. So the move to 5G is driving changes to edge computing and it is driving adoption of 5G solutions. These 5G solutions come with the promise of lower latency, higher capacity and increased bandwidth. And when we start looking at the type of deployment models available from an edge perspective, we see things like public models, we see private on-prem type models and then we see a hybrid, which is kind of like a combination of all of them. And there are four main markets that are targeting edge computing today. And they are the IoT market, enterprise, telco and cloud. So why would we look at a bare-metal deployment for 4D edge? So we have numerous legacy applications that still require to run on virtualized platforms. So there are inherently, from that perspective, there are numerous deployment models for 4D edge. We have bare-metal, we have virtualized, we have paravirtualized, etc. But edge deployments will need small footprints. And even as we see the likes of the cloud capabilities in processing moving towards the edge, we see things like the telco industry adopting cloud-native patterns and applying it to the edge, that the footprint becomes very critical. In the previous slide, we showed a depiction of the geographic distribution of edge deployments, which essentially means that we need to be very aware of our surrounding environment for when an edge deployment is modeled there. So we don't want to have the space of a modern data center. We have little tolerance for extra server capacity, like, for instance, virtualization. So we need to try and avoid that tax if possible. And the lead time for new hardware at an edge location is very slow. So we need to ensure that we prime our deployment correctly so we get it right the first time. And I think bare-metal is ideal for the likes of network functions that really require that deterministic and predictable performance. And why is that? Why is bare-metal a good choice for network functions as we transition from the VNF world to the CNF world, or we move legacy VNFs and orchestrate them in something like Kubernetes? Well, we have full access to the hardware, which is fantastic. By virtue of the fact that we have full access, we automatically have a reduction in the amount of resources that we need. So we look at this scale-up model where we can scale up the capabilities of our bare-metal platform versus scale-out model. And I think this aligns nicely with the smaller footprint, so we scale up as opposed to out. We also have options to leverage accelerators like FPGAs or QT or GPUs for more types of performance gains and for more acceleration options. And with bare-metal, I'm a firm believer that we are capable of generating higher throughput, lower latency, and superior performance. And this pretty much aligns with the 5G promise and even the edge deployment requirements themselves. But another key aspect to why bare-metal is very applicable to the edge is it's more aligned with dynamic aspects. We don't have taxes like the virtualization aspect. We don't have static configurations that need to be in place when it comes to the bare-metal. We can provision our system, configure our host options, have our OS running, and then we have a number of runtimes in place that provide abstractions around the governing software that's going to run on that particular platform. And these components can be swapped easily. And this facilitates infrastructural changes with little side effects to applications. So I do believe that the edge is prime for edge deployments. So my key takeaway is that the bare-metal edge is better designed to address the needs of telco and can deliver on the speed and performance required by 5G solutions today and in the future. So we have edge, we have bare-metal. So how does Kubernetes fit into the picture here? Let's maybe examine some trends out there today. So forecasters are predicting huge increases in edge computing. And this essentially is down to the fact that there are a sheer quantity of edge instances compared to centralized cloud servers. And these edge workloads will continue to rise. They'll grow more complex and they'll become more demanding. So when we get into this particular arena, infrastructure and platform resources need careful management. So we need to be mindful of how we manage these aspects to ensure that we can meet the requirements of edge workloads. And Kubernetes is going to play a very critical role in this particular space because inherent to Kubernetes is the ability to abstract the infrastructure capabilities while still providing a robust and scalable platform. Like this essentially is perfect for the likes of the edge. Because Kubernetes doesn't really know or care about whether it's going to target a cloud deployment or a non-prem deployment or an edge deployment. It's one of its key strengths is the abstraction around the infrastructure aspects. So with this in place, how do we ensure that we can leverage the edge platform to do what it's meant to do essentially? Or in terms of how do we get the best out of an edge deployment? So what we need to do is we need to start thinking differently. We need to be aware of the boundaries and separate out the concerns. And to achieve that, we can look at things like disaggregation and distribution. We need to look at how we manage our resources on our platforms so that we get the best out of our platform. And we need to ensure that we have instrumentation built in from the ground up so that we do have observability capabilities in place that allow us to collect, process, learn and even optimize. Kubernetes is going to be a big player in the edge arena. And one area in particular that I would like to focus on from an edge perspective is the Kubernetes networking model. So the networking model today in Kubernetes has some simpler requirements. All pods and nodes can communicate with all pods without NAT. And the IP that a pod sees itself as is the same IP that others see it as. So this is very primitive to the Kubernetes networking. But that hasn't stopped advanced networking models with complex properties being deployed in Kubernetes all the time. And what do I mean by complex properties? I mean things like, you know, tunnels and overlays, you know, as advanced sidecar mesh with, or sorry, with advanced service mesh with sidecar deployments. IP sec and zero trust. And we have data planning technologies then from the low latency high performance domain. So things like DPDK and SRIV, et cetera. And then pertinent to telco, then are multiple network interfaces required on a pod. And remember that Kubernetes only supports a single interface from a control plane perspective. And any other interfaces available via the pod are not visible to the Kubernetes control plane. You know, so my point here is that Kubernetes networking has advanced as the years have rolled on. And as new, let's say workload types have onboarded to the Kubernetes platform. You know, so like these type of properties require platform resources to deliver under respective claims. So let's take an example of Oven for Kubernetes. So if we see here, what's the idea here is that I want to try and highlight that there are complexities involved in provisioning a Kubernetes network to ensure that we have connectivity, that traffic can be directed to egress or we can accept traffic on ingress and things like that. We have, you know, network policies in place and we have a whole plethora of operations and behavior that are defined from a networking perspective. You know, so like even in here, just provision a simple pod with Oven and to ensure that we have the right integration with OVS. We have three streams of work. We've got the Kubernetes pod creation flow, which is essentially the one to four or at least the Roman numeral version. We have the network settings generation flow and then we have the network settings application flow. You will see that there are a number of different contenders in the networking space for Kubernetes. You know, we've got other SDN controllers that will have equally as complex properties and configurations that they need to deploy and manage. And these are absolutely critical for different types of applications to function correctly. So Kubernetes is not all that prescriptive in terms of what it mandates from a networking perspective, but we still need the extra complexity and characteristics of the SDN controllers that are applied today to ensure that we can meet the demands of the workloads that are now running on Kubernetes. And another key point here is that change is inevitable. Networking architectures are continuing to evolve and when they evolve, they become even more complex. This is already happening and the edge will be targeted with these type of complex networking deployments. You know, we've got things like network observability. We've got things like artificial intelligence and machine learning. We've got service assurance in terms of the whole closed loop model. And these kind of constructs require extra processing. They'll need much more time on the CPU and access to memory and there will be a lot more going over to bosses as compared to the likes of very simple networking requirements. So we need to be careful when this happens because the networking models are evolving, the computations are also getting bigger. But what we see here is that we see an explosion in bandwidth capabilities. We see an increase in the amount of traffic that needs to be processed. And this is kind of disproportionate to the amount of resources available on our platform. So while the pipeline gets bigger while our fabrics increase, our bandwidth increases, the amount of traffic increases, our platforms are kind of struggling now to keep up to ensure that we can still provide that deterministic performance and abide by the SLAs that we've agreed to. So why would we utilize our platform resources or let's what we call infrastructural boiler place if we don't have to? So we do have options in terms of disaggregation where we can leverage the likes of hardware offloads. And this I think is something that the Kubernetes networking model can easily benefit from. So from a networking perspective, we have smartnakes are being discussed and they're being targeted to things like the edge, they're being targeted to things like the cloud, they're being targeted for things like, let's say on-prem and hybrid models and really what do we actually mean by a smartnake? And how does a smartnake afford us the opportunities to accelerate the networking aspects or even not just the networking assets but multiple aspects of the entire workflow? So if you look for a definition of smartnake, it's hard to settle on any one particular definition. So I've put in a number of definitions that you would probably come across if you start researching the likes of what smartnakes are. I've come across things like network-attached acceleration platforms and new processing environment, a target for a network pipeline, a programmable data plane, a location to run infrastructure management components, let's move some of the control plane aspects to a smartnake. And another categorization or not categorization but let's say another type of target for the smartnake then is a guarantee for the network integrity. So move the root of trust directly to smartnake hardware. So let's provision our networking model, let's provision all of our policies and enforce them at the smartnake layer and free up our platforms for processing. So we take the trust and the security model and we move it down the layer into the actual smartnakes. So this allows clouded means and assisted means to program the networking model so that tenants can operate without breaching security boundaries. But also when we look at smartnakes, we see different variations in terms of the categorization of smartnakes. We see system on chips and then we see discrete versions. So this particular diagram here is what we're trying to say really is the degree of smartness may vary. So we need to look at things like configurability, offload capabilities. We need to look at flexibility and efficiency also. And flexibility and efficiency, you need a delicate balance between both of them. Both are required. So if we look at the system on chip model, we have the likes of maybe programmable cores if we're using an ASIC. We've got FPGA system on chip which allows us to do configurable logic. If we move across and we look at some of the discrete models, we have ASICs that are more limited in terms of their flexibility. It's hard to change them. And then we have combinations like ASIC and FPGA which provides the efficiency and the configurability or flexibility required. And then we maybe have a full-blown FPGA. But to do this, both of these are required in terms of flexibility and efficiency. And with this, we can look at things like performance, security and offloads. And these are the type of things that we can move to a smart nick to allow Kubernetes to target what it needs to target in terms of ensuring that workloads are running in their pods or in their containers, in their pods, on their platforms. And this type of employment of smart nicks really is conforming to what we call a hybrid computing model where we have cores available on our platform with memory and storage. We also have cores and accelerators via the smart nicks. And these are domain-specific architectures. And the domain in our case for this particular talk is around the networking processing capabilities required for especially the telco domains. So smart nicks are essentially... To keep it abstract for this particular talk, I'm going to define smart nicks as a platform that has processing capabilities and offload acceleration capabilities also. So we spoke about the boundaries in terms of what our platforms at the edge should do. And we spoke about how we could provide some desegregation at the edge. So the edge, based on the first diagram in the first slide, it doesn't have an abundance of resources. It's not like a data center. It's constrained. But that doesn't stop us wanting to bin pack as much workload as possible at the edge to deliver on what edge computing claims to be able to deliver. So we want to provide predictable models for deterministic performance. But how are we going to achieve this? How do we desegregate concerns from our main platforms and move it towards... Sorry, alleviate the pressure on the platforms and move it elsewhere? Well, we want to leverage a sidecar platform but apply that to the infrastructure. As was done for, let's say, the layer four to layer seven type applications with the service measure architecture by deploying sidecards that take care of the data plane within the pods. We also want to leverage the same concept and apply it to the infrastructure. Now, if we do that, what does that yield for us? It means network flows can be programmed and offloaded. So we have things like OVS traffic control. We have RT flow for things like dbtk. We can offload these directly onto our, let's say, our SmartNIC in this case. Traffic can be forwarded between the physical functions and the virtual functions without going through software. So we have acceleration almost immediately. So we have no software in play. Purely a hardware concern. We have inline processing that can stay in the NIC and then look aside and directly transmit to the target. So no data movement back or forth with the host. So we've eliminated that and we've created that boundary that I've been speaking about. So it's still in the NIC. So our platform and our utilization of our cores is very, very low with this type of offload or this type of capability. We can also look at programming our security policies and having them managed by the SmartNIC. So we can have our access control lists managed there. We can have our network policies managed there. We can do EPPF offloads. So we can do filtering, low balancing, monitoring, et cetera. All of the things that come with the EPPF work or at least the EPPF technology. And we can also then leverage the SmartNIC to deploy observability pipelines. So observability by today's standards is very important. We want to ensure that we... So we're already in what I would call a reactive model where we have the standard process of where we collect the information. We generate some insights and then we try and leverage that. So it could be the case that we detect that a particular network slice in a 5G scenario is going to run into trouble in terms of delivering on its SLA. So we can use observability pipelines to ensure that we start to provision out a new slice or that we increase the capabilities or bandwidth available in that particular aspect. But that's reactive. So we collect information and then we make a decision or we make an action based on what we've collected. But what happens when we want to start moving towards proactive models where we have smart intelligence systems deployed sitting on our SmartNICs that are able to over provision as needed or we can move pods between different locations. So all of this again is something that I believe would come to the forefront once we have fully embraced edge computing and the 5G solutions that we'll accompany that. But the very important point here in terms of disaggregating concerns and processes from the edge perspective is that this shift is not new. We want to facilitate advancements in networking, in Kubernetes networking, but without the cost of the extra CPU cycles. This is a perfect fit for edge scenarios. It follows the exact same patterns that have been applied across data centers and clouds for the last number of years. So SmartNICs provide programmable solutions. We can enhance our overall networking model while Kubernetes continues to orchestrate business value. And that's the key message here that is very applicable for edge computing. So now that we have opportunities to enhance the networking model via a SmartNIC offload and with SmartNIC deployments for numerous targets like the edge to farage, the cloud, enterprise, etc. What other offloads can we leverage? How can we also disaggregate other aspects, maybe even of the control plane? We can use things like dedicated FPGA devices, quick assist technology for things like encryption and compression, and GPU for more offloads, for graphical, for intense processing with graphics or even more mathematical computations like, for instance, machine learning, things like that. So my point here is that data planes can be offload, but so too can control plane concerns like encryption and compression in the case of QoT device, which can offload that and save more cycles. So this embracing of acceleration, I think, of acceleration and potential offloads is paramount for edge computing to succeed. And Kubernetes is very equipped to handle these type of scenarios. It just, what we need to bring to the table is these particular offload capabilities and ensure that we have successful orchestration in place, which I think is very, very possible as SmartNICs are already deployed today. There have been numerous talks over the last number of KubeCons around SmartNICs and offloading things like OVS and the cache path and so on like that. But this talk really is about enhancing Kubernetes as it's going to be deployed at the edge and how SmartNICs can provide new opportunities for us as can other offload technologies and other hardware offloads. So thank you very much. I hope you enjoyed this. Thank you.