 John de l'Amérique and joining me today is Steven Wong. I will be, we'll be talking to you today about some ways in which to utilize Kubernetes within the telco networking environment. And a little bit about me, I joined the Kubernetes community a little over five years ago, did a lot of work in SIG network, particularly around core DNS, and later moved to SIG architecture, where I've been involved in the conformance of project, as well as, and I initiated the production readiness review, and I'm a co-chair there in SIG architecture. Steven. Hi, as John said, I'm also from Google, but prior to Google, I've made my contributions to more of a telco open source community by co-founding the OpenStack TACO project, which is the ENF managers. Also one of the initial cores for Neutron Networking SFC, service function network training, as well as actually founded the OPNV Clover project, which is trying to use cloud native open source projects to address telco use cases. Great, thank you, Steven. And as I said, we're going to talk a little bit about deploying network functions across the entire set of compute resources from edge to core. So Steven, why don't you take you from here? Next slide, please. So I think it goes without saying that, so the context of what we want to do in this particular presentation is more network function centrics. So things about deploying network functions, things about configuring network functions, things about how do you manage network functions in general, in a 5G network, they are complex. They're one of the big reasons, and then it goes without saying, one of the big reasons is because both the 3GPP side, they try to separate it to control plane from data plane. So now the EPC, the 5G core is fully separated. The components are spreading all over the place. And also on the ORAN side, they are also doing separations between control plane, user plane, and data plane. So there is no longer a single piece of software, it's fully distributed thing. And because 5G has so much more bandwidth, you are starting to run many things onto the edge. And then telco is not just a single edge, it's many tiers of edges. And many of those components are spreading over to edges to optimize performance and latency requirements. And for that it adds to a complexity of both managing and deploying and managing those network functions components. Next slide, please. The transition of going from virtual network functions to more of a containerized network functions. One of the big things is manual management and orchestration from Etsy. It's basically built on IAS clouds, the infrastructure service cloud. So the abstractions and everything seems to be managing just those set of hard infrastructures. But then container particularly on Kubernetes are very different abstractions and concepts. So when you transition from a more manual to virtual infrastructure management interface to a more containerized network functions, you're starting having this current hybrid models that you can see where manual is optimal for managing open stack like things, open stack like IAS abstraction. Whereas it's kind of a poor fit to actually start looking at the Kubernetes workloads. So what are we looking at actually moving forward in 5G network because of the nature of containerized network functions becoming more ubiquitous given that you're running on edge. So the resource kind of constrain you to kind of kind of making it more optimal to run containerized network functions. It's more of a unified cognitive management. It's what we're looking at in a world where you are using basically just one model, one management platform and then it can both manage VMs and containers in a very, very unified manner. Next slide, please. So of course, given that we're here, given that the ONIS is actually part of the extension of Qoocon, we're talking about using Kubernetes to manage both the VMF workloads and the CNF workloads. And there, but it obviously doesn't work off the box. Kubernetes is very good at mapping the application needs to the infrastructure. And then now the applications are the network functions. So their demands are not the standards just a single compute and infrastructure and the simple open up a port on their network, a flat layer three networks, which is what the traditional Kubernetes is. So they obviously, one things that almost require for all the network functions is multiple interfaces. So things on multis, SRIOV, which does the kernel bypass, which is actually very tough for containers and requires you to actually create mapping directly on your network, your virtual functions to interfaces, to a physical interfaces. All those things needs to be kind of integrated into Kubernetes to make it a little easier to spin out existing parts. You are actually looking at the network that is far too far more complicated. CNF as an application is a lot more demanding on networking and not more knowledgeable of the network. So they're not just abstracted network into just connectivity policies and port numbers. They actually would have to physically configure a VLAN, a VRF, whatever little things on the network that need to be in place, they have that knowledge. So all those things are obviously not part of Kubernetes today. So the idea we have here is extending Kubernetes to manage all the things that the network functions actually demands. Next slide, please. But then if you do it a little knobs and little bits and pieces of the OpenStack models, OpenStack neutral models, I mean nothing bad about OpenStack, but then the imperative models where you're creating a neutral network, create a neutral port, plug a NOVA VMs into a neutral network through that port is extremely imperative, extremely descriptive. And then in the environment where you're fully distributed, the entire infrastructure basically hosting a set of abstraction of your network functions, that becomes very unscalable and becomes extremely unportable. So the things that Kubernetes promotes and then we would like to see even the network functions were adhered to is more of an intent driven model where you're saying something like, hey, I wanna deploy 5G networks in this set of clusters, basically, on infrastructure or in this little diagrams. It's basically just telling you, oh, I want to get something to drink because I'm thirsty and then you'll go figure out how to get me something to drink. And then I want a soda. And then so in case of saying that I want a Coke in 12 ounces Coke onto this particular cup, please wash it and bring it to me in this room. That kind of thing is not, if you extrapolate that into what a network functions requirement is, it would probably, it would never really fly because the environments are just so much more distributed. There are multi clusters that you have to manage then and you basically get into a situations where it's very hard to manage and any changes will requires a complete change of your current deployment scripts. Next slide please. So what do we need is basically cloud management in every single tier. The concept of Kubernetes where it's intent-driven, it should be done on every single layers. It's not just your infrastructure needs but also the network functions configurations where it can actually be driving infrastructure. For example, your IP address may have to be applied. Obviously it needs to be applied on the kernel level so that you know how to address them. But it also maybe, well not maybe, for sure it needs to be applied also on the network functions levels and the configurations. So on every single levels, you want that to be managed in a uniform way. And to do that obviously for Kubernetes is a well-known way of extending Kubernetes current resources, which is the custom resource definitions. And part of CLDs you usually come with now in operators. The operator patterns is now the predominant patterns to address how to add our CLDs in place. So we're looking at a world where you start adding CLDs that are addressing more than just your infrastructure needs or your new different types of Kubernetes cloud provider needs. But also the domain, your RAM, your core, your 5G core, your 5G RAM, what is needed, what kind of configuration changes are needed are now could be expressed in CLDs and then you can use operators to process those intents. And then those things are going beyond just a node, it's going to an entire network, it's going to across different clusters on different sites. And then so now you have no OuterBand configs on the user level, because the operator would be the one actually trying to take that intents and then talking to their own element management or if they use GLPC, NetCon for some other things. And obviously this is way more than a Google only efforts. So we are doing this presentation as a way to start calling for industry people that think the same way to come together and try to make this happen. Next slide please. And of course, this is just that one. Yeah, so exactly. So thank you Stephen. So now we're at a stage after we do everything Stephen just talked about, right? Where we have at the different tiers along the edge, the remote edge and sort of tier one, two, whatever we want to call them all the way up to say a hyperscaler region of a cloud provider. We have a consistent control point, right? That's the unified cloud native management piece. And so we're done, right? We can now just deploy a network function anywhere. Communities will pick it up. It'll render those infrastructure changes down below all the way down to as deep as we need without any human intervention and we can all just relax. Of course, that's not true, a lot more is needed. We need to be able to do the things that come before we put the network function at that tier. We need to figure out which tier to put it on. We need to look at the requirements for the latency or whatever it may be and use that to select the tier. We need to figure out once we know we want to say on the far edge, well, where on the far edge? Because there are thousands and thousands of sites and do we need it everywhere? Do we need it just in certain regions? So we need to be able to figure out those locations. Okay, so now we've picked the location, we've picked the function, we've picked the tier and we need to specialize the config because that function config is gonna be different for that particular tier and location and other factors than it would be anywhere else. Once we've figured out what the config looks like, we need to deliver it there. We need the control plane to get that configuration so that it can act on it. And of course, that doesn't even start to talk about all the monitoring and service assurance and all of the things we need to do to operate that network function at scale, that network function and just make sure that the network itself is up and running. So we were given 30 minutes for this talk so I'm not gonna talk about all of that and even if I wanted to, I don't have solutions for all of that. But what I will talk a little bit about is config specialization, that one line in there config specialization. Why do we pick config specialization to talk about? Well, because we believe it's one of the hardest problems in there. All of those other problems are hard to, but this is one of the hardest ones. The reason is that it goes back to what Stephen started with, which is complexity. Configurations are really complex. So let's start to break down the problem and make it a little bit less onerous. So we'll start with categorizing our configurations into these two broad categories. One we call provisioning config. So this is Kubernetes stuff for the most part when we're talking about a Kubernetes control plane. It's one of these manifests that we use to tell you what container to run, how to configure the node, how to configure the cluster, how to configure the underlying infrastructure. And this is typically today done with infrastructure as code tools. And there are certain best practices out there today and how it's done. The other piece is the networking config. So Stephen referenced this a little bit where he talked about that when you change the IP address, you're gonna have to let the kernel know maybe, but you're also gonna have to potentially reconfigure the function. So these are the actual network function configurations that realize the telco network as opposed to the control plane network and control plane pieces. Today this was often done with NetConf. So you actually talk to the actual container that comes up runs some service that's listening and you talk to it with NetConf and it changes itself internally or you SSH into that box and you run some commands or there's vendor specific element managers, vendors here being the network function vendors. They may have element managers that manage their network functions and allow you to configure them in different ways. Okay, so we have these broad two types of configuration, provisioning config and network config. Another dimension of this is the day zero one and two considerations. So day zero being design, how do I hook all these things together? What I want, what function talks to one other function, what level of scale do they need, et cetera. Day one being how do I actually provision these? It's kind of more what we're talking about here. And then day two, how do I scale it? How do I make changes to it? All of both these types of configs vary based upon all of the things we talked about before the tier, the location, the specific function. Probably there's particular use cases and requirements for that particular network that we're trying to render and they change in interrelated ways. You change one, you have to change both because some of them change. So like we said, lots of complexity. Can't address it all today. Let's drill into provisioning config and focus on one thing. And like Stephen said, this is nothing we could possibly do on our own, which is where we're drilling narrow and narrower for this conversation, but as a community, eventually we would hopefully all talk about all. Okay, so thinking about provisioning career config, like I said, that's typically handled today with infrastructure as cultural. So basically templates. So you think about a template configuration. It's got some, if that else statements in it, it's got some loops in it. It takes in a set of parameters and it renders a set of manifests on the end. So that works in simple cases. It works great in simple cases. It's pretty easy to understand. Conceptually, somebody's new to this, they're sort of like, well, I just want to make it slightly different in this environment and that's great. But as you get more and more of these sets of configuration and you get more and more use cases where you want to deploy that same set of configurations, the complexity quickly ramps up. So these become thousands and thousands of lines. When you actually get running on the cluster, if there's a problem with it, you have to backtrack somehow to figure out which of all of those parameters, where's the bug in all of those if then else statements that caused this particular value to get delivered to the cluster based upon this particular set of inputs. And it can be really hard to backtrack that. You also, this model lends itself to what we call a black box model. So you think of these sets of configuration as something you don't understand. I don't understand that. You just tell me what parameters are there are and I'll just tweak these knobs on the outside and you'll render it. But the reality is we have to support these networks that are running, which to support these network functions that are running. So we have to understand in the end all of the stuff that's going on and running in these environments. Not only that, when we start to treat it as a black box and we start to say but we need to deploy this in these different tiers in these different sites for these different customers that have slightly different requirements then all of a sudden the number of parameters explodes. And so you have to understand now a new API sort of shadow API of hundreds or thousands of parameters which have no consistent schema and have no validations and are just a sort of hodge podge of key value pairs. And that, again, it's just unmanageable from a debugging and a keeping track and a policy enforcement and consistency point of view. So, okay, we've got complexity, complexity. I haven't offered you any solution yet, right? Let's just pause a moment and go back and think really basic computer science. What are machines good at and what are people good at? People aren't good at handling all those parameters. You know what, machines are really good at processing data and one thing we've learned in our industry is that if we want those machines to be even more effective and we wanna be able to understand how to tweak the processing of that data, we can impose a little bit of structure on the data. So the thinking here is, or the examples here that I'd like to think about are files streams. In Unix, every file is a stream, it's just a stream. And so we've got tools like SED which I believe stands for Stream Editor which can just go through that stream and replace things. Okay, fine. Well, let's put a little more structure on it than a stream because stream is already a structure but let's put a little more structure. Let's put line breaks in there. Now we've got lines. Oh, now we can do a few other simple things. We can count the lines. We can compare the lines and see if they're the same or different and we know how to break it up. We can come up with tools like AUK which can do all kinds of crazy things on those lines. And actually then let's take it a step further. Okay, we take those lines and we break them into the edge. We go from just rows to rows and columns and now we've got this table structure. Wow, we've come up with SQL which an entire billions and billions of dollars are built on top of SQL on this really pretty simple structure of just taking a file and breaking it into cells and the tables will up, rows and columns. And we can create a whole language that can operate on top of those and build repeatable operations on top of those. So this is sort of the key insight looking at this configuration problem and the amount of complexity that we're talking about in this configuration problem and saying, how do we apply sort of basic best practices of computer science for the best lessons we've learned in computer science? We do it by treating the configuration as data. That means that the day and giving that data structure. So in our particular case that's the Kubernetes resource model. This concept of creating the configuration as data and operating on that structured configuration as data is shockingly called configuration as data. So like I said, basic goal is represent that config with a simple data model. We're choosing YAML and Kubernetes resource model here. That gives us this sort of well-defined schema and KRM is also an extensible schema. And then we start to build tools on top of it. So some of the tools that have been built on this model are capped and customized both open source tools. And once we have that, we're sort of designed we're now working with our configurations in a design that's intended to support automation. So we can package up these configurations into packages and those packages can be composed. They can be edited in place and the original author. So if you think about Git, those of you who are familiar with the basic software engineering practice of source code control, Git is the most popular source code control manager at these days. And you think about, you write some code, somebody clones that repository, they might change the code. You change your original code and now the people can merge that back down. Now that's done in source code with just basic file merge line by line, right? The structure and question there, as we talked about earlier, was line structure. With config is data using a KRM and YAML model, we can actually take that a whole other step higher because we actually understand more than just the line structure. We understand the field structure of that KRM resource and we can actually have semantic knowledge. We can say, we know what labels are. Metadata labels, we know what those are. And so if I take this package and I clone it and I add some labels to it, then the original author adds some labels in their package. When I go to merge it, I don't just have to overwrite my changes with what the original author now did. I actually know I can merge those things because labels are additive, right? So it's a sort of even higher level of structure and even higher level of automated ability than we get out of sort of just treating the files as basic structured text files. Okay, so what does this result in? So the idea here is that we, because of the machine processability of this configuration, there's no longer feed some inputs in and the results we get out are really look really different from the original package. They're all still structured. There's not, if then else statements, there's not looping. There may be machine processing that results in a different config, but it's all sort of traceable. And the idea is that it can be traceable all the way from the user intent, all the way down to those infrastructure intent. And that traceability and that declarative natures, even referred to declarative earlier, talking about that Kubernetes is declarative and that we can add our resource or infrastructure resource configurations into Kubernetes and it can declaratively configure the infrastructure. We can take that, that's sort of southbound looking down. We can actually take that up one more level and we can say, now with configuration as data, we can declaratively create the intent of sort of how a constellation of network functions might hook together. And we can start to apply declarative, one of the great things about declarative is it really helps you with the two operations because you can start to look at the intent. You can have an automated process that looks at the intent and looks at the actual state of the world and then does an active reconciliation. So that's all Kubernetes does. It happens to do it for containers, it happens to do it for compute, storage and basic networking. We wanna extend it for more networking but then we wanna extend it upwards so that we can express the intent of those different network functions and declaratively actively reconcile downward. Okay, so a lot of abstract stuff but I hope that the message comes through in that 5G networks, really complex, new use cases making things more complex, declarative management and configuration as data are proven to reduce complexity because these are the ways that Kubernetes works on the set of data that it already manages. We believe we can take these same concepts and we can apply them to network function provisioning and potentially beyond that as working as an industry together. There have been existing efforts around, a lot of them around particularly the infrastructure pieces. These are disjoint though, meaning that they're different projects creating these, running them for their own specific use cases and kind of siloed off from the big picture that we're talking about here. They're also incomplete, so you end up with gaps between them, at least incomplete in the sense of the vision we're talking about. So what we would like to propose is that we start a new open source initiative and it's really looking at this bigger vision. How do we apply declarative management the next, a little bit higher up the stack for network function provisioning and provisioning sort of constellations of network functions and have that render all the way down to the infrastructure so that we can have complete out of band. There's no human going in and making any changes. We deliver this declarative intent and we end up with a bunch of network functions running, talking to each other and scaling out those little gears. So that's all we have for you today. Thank you very much. And please, my email address, Steven's email address there, reach out to us. We definitely know that we can't do this alone. We're going to need a lot of people from across the industry to make this a reality. Thank you so much. Thank you.