 Go ahead and get started. Cool. Can we see everything in presentation mode? I can. Awesome. So effectively, I'm going to be talking to you today about network service mesh, which is sort of a new thing that's emerging to deal with what some people might think of as the more esoteric networking problems in Cloud Native. And having presented it a large number of times now, it's finally occurred to me that almost everything goes better in a story. And so this is a narrative introduction to network service mesh that basically talks through it from the point of view of a developer who has one of the many classes of use cases that network service mesh solves very, very well. So our protagonist. So meet Sarah. Sarah is writing a Kubernetes app to be deployed in the public cloud. And one of the pods in Sarah's app needs secure access to her corporate instrument. So that's a networking requirement. And this is the way it looks from Sarah's point of view. She has her pod. It has its existing Kubernetes interface that interacts with the rest of the cluster and does all the things she knows and loves. And she needs some other thing that will connect her. Give her an L2 or L3 connection back to her corporate internet. And somewhere along there, security happens. And she doesn't really know or want to know or care what constitutes security. So this is conceptually what Sarah wants from the system. So normal interfacing, sitting and receiving to the corporate internet. But unfortunately, this is not what Sarah is going to get if we go with something that looks like neutron. So if you go that route, this becomes almost a developer's definition of hell. How do I find out what subnets the second interface connects to? Who defines that subnet? What size does the subnet have to be? So OK, great. Now I'm going to define interfaces from that subnet to a VPN gateway pod. Oh, crap. Routes. I was told there would be no routing. But in fact, if you want some traffic to go to your corporate internet, you need some routes that will point to that, but not all routes. Replicas have this way of growing. If your subnet ends up being too small, now you've got to start all over again. And that means you have to change your interfaces again. And then God help you if you pick a subnet that's incompatible with something in the corporate internet, which you typically will find out as you go along. And God help you again if your corporate network guy is reIP something in the internet, and now that subnet is incompatible again. And then effectively, then somebody actually decides they want to do something on your end about security, say for whatever reason they would like to have a first pass firewall in the cloud. And so now it becomes Sarah's problem to wire all of that in as well. So it gets actually worse because corporate internet, as we all know, are not static. The prefixes in those corporate networks change. And so that means the routes in Sarah's pods will have to change. So this is where we see Ariandre, the NSM spider, coming in saying maybe I can help. So who are you, right? So this is Ariandre, the NSM spider. It's sort of the mascot for the network service mesh team. Because a spider for weaving web seemed like a good idea. And so what is network service mesh? Now, because network service mesh contains service mesh, it's a very natural thing to think, OK, is that like a steal? And the truth is that it is actually a lot like Istio. All the key things that Istio does for TCP and HTTP connections, network service mesh does that for IP, ethernet, and whatever other L2 or L3 protocols it is that you actually care about. It turns out that most people most of the time care about IP and ethernet, but there's a whole slew of others that people actually care about as well. So this is Sarah explaining her problem to Ariandre. Sarah, again, just wants to connect her pod, which does exactly what she wants it to do with Kubernetes. She wants to connect it securely to the corporate internet. That's all she wants. So to do this with network service mesh, effectively what you do is you say, OK, we have a new resource. We're doing this with CRDs called a network service. You give it a name, in this case, Secure Internet Connectivity. And it has a spec where you have a selector that matches things that provide it. And you have a definition of channels. And these channels are things that justifying the payload that comes in. You can think of a channel as being analogous to listening on a TCP port, but it seemed profoundly unwise to, once again, further overload the word port in a network context. And of course, this looks a lot like service resources in Kubernetes. You use selectors on pods to find the pods that provide the network service. You expose channels that are like listening on ports. The one big difference is that we talk almost exclusively on a network service mesh about the payload. So you'd say, my payload is IP, or my payload is ethernet. You wouldn't talk about any kind of an underlay technology, because that's not something Sarah actually gives a shit about. And then, so this is a I need to deploy this for my VPN deployment for the VPN gateway pod and a network service resource. And that's very close to being all that is actually needed for Sarah. The truth is, as you'll see shortly, there is a need for an init container and a config map telling you what to do. Now, the init container is something that's being actually written by the network service mesh project. So it's the kind of thing that you just roll out the standard container, and you drop in the config map that tells it what network service is you want to connect it to your pod. Any questions so far? I guess, and to step back a little bit. So network service mesh, this as a project is, where is this project located, or stewarded, or governed? It's currently out of GitHub. One of the reasons we're actually having that I'm sort of, you're talking to you guys, independent of the fact that you're fun and I like talking to you, is we're in the process of trying to figure out what's the right formal home for network service mesh. So we went and talked to Kubernetes SIG networking, and SIG networking feels strongly it should be a Kubernetes working group. As we're bubbling that conversation up, because they're in the process of redefining Kubernetes working groups, it's also not clear whether that is or is not the right home for it. And so we're sort of talking to various people to get advice as to what they think the right formal disposition would be. But right now it's just a lot of code being written very quickly by folks from a bunch of different companies in GitHub. So far we've got participants from Cisco, Ericsson, Palo Alto Networks, Red Hats, AT&T, Bell Canada. Oh, I'm missing some. Yeah, I'd have to go back to get the complete list of companies from which we have individuals participating, but there's a lot of activity, a lot of interest. Yeah, that's very good. Of the, I'd like the image community that's being used is this based on any other open source projects or is this kind of net new initiative? So we, there, so I would draw two distinctions. There's the distinction between you can use and requires, right? So for example, there are a lot of people involved in the community who like VPP as a data plane. And so it is certain that that will be one of the data planes supported under Network Service Mesh. However, we also feel very strongly that we have to be data plane agnostic and we're taking great care to be data plane agnostic. And so you could say, okay, can it use VPP? Yes. Does it require VPP? No. Does that make sense? Yeah, it does. From a control plane perspective, is there a reuse there? Not a ton. You're sort of getting into some interesting topics like I talked about with the different decks subsequently. There's a tremendous amount of flexibility as you'll see as we go along with Network Service Mesh. One of the interesting bits of flexibility that it allows you to have is that you can compose, not only can you compose things together to create an Network Service, but you can compose the data planes and control planes semi-independently, which opens up a whole world of cool things you could do. But that is a much deeper conversation, one that I would love to have, but probably best to get the basic concept going for folks first. Okay. That make sense? Yeah. But I mean, if you're getting at things like could I use Network Service Mesh in a way that would advertise BGP routes for things? Yes, yes, you could. Is BGP a first-class concept in Network Service Mesh? No, no, it's not. We tend to be very hyper-compatible with the kinds of things you may want to do under the covers while preserving a very, very simple surface for actual consumers. Cool. Okay, so the next question Sarah would have is, okay, so no interfaces, no subnets, the routes. How does all this work? And of course, this is where we get to the basic concepts. So Network Service Mesh has three basic concepts. So you've already met the first one, the Network Service. This is a logical concept of the thing that you want. Now, it turns out that generally speaking, app developers never want a subnet. I've literally stood in front of an audience of 600 app developers and asked, how many of you ever want to know that a subnet exists again and couldn't get a single person to raise their hand? So that's not something they actually want. They want things much more like what Sarah wants, which is I want secure connectivity to my corporate internet. It is in fact a service that happens to act on an L2 or L3 payload. So that's the first concept. It's this logical service. Hey, Ed, just not to play, devil's attic as we can hear, but something I've been like dealing with here at MasterCard is I don't know if I want my developers to even know anything about the network. I think they should just be grandfathered access to the corporate internet and to the internet and to other like database, other segments. I do, do you really think there's even a need for the developers to have to say I want to have subscribed to this channel and get access to this environment to we just as network administrators just make that all like the part of the definition of what the deployment needs. So you can do that if you have sufficient control over the underlying network environment, but you don't always have that control. And it sometimes turns out that the things that you want to do end up requiring the degree of flexibility that does not play well with having total network control. So yeah, on that nice first you kind of see the need for both. You need something that gives them more specific. They can make their own definitions or subscribe to certain definitions. And also that could be a need where we would be able to kind of hide that under the covers from them. Right, right. Well, and again, this is the semi-independently composable data planes and control planes in network service mesh. One of the things that you can do in network service mesh is you can, as the network guys, still influence things out of the cover. So one thing that I sadly will not be presenting in this deck is that, how many folks in the room are familiar with segment-riding V6 and SID stacks? Yep, definitely. Great for me. Yeah, so one of the things you can actually do in network service mesh. So SRV6 can be thought of as just another tunneling option between Sarah's pod and whatever network service Sarah is dealing with. There are mechanisms in network service mesh that would allow you to insert a proxy in between that could essentially both sniff the SID stack going back and forth and augment the SID stack. So let me get a little further on. I'm getting ahead of myself. But effectively, we have no problem in network service mesh with the network doing work. It's just a sliding scale for us. You pick your poison. So again, this is something like secure internet connectivity when you talk about the network service. So the second concept is a network service endpoint. It's a pod that's providing the network service that you want. Now, since you brought this up, I will point out that this doesn't necessarily have to be a pod. It could be something in your physical network. But for the moment, we're just gonna talk about pods providing network service endpoints because it simplifies the conversation. So and this is exactly like ends points for services. I think the only major differences we got advice from the SIG networking folks that effectively came down to for the love of God, don't make it plural. That was a very bad idea. And so we have in fact, not made it plural when we talk about resources. So an example of this would be the VPN gateway pod, right? That's a network service endpoint. And then finally, the third concept is this L2, L3 connection between your pod and the network service endpoint. And it's tempting to think about this as an interface. And it can be an interface. So usually for most application pods, this will get instantiated as a kernel interface. There are people who wanna do NFE kinds of use cases where they want other kinds of mechanisms for their local connectivity. They may want MIMF or V host user or whatever. And there is sufficient flexibility built into the network service mesh that you can do that. But you're running a lot of developers probably just going to get a kernel interface. So what about subnets? So one of the things that we actually realized fairly early on is... Ahead, long time is Madhu here. So, I mean, it's exciting to see this one. Just for me to understand this one, right? Is this also applicable to Windows pods or is it purely mixed on so far? It could be applicable to Windows pods. I do not talk about Windows pods because quite honestly, I am unfortunately relatively ignorant there. I would, if you are in a position to become a bit more involved, to make sure that we stay on a track so that we can also do this for Windows pods, I would be ecstatic. Come on, Ed, anytime I'm with you, come on. I'm sorry, what? I said anytime I'd love to work with you. Okay, awesome. If you could drop me a line, that would be usually helpful so we can sort of sync up. Because I very much would like to make sure that we have a nice happy landing on the Windows side. My major problem is I quite honestly don't know enough to know whether or not I'm leaving the right architectural white space and we don't have anyone else in the community who is a big Windows guy. So, that would be very helpful to us. Sure, this is what I asked is, if you go to the previous slide, the, this is always a challenge. When it comes to the current interface, it's probably probably easy to do on the Windows side. But the moment is, go to the exotic things like MF and the host users, right? We have to be, we have to understand what a lot of interface they have. The HMS is very, if you look at the HMS APIs today, it's very network-centric. So, we have to see what are the equivalents for such exotic things, right? Even if it's not there currently, right? So, yeah. Yeah, and one of the things I would actually love to do with your assistance is try and get some of these things to be a little more generic. So, for example, MMIF is a fairly straightforward shared memory mechanism for two containers on the same box to communicate with each other. And so, if you have shared memory and you have something that looks like a Unix file socket, it shouldn't take a lot to make MMIF work on Windows. Yeah, yeah. Yeah, that's right. Things like namepipes, right? So, the equivalence there is a namepipe for the socket in Unix, right? So, and that's why if you look at the terminologies, so we have done the mistake in the past, as the doctor, where we call it Unix-specific terminologies and finding the appropriate terminology on the other platform, find it hard at the latest stage. So, let's try and see if we can find a common ground there. That is actually also a really good point. So, yeah, I would love to work with you to make sure we don't fall into that error because I'm acutely aware of being parochial in one's view, leading you to make decisions that you didn't have to make, that make life harder later. I talk about this, I call this architectural white space, which is 90% of the time you can take an action that is exactly the same cost as the one you might have otherwise taken. But, six months from now, when you go to do the next thing, choice A is really expensive and choice B is really, really cheap. Let's make choice B, right? Cool. No, thank you. I do appreciate that very much. Hey, Ed, wow, just one more question. Yes. So, in your example as a computer, for this particular example, is it somewhere on like a public cloud or because you're connecting to the corporate, you're trying to connect to the machines on the corporate internet side. So, where is the, where is it located right now? Like possible locations, where do you support such things? Okay, so, network service mesh itself is not the thing that's providing the VPN gateway pod. The VPN gateway pod is a network service. We connect network services, we don't traditionally provide the network services, right? So, the choice of where you put the VPN gateway pod, probably that would be something that would run on the same cluster as Sarah's pod and then would have a connection back to some VPN concentrator. So, I had a picture that looked like this. Actually, I don't have a picture that looks like that. Apologies. But effectively, yes. The VPN gateway pod probably is something that runs in the cluster and then does something like IPsec back to a VPN concentrator on your corporate site. But you have a lot of cases where you are defining when the network services stood up. Does that answer your question? Yeah, yeah, I was trying to understand some of the constructs that you are trying to explain here may require more support in public cloud. Like you may not have that much control over the network if you are running in a public cloud as opposed to if you have your own network or. You will see as we go along that we don't have that much control over the network. A network service mesh tends to be very opportunistic in that if you don't have a lot, it works. If you have more, there are more things you can do with it. So, let me do one, two, three, four to stop, right? So, there was the question here, what about subnets, right? And a network service mesh is dealing with L2L3 connections that are point to point cross connects between your pod of the network service you want. So, you don't really have to think about subnets in that context. One of the interesting things here is if you want a bridge domain, and there are people who do for all kinds of good reasons, that bridge domain itself is a network service, right? And so, all network service mesh would do is connect you to the bridge domain network service. And so, by virtue of being point to point cross connects, we don't have the same kinds of crazy subnet issues. So, of course, then there's the question for Sarah in that, you know, routes, routes are a big pain for her, right? Getting the right prefixes and that kind of stuff. And in looking at these kinds of problems, one of the things we realized is addresses in routes for the L2 and L3 connection, you know, from the network, they think naturally should be coming from the network service endpoint like your VPN gateway, because your VPN gateway probably has a pretty good understanding of what IPs are going to be validly assignable for talking to your corporate internet. And it probably has a pretty good understanding of what prefixes should be routed to your corporate internet. But that's not something you want to go have to manage to the great giant IPAM in the sky. It really is a contract between your VPN gateway and Sarah's pod. And so, you know, that is something that should be handled from Sarah's point of view automatically. There's a little bit of thought on the part of the person who is deploying the VPN gateway pod. And then, so finally, there's this new firewall pod that her security people want to stick before the VPN gateway pod. Now, network service mesh is a mesh. So far, I've just shown you single point-to-point connections. And in this example, once you introduce that firewall pod, the firewall pod and the VPN gateway pod are working together to provide the secure internet connectivity service. From Sarah's point of view, it's just one service, right? The fact that there are multiple things composed to provide it is not really her problem. And so, you know, the question becomes, how does that work? And the answer of course is, we do what we always do in the situations. We do analogies to what's going on in service mesh. So how many folks here are familiar with virtual hosts and route roles in Istio? Me. Cool. I mean, and the cool thing about most of us, sorry. I would expect so. Effectively, what they let you do is they let you bring in policy that influences how connections to services happen. And network service mesh has an analogous concept. We call them network service wirings. And they are very much like the virtual hosts or route rules that allow you to bring policy to how L2 and L3 connections happen when you have someone reaching out to a network service. So just to walk through an example, say that we want to have the situation here where you've got Sarah's pod connecting to a secure internet connectivity service. But it happens under the covers that first it hits a firewall pod and the firewall pod talks to the VPN gateway pod. So for that, you would need two network service wirings. The first one, which I have creatively named secure internet connectivity wiring one has a target of secure internet connectivity. In other words, if you were trying to reach the network service secure internet connectivity, this network service wiring might apply to you. The secure internet connectivity wiring one has a qualifier which says if your source is not something that provides secure internet connectivity, like for example, Sarah's pod, then this policy does apply to you. And then the action is to route to a destination where you have pod selector basically on the label firewall equals true, right? So if you are a Sarah's pod and you reach out to connect to secure internet connectivity, you will be connected to firewall equals true. So this picture sort of shows that graphically. You've got this network service wiring, Sarah's pod is trying to connect to secure internet connectivity because it is not providing secure internet connectivity. It matches this network secure internet connectivity wiring one and therefore it gets connected to the firewall pod. Makes sense so far? Okay, it makes sense. Cool. So, and then the next question is how does the firewall pod get connected to the VPN gateway pod? Before we move on to that, tell us a little bit more about the in terms of like the actions, can you, pending is the right word, but can you sort of chain these together? So it depends on what you mean by chaining. So by chaining, do you mean can you take multiple actions so you route to a firewall pod and then you do something else? Right. Yes, we're very, very early in our thinking about this and we would welcome more participation from folks thinking a little bit deeper about it. I will tell you right now the reason we call this out as an action is one of the things that we have as a use case that people would like to be able to do in the future is to have an action that basically says please go spawn a firewall pod in this cluster if you don't have one already and then route to it. Essentially the dynamic spawning of network service endpoints is something that's on our radar. And in that case, this would sort of work out as a chain thing where your first action would be spawn if not in existence and the second action would be the route. But that's not fully thought through yet in the community. It's been bantied about, but it's not fully thought through. And so we would welcome more participation from folks to try. That definitely is an interesting area for me right now is what I'm working on. Oh, I mean, it's just like it ends up being massively interesting for all kinds of folks. Not the least because if you can provide scope there or you can say, essentially I'm interested in this auto spawning behavior at scope cluster or at scope node, it allows you to deploy network service endpoints sparsely. So imagine I've got a distributed bridge domain that I want to have as a network service. And in order for this to work really efficiently I would like to have a network service endpoint on every node where someone is asking for it. I don't want to run 5,000 network service endpoints for this thing all over my cluster when I might have 20 nodes that actually need the damn thing running. That would be horrible. And so that kind of problem is what we're thinking about around this. But anyway, I water off a little bit of the weeds. So how does the firewall pod get connected to the VPN gateway pod? This is a totally reasonable question. And for that, we introduce a second network service wiring, secure internet connectivity wiring too. Again, it's target in this case is exactly the same secure internet connectivity. But in this case that has a qualifier that says, okay, if your source had pod selector firewall equals true. And what this allows the firewall pod to do is it allows it to say, okay, I'm handling firewall behavior. And I realized that I need secure internet connectivity myself because I don't really do that. So I'm going to reach out and ask to be connected to secure internet connectivity. And this network service wiring will then match that request and route it to the VPN gateway pod. Now, one of the lovely things here is the firewall pod does not have to have any comprehension of the VPN gateway pod exists. And if I decide I want to insert a new pod between the firewall pod and the VPN gateway pod, I just have to change my network service wiring. I don't have to change my firewall pod. Nothing about my firewall pod changes. So this looks graphically like this. Essentially, you've got your firewall pod reaching out for secure internet connectivity with an L2L3 connection. It matches the secure internet connectivity wiring too and it gets connected up to the VPN gateway pod. Makes sense? Cool. So, the next question Sarah has because this has been a pain point for her is what happens when IT decides to put something else in there for more security? Say, IT decides now they want an IDS pod in the chain. Effectively, you just have to have a deployment for those new pods and a network service wiring that connects them and that's it. So there are no interfaces, no IPs, no subnets, and no routes that Sarah has to worry about on any of this. So all of those concepts continue to not be part of the Northbound API for the application developer. And you don't need a new version of Kubernetes. This works with stock unaltered Kubernetes and you don't have to use a specific magic CNI plugin because network service mesh does not actually use CNI. CNI does a great job for Kubernetes networking. It turns out to be the wrong tool for solving these kinds of problems. Which sort of begs the question, how the hell does this actually work? So we're gonna drive a little bit down because Sarah is inquisitive. Yes? Yeah, sorry to interrupt you. So you said that there is no CNI required but I think you're also seeing that if there is already a CNI installed in the Kubernetes cluster for some reason, the network service mesh will work with it. Am I right? Yeah, absolutely. Let me be very specific. Network service mesh does not do CNI's job at all. It's completely orthogonal to CNI. So clearly, if you want Kubernetes networking, you have to have some kind of CNI installed. We literally don't care what because we don't do our work via CNI. So whatever you're doing to make a slightly political joke, if you like your CNI, you can keep your CNI. And we in fact don't provide a CNI at this time. So... Yeah, thanks. Everybody likes CNI. Come on. Hey man, I am not here to speak ill of CNI. It works for the Kubernetes networking that we have today and one of the reasons that we work with the completely unaltered Kubernetes is that we are not trying to change anything about existing Kubernetes networking. It works well. Let it continue to do its job. Neither are we looking for alterations from the device plugin mechanism or any of the rest of that. We can use those things, particularly for handling physical mix and SRIOV. We think they're great. We don't need them to change. So how does the magic work? So you guys I'm sure have seen this picture before, right? It's from Phil Casado's presentation where you have the sidecar in the picture. And so in network service mesh, we have something that's kind of like this that does service discovering in routing. It's called the network service manager and you run it as a daemon set. So you have one on each node. Now it happens though that we have a slightly different situation here than you have with service mesh. With service mesh, everything is running over TCP. And every kernel has a TCP stack. So the connection management piece doesn't have to be handled by the thing that's handling your service discovery and routing. In network service mesh, we do have to handle the connection management because there is no one true connection manager for L2 and L3 connections and networking. In fact, there are at least 50 of them, all of which have active advocates and all of which are actively at war with each other. And we just don't want to get into that. So the NSM init container, when your pod comes up, reads the config map in order to figure out what network services Sarah's pod needs to be connected to. In this case, the VPN gateway pod. And it sends a GRPC call to the network service manager to request an L2, L3 connection to secure internet connectivity network service. So the request connection has any information you need to be clear about how you want the connection to be handled locally in your pod. This is where you would specify what we would call what we would call the local mechanism. So you would say something like, I would like a kernel interface. I would like that kernel interface to be named ETH2 or I would like a MMIF or whatever. You basically give a preference ordered list of what you would like in terms of the local mechanism for connecting this network service to you. And then the VPN gateway pod we're gonna talk first about the case where the VPN gateway pod is on the same node just because it's a simpler case, right? We'll talk later about node to node. So if the VPN gateway pod happens to be on the same node then you send a request connection to it. It sends back an accept connection again over GRPC. And the network service manager creates and injects the interface, whatever mechanism it is into the pod from the data plane and creates and injects the interface into Sarah's pod and cross connects them. It's a very simple cross connect. And how long is this? The init container, once it reads configuration it goes very shortly after Sarah's pod comes up. Yeah, just like init containers. Okay. That's the way init containers work. Now I will point out if Sarah wanted to write an ultra smart pod that dynamically requested connections to network services throughout its lifetime, she certainly could. But for this use the case I wouldn't imagine why she would want to, right? This is very much take something off the shelf, configure, put a config map in and go. Situated. So the blue block here and it said being the, I mean something of the, well is that the something of the control plane deployed on a per node basis? Yeah, it is very much a distributed control plane for cross connects. Okay. But it's only doing the cross, essentially the cross connects piece of it. The other nice thing is this means the network service mesh is the only person in this picture that has to have any kind of privilege in the system. You know, there are people who've done things where you insert a privileged init container into Sarah's pod. I know that right now the Istio guys are trying to dig themselves out of that hole because it's dreadful security wise. But the network service manager is the one that's doing all the manipulation of the data plane and the injecting of the interfaces. And it's actually pretty agnostic as to what that data plane is. That could be the kernel, it could be a V switch, you know, it could be OVS, it could be VPP, it could be whatever it is that you happen to be using for that. Okay. The VPN gateway pod, those are not necessarily deployed as a game set because you have user pods on that node that want to take advantage of that network service. No, no, no, absolutely. You would not necessarily deploy the VPN gateway pod as the Damon said at all. I'm just progressively building up the example, which is why I show them on the same node. Then next up I will show them where the VPN gateway pod is on a different node. Okay. It's just the, with new ideas, you want to sort of build them up bit by bit. Yeah. Cool. So basically the NSM in this case does both the service discovery and sets up the connection. Because again, there is no TCP for L2 and L3 connections. So again, this is getting your question, what if the VPN gateway pod is on a different node? All right, great. So in that case, it starts out the same. It looks exactly the same to Sarah's pod and it's in a container. It literally cannot tell whether the VPN gateway pod is on the same node or not. It sends its request connection to the NSM1 on node one. And the Kubernetes API server where we have these resources for network service endpoints and network service wireings, NSM1 looks up in that API server or more likely from its cache, finds network service endpoints that are providing the service, looks through the network service wiring to identify an appropriate candidate and figures out that there is an appropriate candidate on node two that is managed by NSM2. And part of the network service endpoint tells it how to talk to NSM2. So NSM1 sends a request connection to NSM2 over GRPC. Now, this is really close to the request connection that gets sent from the init container to NSM1, but the mechanisms are different. Rather than talking about local mechanisms like interfaces or MIF, in this case, you're talking about remote mechanisms, typically tunneling mechanisms, like VXLN and GRE. And so in that request connection, you send a preference-ordered list of these are the mechanisms I would prefer, these are the constraints I have on the parameters for them, et cetera, from NSM1 to NSM2. NSM2 goes ahead and does exactly the same thing that it would do if we're dealing with the web node case, the VPN gateway pod. The VPN gateway pod literally can't tell whether you're on the same node or a different node. It creates and injects the interface, sets up its into the tunnel, and then sends an accept back to NSM1 with the selected tunnel mechanism and tunnel parameters based upon the preferences it got from NSM1. NSM1 then creates and injects the interface into Sarah's pod and completes its into the tunnel and we now have a cross-connect. Questions? Yeah, Swede, I don't know if maybe I missed a point, but how does NSM1 know that the VPN gateway is in node two? It looks up in the API server for network service endpoints for the secure internet connectivity. So step two. That's step two, okay. It looks up the network service endpoints and it looks up any applicable network service wireings and logically it's looking up in the Kubernetes API server, but we all know that everybody keeps a local cache, right, so it's relatively light. So here I have a question. So if VPN gateway pod is on the same node, we won't create a tunnel, right? Only if it's in a different node we're creating a tunnel. I'm sorry, I didn't quite understand you, you broke up. Could you repeat that? So if VPN gateway pod is in the same node as of Sarah's pod, so we won't create a tunnel. The 10th step. It fetishes in a different node, sorry. So if it's on the same pod then we what? We won't create a tunnel, right? We are creating tunnel only if they are in different nodes. Oh yeah, you only need a tunnel between them if they're on different nodes. If they're on the same nodes, you can do the cross. So who is creating this tunnel pod? Is it NSM or Sarah's pod? Who is creating the tunnel? The tunnel is being created by the network service managers not by the NSE, because the network service manager is the one that is talking locally to your node's data plane, whether that's the kernel or V-switch. VPN gateway pod just knows that it's getting an interface injected within the network service. So only NSM knows that both the pods are in different nodes. So other Sarah's pod or VPN gateway pod doesn't know about it. Exactly. Sarah's pod doesn't know the VPN gateway pod doesn't know. Now we do have mechanisms that I don't talk about in this deck to allow, for example, Sarah's pod to ask to please be connected to somebody on the same node if at all possible, because there are definitely use cases where you care about locality. But in this very generic use case, yeah. This also has the interesting benefit, by the way. If you've ever had the experience of following the bouncing tunnel preference, where you get, you know, somebody decides that, you know, GRI over MPLS over GRE is the cat's meow. And then a year later, they decide that MPLS over UDP is what they really want. You know, you don't, the Sarah's pod of the VPN gateway pod don't have any notion of what the underlay is in terms of the tunnel selected. And so if you want to introduce a new tunnel mechanism, all you have to do is teach the network service mesh. You don't have to update the hundreds and hundreds of possible network service endpoints or pods in order to use it. They will just get whatever sorts out from the negotiation between the NSMs. Cool, any other questions? So there's a natural question here. How does the network service endpoint resource get into the API server? And that's fairly straightforward. When the VPN gateway pod comes up, it sends a GRPC to this local network service manager saying, hey, I'm exposing a channel. And the network service manager goes and creates the network service endpoints in the Kubernetes API server. So that they are present there to be discovered by Anna. So one of the time comes. So and I believe that is my last slide. So if folks have questions or things they want to discuss or I want to make sure I actually capture the folks, particularly the gentleman who was interested in helping out with Windows. And I think you can, we're interested in getting involved to make sure certain other aspects were being covered in a way that you thought were reasonable. Do we have other questions from folks? So Ed, in your diagram, you had mentioned the data plan to be either some kind of research or the kernel. Does it also work with the normal Linux bridge that comes by default? So you actually can cross connect. You can actually cross connect things via whatever data plan you would like. So the data plans that we currently are looking at right now are we definitely have people who are interested in doing it via directly via kernel cross connects with, be it pairs and cuddles, that's one. We definitely have people who are interested in doing it with VDP. We actually have a lot of interest in some of the things that I didn't cover here about how you use that service mesh with physical Nix and SRI OV. We've definitely had people expressed interest in using OVS as the data plan. And so we're carefully architecting it so that the data plan is pluggable and you can just simply use whatever your desired local data plan is. Okay, so then does it mean that you are making the NSN really robust so that it is able to go and talk to these different data plan options and program them? That is the goal. Like any open source project, it's gonna depend on people showing up to write the little bit of data plan. But this hopefully also makes the Windows discussion easier because I know Windows has quite different data plans than we have in Linux. It also allows you to make different choices depending on your needs. If your needs are relatively lightweight, the convenience of using the kernel as your data plan is probably going to win out. If you have really intensive needs, you may need a stronger data plan that is more efficient and more performant, something more like VPP. And people will have different needs and their needs will evolve over time. And what you'd ideally like to do is to allow them to select the thing that meets their needs and evolve that easily over time. Any other questions? Yeah, one more question. So you told NSN will set up the interface and the tunnel in both sides for the VPN and also for the sort of bots. So where it was actually set? It's inside a pod or in the host? This is- Are you asking where is the tunnel terminated? Yeah, I know from, it was set inside the pod or it will be set inside the host of the VM? Where the pod resides? Okay. The tunnel interface. So basically what the pod sees is some kind of mechanism whether it's a kernel interface or the Windows equivalent or a MIF or whatever thing that negotiated there. The actual termination of the tunnel is happening in the data plane. So, the pod does not see a VXLAN tunnel. So what the pod sees then? I'm sorry, what? What does the pod sees then? The pod would see in this example, a kernel interface. That's what the pod sees. The VPN gateway pod sees a kernel interface. Okay. It happens to be just to pick a really bizarre example, an MPLS over GRE tunnel between them is nobody's business, but the network service mesh, right? So, by the way, I do wanna make sure I capture if whoever it was from Microsoft on the chat could drop me a note on how to get in touch with them because I do wanna make sure we leave space for Windows. Yeah, okay, cool. So, do folks have other questions? Do you have comments? Are there other use cases you wanna bring up? I guess one of the things I wanted to bring up at this point was sort of, what do you see this kind of fitting in? If we were to look at this inside the CNCF, what would you see this kind of fitting in? Well, there are some interesting things about it. So, things that I did not have space for in this deck because I time constraints. It turns out that you can have something we call an external network service manager and ENSM that manages physical network stuff that looks exactly like any other network service manager from inside the cluster. So, you do have the ability to interact outside of just the Kubernetes cluster. And so, there's potentially a great deal more broadness here than just Kubernetes. So, part of what we're looking for is advice as to what the best formal vehicle forward from your point of view would be as the CNCF networking group. Effectively, the advice from the sign networking was we think you should become a Kubernetes working group. I'm curious what the advice is from the CNCF networking working group. What is it? We've identified four possibilities so far. One is Kubernetes sub-project under sign networking. One is Kubernetes working group. A third would be a CNCF working group and a fourth would be CNCF project. And we're just trying to sort of sort of among those options based upon the feedback we're getting from the broader community because we feel that it would be useful to have a formal home, but we're not hugely pedantic about what that ends up being. Yeah, it actually touched upon an important point. That is, how do you use this mechanism to leverage existing virtual appliances that might be deployed in the network? Because, most of the data that is in the cloud environment come and drop their clusters in an existing network that already has all these appliances in place. So, it uses mechanism to also direct traffic to say an appliance outside the cluster and then go out to other places from there that would have a lot of value. Yeah, give me one second to stop sharing and start sharing. Well, actually I don't remember quite where that slide is. But effectively, what you have is a physical appliance out there in the world. Effectively, you just end up deploying a network service manager that speaks to the network service manager and network service manager, a GRPC API. And on one side, so that other network service managers can talk to it and then does whatever it needs to do to manage those physical appliances on the other side. And we're quite agnostic as to what that is. So, if the VPN gateway pod in this example were actually a physical box that was being handled, then effectively you would still have a service pod talking to NSM-1, NSM-1 talking to NSM-2, but rather than NSM-2 talking to a pod on the node, it would go and speak that half-yang or some set of other kinds of mechanisms to go configure that physical appliance and the connectivity to it. Yeah, or in the case of the cloud environment, maybe it would have to then also control plane because the pod or the appliance is already there. So it is a matter of just going and configuring it. Yep, yep, no, absolutely. And the thing is, we're completely fine with that. This is part of why you've got the pod to NSM and the NSM to NSM is going to separate the pod. You guys are going to be trying something that looks like a network service manager in the outside world, but is managing frankly any things that you need to manage on the other side of its communication, right? And then Sarah's pod doesn't know any different. NSM-1 doesn't know any different. You know, it all looks the same. And that ends up being massively powerful when you're dealing with existing appliances because again, we are just looking here at simple cross-connects. Yep, cool. I guess maybe a couple of pieces of event. One is just to summarize the project overall, NSM, it's about bringing NFV to cloud native or cloud native data, whichever way you direction you want to state that. But the particular case that we went through today was less necessarily NFV, but probably somewhat of what is concerned for those that are kind of a bridge, disparate network data centers. You know, in the common use case they're being able to have some stuff on-prem. I'd like to access my cloud-based I can't work straight, my cloud-based container deployment, which makes a lot of sense. The other use cases that NSM facilitates are my hunches, are things that are layer three and below, are MPLS and that are service provider oriented. It's not that it doesn't service things that are above layer three, but the distinction between what did you get out of NSM versus other common service meshes? Well, hey, those other service meshes don't address layer two, layer three, they don't, right now, don't facilitate kind of, I'm assuming this is like an IPsec-based VPN or whatever that is, they don't facilitate that. It very well could be, you know, we're happy to negotiate whatever connections are doable on both ends. So you're absolutely right. Auditive NFE is certainly one of our use cases, but it turns out there are lots of other use cases that look sort of like the one I've shown here is for illustrations that are more enterprise-oriented use cases. One of the other sets of use cases that people have brought up is you've got people who've got existing sort of open stack clouds and they'd like to connect a pod to a neutron network. They don't want to have to backhaul the entire neutron concept space into Kubernetes, because that's unlinked. You know, that sort of gives you the hell that Sarah was talking about. Well, a neutron network is a perfectly fine network service as far as we're concerned. And that would be a case where you would have some kind of ENSM or external NSM to connect you to it. We've got people who want to do bridge domains. We have people who want to be able to have physical NICS or SRIOV NICS that are providing connectivity to specific networks, like a radio network service or that kind of thing. You know, so we've got a very broad set of use cases. You're absolutely right though. DMark for us is more L3 and below because the existing service mesh stuff that Isti was doing, they're doing a kick ass job for L4 through L7. They really are. There's absolutely no point for us to go try and play in that space. But it should also be pointed out because of the miracle of layering and networking that you could deploy a traditional service mesh, an application service mesh at L4 through L7 over a network service mesh at L2 and L3 and that works perfectly fine. And we actually have people who want to do that. And our basic position is, isn't it wonderful when things work? Yeah, that is pretty cool. You know, one of the, something that was sparked my interest to Favett or that I think has a lot of potential play is the bridging between this world that we just walked through and kind of the clouded with the digital infrastructure with like networks management of... Oh, yeah. Yeah, no, I mean, that ends up being really, really cool because again, the world continues to look simple to all the stuff in the cluster, even though there's something horrendously complicated on the outside, you know, you insulate them from it by having very simple APIs and by dealing with very simple cross-connects. So, you know, I literally just before I got on this chatting with somebody who has a Mac VLAN use case where he's got a physical trunking work for coming into a node. He wants to deploy a pod that consumes that physical, you know, VLAN trunking network service and then also exposes the virtualized VLAN trunking service to other bots, you know, that kind of stuff. So there's a huge amount of flexibility here and we address a bunch of use cases in a way that, I mean, we were trying very hard to feel cloud native instead of backhauling the, you know, slap a V in front of it, cloud 1.0 concepts. Yeah, so you were saying the Kubernetes project is interested, I mean, clearly they're interested in like very mental provisioning of nodes to get up, you know, to get clusters up and so that people can use Kubernetes and but it sounds like they'd also shown interest in the physical kind of NFB aspect of having a separate SIG that... Right, so being careful not to speak on behalf of Kubernetes, because I'm in no way able to do that. But speaking as someone who has been an observer to many of the SIG networking meetings, some of the resource, you know, many of the resource management working group meetings and many others, my perception is they readily acknowledge the validity and utility of the use cases themselves but they are looking for solutions that do not require them to do radical overhauls of the system, right? By radical overhauls, there's literally a proposal right now that involves taking the device ID that is used between Cubelet and the device plugin and the device plugin API and adding that to the pod spec. So to facilitate... They don't want to do that kind of shit but they would like to solve the use case and we're trying to provide a way that solves the use case that does not actually change what they're doing and that is relatively clean. So in some respects to facilitate migration from or integration with pre-existing technology and employment like OpenStack is a good example. Maybe, I know we're out of time. All right, out of time. Maybe there's a couple of other questions I think for me to give them good feedback about maybe where the best home for the kind of whether or not a self-working group here would make sense or some of those are along the lines of the meetings that are being held today for the Lugato community. I'm just giving that that's kind of the umbrella under which NSM falls. How active are those discussions? Quick one, do you guys meet? So we meet every Friday morning at 8 a.m. Pacific. Let me actually stick a link into the chat for the basic jumping off point. So basically this is your basic jumping off point. It lists out in the read me a bunch of the different collateral in terms of there are any more much more collateral than I showed here. Gives you the link to the meeting, the weekly meetings, the calendar, the mailing lists, waiters to the IRC channel. As you all know, communities tend to differ in their personalities. The network service must seem to be a very IRC oriented community for some reason as it turns out. So, yep. Just one more question. Have you thought about or is this robust enough to expand to other orchestrators also also not just communities, maybe to others? Okay, so expand on what you mean by other orchestrators. Does that sounds fascinating? Like something like MeSos, for example. Oh, yeah. So that is the general pattern I would imagine to be easily adaptable to something like what MeSosphere is doing. I'll be honest, I don't know enough about MeSos to know quite where the sharp pointy bits are going to be. But I imagine it probably could be and that's probably something else we're getting someone involved in the community early would help because there is a tendency to sort of lom on the Kubernetes concepts. And I have no doubt that we are glomming on in places more strongly than we have to. It's just human nature and maybe glomming a little less strongly in places would make it easier to bring the same pattern to other orchestrators. Thanks. So, yeah. I want to go over by a few minutes. I'm going to go ahead and close out. That's fair. Apologies, I pulled the run over. You're definitely in good interest. A lot of questions, a lot of use cases to go through. You guys can feel free to hang on if you have some more time, Ed. Just because I have to drop doesn't mean I might need to go. But I think the recording will keep going on while I leave. But thanks for your time today, Ed. And we'll follow up with some other discussion we want to have. And I think we'll join one involved in the network so that we can mesh up. Yeah, please do. Because again, I'm really excited by some of the things on the Windows and Messos front where by getting early participation, we can actually not make life hard later. That would be wonderful. Great. Thanks, everyone. All right, cool. I actually do have to run at this point. But any closing questions before I do? Thanks, Ed. This was good. Very interesting. Excellent. Talk to you later. Bye-bye.