 I'm Shane Ud. I work at Kong on Kubernetes Networking. I'm a chair of SIG Network and a maintainer of Gateway API. I'm Surya, and I'm an engineer working on the Red Hat OpenShift networking team. I mostly work on SIGnet policy API working group, and together with Shane and Antonio, we're going to give the SIG network intro and updates. So welcome all, and let's get started with our agenda today. So we'll first look at the APIs that the networking team gives in Kubernetes that we're responsible for, more or less. The services in point slices, Gateway API, Network Policy API, Admin Network Policy API, and also what's new in the 127 release, right? So that's some of the things that we look at. And over to Shane, we'll talk about the Gateway Pits. Yeah. So first, we'll talk about service. This is commonly one of the first APIs somebody will run into. Like when they set up a deployment, they might first have gotten a service for your deployment when you run kubectl expose. It enables grouping pods together and exposing them as a networking service. They're given an IP address they can be reached on, and requests can be routed to one of the associated endpoints. Endpoints track IPs and ports for pods. We used to have, I still have, endpoints, which had some limitations. You could only get up to 1,000 pods per service. And more recently, have endpoint slices, which is its successor. It shards the endpoints, and it's much more scalable. Endpoint slices also allowed us to do things in dual stack, topology, and terminating endpoints. Next, in building on top of that, we have the Ingress API, which is very common. Most people are aware of it. It's been around for five-plus years. It does basic host and path matching, TLS configuration, and it is simply and broadly implemented. There's 20-plus implementations of Ingress. You can kind of find it everywhere. There are some limitations with Ingress that we ran into in the last few years. There's many non-portable extensions. What we ended up with was kind of this annotations Wild West, where basically, for everything that an Ingress controller or something would want to do, you'd end up with a custom annotation for it, and it just kind of kept going until the situation we have today is no two Ingresses do anything remotely similar to each other. It had an insufficient permission model, and it mainly focused on HTTP traffic. Also, and we'll get to this in a couple of slides here, it was limited to North and South traffic, which led us into the Gateway API. So Gateway API is the next generation on top of this, rather as an alternative, and handles routing and load balancing in a similar way, but with way more features than what Ingress provided. It's expressive, extensible, it's role-oriented, so as you can see over here, we have Gateway class, Gateway HTTP route. HTTP route is the closest thing you have to Ingress. Gateway class is kind of similar to Ingress class, if you're familiar with it, but an admin might create a gateway, but a developer might create an HTTP route. We'll talk a little bit more about that too. There are beyond 20 implementations and a few integrations with it today, despite it still being in beta, so it's very popular and there's a lot of people implementing it for their solutions. We graduated to beta last year, and our intention, we'll see how it goes, is to try to GA before Chicago this year. To give a little bit more view as to how far we go beyond Ingress, so we talked about Ingress only doing like HTTP traffic, we do HTTP, GRPC, TCP, UDP, and TLS, we also have non-route types which you saw like Gateway class and Gateway. I'll talk a little bit more about those and reference grant, which is kind of a special one. I'll talk a little bit about that one. So to just give a starting point, when you start with Gateway API, you tend to create a Gateway class. This effectively is just a resource like Ingress class that lets the resources that belong to it kind of understand what controller is responsible for provisioning those resources and handling their life cycle. So in this case, the ACME controller would be responsible for any gateways that are associated with it. So we have a Gateway, this is a very simple Gateway that just listens on port 80 for HTTP traffic and you can see from the red ACME, it's associated with the Gateway class we showed on the previous slide. So it's attached to that and then you can attach routes and the most common route right now, the one that's heading for GA and the most mature one is HTTP route which is the parallel to Ingress. It gives you the ability, you have another attachment here, parent refs allow you to attach to gateways. Unlike the gateway, you can actually, or sorry, the parent refs have multiple parent refs that you can actually do for a route so you can actually attach to multiple gateways. And then it's pretty simple and I think I put, yeah, a comparison, so like on the right there is an HTTP route and on the left there is an Ingress that does relatively the same thing, but the right side is way more extensible, we have a lot more fields available and you can kind of find that if you dig in a little bit deeper, this doesn't go super deep. And that's kind of how it looks, we have all these different routes can attach to gateways, all the different types theoretically, if a gateway supports them, gateways can theoretically be multiple gateways and you can attach different routes, they may have any combination, some may only do UDP, for instance. And then this is one of the non-route APIs which we won't go into too much depth on, but one of the problems we ran into in gateway APIs we wanted the ability to be able to have an HTTP route or any route, be able to reach like a service in another namespace. And so that's a boundary with some security concerns there. So we came up with reference grant which is basically a two-way handshake where the actual namespace, you see on the left the HTTP route wants to go to that service in that backend namespace from the store namespace, it needs a reference grant, somebody with our back permissions in that other namespace needs to say, yeah, that's fine, and allow it back to the controller to say, okay, and then it starts forwarding the traffic to that. That is a very cursory overview of these APIs, but I have a link in a minute here that can kind of get you to the next steps because gateway API is very big, so. So that covered, we covered service which is kind of like where a lot of people start with just getting traffic into their applications ingress which is kind of the stable existing thing for HTTP traffic and then gateway which covers the gamut of everything else. We are working on service mesh, this is kind of a newer thing that's happened like within the last year with what's called the gamma project. So that is actually right now nestled under gateway API, so it's a sub project of that sub project. Gamma stands for gateway API for mesh management and administration. So you can use it basically like HCP routes in theory in the future GRPC routes and all that for East-West traffic instead of just North-South traffic. And there's six plus implementations, these are the ones that I was able to get a hold of that do it and there's conformance tests, we just, I merged it like days ago. So there's conformance tests that are actually in there now which these implementations are starting to try to get working and like set up. So it's starting to gather a lot of steam. So if you're interested in service mesh, the gamma project is a good place to hone in on. Because gateway API is so vast and there were I think eight plus talks at this conference alone about it, I only scratch the surface. Please do go to our website, check it out. We're also at that channel, signetwork gateway API in the Kubernetes Slack and then that's our repository if you just wanna go straight to the repo. Network policies is another subgroup within signetwork. Can we have a show of hands if you've used network policies before? That makes my job way easier then. It's a core V1 stable entry API that we've had for over five years now. So it was designed with tenant owners or namespace owners in mind. So for the app developers, if you wanna secure your workloads and define rules on how you wanna enforce your layer three or layer four traffic to flow between your pods, between your namespaces and one such use case for example is what you see here. You wanna be able to say, I want my backend pods to receive traffic only from my front end pods, right? That's a simple use case and you can enforce that using network policies. And that's how a sample YAML would look like. You can use, well, namespace policies are, network policies are namespace scoped. So if you define a policy within a namespace, it's usually if you don't mention selectors in the subject of the namespace network policy like using pod selectors, it will select all the pods within that namespace, right? But you can use pod selectors on the spec to select a subset of pods within a namespace. And peers is what you define to construct your relationship on who you should talk to or who you don't wanna talk to. And the rules are of two types, ingress rules or egress rules depending on what traffic flow you wanna control. So you can have peers that are either other namespaces or pods within the same namespace, right? So that's what a sample YAML would look like and kind is ingress or egress. Note that the ingress and egress are explicit and they complement each other. So basically when you create network policy, right? Everything works until you create the policy. But as soon as you create the policy, everything stops working. So there's this default deny that you create and on top of that you have these allow rules that you create. If you are creating only an ingress policy, egress is allowed by default and vice versa. So that's something to take care of. The API design is a bit implicit in nature because you don't really expect that default deny to be set in place because you just basically said what you wanna allow, right? You never said deny everything else, but that's what you get as a gift with the allow. And I think I will recover the implicit part of it. And let's talk a bit about the peers in the network policy API group. So you can express peers as pods, namespaces. That covers the East-West traffic scenarios. But if you wanna let's say also talk restrict traffic to northbound. So egress outside the cluster from your pods, you can also use something called an IP block on your API in the spec part. You can define it and that helps you define the set of ciders that you wanna restrict or allow traffic to from. And you might think, well, all this is there for the developers, but what can administrators in a cluster do to enforce more stricter rules that are non-overrideable by the policies that are defined by the namespace owners? And that's why we have this new policy, the admin network policy API in SIG network in the network policy API working group. It's a relatively new one. It's been around for almost just a year. They kept merged a year ago and we have the API repo that lives out of tree. So it's still in the early phases of it. So we are a V1, Alpha one API and we welcome contribution. So let's take a look at what that API entails, right? If you are an administrator for cluster and you wanna be able to express cluster scoped rules for regulating traffic, like let's say you wanna say that the sensitive namespace in your cluster should not be able to receive traffic from any of the namespaces in your cluster. You can put rules that can express such scenarios easily using an admin network policy. The API defines two kinds of CRDs. One is an admin network policy. The other one is a baseline admin network policy. And the structure kind of looks like that, the precedents let's say. So if you have admin network policy defined in your cluster those are the rules that get evaluated first and they have a higher precedents. If there is no match found there then you would fall down to the network policies. And if there's no match found there then as an administrator you still wanna say I wanna have a default guard rail across my cluster that can help secure the workloads. And that's where a baseline admin network policy comes into play. And you can have at most just one baseline admin network policy. It's just like a default fallback that you want to put in place. The API design is explicit in nature. We tried to learn from our network policy design and then we have done hopefully better. We welcome feedback. So as you can see the sample YAML how that looks like we have a new field called priority which is not there in the network policies that lets you, it's powerful. It lets you set what precedents you want your rules to be in in case you have overlapping subjects and peers that you're defining. And we also have the explicit deny action or an allow action that you can specify instead of it being an implicit deny. So you get what you ask for, right? There's no implicitness here and it kind of mimics the traditional firewall that operators and administrators are used to. Like I said, it's an Aviva and Alpha one version of API. We support East-West only. We are in talks for having support for Northbound and to support more use cases there. Hopefully in the next version of the API we have that. And peers over here in this case is just pods and namespaces as of now. We are aiming for beta by the end of this year. Let's see. So the current status is that we're having some implementations in progress. We're getting some feedback and based on that we're changing some parts of our API and seeing what's feasible and what's not. So if you are in the networking ecosystem, if you have use cases that you want to secure your workloads across your cluster, you want to try out this, please reach out to us. We are more than welcome to see if we can accommodate your use cases, right? So feedback appreciated. And in addition to implementations, we're also looking into how we can have conformance testing and we're trying to follow patterns that are similar to the gateway API working group so that we can have the CNI plugins implemented in a seamless way and then just report the results back to us such that we can save their conformance or not to the API. We have a subgroup within the SIGNET which is called the SIG Network Policy API. There's the Slack channel on Kubernetes where you can find us. We also have bi-monthly meetings on Tuesdays at 6 p.m. European time. So please do join us if you're interested. We have a lot going on and we would welcome all kinds of contributions. Moving on to the networking components. We have a proxy that is the default implementation of service proxying in Kubernetes. It's called Kube Proxy. It's been around again for a really long time and I think you heard what Shane spoke about services and endpoint slices. So what Kube Proxy does is it converts these Kubernetes API objects into networking rules and we have two backends that we support as of today. One is the IP tables backend which is the default one. So IP table rules are created for every service and endpoint slice, right? And then the other one is the IPBS backend and moving forward, we actually have a new cap there which I've linked on the slide which is gonna support a new backend based on NF tables. IP tables has been around for a really long time and it has been the default packet processing filtering system in the next corner. It's, I hear it's getting deprecated. We have done in the audience if you have questions, please do reach out to him but it has some disadvantages, right? Like it does not really, there's absence of incremental updates, for example. It does not scale well as the cluster sizes increase if you have a large number of services, takes a lot of time for the rules to sync. And we also, well, the larger issue over here is of course the deprecation and we wanna prepare for the move towards successor NF tables which is more performant efficient and it solves some of these issues with IP tables and also all the new features are going into NF tables and not IP tables. So eventually we do wanna move to NF tables. So that's something that we have in store and it's being discussed and there's a cap out there if you're interested, please do reach out, comment on the cap. And I'll give it off to Antonio who will talk about the new things that we have. So, okay. Well, what to do will be wondering what these people is doing, right? So, we try to adapt. We have different requests for new features. We have to deal with technical debt. We have to do a lot of things, right? As Surya said before, we have to fix Q-proxy. You have two caps there that are going on with Q-proxy and this is mostly done with support that is here. So, if you have bugs or something, go to him at the red tree, okay? But, well, the state is, well, you can track, we have a project. You are curious, you can go there, you can track. We try to keep it updated and you can see what we are doing, right? We have these Q-proxy things. Well, we lately started to implement and you can see with Gateway API. Gateway API says GA there, right? But it's not a really GA. What we say is this group show a path to other projects to be able to deliver a quick value, right? If everybody wants to merge and the fit in in Kubernetes core, it's very slow. It was stuck to, I think, three years of discussions and some regressions. And what Gateway API demonstrated is that you can create a working group, work with CRDs and get things working and in a faster way that with Kubernetes. So, we have all these working groups, network admin policies, another good example, right? And we have a working group now that is being working, I think that for more than six months, but this multi-network that a lot of people will be interested in that. We really don't know. They are targeting to implementing in core, but we really don't know. Right now, it's inception, incubation, okay? We have also another call kept that are going in APIs. So, we realize that we don't have a clear concept of definition of the network in the cluster, okay? And we want to provide a better experience for users. So, to be able to define their, or to expand their cluster network, pod network or the separate network. So, this multiple cluster side and multiple service side, there are two kept that are in alpha, but we are still discussing how can we provide the better user experience for all of you. We have another, it's more feature that allows to people to reserve the lower ban of the node port. So, imagine that you have a node port and you want it to be used for an application beforehand, right? And then, if you choose a random value, some other service in the cluster can get it. With this, similar to another feature that we did with the cluster API, you can reserve the lower value of the node port. So, dynamic port will never look at it for that range. This is a very useful, a small and useful feature for people. And in addition, the other more important kept that we are doing is the top only our routing that I'm going to speak soon later. We have the expanded DNS configuration. This is a funny one. You know that with libc, you can, I don't remember exactly the number. You can only configure the name servers. Well, this is seen several years. You can have, I think that until six or more name servers. So, we had to wait two, three years to implement in the container runtime, to cascade to support all the new versions and to finally people be able to use six name servers in the result.com. That's an example of things take a lot of time to know where people. So, far, and I don't think, okay. And for graduation, we had the two, one feature that was very important for zero downtime deployment. That's the terminator in points feature. Then, what is the other one? Sorry, I don't see it. Service traffic policy. This was a funny one. We have a lot of discussion because we have another feature that is topology, our routing and this call line. So, basically, what we decided is service traffic policy, decide if the traffic is internal. Oh, I don't know. It's internal and the traffic is local. It only is able to reach the pods or the points that are in the same node. Okay? So, it's mimicking the external traffic policy for service, but for internal. To span on a bit more in one of these features, what I was trying to say before. Right now, we don't have a clear definition of what are the networks in the, well, we may have a definition, but we don't have a full agreement on what are the networks in a cluster, okay? We simplify the networks to a pod network, a node network and a server network. Everything is configured beforehand. So, you are installing whatever configure the flags. We tend to come up with a better design that allow people to modify and manipulate these networks and recite these networks without having to recreate the cluster or tend to have the goal will be to have zero disruption and be able to add service network or add new pod networks so you can grow and resize your clusters independently of the network. The other feature that I mentioned it before, and this took a long time to get implemented in QPROC and that was implemented recently in 126 for beta, is the terminating in points. The key of this feature is that until now, the points had only a binary state, right? So, the point only know if the pod is ready or the pod, if the pod is already, it was in present. With this feature that what we have is the lifecycle of the pod in the employment slice. So, the implementations are able to track if the pod is ready, if the pod is not ready but is terminating and is able to serve traffic or if the pod is totally terminated. The way to use it and the problem that it solves is to have zero downtime on rolling updates. So, you just see the graph and you have a load balancer. The, when you use external traffic policy local the load balancer pulls the nodes to know in which node you have a pod. The moment that a pod is a deployment rollout, the pod start terminating. That's important that your application can do the system signal. So, it can still process traffic. So, during the period that the health check start to fail, it is still going to serve traffic. Before, because we didn't have this state in the employment slice object, the QProsy automatically remove the pod. So, all the traffic was recorded. With this, the QProsy know, okay, this is the only pod alive. I know that it's terminating and it's able to handle traffic. So, let this pod keep serving traffic. But the health check is a load balancer don't send me new traffic here, right? So, this way, when you roll out, you have a zero downtime. It looks simple, but it took a lot of work to get this working correctly. So, and the least, the last thing that I want to talk is about topology, the world routing. And this is a very demanded feature, for several reasons, right? People that use crowd mainly, want to keep this traffic in the same zone, right? And just not most economic, but you know, you have performance and latency benefits. And you, some of you may have seen discussions on issues and we always say I'm trying to push back, but not push back for, we don't want to do it. Pushing back because one thing is what you say or what another thing is what happened, right? When we implement this, okay? You have a perfect symmetry and okay, everything goes to the place that it has to go. But when you are in reality, you don't have any control on the client, okay? So, what happens when, I don't know, your scheduler sends 100 pods to the one zone and those pods are the one reaching the service, okay? They are going to overload the service and maybe you have two other pods either in another zone and you are not sending traffic. The thing is that we don't have a really, realable way to say, okay, this pod is overloaded, just overflow to another place. And this is one, well, this is the main thing while we are trying to get this right because if we offer this tool to the people and people say, okay, I have preferred zone and then, okay, I have preferred zone and you are giving me an outage and it's even worse that paying for the interzone traffic. So, what are you doing? You are doing everything wrong. So, the other point is, and this is another observation that we have. The network is not a workaround for the scheduler. Once you deploy something, you need to take a really good scheduling strategy. So, with this feature, we just need to cover different ledges. We need to cover the scheduling of the endpoints of the pod and we need to cover how to handle the traffic of the clients, you know? We need to do these two dimensions and this is the thing that I personally am going to work during this cycle to see if we can get this right or to see if we can, I don't know, write some best practices or something so we can have this feature and users can use it then without black hole in traffic. And that's it. I don't know if you have any questions. Feel free to ask. I'm going to turn in the status of endpoints sizes. Okay, if you don't mind because I just fixed that incident two weeks ago, okay? So, this is a problem that we have in Cineco, right? It's what, I don't know, we talked before in the other talk is what is reference architecture? What is the reference implementation and how do we inform behavior? We have this program that is conformance test but this doesn't really fit our needs and I know that Shane touches on the Gateway API. What we are trying to do is we have this E2E test. So, when somebody has a feature, that's an E2E test and then that E2E test defines the behavior. But what we realize is that we are not promoting this well. I went to Syrian and I thought, why are you not running this E2E test? Oh, we are running the E2E test. The ones that say conformance, okay, it's all right. That's the problem. We are not promoting that the implementations of Kuprox Heli things need to run E2E test that may be not conformance but are Kuprox Heli conformance. It's our fault in Cineco to not make it clear to implementation. We define the API, we define the behavior. Please run the test. Is it the responsibility to CNI plugins to implement that behavior? We are trying to make it a little bit better where we can implement more of the behavior in a core component that other proxy implementations can build on top of, but we don't have anything yet. Nobody? It's HTTP was the most stable of the gateway API. Yes. How about the others, what's the state? So GRPC route is moving really fast. We've got a lot of conformance going for that quickly. So that's actually tracking really good. TCP and UDP route are ones that I'm personally wearing on. I kind of focused a little bit more on the layer four side of things. And they've had some trouble getting, we have people that are doing them but like getting the conformance going and stuff has been difficult. It's kind of a weird space. So we have a project and gateway API right now called Bleakst, which is a reference and testing implementation of gateway API that we are kind of porting in to be like our CI testing tool, but also it's a layer four implementation. So like TCP and UDP route specifically using EVPF for the data plane and that's going to help drive forward the conformance. So we should have conformance. I'm literally right now working on getting to the point where that conformance has started because conformance is at the core of how you graduate or go anywhere in gateway API. So they're in alpha today. In theory, we're trying to get them into when the GA happens and HTTP route and gateway and gateway class reach V1 at hopefully Chicago. Hopefully at the exact same time we have betas for TCP and UDP route. On TLS route, I'm not really sure yet. That one's kind of even in a weirder space. We do have some conformance tests and stuff like that but we're kind of have to see how that one goes and play it by ear a little bit. You want to add on that, Nick? TLS route needs, we've got some conformance like a fair amount of it is coming with many way implementations that are off the set of non-conformance tests and then you've got to have a certain number of documentation to pass it. Yeah. We do have some conformance tests, it's not for exercising the links so it's more conformance tests but we also need applications to pass the conformance test but it's not that far away, it's not that much better. We think the shape of the API is pretty good since we've done it. So, same for TCP route, we don't think there's any major in the API there. Yeah, they're pretty simple APIs. And it's possible but I don't know if we'll have the priority for it or not but it is possible potentially that our reference implementation might be able to help push that one along too but again, it's kind of, we're playing it by ear, we're waiting for people to jump on it. Any other questions? Go for it. Is that our last slide actually? I thought we had, okay. Oh, yeah, go ahead. I heard you said it's still, you know, early days and I'm also new to seeing that work so, maybe it's just a matter of whether I connect with that or, but I'm wondering if you're thinking about a feedback with good scheduling because it's easy to imagine use cases where, you know, I heard you kind of say, you've got to take responsibility for not creating these wildly out of balance situations but it's sometimes you can imagine like an HBC cluster that kind of wants that but maybe, you know, you can also see it just having a runaway of that with the thunder and hurt effects, obviously having one cell in there, you don't want that. So I'm just really interested in how you're thinking about creating that feedback with scheduling. I don't know, Robert, if you want to come up. This is, come here, I can start. This is the problem of Kubernetes, you know, you have different type of users. You know, if you give this feature to one power user, it's going to make it perfect, you know, it controls the network. But then we can throw this and we can create all that is. We don't want this for that, you know. And there are multiple problems, multiple solutions and we are going to try to get the solution. Rob, please. He was thinking much more than me on this. You can blame me for it. Yeah, so Antonio's been working with me on it. You know, topology is really hard to get right. We've gotten it wrong more than we've gotten it right maybe, I don't know. What I'll say is we've been trying to take little baby steps towards something that's better. I think we all agree the feedback loop is the end destination that we want to get to. But that's just a massive project with the, you know, capabilities we have today. I don't know of a way to do that with IP tables, for example. Maybe there's other technologies that, but you know, these are all huge projects in of themselves. So the first thing we started was hints like, okay, we'll just try and proportionally allocate end points. That felt a little too magic to a number of people. And so now we're starting to look at well, we'll give you the dangerous thing. It will work for some cases, but have to like just plaster warning labels all over it. But it will unlock some real use cases. So we want to at least let that exist, but be aware that until we have a feedback loop where, you know, it comes with some danger as Antonio was saying. But I'll give it back to you. Yeah. Basically, we're going to implement it on top of it. Okay, I was just going to say before everybody goes, please do check that out. So that's our readme. It has a ton of stuff in it. We'd love to have you join the SIG Network channel and just a lot of people come into SIG Network like wanting to get involved and stuff like that, but don't know where to start. We are happy to help guide you. And the readme includes all of our different meetings for various groups. So find something that like you like or that gets your attention. Join it. We have meetings all throughout every week. Thank you for coming.