 Okay. I want to thank everyone for joining us. Welcome to today's CNCF webinar, What's New in Lincord. I'm Libby Schultz. I'll be moderating today's webinar and we would like to welcome our presenter today, Oliver Gould, Lincord creator and CEO, CTO at Lincord. Excuse me. A few housekeeping items before we get started. During the webinar, you're not able to talk as an attendee. There's a Q&A box at the bottom of your screen. Please feel free to add your questions there, not the chat, the Q&A and we'll get as many in as we can at the end. This is an official webinar of the CNCF and as such is subject to the CNCF code of conduct. Please do not add anything to the chat or questions that would be in violation of that code of conduct. Basically, just please be respectful of your fellow participants and presenters. Also, please note the recording and slides will be posted later on the CNCF webinar page at www.cncf.io slash webinars. With that, I will hand it over to Oliver. Thanks, Libby. Before I dive into this, if you can figure out how to give me a reaction and I just want to see if you can give me a thumbs up in Zoom before I go on. Are you alive there? Are you able to interact? Yeah, we got some hands there. Thank you. Great. Before I dive in, how many of you are actually running Kubernetes in production? Give me a hand if you are. Cool, we got a few there. Excellent. How many of you carry a pager for Kubernetes in production? Were you actually responsible for things running? Okay, we got a few. Anyone running Linkerty in production? Okay, great. We still got a few. Excellent. That's what I like to see. My hope today is that if you haven't used Linkerty before, this talk gives you enough to get started and understand what problems it solved and how it works. If you have used Linkerty before, I hope this will be an invite to get more involved in the community and contribute in one of many ways. With that, my name is Oliver Gould. I am the creator of Linkerty. I have been working on this for the past few years. The talk will basically start with a brief overview of Linkerty and service measures and what they do. I'll take you through a tour of Linkerty's features. Then we'll get to the good stuff, we'll get to a demo, and then have some questions after that. Great. Linkerty has been around. We've been working on it since 2015, I believe. It's been production for over four years now. We've got a really active community. Many of you are already in the Slack or are already dealing with us in GitHub. We really appreciate that. We've been in production with a whole wealth of different types of companies, very small, very large. Big traffic, small traffic, everything. We've been part of CNCF since early 2016, if I recall correctly. We've been in CNCF for a long time. That means we're committed to open governance. This isn't a corporate project. This is a community project, even though Boyant, where I work, does a lot of the work on the project. Linkerty comes out of my experience or our experience working at Twitter. I was a production operations engineer at Twitter from 2010 to 2015. That's where we really saw what was one of the first modern microservices of all. This is, of course, with Mezos because Kubernetes is around yet, but a lot of the same problems and primitives was there. I was on call for service discovery and traffic management and a lot of the core concepts that Linkerty deals with. Then in 2016, we launched Linkerty 1.0, which was a JVM-based proxy. It wasn't Kubernetes-specific. It's super generic. It can be tied together, Mezos, and console, and Kubernetes, and all sorts of things. Over time, we learned that that heavyweight flexibility wasn't really a good suit for Kubernetes. We created a new version of Linkerty, Linkerty 2, which is really lightweight, meant to just get up and running and really tied to Kubernetes tightly. That's really what I want to talk about today, is that that new version of Linkerty, which has been around since, I don't know, 2018 is when it became Linkerty 2. We've been working on it since probably 2017 sometime. Linkerty, out of the box, gave you three core polls of value. First, being observability. It'd be really hard to understand what's going on in your cluster. KubeCuttle is great, but it can tell you whether things are up or down. They can't tell you if things are slow or fast or if they're failing requests that's very decoupled from the Kubernetes ecosystem. We also have support for tracing. I'll get into more of that later. The other big thing we do is connectivity. Load balancing, timeouts, routing, connecting clusters, making sure that a pod can talk to another pod or another service is a big part of what Linkerty does. We do that securely. We do that with MTLS by default for everything. We integrate with projects like CERT Manager, which is just admitted as a CNCF project. Congrats, CERT Manager. I'll get into more of these details later. Let's start, though, with microservices. I assume most of you are basically familiar with microservices, but just to lay the groundwork, microservices are contrast with what we had before then, the lamp stack, and that we've taken what used to be a bunch of library interactions and linking and put that over the network. Now we can separate business logic over the network and things call each other via APIs. The service mesh is a way to add rich operability, the things we were just talking about, as a sidecar to this. You don't have to embed this in a library in your code. Of course you can. But it's about having a proxy that's able to take on a lot of these concerns outside of your application. So you can have it uniformly, depending on, regardless of what application stack you're using, whether they're homonymous or heterogeneous, a service mesh can make that uniform and work well. And this is all powered by a control plane. And so you have a Kubernetes control plane, which is the Kubernetes API server and its extensions. And Linkery provides a set of control plane APIs that interact closely with the Kubernetes APIs to power the proxies. So in Linkery itself, the proxy is basically unaware of anything having to do with Kubernetes that only knows about the control plane API. The control plane itself is very tightly coupled to Kubernetes. This kind of looks like this, right? We have these proxies are ejected as sidecar. So within your pod, we add a container to your pod, which is the proxy that's done by a control plane web hook called the proxy injector. Also in the control plane, we have a certificate authority, which the proxy uses to establish identities. We'll get into more of those details later. And then we've built the proxy in Rust. We've built most of the control plane and go. We ship up an instance of Prometheus and Grafana by default. We have great helm charts. We support SMI. We're a big part of the, you know, we're big fans of the CNCF ecosystem. And so we really try to leverage as much of that ecosystem as you can. That's not in our core competency. Our goal is to do this without adding complexity. So Kubernetes itself is complex. And that's really what we find. No one should want to adopt a service mesh. Most people are having enough trouble just to adopt Kubernetes. And Linkerty is meant to just kind of get installed, get out of your way and then grow with you as you have more problems that you're trying to solve. We really don't want you to try to solve all of the traffic problems on day one. That should be an incremental path that you're going to be successful. And so to do that, we make it, you know, it installs, you add it to your application, almost no configuration necessary to get started there. We really focus on minimizing the resource requirements of Linkerty. And I'll show you some examples of that later. Part of our goals for simplicity are that we don't try to invent new abstractions that are specific to Linkerty. We really embrace Kubernetes primitives and sometimes to a fault, I would argue, but we really don't want you to have to introduce new types of abstractions that you have to think about and manage. We really try to use things that are already well understood and well supported. And we try to do that securely by default. And we'll show more examples of this later. We do MTLS by, well, we add security by supporting MTLS in the proxy. That's mutual identity, where every proxy generates its own private keys. Those private keys never leave the pod. And then we automatically secure everything we can. And we don't currently break anything that we can't secure. And this is really important because a lot of the Kubernetes core API or core traffic like health checks, readiness probes, et cetera, can't be secured by default today. And so we really need to take an incremental approach to improving things and making it auditable. We've really focused on secure foundations. And so our proxy is written in Rust, which is a memory safe native language. I'll show more examples of that later. And again, built on Kubernetes, we want the goal for Linkerty is to be able to install it, add TLS, add policy, and get going. The proxy is one of the biggest questions we get is, does Linkerty use Envoy? And the answer is no. We really have tailored our proxy to the service mesh use case. It's not a generally configurable proxy. There's no config file that you can go change how the proxy works or add plugins to it today. And that's one for simplicity, two for minimizing resource overhead, and three for security. We find that the more configurable something in this kind of critical part of the stack is the harder it is to audit, the harder it is to get confidence about being secure. So we really try to minimize the flexibility of the proxy and really tailor for the use case for these reasons. And we've had this audited a few times now. We basically do security audits once a year, and things have been going really well. As far as that, we've had a few very minor issues, but nothing scary. And this is all built heavily on the Rust networking system. And if you haven't played with the Rust stack of Tokyo hyper tonic power, these projects are really exciting. And we've invested heavily in them. In fact, a lot of the proxy code has been forked out of the proxy and moved upstream into these projects. So we're big fans of the open source ecosystem and making things better there. Okay. So that's just setting the stage. And I want to go into a little bit more detail about the proxies or linker these features in general. So the single sharpest tool in our tool belt is low load balancer. And so every proxy implements what's called a peak Yuma load balancer. Yuma is a exponentially weighted moving average. And we use the latency of individual requests to inform the load balancer. So there's no centralized load balancer state. There's no single load balancer you go to. Every pod has its own little load balancer embedded in that proxy that's making its own latency decisions in its own ratting city. This is all really tightly coupled to Kubernetes services. So we don't, again, we don't add new primitives there. We use services for services discovery, which means we benefit from things like service topology. And I think I saw Matei in the list here. Matei is one of our interns this summer and who implemented service topologies work. So service topologies are a new Kubernetes feature. I think they've just been available since 118 or 119. And it allows you to express kind of no affinity and things like that or failure zones and make Kubernetes services aware of that. And now linker the honors that stuff as well. This means that we can bypass kube proxy. And again, the big, big goal here is that you can just add this proxy to your application. You don't have to reconfigure your application to work all that differently. And it can take, it can benefit from all this logic. That's amazing. Here's a really good example. I don't want to spend too much time on here, but just to show what a load balancing algorithm can actually do for reliability. If we use something really naive, like around Robin load balancer, let's say we're, this is actually a test we ran where there's 10 instances. And one instance is slow. It has a two-second latency. Everything else is quite fast, about 100 milliseconds. And what we see is that if we set a one-second timeout around Robin load balancer, it means we'd have about a 95% success rate. And just by changing the load balancing algorithm to at least loaded, we can push it up over 99%. And with a Yuma load balancer, which is latency aware, we can get better than three nines. And that, in at least my experience, can be a real difference between being woken up and being able to sleep through the night if your pager goes off. And so I really do think that the load balancing in LinkedIn is maybe undercowded, but really important thing that it does. One, probably the most important thing that LinkedIn does though, is add mutual TLS to all connections. So when we say that, we mean if there's a proxy on two pods and those pods communicate, we'll establish a TLS for that. Mutual TLS, where both pods identify themselves to each other, and we have a secure connection there. This is all bootstrapped off of Kubernetes service accounts. Again, we don't want to provide any new requirements in terms of API service area. So every pod generally has a service account. We take that token and use that to authenticate ourselves to the CA, which gives us short-lived certificates that we rotate throughout the lifetime of the pod. Private key never leaves the memory of the pod. And this really is an important tool in a zero trust architecture. You can use things like cert manager to bootstrap this and help manage that CA. And also, if you already have TLS in your infrastructure, LinkedIn will work with that transparently. We won't add a new layer of TLS to that necessarily, but we will let that pass through as TCP traffic. And up until 2.9, so LinkedIn 2.8, for instance, only supported MTLS for HTTP communication. With 2.9, we've added support for both the load balancer and mutual TLS for almost all TCP traffic. There's some classes of TCP traffic that doesn't apply to yet, but we'll be fixing that in 2.10. That's really exciting actually. That's a big thing we did in 2.9. Here's my favorite LinkedIn feature, and I don't think this is in any other service mesh. You'll have to check with them. I haven't really done my research there, but this is something we especially designed in the LinkedIn in that we do protocol upgrades between proxies. And so this is a hand-drawn diagram I did earlier today, but if you have HU1 communication and you're doing many requests at once, let's say you're doing 1,000 concurrent requests from a pod to another set of pods, that'll be 1,000 TCP connections, one per request that's active. And with HB2 multiplexing, what we do is proxies only establish one connection between each other at most. It gets empty last once, and then we multiplex all of those requests across that single connection, and then we demultiplex it on the other end so that we don't have to change your application semantics in any way. Testing I was doing just in the last few weeks shows that this really, really, really reduces the number of requirements for these proxies. Connections can cost quite a lot in terms of buffering and things like that. And so by only having one connection, we really bring the proxies memory usage down. And this, again, kind of speaks to why we've chosen to implement our own proxy here is we can be really special about the types of smart things we do within the scope of the proxy. Again, no application changes, just works out of the box. Most people don't even know about this. Another feature we implemented, I think, probably about a year ago now, is traffic splitting. And this is basically stochastic weighted routing. And a popular tool, Flagger, can be used to drive this. Basically says if you have multiple services that you want to send traffic to, well, you can wait between them and say, I want 20% traffic here and 80% of the traffic there. And the link would be one of that. And I can show this off in a demo. Again, this was something that only worked with HB traffic until 29. And as of the 29 release, we now support this for TCP as well. Now it's at the connection level. We can't be any smarter than that. But it's really a convenient tool, especially for multi-cluster routing. The service mesh interface is a project that was initially sponsored by folks at Bajur. And we've been heavily involved in it. And most of the other service meshes have been heavily involved in it. And it provides basically three core APIs. One is the traffic split that I just showed. Another is a telemetry API to SMI metrics, which is really just a uniform set of metrics that will work with any service mesh. And then there's a policy API as well. And this is another CRD that'll let you just express policy. Linkrity doesn't support the policy API yet. And we're working on revving that API with those folks so that we can support in a better way. That's probably 211. So not in the next release, but the following. In 27, I think, and maybe 28, in the last release, we started adding a multi-cluster capabilities. And this lets you basically sync services across clusters and route them through a gateway or an ingress on that other cluster. This is really cool and that works really well with traffic split. So the application doesn't have to know about where a service is hosted. You just say, I want to talk to the Foo service. And if the Foo service is actually an east cluster, you can make that all totally transparent to the app. And you can even kind of do weighted shifting and things like that. This is great because there's no single point of failure here. We're not going through any kind of centralized load balancer. There's no special network requirements. You don't have to have a flat network where everything's addressable. So we try to maximize for flexibility. And this is really the first LinkerD extension, where most of the core LinkerD experience doesn't know anything about multi-clusters. This is a kind of a pure add-on, which leverages the Kubernetes API and a little bit of LinkerD service discovery implementation to make this really cool. Another reason that we focused on implementing our own proxy was that we really wanted to have an excellent Prometheus experience. And I think it's gotten better over the last year or so, but for a long time, Envoy's Prometheus implementation was really as difficult. Envoy started with a Statsky push-based model. And we've taken another approach, which is to use a lot of the Kubernetes metadata. So the deployment you're talking to, the surface account that that's part of the pod name, all sorts of labels for that pod, all of those things get hydrated onto the LinkerD's metrics. So we can give you really rich dashboards and queryability. This also, we've extended to work with Open API or Swagger specs, as well as your PC, so you can take those route definitions and import them into LinkerD and have all of that information show up in your LinkerD metrics as well. Again, there's no configuration necessary. Those Swagger and Protobuf enhancements do take some configuration, but we just try to make this work all out of the box. You don't have to do any configuration. We ship Prometheus by default, and you can get up and running. In 2.9, you're also able to bring your own Prometheus. So if you already have a Prometheus in your cluster, you don't have to use LinkerD's as well, you can configure LinkerD to talk to the main Prometheus, and since some of it all worked great. We get a lot of folks asking for distributed tracing with open census or open telemetry. And LinkerD can work in that world as well, but the dirty secret about distributed tracing is that you actually have to change your application integrated with this. There's no way that we can do this totally transparent to the application. So if you have tracing set up in your ecosystem, LinkerD will emit spans and tell you where you're hitting hops in LinkerD in the service mesh, but this isn't something that will just work out of the box, but we have a good integration there as well. What will work out of the box is LinkerD's tap functionality, and this is what I call ad hoc tracing, where you're able to, at runtime without prior configuration, query proxies directly and say, show me metadata about the traffic that's going through you. So this isn't aggregated in the metrics form where you already know what you're looking for. This can be done kind of in the discovery or exploratory mode. That does mean that your headers and all sorts of private information can be exposed by this API. And so this is all locked down with our back. You can set roll bindings to make this much more locked down if you need to or you can turn it off entirely. In fact, in 210 we'll be making tap and optional components, so you don't even need to run it by default. Okay, then there's a few more things in 2.9 that I haven't even talked about yet. One of the bigger ones was a summer of code project that added multi-arch builds. So now you can use LinkerD on your Raspberry Pi clusters or your Gravachon clusters if you're in AWS. So with ARM taking on a lot more importance in the world and being much more power efficient from what I can tell, it's really exciting that LinkerD can work in those environments out of the box today. So no configuration changes. You just, if you have a mixed cluster with some ARM nodes and some AMD nodes, they'll just pull the right images and it works transparently. As I mentioned, Matei added service typology support. This also means that we now support new Kubernetes endpoint slices API. The endpoint slices API is designed to allow your clusters to have much larger services. The Kubernetes API itself could get very slower on large services and large deployments. And so we support those new APIs that allow those things to be more efficient. As I mentioned just a second ago, we allow you to bring your own communities in Grafana. So if you already have those things, no reason to install a second copy of them. And in 210 we'll be making a lot of these components very optional where they get installed separately. And then where I spend most of my time is in the LinkerD proxy, we had a ton of changes between 28 and 29. We basically rewrote the entire services discovery mechanism. So now instead of looking at post headers and headers of the requests, we do all discovery based on the app you're talking to. So if it's an IP, we know it's a service IP and we can do load balancing based on that. Or if we're talking directly to a pod, we'll just forward it to the pod without doing load balancing. And this means that we can work well with ingresses. So if you're using Envoy as an ingress, for instance, we'll let you configure session stickiness or any of those things in Envoy and LinkerD will honor it without interfering. We've done all of this work basically to support MTLS through TCP communication. So most TCP protocols will now be secure and load balanced and routed with the traffic split out of the box. The protocols that doesn't match are server-first protocols. So things like MySQL or SMTP, where the server speaks first to the proxy or to the client. And that's because we do protocol detection. There's no configuration needed to tell us what protocol you're speaking for the most part. So we look at the first few bytes of the connection. We say, oh, this is HTTP1. This is HTTP2. And we forward it. And 210 will support MTLS for those protocols as well. And we'll also support multi-cluster for TCP connections. As we've been getting more and more production users, we had some people doing failure testing and realizing that it could take LinkerD a while to reconnect after a node outage and things like that. And this actually turns out because of Kube proxy. Because Kube proxy itself, which is the IP tables-based service discovery and load balancing scheme, Kubernetes ships with by default, can actually be quite slow to update in some environments. And so now we bypass that entirely when we talk to the control plane. We load balance requests overall of the pods in the control plane. And it's much more resilient to those types of failures. We also adopted a new runtime in the proxy. So previously, the proxy would only use at most one core. We only run one thread for the proxy. And this is great for most applications where you just want LinkerD to be small and get out of the way. But some folks want to push 50,000 requests a second through a pod. And that's going to be really hard to do on one thread. And so now we support scaling this up. It is an annotation, or if you set CPU limits via a mutating webhook or something like that, the proxy will just pick up those settings and limit itself to the correct number of CPUs that you can figure. And all of this work has enabled us to reduce latency, reduce the CPU usage and reduce the memory usage of the proxies, which is really one of the big motivators for us is to have the lightest footprint we can, especially in the proxy. Okay, we finally got to the fun time. I have enough time to get through it, I think. So instead of giving you a flashy microservice demo where we've done emoji voto in the past, and I think in the two-way webinar, we did multi-cluster and things like that, I'm actually just going to show you the Dev environment that I use for testing a lot of the proxy changes and link changes in general. I'm a big fan of K3D. It's much like kind in that you can just run it in Docker on your host, no build Kubernetes setup. I'm going to be using the latest Linkery stable release, which came out on Monday, and I'm going to be using this low-test tool I wrote called Ork. It might change its name, but it's really just a client which generates lots of requests. Those can be GRPC, HV1, or TCP, and we have a server that supports those. This is what we use for testing or reproducing bugs or things like that. Without further ado, let's see if I can show you what's going on. If this spawns too small, just shut it in the chat and I'll try to fix it. Instead of showing you how to set up a K3D cluster, I've already set that up and I've already deployed the app to it. We have, I think, about three, like nine, 12 pods running something like that in the app. K3D itself takes about 500 megs or so in this environment. I've been running these low-tests. No Linkery installed yet. This is just the low generator running on a host under my desk in my office, which I had not been to in a long time. So let's install Linkery and I'll just take you through that experience. So before we actually install Linkery, we're going to check and make sure the cluster is ready. If I installed Linkery previously and not uninstalled it properly or there's clock skew, we do a bunch of checks just to make sure Linkery can be installed. Great and that worked. So now I'm just going to install it. Linkery install will just generate a lot of YAML and we'll just apply it. And this will take a few seconds. We see that Linkery is one, two, three, four, five, six, seven, eight, nine, ten. Ten components right now, all of them pretty small. You can run these in HA mode where they run multiple replicas of each. And as I mentioned, we now support turning off Prometheus and Grafana. We'll make those kind of slimmer in the next few weeks, even so that the web components are optional. We'll give this a second to start up. But that won't actually do very much for us. We've installed Linkery, but the application isn't running up to and in fact we see that here where there's only one container in each of those pods. In order to validate that Linkery is coming up correctly, we can use the Linkery check command. And what Linkery check does is it again runs through a whole bunch of common problems that can happen during an install and validates that they aren't there. One of the biggest things we have to make sure all the pods come up. So we're going to wait for all the pods to initialize. This will take hopefully not too long, a couple seconds. Much more impatient about this when I have people watching. Okay, it looks like we're almost there. So I don't know if you noticed, but before we installed Linkery, this host was about two gigs usage. And that was mostly K3D and Docker. We'll see now installing Linkery, the largest part of Linkery, memory-wise, is the Prometheus instance, which is going to run us around 100 megs or so in this different environment. And then if we can also look at the proxies, here we see the proxies all running, all of them, you know, most of them well under 20 megs. The one that's running with Prometheus, which talks to all of the pods in the cluster a little bit higher there, running around 20 megs. Okay, so I've got that all set up. And now I'm going to upgrade my test. So don't worry too much about this. This is a Helm chart I used for configuring different topologies to test. Let's see, we're going to run at most three requests at once per load generator. We're going to inject Linkery. And so this inject Linkery sitting there, what it really does is just add a single annotation to the pods or to the manifest so that the Linkery controller can know to inject these things. So by default, we won't add Linkery to things. You can either annotate namespaces or the workloads themselves to get things running. And so that will, we're going to roll our test environment there. So we see those pods restarting and coming up now with two containers. Let's see if this works. There's this check proxy command, which will again kind of look at the health and make sure the proxies have all started. They have service accounts in those pods, et cetera, so they can actually run. And now we actually have Linkery running with this environment. So let me open a dashboard. And this is really what we get out of the box of Linkery. And so we immediately have a whole bunch of metrics about what's going on in the cluster. So we can pick it up. A high level look at the namespaces. We see that the Linkery name space and the ort namespace are the only ones injected. We're doing around 2.5,000 requests a second in there, which makes some sense. And the other thing I've done is I've installed a traffic split here, which we'll see here. And so now the load generator is talking to a single service called server. And that's being split equally across three separate services, server one, zero, one, and two. And they're all doing about the, you know, the all the same weight and they're all doing about the same amount of traffic. I've handcrafted some Prometheus queries here to show the TCP load generator. We haven't wired through many of the TCP oriented metrics into the dashboard yet. That'd be a great place if folks want to help us there in the 210 release. But we see, you know, about the same amount of traffic being spread equally across all of those connections. And now what we can do is we can actually modify the traffic just by adding this resource. So I'm going to open this traffic split resource. And I can put, say, do 80% to server two, and we'll do the remaining 10% each to the other ones. This will take a couple seconds, especially for Prometheus to do some scrapes. So, Prometheus scrapes every 10 seconds. The actual change will be instantaneous, but we'll take us a little, a minute to see this in the dashboard. Oh, one of the other interesting things that we see here is that we get a topology out of the box, right? And so we haven't had to configure anything. We're not using tracing, but just from the metrics being involved here, we can actually get a kind of call graph for your system out of the box. If we look at, for instance, one of the server deployments, we can see all the deployments that talk to it. As I mentioned, we also have a TCP load generator that's talking to it, but it's not showing up because we need some enhanced metrics here, or we need some UI work to show those metrics. And as we look at, for instance, the GRPC load generator, what we see are that we are actually doing, the dashboard's doing cap. And so it's actually looking and counting live requests here. And this shows us that, like, in the last 10, 20 seconds that we've been looking at this, we've been doing a lot more requests to the heavily weighted server and much fewer together. We also get that from our metrics here. And if we want to go even further, we can take all the load off of one of them. And in a few seconds, we'll see, I think, server one just dropped off the map there. As I mentioned, we ship a graphon instance by default. And so we can actually just explore some of these things directly from here. Yeah. And we see just in this request level request rate graph that this has started at equal traffic reduced to about 10% of the traffic. And now it's at 0% of the traffic. That was server two. See the opposite. That's enough fiddling around with LinkerD, I think. So looking forward, there's a lot of the roadmap in the community is working on a bunch of different things, not all of it at point. One of the bigger features we're working on for 2.10 is really focusing on minimizing the core control plane. So in order to get TLS, service discovery and load balancing, and proxy configuration, which are really what you need to run the service mesh, we want to minimize the imperium space still. So only those core concerns. And then things like Prometheus and Grafana and TAP are going to become an extension that you can add on the LinkerD with a single command again, but are not part of that kind of core control plane. This, again, should hopefully make it easier to upgrade the Viz components without having to upgrade the core components. And ultimately, we want that core set of functionality to change much less frequently to become boring with the goal. Today, we don't do, we have multi-cluster only supports HTTP traffic. And so the big thing we're going to do for that in, for multi-cluster in 2.10 is start supporting all TCP traffic, including server-first protocols. And that means we'll start doing MTLS for those protocols as well. That will require some configuration to enable, but it's not, it's like an annotation. It's not that heavy. As we kind of saw on this demo, some of the TCP metrics, the Tor TCP metrics haven't been wired through into the UIs. I guess I missed one of the better UIs to show where TCP metrics are used, but we want to validate that, we want to validate that TLS is actually working. We have this edgages command, which lists all of the, you know, in this case, deploys that are talking to each other. We can also do this by pod. You get a lot more there. And then the cases where this is from probably when we were doing the restart, we didn't have service discovery on some of those connections. So we didn't have identity problems that they weren't provided. But we want to keep enhancing a lot of this functionality with the new TCP data that we have there. There's a new feature, a newer feature in Kubernetes that's still not widely supported called bounded service accounts. And so today, every pod has a single server account that it uses for everything it's going to use, whether it's, you know, modifying Kubernetes resources or doing identity provisioning. And there's a new beta feature that allows us to bound service account tokens to different uses. And so what we'd really like to do is have short lived service account tokens that are only used for provisioning identity so they can't be confused with other things. You know, if you give a service account token to something else, we can't use that to get limited identity. Traffic policy, as I mentioned, is a big concern that we have. We've been being pretty incremental about that. The goals to first get everything to be TLS, get that to be multi cluster. And then once we have these kind of core connectivity concerns, we can start to think about policy in a more serious way. That's a big part of what I think differentiates liquidity from other service measures, especially STO is we haven't started with all the features and tried to productionize them all slowly. We've really focused on being production ready and incremental in our approach there. So that's what we're going to do for traffic policy as well. There's a group of folks outside of point that are very eager to implement TIPS 140-2 for this basically means for government. So if you want to use liquidity in government applications, today our TLS implementations are not validated by the standard. And so there's some folks working on allowing you to swap out the TLS implementations for ones that are validated. I'm really excited about that as well. And right now, liquidity only works within Kubernetes clusters. You can of course talk to resources that aren't in the Kubernetes cluster, but those aren't secured and managed in the same way. And so we're really eager to allow you to add proxies to things that are not in Kubernetes so that we can support that use case as well. And finally, there's a new WASM web assembly, which is neither web nor assembly API called proxy WASM, which is supported by Envoy. And we started to experiment with what it would take to support that sort of thing in a proxy. I'm still a little conservative about adopting leading edge technologies like that. Of course, I think from a security point of view, there may be some risk there. But ultimately, this means that you might be able to write plugins in any number of languages, JavaScript, Go, Rust, and have them just dropped in the proxy and work without having to change proxy code, which is really promising. But don't let that stop you from getting involved in writing proxy code. It's important to call out that not all contributions to LinkedIn have to be code contributions. We have a community anchor program where we've just started over the past couple months where we really want to get folks in the community who are getting value out of LinkedIn and solving your problems with it. We want you to be talking about that. We want you to be on the stage at KubeCon, telling your story, and also telling the good and the bad of what it's like to use LinkedIn. And so if that's something that's interesting to you, go to that URL and be happy to help set up with both the material and the opportunity to go do those talks and those blog posts. And finally, it's a community project. I've said this many times today. The real value of LinkedIn is that we've got a great set of folks who are always adding new things to it, who are helping us find bugs before we get to stable releases. And so folks testing edge releases, we do edge releases basically every week and stable releases about every two months or so. This last one was a little bit longer. But we'd love for you to get involved on GitHub. There's a help wanted tag where it's great to just chime in that you want to help us and start working. We love code contributions. I also love it when people help people debug things on Slack. We have a lot of new people coming in and asking questions. And it's great to have people who have already been there to answer those questions. So it's not just a few of us all the time. We've got mailing lists. We've got monthly community calls. We get the security audits where we love the same safety system and whatever for us. So with that, I would love to hear what questions you have, if any. Yep. Let's go ahead and drop all the Q&As into the Q&A box. And Oliver, you can take it away. And we have about 10 minutes left. So we'll get as many done as we have time for. Do you want me to read them for you? Nope. I got it. Just catching my breath. So the first question from Sanjay is, what's the ultimate to strict and permissive MTLS mode in like the compared to STO? And so I kind of answered this earlier, but we don't have a strict mode today. There's a number of gotchas in the way of that. So we're working in adding policy. I want to get the kind of first pieces of that into the 211 release. But today we're opportunistic about adding TLS. And so we're kind of in the permissive mode. And we provide auditing tools to let you debug and alert on that yourself. Next question is, with the 29 release has likely been optimized for smaller arm devices like Raspberry Pi. It's a great question, Prené. I wouldn't say it's been totally optimized. We haven't spent a lot of time in the arm environment yet. It's kind of experimental at this point. But our focus in general is really lightweight. And so we're focusing on minimizing the control plane. So you have a very lightweight stall footprint there. And also the proxies themselves are quite small. If we go back to this environment, which is running our load tests, we started around two gigs and we're still under three gigs. And most of that is Prometheus, right? All of these, the largest proxy is Prometheus proxy at 28 megs. Everything else is 21 megs or lower, which is really, really lightweight, especially if you compare this in the other service machine conditions. So I think we'll be better on Raspberry Pi than some of the others are, but there's probably more work to do there. We'd love help. And Sanjay also asked, what's the plan to implement authorization rules in Liberty? So we're targeting something like the SMI traffic policy API. And so SMI has a traffic policy CRD. It doesn't quite fit our model of the world. And so we're probably going to submit another revision and work with them to rev that. That design is probably kicking off in December in earnest. We need to get to KubeCon and get some of the 210 work in place. But we'll start designing that towards the end of the year. And once 210 goes out, we're going to start working on that full time. And it'll be the kind of big future in 211. So I'm really looking forward to that. And I would love input if you have pain points that you've seen in the existing implementations in either SMI or STO. We'd love to hear what's worked well for you, what hasn't worked well. John said that I said that multiplexing computer reduces network load. I think this will reduce the amount of ports used in a cluster, maybe running port exhaustion. It definitely can reduce the number of raw sockets and file descriptors that epoxy. And so if you imagine, because we're proxying GCP connections, if we're not doing anything smart here, we really double the number of connections in a pod. Because we have to terminate the connection when we accept it. And we have to establish new connections on the way out. And so this is the multiplexing really lets us work around that cost. It means one that we use far fewer femoral ports on the outbound side. And on the inbound side, we're not really doing a port exhaustion on the inbound side. But it's a similar problem there where we try to reduce the number of sockets, the number of file descriptors that epoxy use. And as we've seen in testing, that can really have an impact on memory consumption. So yeah, our goal is to minimize the operating system resources we need to do these things. Any other questions? I've gotten through them too quickly, I think. It's great. Does anybody else have anything? There we go. Does Linkerty support networked policies similar to Calico or should one just use Calico? Today, I would say use both. So I think security and depth is a better approach than just having one layer. And so it's good to use network policies where you really want to enforce where traffic can and can't go. But I think it's also important to use things like MTLS for kind of the zero trust aspects of securing communication in flight and things like that. CMS, can we segregate traffic between namespaces? For example, all namespaces can talk to Linkerty and CERT manager, but not others. This is a great feature request and I think it's definitely something we're going to approach. Like that's definitely a use case for us to consider as we implement traffic policies. This is an obvious type of policy you may want to express. I think today the way to achieve that would be through a layer approach. So something like network policies layered with Linkerty. It's on the roadmap. It's not really there as a full link to future today. Excellent. All right. You have time for one more question if someone wants to. All right. Well, you can also jump into our Slack. I'm happy to answer questions there. I'll be around for some of the afternoon. We have one more pop up. You want to answer? One more question. Okay. Are there plans to have beginner and intermediate coursework? Great question, Conrad. I believe that we just had a free CNCF course just go live. So I don't have Dylan Candy. But yes, there is coursework. It's available through CNCF. I believe it's free and it should be focused on beginner and intermediate for sure. Good question from Jindong about is there a downside to HP2 multiplexing? So it can add latency and CPU usage in some cases. So it's really a trade-off between the memory footprint and the socket overhead there versus some CPU and a little bit of latency. We see higher tail latencies in the H21 case, but slightly higher average latencies or P50 latencies in the HP2 case. So there is a little bit of a trade-off there. That's a really good question. But we think HP2 is a better default for the lightweight approach. Okay. Thanks so much, Oliver, for a great discussion and a great presentation and Q&A. That's all the time we have for today, everyone, and thanks for joining us. The webinar recording and slides will be online later today, and we are looking forward to seeing you at a future CNCF webinar. Everybody have a great day. Bye-bye.