 Hi everyone, I'm William Morgan. I have an official title these days. I used to be the person who speaks a lot about LinkerD. Now I have an official role in the project, which is called director. So I'm officially in a markdown file in a GitHub repo with that. So thank you all for coming. Today this is our KubeCon EU project update talk. I'm gonna try and keep this pretty high level and pretty informal. So, and probably there'll be lots of space for questions at the end, so feel free to come up afterwards. And if I don't get a chance to answer your questions myself and then a bunch of the LinkerD maintainers are hanging out at the project pavilion, the CNCF project pavilion, there's a LinkerD booth there. So please come say hi to us, we'd love to talk to you. All right, so I've titled this talk and I think the title has changed a little bit since what's officially on the schedule. I'm gonna talk about VM support. I'm gonna talk about egress. I'm gonna talk about Spiffy. And then I'm gonna talk about more. So hopefully this is what you were expecting. Let's go ahead and get started. I've got, I'm gonna warn you, I've got two or three slides in here that we've been using for probably eight years. So, and one day we're gonna, sometimes we'll update the numbers and stuff. One day we'll give these another bit of polish, but just imagine a bunch of beleaguered open source nerds frantically trying to put slides together at the very last minute. Definitely not what happened in this talk, but you can imagine that is happening maybe for some of these other talks. So what is LinkerD? It's a service mesh. I do have one slide about what a service mesh is. So don't worry if you've never heard that term before, get ready for a thrilling journey. Created originally by a company called Boyant. Eight plus years in production, almost 10,000 Slack channel members. So if you have a moment to go to slack.linkerd.io and you don't have to say anything, just log in. That'd be great because I'd love to get that finally to 10,000 people in that Slack. Lots of GitHub stars and contributors and things like that. And we've been a CNCF project since almost the beginning of the CNCF. I think we joined in 2016 as the fifth project. And we joined as a, they didn't have incubation back then. They called the inception. So we joined as an inception level project. All right, but ultimately, I like to think about this as like a job. What job does Linkerty have? So Linkerty's job is to give every platform engineer in the world tools they need to create a secure, reliable and observable cloud native platform. Now, Linkerty is pretty Kubernetes specific. So if you're not using Kubernetes, this can't really do its job. But if you are, then this is how I think of kind of what we're trying to provide to you. We want to give you the ability to build a platform and to build it on top of Kubernetes that has those three properties, right? Secure, reliable, observable. Linkerty is not a complete solution. We can't fix every security aspect, you know, of running a modern cloud native application, but we can do a chunk of it. And same thing for reliability and observability. So that's our job. All right, as promised, what is a service mesh? It's a infrastructure layer that provides those, you know, those three kind of properties at the platform level. And I'll talk about some of the features that we do that fall into each of those buckets. You can think of it as like an L7 capable network, you know, that's debatable from the network engineering point of view, but it's a nice model. It is uniform across your entire application so we don't require you to make application changes. And the way that Linkerty works specifically, because there's different implementations, is that we have these things called, well, they're proxies. We call them micro-proxies because they're really small and they're really specific to Linkerty's use case. And we stick them right inside the pods, called the sidecar model. And then they handle all the traffic to and from those pods. And then we've got a set of other processes that live in a namespace somewhere that is called the control plane. So those proxies are called the data plane and then we've got these other processes called the control plane that give you the ability to manipulate those proxies as a whole. So in the olden days, the thought of deploying 10,000 proxies was like horrifying. The magic of Kubernetes is that we can do that and we can make it kind of usable for you. All right, is this making sense so far to everyone? I see nodding heads. All right, good. Okay, one talk happening at this conference that I wanna call out right off the bat is, I talked about, I'm gonna talk a little bit more about Rust in the proxy and that's kind of what the interesting kind of story has been for Linkerty so far. We actually have started to introduce Rust in the control plane as well and that is a little more interesting in the sense that that is a part of Linkerty that has to interact with the Kubernetes API. So the proxies themselves are actually totally independent of Kubernetes, they don't know anything. They're specific to Linkerty so the proxies themselves only talk to the Linkerty API. They're not general purpose proxies, they're not like Envoy, they're not like Nginx but the proxies don't know anything about Kubernetes. In the control plane, of course, your whole job is to interface with Kubernetes so we started to do that in Rust, it has been complicated, let's say. It's been fast, safe, insane, apparently. So please, if you have time tomorrow to catch this talk, Matej, one of our Linkerty maintainers will be talking about some of the work we've been doing there. All right, so if I were to describe kind of the design philosophy of Linkerty, we wanna do less, we wanna make it simple, especially simple to operate. So our goal is if you are the poor, beleaguered SRE or platform owner or whatever it is who's tasked with building a platform on top of Kubernetes, we wanna give you something that should basically just work out of the box and shouldn't cause a crazy amount of resource consumption, shouldn't require you to have to wake up at three in the morning and you should be able to build kind of like a mental model of how Linkerty works and you should be able to, when Linkerty's doing something that's unexpected, you should be able to then understand, okay, why is this happening? Now, that's the goal, we're never perfect, but that's kind of our design philosophy and that's informed a lot of the way that we've developed features and the architecture and even the choice to use sidecars which is a little, maybe a little, I wouldn't say controversial, but it's not the only option in the modern service mesh ecosystem. All right, so what makes Linkerty unique, many things, but one big one kind of from the architectural component is, we build it on top of these Rust proxies and these micro proxies are custom built for Linkerty, it's a part of the stack that we own, we control and Rust gives us a whole bunch of cool properties so if you're a language's nerd, Rust has a very powerful and sophisticated type system, okay, it probably doesn't impact you in a direct way if you're a consumer of Linkerty, but if you're a programmer it's cool. More importantly, Rust compiles a native code so we can make these proxies as fast as possible, as fast as C or C++, that's really important for a proxy. Go is great and there's lots of components of Linkerty on the control plane side that we do in Go, on the proxy side we really need something like Rust so we can be as fast as humanly, as computationally possible. And then it also gives us these really nice security guarantees so the whole point of Rust, the reason that language was invented more or less was so you can have a language that's as fast as C and C++ but that circumvents an entire set of memory safety vulnerabilities, buffer overflow exploits, the whole set of memory management stuff that Rust can enforce for you and so if you're building a security focused product that's a really, really powerful property because it means you get a whole bunch of, you get to avoid a whole bunch of like the classic issues that you see with languages like C and C++. The other thing that I think is kind of cool is the state of the art in user space networking is all happening in Rust. Like these libraries are where all of the brightest minds of asynchronous networking in user space go to spend their time so we get to take advantage of really, really cool stuff that's happening under the hood there. Ultimately though from the operator perspective, so that's kind of like the design perspective from the operator perspective, we really want this proxy to be an implementation detail so we don't want you to have to think about the proxy as like a new operational component that you have to care for and feed and maintain and if you're familiar with the good old days where we had our engine X in the front and then our Rails app and our database like each of those components required, a lot of care and feeding. We'd like, and we get 99% of the way there, we'd like the proxies and linkerd to not have that property so they should be as much as possible in implementation detail. All right, how's everyone doing out there? Lot of thrilled faces. Okay, excellent. All right, so I mentioned this word very briefly. I'm not going to talk too much about sidecars versus EBPF versus ambient versus whatever. I have included that in this version of the talk in earlier conferences and I've got a blog post and stuff about that with our kind of the results of our analysis, but there is a talk that Matei who's a linkerd maintainer, Mike Beaumont who's a, I think a Kuma maintainer are doing about sidecar containers in Kubernetes. That's happening on Friday. I'd encourage you to go visit that talk. There's a little bit of service mesh content in there because obviously these are two service mesh projects, but what I think is more exciting is sidecars actually are like a new official component to the Kubernetes API as of only last year. So we've been talking about sidecars since like 2015. It's always been just a model for how we deploy stuff. It's been a model that says, okay, you stick a container next to another container, right? But now for the first time, we actually have an official sidecar container like a mode in Kubernetes and that fixes a bunch of the warts involved with kind of the just sidecars, especially around things like jobs and containers that are expected to terminate. So cool stuff in this talk, please go check it out. Okay, so I don't think this talk is gonna be super long, but we'll do our best. All right, LinkerD year in review. So if you imagine the year starts with KubeCon EU23 and ends with KubeCon EU24. So I'll talk about, I think we had three major releases in that timeframe. The first was LinkerD 2.13. So in this one we added this, in this release we added the idea of dynamic, what we call dynamic request routing. This is basically routing of traffic based on the specifics of HTTP and GRPC requests. You can route individual requests now based on headers, based on verbs, not based on the body, so we don't dig into that. And this was kind of the second release where we started relying very heavily on the Gateway API. So there's probably 17 other talks at this conference about the Gateway API. I'm not gonna repeat too much about what it is, but it's a cool set of CRDs and an API that's built into your Kubernetes cluster today that gives you fine grain control over traffic matching and traffic routing and things like that. So the vision for LinkerD is, Gateway API is still kind of evolving, especially in the service mesh space, but we're tracking that evolution. And I think the vision for us will be at the end of the day, you're basically configuring LinkerD almost entirely through Gateway API primitives. And then you have some really nice advantages. A, they're on your cluster already, which is nice, right? We don't have to install a whole bunch of new CRDs. And B, it's the same configuration that you potentially could be using for your ingress and maybe for egress control as well. So there's this glorious future that you can all, a glorious YAML-based future that you can all envision where you're using Gateway API types to control kind of every aspect of L7 traffic routing in your cluster. We're not quite there, but we're taking steps in that direction. So yeah, LinkerD 2.12, we introduced that for some of the authorization policy stuff. 2.13, we started using those same Gateway API types for this request routing. And there's some cool examples of things you can do with this. Usually you have to do a little work to actually get to the point where you're doing things like sharding per region because of the nature of how the Gateway API types work. But it's now possible in a way that was not possible prior to 2.13. Okay, the other thing we added in this release was circuit breaking. So this is the ability to know when an endpoint is failing and to stop delivering traffic to it. And again, LinkerD, remember, we're operating in L7 world, so by failure, we don't mean it's refusing connections, although that is a class of failure that we can detect, but we mean, oh, it's returning 500s. So if you're talking to an endpoint that's overloaded and starts returning 500s rather than adding more load and making things worse, you know, by retrying or whatever, we can short circuit that and we can say, let's back off, let's not send any traffic there and let's let it recover and then every once in a while we'll try and we'll say, oh, it's ready, it's healthy now and we'll kind of gently ease traffic back onto it. All right, so that was 2.13, so remember, there's three releases we did in the past KubeCon EU centric year. So 2.14, we introduced what we're calling flat network multi-cluster. So LinkerD has had multi-cluster for ever since 2.9 or something in the olden days and the way that has worked is we've had a gateway component that sits on the destination cluster, right? And by multi-cluster, you know, of course, you can always run multiple Kubernetes clusters. What we really mean here is communication between Kubernetes clusters, right? So workload one should be able to talk to workload two and it should do it in a way that's secure, right? And it should do it in a way that where you can control the traffic and it should do it in a way where workload one doesn't know where workload two is. Like it's, the application is decoupled from the cluster topology, right? And that should be all controllable at runtime and you should be able to add a new cluster and gradually ease traffic over and all that fancy stuff, right? So that's always been the case, well, since LinkerD 2.9, that's always been the case. And we did it by adding this gateway component. So workload one will send traffic through the gateway and then it'll hit workload two. Now in the text updated, but not the, hold on, let's go back here. Okay, there we go. If your underlying network allows pods to route traffic to each other anyways, then we don't really need that gateway components. It's another thing to run, another source of latency. So we added the ability for you to just have direct pod to pod communication while still preserving all those same properties of mutual TLS for security and the dynamic request routing that we just talked about. So, and the reason why this came at, kind of in 2.14 is we started seeing a lot more planned use cases for multicluster Kubernetes. I think in the early days, the majority of the multicluster use case that we saw were kind of evolved use cases or like you started with one cluster and then someone, some other team added another cluster and you're like, oh, shit, now I need, these two things to talk to each other, right? Or you acquired a company and that company had their clusters and so you kind of like got into the state where those clusters were kind of naturally running in different networking environments or different clouds or whatever. Now, a few years into the present, we're seeing a lot more companies that are saying, okay, we're gonna deploy 70 clusters and we're gonna do it on purpose because we want the high availability or whatever it is. And in that world, you tend to have a lot more planning that goes into underlying networking infrastructure and so you see a lot more of these kind of shared flat networks. So this is a nice property to have. I'm making sense so far. Okay, I talked about Gateway API. I'm not gonna get too much into these gory details, though if you love API conformance and all that stuff, plenty of other people to talk about. But like I said, our goal is, we did achieve performance, sorry, conformance in 2.14. This is kind of a moving target and we're gonna keep up with the moving target and move us all towards the glorious YAML-based future that I alluded to. All right, and then finally, the most recent release was 2.15, which adds mesh expansion, also known as, well, we put support for VMs but really it's like support for non Kubernetes workloads. They don't have to be on VMs. They could be on PMs, right? They could be on whatever. So mesh expansion is basically the ability to run the LinkerD data plane outside of Kubernetes. The control plane still has to run on Kubernetes. So we're giving you something that's still kind of a Kubernetes centric tool and the way that you configure LinkerD, you know, I didn't even talk about this at the beginning, but the way that you configure LinkerD or the way that you interact with it as an operator is all through CRDs on the Kubernetes cluster. So you're not like calling some HTTP endpoint and making a post or whatever. You know, you're updating custom records or annotations or whatever on the Kubernetes cluster and that's how you configure LinkerD's behavior. And so, you know, in this world, you can now run the proxy outside of Kubernetes, right? You can connect it back to your control plane and you can get the same properties that we've been giving you for pod to pod communication, now for pod to external communication and sometimes back to internal as well. We had to solve a bunch of interesting problems, I think, to do this, you know, in Kubernetes land, Kubernetes provides us a whole lot of stuff that we can use, right? So for example, it provides us things like service account tokens. We can use that to bootstrap identity, right? Once you're in VM land, well, you don't have anything. You just have a process running on a machine. Kubernetes provides us with the ability to transparently inject the proxy into a pod based on an annotation because that annotation then triggers a mutating admission webhook controller, blah, blah, blah. In VM land, you don't have any night. You just have a process running on a machine, right? Kubernetes gives us the ability to, and it gives us some guarantees around our ability to manipulate L4 traffic so that we can make sure that all TCP communication gets routed through the proxy in the pod. In VM land, we don't have any of that, right? So there are a bunch of problems we had to solve here. One interesting aspect is the identity component, especially, is solved for us by another CNCF graduated project called Spiffy. And Spire is like the implementation of Spiffy. So we imported Spiffy into the project, or we adopted Spiffy for identity for those workloads. There's a lot more interesting things we can do with this in the future, but this is kind of a starting point. We introduced this new external workload CRD that allows operators, allows you to represent the VMs in the mesh. And then you can select VM workloads through label selectors. You can transparently route the traffic. You get a lot of the nice properties that you get from multi-cluster. So that's pretty exciting. The other asterisk I'll put on here is this is Linux VMs only, so we don't have Windows support yet. Might be coming. All right, and if you like Spiffy and you like Mesh expansion, we've got another great talk tomorrow from Zahari, who's one of our LinkerD maintainers. I'd encourage you to go check that out, because this is where you get into some of the gory details and the technical trade-offs. I'll leave that up for one more second. All right, the other big announcement we made with LinkerD 2.15 is that the project is no longer going to publish stable release artifacts. So for the past eight years, we've been publishing stable release artifacts. We're gonna continue publishing the edge release artifacts, but stables are now going to be left to the vendor community. So the edge release artifacts have all the code and the main repo up to the point where they were cut. That means all the bug fixes, all the security remediations, all the latest features. We run edge releases ourselves in production. They're great, they work wonderfully, and we would love for you to run these edge releases and help us get a really fast cycle of reporting bugs, fixing them and speeding up the pace of project iteration. So primary reason, speaking from the project perspective, primary reason was we want to improve the pace of development. In LinkerD, release engineering is a lot of work. Anyone who's been in that role knows just how difficult it is, especially when you get into the cherry picking and backporting of changes. That work is kind of orthogonal to what we were doing in LinkerD to develop features. We need that rapid feedback loop. And ultimately there's a vendor community around LinkerD that we think can do a better job with stable release artifacts. So that's a change for many of you that will require a little bit of thinking through how you want to upgrade LinkerD and things in the future. I'm happy to answer questions about that either in person or in front of the crowd. All right, what's coming next? So this is my tentative roadmap. And I put tentative on there because we are a pretty nimble project and we can reorder stuff, especially when we have really clear compelling use cases, when we have users who are like, this would solve a big problem for us. Those are all ways in which this roadmap has shuffled. One thing that we've gone back and forth on is the prioritization of egress versus ingress. I think where we've landed on that is the very next set of the next major release, 2.16, will be focused around egress control. And we're gonna ingress, which is a much harder problem. We're gonna start some of that design work now, but I think we're gonna land that in 2.17. I would expect all these 2.16 and 2.17 to land this year. So we're not talking 2025, but that's kind of the ordering that we have so far. So there's a little bit of cleanup. If you read through my 2.15 announcement, we got some stuff out the door as practical as we could, and there's a little cleanup we have to do in terms of feature parity. Gateway API parity, we have these two. If anyone's using service profiles, we're kind of in this in-between situation where some features are on service profiles, some are on the gateway API types, and you can't really mix them together in certain circumstances. So it's a little ugly and we'd like to get out of that as soon as possible. We know what the scope of work is to do there and we're gonna do that. Mesh expansion for private networks. I talked about the shared flat network. That's a use case where mesh expansion works today. When you have private networks and there's a gateway component, okay, that's something that we need to add. Again, the work is scoped and we know what we have to do. There's a couple other little cleanup things that'll happen in kind of the 2.15 timeframe. 2.16 IPv6 support, okay, long, we should have had that a year ago, but if you're watching the GitHub repo, you'll see some of those PRs landing already. So that's in late stages of execution. And then egress is the thing that we're really, I think it's gonna be like the big ticket feature for LinkerD 2.16, so that's the way it works today is LinkerD will happily route any traffic that you want outside of the cluster. It doesn't give you great metrics for that traffic for kind of maybe not ideal reasons. And it doesn't really give you great control over that traffic either. So both of those things end up being really important, especially in security conscious environments. So we're gonna fix that. We're gonna give you all of the metrics and control and all that other stuff that you would want. Expect from LinkerD for traffic leaving the cluster. And then egress, hopefully everyone knows what that means, traffic coming into the cluster. So the proxy at this point is capable of handling this in terms of, certainly capable in terms of load and things like that. But egress is a pretty large feature set. So we have to think through what portion of that we implement first and what portion comes later and where are we in the roadmap. There's a wide spectrum from like, I can relay a TCP connection to, I can replicate everything that Nginx does and we're gonna be somewhere in that spectrum. But I'm really excited for that because I think there's some really cool stuff. Once we have 217 in place, then we have a really compelling story for traffic throughout the entire cluster. Coming in, floating around within the cluster and then exiting. All right, and that's it. This is another slide that's been around since 2018. Please get involved. This is all just as true now as it was then. Development is all happening on GitHub. Everything's Apache V2. It's a CNCF graduated project. Like none of that has changed. We've got a community in Slack.linkardy.io. That's helpful. We've got mailing lists. We've got security audits. So come on in and join us. I'd love to have more folks involved in contributing, in testing and in helping other users. All right, and with that, let's turn it over to questions. We have eight minutes for questions and I have five chairs up here. Now, and I'm kidding, there's a... It should be a mic on either side of the hallway if anyone wants to come up and ask a question. Hello. Hi, William. Hi. So addressing that change in 2.15, if I'm putting on your open source hat, if I'm a user, how would I build release artifacts? I'm not asking for the full answer, but how would I go about building release artifacts for 2.15 if... Where is 2.15? Like it's not a tag yet, is it? Yeah, so there's no tag for 2.15. You know, we've considered... There's an edge release that basically is what would be tagged. So we've considered like, maybe we just tag that and get on with it. So today, if you wanted to have something called 2.15, I would take any of the edge releases that is either from the announcement onward. You can consider that 2.15. You can consume those directly or the entire build process is all open source. So you could also take that build process and funnel into it whatever portion of the code you wanted. Does that make sense? Yeah. Thank you. All right. Yes, another question. In our network setup, we are routing traffic between services via a cluster external load balancer. Would mesh extension help us tracking what's going on or are we just doing it wrong? So in your cluster, say that again, external load balancer? We have two services in our cluster, but they talk to each other via a cluster external via the HA proxy thing. Are we doing it wrong or can mesh extents and help us find out which service is talking to which? This is within the same cluster? No. This is across clusters? Okay, but they're both Kubernetes clusters? The services are in the same cluster, but they talk via an HA proxy, which is outside of the cluster. Yeah, if you can remove that HA proxy entirely, I don't think you would even need mesh expansion. Can they just talk directly to each other or is there a reason they go through that HA proxy? That's the question. We would need to rework everything because it's kind of old. Yeah, yeah. So mesh expansion would not be helpful for that. I think the work there would be, since they're in the same cluster already, the same work would be to remove that HA proxy. Yeah, what mesh expansion would allow you to do would be to run a LinkerD proxy in front of that HA proxy, but I don't think that solves anything for you. Okay, food for thought, thank you. Yeah, yeah. Hi. Yes, hi. Thank you for your presentation. I have one question. So I see like new features, new developments, are there going to be support for open shift deployments of sidecars in unprivileged mode? You know, open shift has opinions about running workloads and you can't run them with a UID you want, you can't run them privileged by default and so on. So are there any plans to support open shift? Yeah, so, yes, I would like to support it. I think we kind of support it fairly well today, but it seems like we keep finding new issues with open shift, not with open source, that's fine with open shift. So yes, from my perspective, if we don't support open shift, that's probably a bug in LinkerD. So. No, but I mean like currently I know that I either need to run the container put in privileged mode with the USIP tables because network, I mean, network plugin doesn't work because open shift handles it in its own way and LinkerD proxy, it also wants to run with some, as far as I remember, specific UID. I mean. Okay, so there is a specific issue today that's preventing it from working on open shift. Yes, that's what I'm asking like. Yeah, yeah, yeah. I would say come find me and I'll grab an engineer and let's figure it out. Okay, thank you. Yeah, you know, LinkerD should have support for open shift. From my perspective, we should fix all that stuff and make it work. So. Thank you. Yeah, yeah, you're welcome. All right, we have three more minutes. I think that's time for one more really juicy question. Exciting, compelling. I see someone stepping up to the mic. Hi. Hi. Thank you for. All right, final question, make it good. Yeah, I've seen the 1.17 is about the Ingress. Do you have some insights about if you're going to use Pingora from Cloudflare? For Ingress, are we going to use what from Cloudflare? Pingora. This is a Rust library that has been released by Cloudflare in the beginning of the year. Oh, okay. And they've done a pretty good work with all the library you already use. Tower, Hyper, et cetera. And I was thinking maybe you might add this part in your... Yeah, that's a great question. I have no idea what the answer is, but yes, there is a cool Cloudflare library that's built on top of Tokyo Tower H2, all the same stack that we're using already. Can we use it? I don't know, but I know who would know. So thank you. Yeah, great question. All right, folks, thank you very much. I'll be here for a little bit. I'll be in the LinkerD booth. I really appreciate you coming all the way here. Thank you very much.