 All right. So if you haven't seen it yet, we're going to do live hands-on stuff with everybody today. So you're going to need to be using a Kubernetes plus there. It would be convenient if you could get one that wasn't located on your laptop. And the folks at Sevo are giving away credits for people that sign up at that link. If anyone's planning on doing this and you need the link, raise your hand. Otherwise, we're going to move on pretty quickly. OK. So we'll wait a bit. I mean, know when you're done. And then we'll put it in. If you are done that, the actual live workshop portion will be using that githubrepo.com. You can clone it or you'll need to clone it, actually. So you'll need to clone it. But you could also visit that URL on GitHub. And it'll have the instructions and a nice friendly to read, read me file, all right? Because we're going to be doing, we're all going to be typing stuff in from that readme doc. So if you've cloned it and you've got a cluster there, can I get a thumbs up? All right, there's one thumbs up. OK. Yeah. Oh, yeah. So that's another one. You'll want, when we actually go into the workshop portion, you're going to want the kubectl, CLI, the helm, CLI, and anything else that may be the Sevo CLI if you're using that cluster. If you're not, you'll need Docker and K3D or kind. Really, it's some way to spin up a Kubernetes cluster, right? There's also a couple of other commands, like JQ and OpenSSL, which most people tend to have. They're not strictly necessary to complete all of this. Like, you can finish without. It's just, it lets you inspect the certificates that you get. So it's kind of neat to have those. And we've got someone that'll help you understand what the OpenSSL output actually means. And it's not me. So with that, you want to get the slides? Sure. Let's do it. All right. So welcome. Hello, and welcome to Zero Trust Networking in Practice with Jason Morgan and Ashley Davis. So I'll go first. Yeah. Hey, everybody. I'm Jason. I am a technical evangelist for Boyant, the company that makes the Lincority service match. So it's my job to tell you why you should use Lincority and why Lincority is wonderful. We've got a booth if you want to stop by for more questions later, or I'll also be around after this. Oh, sorry. You can find me on Twitter at Jason Morgan. Find me on GitHub if you want to find me for some reason. And you can find me on Slack as at Jason on Lincority Slack. Yeah. Hi, everyone. I'm Ash. Ashley, I don't mind either way. I'm a software engineer at Jetstack, which is the originating company for cert manager. I'm also a cert manager maintainer. I'm kind of a TLS certificate cryptography kind of person. I'm in that mold. I've really been looking forward to this. I'm going to say it's my first in-person thing like this. This is a bit of a milestone for me. It's really cool to have people here. I also have my social media links there if you want to get in touch after this. I'll talk a bit more about the cert manager booth and the Jetstack booth after we finish the workshop. All right. So let's start with why use cryptographic identity. So the basic talk here, the theme that we're going to get at is when you're trying to secure traffic between workloads in a Kubernetes cluster, we think the only thing you should be relying on is the cryptographic identity of the pod. I think that's really important, and that is fundamental to being able to securely communicate within a cluster. So to start, TLS, what we build for TLS is the effective underpinning of the internet. In a way, we securely access websites. It's extraordinarily hard to forge, and it gives you a way to prove that you are who you say you are without having to transmit your private information as a workload. It relies on delegating trust to third party authorities. Essentially gives us the source of our trust for applications becomes this chain of certificates that are based on something that we can all, every workload can rely on. And of course, you can't go visit your bank or Netflix or look at these slides or access GitHub or do any of this stuff without TLS. So we feel like it's a pretty good foundation to build on. And with that, we're going to ask Ashley. Ashley, why should, when we're talking about these certificates, why should we use cert manager? Like, I can run bash scripts and create certificates. I can't use open SSL, actually, but I can copy and paste some stuff. So why should I be looking at cert manager? To be honest, I'm not sure there's many people in the world who would say they can use open SSL, it's a pretty tricky tool. So yeah, we asked earlier if anyone would use cert manager before, and there were a few hands went up, which I think we would kind of expect because a lot of people have heard of cert manager, right? It's almost like a standard part of a Kubernetes cluster. It's not standard, but people treat it like that. They install it at the beginning of when they're setting up a cluster. And that's super cool. The idea is, if you've not heard of it before, and you don't need to have heard of it to get through this workshop, the idea is, if you've used Kubernetes, you've written something like a pod or a service or a deployment or some YAML that will specify what you want. Cert manager adds that, but it does it for certificates and issuers, and the kind of concepts around that sort of stuff. And it ends up putting your certificate into a Kubernetes secret where it can be used by your application. There's all the plugins you could really think of, whatever you want to use to issue your certificates is probably something for it. That makes it quite helpful to use, because you pretty much already have whatever you need. If you're using AWS, GCP, whatever, ACME, let's encrypt. It's all good. And yeah, a lot of GitHub stars, which some people really care about, and it is always nice to see the number going up. One thing that's really cool is that we actually just hit incubating in the CNCF. We've been a sandbox project since mid-2020, I think, if I remember the history right. And yeah, we made incubating in October this year. So it's very recent. It still feels like an announcement every time I say it. It took a lot of work to get there, but it's a really cool milestone for us moving forwards. Yeah, super exciting stuff. All right, so we talked a little bit about cert manager. And an important concept in cert manager is what allows us to actually get our individual certificates. Do you mind sharing a little bit about issuers in cert manager and how they relate to what we're going to be doing today? I would love to share something about that. So the issue is, if you've used cert manager, you'll be familiar with this. So I'm not going to deep dive on it. But issuers are how you get your certificates. You specify how it is that you obtain whatever certificate you need. We kind of broadly split them into two types. There's the one that most people actually use cert manager for, which is to get certificates from external APIs, like let's encrypt. That's probably what you want if you want a public-facing certificate. It also is probably what you want if you want to store your CA certificates, your issuing certificates in something like HashiCorpVault or VenifyTPP or anything like that outside of your cluster. But cert manager can also do in cluster issuance as well. It has all the tools you need to create and sign certificates and have all of that stored within your cluster. I'm not necessarily saying that that's what you would always want to do in a production environment, right? Having your route external to your cluster is actually a pretty good idea. And that's probably why I would recommend if you were going to do that. In this talk, though, we will be looking at those because they're super simple and they don't have any external dependencies other than cert manager, which we need already. So we'll be using those today. But yeah, as I mentioned earlier, there's a bunch more of these that are available as an external issuance as well if you want to play around with them. OK. And so with that, let's talk about what can we do with some of these certificates and a little bit of an overview of what is MTLS or Mutual TLS. So regular TLS, you have a client and a server. So when you go to GitHub, your computer is a client or your browser is a client and the server is that. And when you build the connection, you go and you verify the identity of the server and then you build a secure connection with its public key. And using that, you're able to build this encrypted session between you and GitHub. And that's pretty handy. Mutual TLS adds a second step in there. So instead of just the client identifying the server, the server is also able to positively identify the client. So in this case, both sides of the conversation know for sure who they're talking to. And they have a secure link over which to talk, which is pretty nice. Since we're already doing a TLS handshake, Mutual TLS doesn't really add a lot of overhead. And the reason you don't use Mutual TLS when you're talking to GitHub is GitHub doesn't care about the clients. It can't know about you and it can't sit there and figure out who everybody is that's coming in. That's what authentication is for. In our Kubernetes cluster, we control all these components. So we can easily set up an environment where both the client and the server trust the same certificate authority and then they can use that information to positively identify each other. And when we have Mutual TLS, that's the building block for our cryptographic identity. And that allows us to do things like set up policy in our cluster that lets us not trust anything and then we can implement what's going on. So one thing that that brings to mind is that you need a lot of certificates to achieve this, right? Every pod in theory is going to need a certificate and therefore an identity to use when it's authenticating against other services. That brings up the problem of how do you manage all of those identities and those certificates? And so a manager can do the mechanical aspect of that. But it's useful to have something else which will help with managing the identities at a different level. I think that's where Jason comes in. Yeah, so let's talk about Lincardee. So Lincardee is a CNCF project. It is the only service mesh to hit graduated status within the CNCF. So that means it's on par with things like Kubernetes when it comes to maturity. It's created by the folks over at Boyant, the company that pays me to do talks like this. Been around for a while. And it's used by small startups, by really big companies. If you saw earlier this year at Valencia, the folks from Xbox talked about how they use Lincardee to build or to do multi-cluster and add MTLS for Xbox Cloud. And yeah, that's our story there. A little bit about how Lincardee works or how a service mesh works. What you've got here is an example of generic Kubernetes cluster with an application. So we have an Ingress, could be Nginx, could be ambassador, could be whatever, pick your Ingress. Then we've got a Webfront end, and we've got two backend services, foo and bar. So the way Lincardee works is you install a control plane. This is what we're gonna do today. You install a control plane, which is your interface for managing the mesh. And then we're gonna sit a little proxy, in this case the Lincardee 2 proxy, in between all of your applications. And we're gonna route your traffic through that proxy. And so each connection between, say, web and foo, as an example, is gonna go from web to proxy to proxy to foo. And what that's gonna do is the proxies are going to have the information about our certificates. They're also gonna have information about how is this thing performing, be able to make intelligent load balancing decisions, stuff like that that's beyond the scope of the workshop. But it will, the Lincardee proxies are gonna intercept your traffic, do some stuff with it, and that's gonna be the core of how that works. And you're gonna interface Lincardee through its control plane. And with that, I hope you've had enough slides because now we're gonna go into the demo and actually live do our workshop. Are there any questions before we move on? Great. All right, so if you haven't done it yet, nope, wrong one, if you haven't done it yet, go to .io slash cert manager workshop. That was that link on the beginning slide and we can pull that back up if you need it. And we're gonna check out this read me. All right, we're gonna be going through that today as this exercise. I'm good? Cool. So at this point, if you have questions, raise your hand, either Ashley or myself will come over. We also have Flynn in the back. I'll be happy to help you with any of these things if you run into problems. And I'm just gonna walk you through the steps and we're gonna do them live as we go. Sound good? Does anyone or yeah, does anyone not currently have a Kubernetes cluster they can use? Okay. Shoot. I'm sorry, I can't hear. Oh, shoot. Oh, that's the shit. You can also use a K3S cluster or K3D cluster or Kind or Docker desktop has a built-in Kubernetes cluster in it. If you need some help getting a cluster running, please let me know and we'll be over to you in a few minutes, okay? So for those of you that have a cluster running, make sure that you have Helm, Cube CTL, Curl, you might want JQ and OpenSSL that you can get through without it. Just talks about K3D and rootful Docker or Kind. Yeah. So let's go into the actual step. So the first thing I want you to do, those of you that have a cluster, we're gonna go ahead and deploy our application. So this is gonna be the demo application that we're actually gonna set up policy for. So first we're gonna take this app and then we're going to add it to our Lincredi mesh and then we're gonna apply some policy to it to lock down who can access what. Sound good? All right, so we're gonna start here. You can see I've got a couple terminals open. We're gonna be doing our work on the left side and we're gonna be showing things about our cluster on the right-hand side. Can folks see this okay? And like I said, please do follow along. So the first thing I'm gonna do is I'm gonna create my books app namespace and I'm gonna go download and run my demo application. All right, so see on the right side, I've got some components creating. Books app is a fairly simple application. It's got a traffic generator, it has a web front end and it has two back ends, books and authors that allow us to manage who's talking to what. Let's go back. Right now we're gonna want to actually install CertManager. So we're gonna use Helm for all of our installs today. In general, we recommend that when you're running Lincredi especially if you're gonna run it in a production or near production scenario that you look at using Helm to manage your Lincredi installs. Same is true for CertManager. Everyone familiar with Helm or fairly familiar with Helm? Do you want me to walk through what the Helm commands do? Well, I'm just gonna do it anyway, because I'm paranoid. So what we're gonna do here is we're gonna add two repositories, two sources of Helm packages. One for Jetstack who makes CertManager and one for Lincredi. Once we create those repos, we're just gonna run a little update command and make sure that we've got a local version of our packages. All right, next. We're gonna actually install CertManager and let's talk about what this command does. So we're doing Helm install, the CertManager package. We're referencing the actual package name. So you give it a name, then the name of the package, right? Or the reference of the package within the repository. Tell it what namespace to install it in. We tell it to create the namespace as it goes, install some custom resource definitions and we give it a particular version. We can't tell you how valuable it is to always explicitly set the version when you're installing things with Helm. Part of the value here as well is that you're installing the latest version of CertManager which is only a week or so old. It was released very recently. So you're using the latest and greatest tools here. All right, so here we're gonna see CertManager getting installed. So here let's just close that out so we can see everything going on. So we see CertManager starting to get installed. That was quick. Yeah. So we're not running locally. And folks that are following along, is this all going smoothly so far? Killer, awesome. All right, now that we've done that we're gonna install another CertManager component called CertManagerTrust. We're using a different install command just because we like to keep you on your toes. So here we're using the HelmUpgrade command with the install flag. So this lets you, it basically, it would upgrade it if it's there and if it's not there, it'll do the install for you. It's pretty handy. Oh, I did the wrong one. There's a bit of background on the tool, TrustManager, it's quite a new project and it's kind of something that we've been focusing on a lot recently. If there's something that I should get as a tattoo, it's getting a certificate is only half the problem. That would be the chest tattoo that I need to get, right? Trusting that certificate is the other half of the problem and there's not much done currently to talk about how to solve that problem. There's some upstream work going on for it in Kubernetes itself, but actually we really wanna get involved and help solve that problem and TrustManager is a big part of that. So it helps you to get your cluster to a point where you can trust certificates effectively and that goes back to what Jason mentioned earlier about trust delegation and having third parties and verify the certificates you've got. All right, with that done, we're gonna go ahead and create a namespace for our Lincredi install. We're doing this now because we're about to tell TrustManager to create a bunch of certificates for it and they're gonna need a home and specifically their home is gonna be the Lincredi namespace. So let's go back, keep CTL create namespace Lincredi. Great. And now we're going to apply and then read this bootstrap.aml manifests. So for those of you following along. I think that is in the root. You might not be in the right folder. Oh, right, I'm in the right folder. Oops. Sorry, important point, guys. Or folks, please go to the appropriate folder for doing this. Point, so manager works up. Great. There we go. All right, so let's talk about what this was. See right there. So do you mind talking through what objects we're creating and why? Sure, yeah. So the structure that we're gonna use here is we're gonna have a root certificate which will live in the cert manager namespace and that will issue what we call an intermediate certificate which is the one that will issue the identities that we'll actually use for our pods. The way we do that is to link back to those issuers that I talked about earlier. We have issuers and cluster issuers. The distinction isn't so important but cluster issuers apply to every namespace whereas issuers are namespaced. Here we're just gonna use a cluster issuer. So we create a self-signed issuer which does pretty much nothing. It's incredibly simple. It just tells the certificate to sign itself using its private key. That makes it a root certificate and that root certificates for several reasons of what we use for the basis of trust within TLS. So we want one, if you were doing this actually in production, I would maybe suggest that you generate this outside of your cluster. In PKI, that's, in installations I've managed before in production, pretty big companies, we tended to use a Raspberry Pi and do this entirely offline. Like we'd desolder the Ethernet ports. You couldn't go online with it and generate it all offline. We're not gonna walk that mile here. We haven't bought everyone a Raspberry Pi and we're not gonna damage a load of Raspberry Pies because they're really hard to get at the moment. So we'll just do it in our cluster because that's fine here. This certificate is kind of the core of how CertManager works. We're asking for a certificate with a particular name telling CertManager with the secret name where we wanna store it. We wanted to last for quite a while because root certificates are a bit of a pain to rotate so they tend to last longer and that's why we make them more secure. And we also specify which issuer we want to use to issue this certificate. In this case, it's self-signed. We then use what we call a CA issuer. It's kind of hard to see, but you'll see on line 49 there. A CA issuer is using a different certificate in the cluster to issue other certificates. So in this case, we're creating a CA issuer using that root certificate we generated to generate the intermediate certificate that we need now. So this is a process that you might need to repeat in the future. Someone's really enjoying the workshop. Yeah, this is a process that might need to be repeated because the intermediate certificate won't last as long as the root. If it did, there wouldn't be any point in having two certificates in the first place, you'd just use the one. So yeah, this intermediate actually you see here is only 48 hours long and you can see we're referring to that issuer, that cluster issuer, sorry, which represents the root certificate we generated. And there's a bunch more TLS-y stuff we don't really need to worry about it. Apart from is CA at the top, we do need to make sure this certificate is marked as a CA certificate so it's able to issue others. That's really all we need to do to set up the issuance certificates that we need. We're not gonna be writing YAML for the individual certificates for the components, for the pods that we're gonna run because that's something that Linkerd will do using this certificate that we're generating. So it's just the CAs that we're using Cert Manager for here. But the one last final step we need is to generate a bundle. So this is coming back to that trust problem and this is actually the trust manager component that's coming in to help. We're taking the CA certificate that we generated, the root, and we need to copy that into the Linkerd namespace so it can be injected into sidecars and used for trust purposes. So to summarize all of that TLS-y certificate stuff I've just said, we have the root certificate, which is the basis of all trust and that's foundational for our service mesh here. Only the root certificate needs to be distributed and we distribute it using trust manager here. We've used Cert Manager to generate that root and to generate an intermediate off that root and the intermediate will be rotated semi-frequently because it's short-lived and that's the reason we have it because it may need to be rotated more frequently. Yeah? So right before this workshop earlier today we did a workshop, only many folks done a workshop on running Linkerd name production and if you're using something like Cert Manager to handle this intermediate certificate, this issue where that you use, it dramatically simplifies operations and it lets you do something like set a 48-hour lifespan on your cluster certificate and have that be essentially a trivial operation for your cluster admin. So we can set up clusters with this basic configuration, know that any individual cluster will never have an issuer last more than 48 hours and it will rotate for us automatically without us doing any work and then if you're starting to think about doing things like multi-cluster and Linkerd, each individual cluster inside your environment will only be able to be trusted for maximum 48 hours at a time. All right. Let's make sense. We're still able to follow along with the readme. Okay, if you do feel free to go ahead, right? Do feel free to go ahead and we'll just talk about it when you get stuck. All right, so we just looked at the file. We can actually do some inspection on the certificates. What do we get? And Ashley's gonna tell us a little bit about what this is telling us. Sorry, let me do this again. To less so that you can talk about it. So yeah, this I will admit is a mess. We're using OpenSSL here because most people have it installed. There are other tools for viewing these things. I mean, there's ways to get OpenSSL to print this out in a prettier way but who knows how to do that? It's a pretty complicated tool. So I'm not gonna dive into every line here and I don't think it'd be particularly interesting for me to start reading out a public key or something here. But what's important here for this root certificate is that we see that the issuer and the subject are identical. So that's actually the definition of a self-signed certificate. People often use self-signed to mean private. So that might mean their organization has a CA certificate and people call that self-signed colloquially but actually the definition is that the issuer and the subject match. That means there's no other way to trust this except explicitly. We're not gonna go too deep into the details of how TLS Trust works but unless you're interested, then you can go and ask me. I always like to talk about it. But it's important that we need something to distribute this trust for others to be able to consume it because this is self-signed. We can't find a different certificate which signed this because by definition there is no such certificate. In also see obviously that the validity is quite long and that it is marked as a CA which is actually this line of the bottom now under the basic constraints. That's important. All right. Cleaning up and moving on. Let's read out. Let's read a little bit about the intermediate certificate and you can tell us what exactly that's telling us. Forget to do less this time. So yeah, again, it's a mess. It's still a mess. It's still hoping SSL. This is the intermediate certificates. You'll notice in the command that Jason just ran this is looking at the linker desa namespace now because that's where the certificate is. And crucially you'll see that this time the issuer and the subject are different because this was issued by the root certificate and the issuer obviously is the name of the root certificate. Again, importantly this is a CA. It's, you see at the bottom there. But there's really not much more to say about it than to notice that the validity duration of this is very low. So if something goes horribly wrong and the private key for this certificate was exposed, the absolute worst case is that someone could use it for two days. And revocation is actually a pretty difficult thing to do with certificates in production. So the shorter live just certificates are, the less splash radius you have if something goes wrong because they'll have expired by the time anyone can do anything particularly evil with it. Okay. So now we're actually gonna do a LinkerD install. So today we're gonna install LinkerD 2.12.2, I'm pretty sure. 2.12.2, I'll double check what version I actually put in there. But we're gonna install LinkerD and as of LinkerD 2.12 the core control plan of LinkerD is installed with two different helm charts. We have one specifically for the LinkerD CRDs and one for the LinkerD control plan itself. So that's what we're doing here. And then there's a ton of comments in there, read me about what we're doing and why. So if you have more questions or if I missed something, please do check out what we're saying in there. So we're gonna do the CRD install. That's happy. We see no new containers get created and then we're gonna go ahead and install LinkerD. A little bit about this command. When we install LinkerD, we're gonna tell it, we're gonna override the default install behavior and we're gonna say, hey listen, set the identity external CA and this is the scheme that we're gonna use to figure out what we're doing and then we specify the chart. This tells it to look for secrets already existing in the LinkerD namespace that are gonna allow it to actually start giving it certificates to the individual workloads. Again, this is the last time I'll talk about the hierarchy, right? You've got a root CA, which establishes trust for LinkerD at all, right? And you would use a root CA to sign the individual cluster certificates for each LinkerD install that you use, right? And then inside the cluster, we have our intermediary or what we call in LinkerD terms, call it our issuer certificate, right? That is the certificate that LinkerD will use to create individual workload certificates so that they can trust each other and they can communicate securely over the network. So let's run this, all right? And then we're gonna run LinkerD check to just tell us that our LinkerD environment is ready. I would love it if you all have questions. Now's a great time. And if you don't have questions, I have a threat, which is I'm gonna give you a terrible networking joke. Y'all ask for this in a way. I just want you to notice. I have a UDP joke for you. You may not get it. So let's talk, we can talk a little bit about what components we're creating or what is like a core LinkerD install include. It's three basic components. We have the component of LinkerD that is responsible for creating these identities. For using all these certificates, I've just spent the last 35 minutes talking about 30 minutes talking about, right? That's the identity service. We have the proxy injector, which is actually gonna start adding the LinkerD proxy to our workloads, right? So we're not gonna define the whole proxy in YAML. You just define an annotation either on your namespace or on your workload. It's gonna say, hey, please inject the proxy on these things and that the proxy injector is gonna take care of that. And then last but not least, we have the destination service. Destination service actually does a whole bunch of things, but its key job is to make sure that the proxies know about each other and are able to make smart decisions about your traffic. Sound good? It might be worth saying that if when you run LinkerD check locally, it does warn you that the issuing certificate is expiring soon because it's so short-lived. Yeah, you see it there. That is actually because we're racing ahead and we're issuing a very short-lived one intentionally with the intention that it will go out of scope very quickly. It's not a warning that we need to worry about because this is something we're doing intentionally. It's a really important warning if you're hand-rolling your certificates. Pay attention because it's one of the few ways that you can get LinkerD to cause your production traffic to stop. It is something we can safely ignore if we're using something like CertManager to rotate our certificates. It's, CertManager is one of those things where the automation makes your life dramatically easier. Okay, so now that we've done that, we're actually gonna install the LinkerD dashboard. So this is called LinkerD Viz. We're gonna use this UI a little bit during the demo and I'm gonna run some commands that will help us see what's going on with our traffic as we actually start locking down our cluster. I've got to say, I absolutely love LinkerD Viz. I think it's a really cool tool. I wasn't very familiar with it a few weeks ago, but I've already started to enjoy seeing it in action. It's gonna give us a lot of insights about what is our traffic doing, right? So like, why do you use a server smash? At least the folks I know that use service messes do it because they want MTLS or they have some requirement for MTLS. They want to better lock down the interactions between the services in that clusters. They might wanna do GRPC load balancing. If you have that problem, you know who you are. If you don't, don't worry about it. And like, a lot of folks want standard metrics from all your workloads, right? LinkerD Viz is a tool that will expose all these standard metrics. So with that, we're gonna run LinkerD Check again. We are actually just gonna run LinkerD Viz Check. So you can do checks. LinkerD Check CLI's got a couple idiosync overseas. You can check all of LinkerD or you can check a specific component. In this case, we're just gonna check the Viz component. It's good to go. Close this out. And now that we've got it, right, we're actually gonna add an application to the service mesh. I talk about it a lot in the text, right? But essentially, what we're gonna do here is we're gonna tell the Kube API that these pods need to be injected and then our proxy injector is gonna watch for things, people trying to create pods. If they see someone trying to create a pod with this annotation, it's gonna mutate it. So it's a mutating webhook. All right, so let's show you with this. In this case, I'm gonna use the LinkerD CLI to do the injection. Typically, when folks are using LinkerD in production, they set the annotation either on the namespace in whatever CI system actually creates objects in their Kube cluster or they set the annotation on the individual workloads, right? This line here is gonna add one line of YAML, one annotation to the four deployments in the Books app namespace. All right, so let's run a watch command here on the right. And I'm also gonna do myself a favor and change my namespace to Books app. So from here on out, when I make a command, it's gonna be in the Books app namespace and I'm gonna create two more little windows because I wanna show you all a bunch of stuff. All right, so we're gonna see the Books app pods running and I'll show you some more things in a minute. So K get deploy dash O YAML. So output the deployment says YAML, run LinkerD injects against that YAML and then send that right back to the Kubernetes API. This makes sense, what we're doing? Everyone who is following along, have we gotten at least to this point? Awesome, thank you folks. We can see on the right-hand side, it's actually happening pretty fast, but I have, whereas before, I had a bunch of containers running or a bunch of pods running with one container per pod. Now they've all been destroyed and they all have two containers per pod. That second container is the LinkerD proxy. So what's happened is we've gone from each of these components talking one to one to each other and now all their traffic is being routed in and out of a little proxy, right, the LinkerD 2 proxy and we've got some neat stuff that we can do with that traffic. One of those things is LinkerD viz.stet. Deploy. So one of the things that we can do is we can see details about the traffic between these applications. I'm gonna make this a little bit smaller and hope that you're seeing it locally. So suddenly, without me changing anything on my application, without creating any custom resource definitions, without doing anything fancy, I just added this injection. Everything's still talking to each other, it's just now behaving a little bit better than before and I can see the percentage of the requests hitting each API that are successful. I can see the latency and I can see the volume of traffic in my environment. This is important because as we go into the policy side of actually locking down our cluster, we're gonna break all this traffic and so now we can see what's happening live and last but not least, we're gonna run a watch command. LinkerD viz.stet.deploy. I think I got that right. Yeah, great. That's gonna go ahead and use pluses and minuses. Even if it's like that, it's still easier to read than open SSLs, I'll put it. Thank you. All right, so it's a little small but what this is telling us is what are the effective policies? Let me make this a little bigger, sorry. That's just ridiculous. What this is telling us is what are the, what are the existing authorization policies we have in our environment and how much traffic is being allowed or denied by that authorization policy. Does that make sense? Quick dive into policy in LinkerD, right? LinkerD handles the auth z side of auth n and auth z side, I'm gonna say auth z so that I don't make that noise again. We delegate, like the way LinkerD works and one of the things that allows us to keep LinkerD fairly simple to use is that we rely on Kubernetes native objects wherever possible to do stuff. So when we route traffic between services and Kubernetes, we use Kubernetes services. When we figure out the identity of a pod, we go for the identity of a pod that Kubernetes assigns it, right? So our authentication mechanism is the Kubernetes API and specifically the service count that you've assigned to your individual workloads and that's how the certificates are gonna get created. They're gonna be based off that service count that you gave your pod. So right now we see a lot of traffic flowing through. Nothing is unauthorized. We have a good bit of traffic that is authorized, okay? Great, and we're gonna totally mess that up as we go here. We can run another LinkerD check because I like to show off how fancy this tool is. We can run LinkerD check again in the books app namespace just to validate that our proxies are in good shape and look at that, it's happy, but of course the issuer certificate is gonna expire in less than 60 days, so watch out. All right, so I put in the readme, the commands that I just put under that watch. If you have a terminal that you can split it to the side, I would really highly recommend setting up LinkerD viz stat deploy and LinkerD viz auth z deploy in separate terminals because I find it's really useful to watch what happens as I apply policy. All right, with that, I'm going to send the applications in books app into namespace jail. So the first half of this thing, or I guess the whatever, middle third of this is gonna be isolating the applications in the books app namespace so that they can only talk to each other. I'm lying actually a little bit. So they can only talk to each other and they will allow our LinkerD viz, our dashboard, to figure out how they're doing, right? Get some statistics about how they're performing. Is the LinkerD viz stuff living in a different namespace to the application, that's the key here, right? Yeah, so our books app is gonna be in the books app namespace, LinkerD viz and the LinkerD viz namespace or engineers or not, designers. So let's go ahead and first off, I'm gonna create a default inbound policy for my namespace. So what I wanna do is I wanna tell things in the books app namespace that, hey, listen, unless you have explicitly been authorized to see some incoming traffic, deny it. Or deny everything by default. One thing that you're gonna notice after I set that annotation, nothing happens, like nothing changes. I've still got traffic, I don't see any new policy, none of my pods suddenly crashed and burned. And the reason that is, is that default annotation, which can either be done at a cluster level or can be done at a namespace level, that is read by the proxy when it starts. So in order to have it take effect and start causing us problems, we're gonna need to actually restart these pods. So let's do that. So we're gonna do K, roll it, restart, oh, we're gonna tab completion. We're gonna restart the deployments in the books app namespace. So now we're gonna see our traffic cratering. But you're actually not gonna get a ton of details on that because while the components of books app can't talk to each other anymore, viz actually can't talk to books app anymore either. So it's not getting any updated details. So we've well and truly broken this. I guess we're not gonna worry about that. Oh, shoot, I did want that. Sorry, give me one sec. Sorry folks, watch, get pods. Great, because there's gonna be one more thing that we're gonna wanna know about our pods. So now we've done the roll it, restart. We see that everything came back up and is actually perfectly fine. That's some, so one thing we do allow even when everything else is banned by default, LinkerD will explicitly carve out exceptions for the Keyblit to do liveness and readiness checks on your pods. So that means if nothing else works, we're at least gonna allow your pods to start. Okay, so that's why they were able to start back up even though we told everybody that no traffic was allowed through. So you can go ahead and you can run the commands again or if you're running them off to the side, enjoy that. You know, the readme is just gonna tell you in depth what I just told Jay out loud. So now we've got that. We actually wanna go ahead and carve out an exception for LinkerD to, for LinkerD to, the LinkerD vis component to speak to our individual applications. So we can get some metrics back. So let's show you that. And by now we should have, yeah, great. So now you see no authorization rules on the bottom. We see absolutely no traffic for our environment at the top right, right? And the reason that is one, because there's no traffic, but two, because our dashboard can't talk to, can't talk to LinkerD anymore. So what we're gonna do is we're gonna create two manifests and we're gonna read them in a second. But first I'm gonna define the admin server port on all of these pods, right? That is the LinkerD admin port. And then I'm gonna assign a policy that allows our Prometheus instance and our Metrics API to talk to that admin port. So let's actually read that out. Manifest, Bookshap, allow admin. What was it? I just did this. Admin server, there we go. So just to talk about hierarchy of objects in LinkerD policy. So, or even stepping back a little bit. With LinkerD by default, to get mutual TLS, to get Metrics, to get the way it makes writing decisions, we need zero custom resource definitions. When you add in policy, like this is a place where you can break your applications and where you have to start getting into custom resource definitions. So it's kind of like Spider-Man, like with great power comes great responsibility, right? If you want the power of policy, you're gonna have to take on more administrative work for you as platform engineers. I'm making an assumption. Everybody here work in and around platform engineering? Okay, killer. So that's what I was hoping to see. So we have this hierarchy of objects. So the first thing that we have, so we have our custom resource definitions are servers, HCP routes, mesh TLS authentication, network authentication, and then an authorization policy. So a server defines a port. It tells Linkerty about a particular port and how it works. In this case, I'm creating a server called Linkerty Admin in the books app namespace. It's gonna find all the pods, whatever labels you have, and it's gonna bind to the Linkerty Admin port. So this is a port that is explicitly called out in the YAML manifest that defines your workload. Make sense? So this hasn't done anything beyond tell it there's a port. Oh, by the way, and the proxy protocol it uses is HCP2. Or the protocol for this workload is HCP2. And then after admin server, we're going to cat manifest books app. This is what allow admin. Allow viz, there we go. Allow viz. So now we're gonna see two new object types. So the first is we're gonna create a mesh TLS authorization. So that is a meshed identity in the books app namespace. We're gonna call it vizapps, and it's gonna apply to Prometheus from the Linkerty viz namespace, and it's gonna apply to TAP from the Linkerty namespace. So that's gonna explicitly authorize Prometheus and TAP to talk to our workloads. Talk to workloads on the Linkerty Admin port, right? Because that object above that server or that authorization policy, that binds our rule of the mesh TLS to the port, okay? So we have port, rule, rule binding with the authorization policy. Still make it sense? Killer. I appreciate the head and otters and thumbs uppers out there, by the way. Very grateful. I was just thinking there's been some excellent head and otters. All right, so now that we've done, now that we've done this, actually I do wanna show you this. On the terminal, we're gonna see is where before we didn't have any metrics, now we're starting to see some things appear. Now we're starting to see some things appear, right? So I once again have stats in the top right about what's going on in our environment. We see this very little traffic, right? It's basically just the health checks are rolling through, but there's some traffic, which is nice. And I can see that there's a policy in place and I can see on my default denied policy that about 10 requests per second are getting denied by Lingerty, right? So a good metric to watch as you're setting up policies is there anything being denied or unauthorized? If you set this up correctly and you're not in the middle of a terrible security incident, this should always be zero. So now we've got that. We're actually going to allow some traffic through. So I created a server, created one server object for all my Lingerty admin ports. Now I've got three more ports that I care about. My web front end, my book service, and my author service. So we're going to create, we're going to give each one of those a server object and then we're only going to look at one of those server objects because they all look basically identical. So let's go create these things. Cool. And let's look at one of these, one of these objects. So again, very similar to what you saw before, you call out, once again, you call it a named port. So if you look at the, the books app YAML, like if you go back and look at what you curled down, you're going to see under the definition, a port defined called server, right? Or service. We're telling Lingerty explicitly, it's going to be an HP1 connection, right? So you don't have to worry about protocol detection. And then we add in a selector. So very similar to a service, right? Community service. You're going to have to find a way to identify what pods belong to that service. You're going to do the same thing. What pods belong to this author's server object? That being said, just because we've created the servers doesn't mean we have any traffic. That's what allow namespace is going to do. So we're going to apply it, and then we're going to look at what, what it did for us. So here we, we apply this allow YAML, allow namespace YAML, right? And we're going to see that it created a author's books app policy, a web app books app policy, and a books, books app policy, right? And it also created a mesh TLS authentication. So let's just actually look at that. I'm going to, I'm going to go through one of these and the other three are going to be identical. So we created one policy per server and we bind every single policy into the same mesh TLS identity, right? And the mesh TLS identity we'll see in a second, it gives access for anything in books app to talk to anything else in the books app namespace. If you look at our traffic on the top right there, you're going to see that now we're back to our lightly broken application as opposed to our totally broken application. So we have traffic rolling through and we don't see things being unauthorized anymore. Now we look at the end. This was my favorite part. I thought it was really nice to be able to do star.booksapp. So if you're a service count in the books app namespace, you are valid to access the server port on any of these components. So now we've restored traffic to our environment. All right, we can get metrics. We can, we can have our components talk to each other and this is the core of setting up a namespace jail. What do y'all think? Straight forward? You'll feel comfortable doing this in your environments? All right, so I have a question then for the confident folks of our audience here. If I wanted to allow, say I'm using an nginx ngress and it's in the ngress nginx namespace and it's under a service count called ngress nginx, do y'all have a general idea of what I would need to do to allow it to access my web front end? I've got a linker d-hat for anybody that answers this even if you're wrong. Not now, but tomorrow you've come find me at the booth. I'll give you a linker d-hat. Not you Flynn. So you're in the right general direction but so the first thing we need, the first thing we need, or I guess, yeah, let's start with that. What are the objects we care about? So what server object do I need to bind it to? Whatever my authorization policy is. In this case, I'm trying to access the web front end. So it is the web app server object. So we're going to take the server object for web app and we're going to bind a new policy to that web app server. And then we're going to need to create a mesh TLS authentication policy for our nginx ingress. And we bind that new mesh TLS object to the web app server. And that would allow traffic from our ingress through to our application. General sense? Thank you very much for trying. I really appreciate it. I'm definitely going to give you a hat tomorrow. Long story short, as you all probably figured out, this isn't a trivial exercise. There are really good docs. Inside this readme, I point you at the docs file for the policy reference. And just exercise caution, test things out in a reasonable way before you go roll through into production. This is our founder when he talked about it. This is the biggest foot gun that Lincardy's ever given our users. So don't use it to harm yourself. It's also a really powerful way to ensure that only the right things are talking to each other in this environment. So where are we on time? We have a bit more, right? So we have a 90-minute session. So the next section of this, this was the coarse-grained policy in Lincardy. So what I mean by that is, I guess it's coarse-grained. That was bad. All right, point is we bound rules to a port, right? But it didn't allow us to get any more in-depth in that. So with Lincardy 2.12, we can do more fine-grained policy. So I can, because Lincardy is aware of your API, it understands what it's doing. Who's talking to what? It understands the HSP verbs. It understands who's calling what path. We can actually make rules that go down to the per verb and per path level inside your environment. Right, in fact, the way that we set default exceptions for readiness and libeless checks is we use fine-grained policy built on top of HSP routes to allow the cubelet, even though it's not part of the mesh and will never be part of the mesh, we allow it to talk to the whatever predefined readiness and libeless checks you have inside your workloads. OK, I'm going to show you how to break that in your environment. Does anyone need a break? One thing you may have noticed, there are four workloads in a Books app namespace. We didn't create a exemption or we didn't create a server object or any policy for the traffic app, as because the traffic app is not hosting anything. It's just making calls out. So no one is allowed to talk to traffic, but traffic is allowed to talk to everyone. OK, so we've isolated the namespace. Now let's do some walking down, right? So we're going to use HSP routes to do it. So the first thing I'm going to do is create a route for our author's service. I want you to see a bit about what happens when we do it. In particular, if you could watch the get pods, something interesting is going to happen when we run this command. So we've created our first HSP route policy. Give ourselves a minute. I want to make sure my demo breaks correctly and as expected. It's tense, right? Are you tense? I'm tense. It's so easy to break things when you don't mean to break things. And it didn't. So now if you look at our environment, the author's service, or the author's pod, just became unready, right? So we have two containers, a LinkedIn proxy and the actual author's container. And author's is no longer ready. So what happened here? Let's actually take a look at this object. We created an HSP route. And what it said is, hey, listen, for these paths and on these verbs, we're going to set specific rules about who can talk to what. We haven't bound anything to this HSP route, right? So we haven't told anybody there a lot to talk to it, but these are the rules about who can talk to what inside of our application. And going back earlier, when LinkedIn looks at your environment, it creates default exceptions, default HSP routes for liveness and radius probes in your environment. And it assumes that once you start creating HSP routes, you no longer want its defaults. You're going to explicitly specify who's doing what with your applications. And because I haven't allowed health checks, they're not passed through. And this talks through, again, a lot of this is intended for you all to do self-paced. But the gist of it is, it's not working because LinkedIn deleted the routes that it created for you. So now, we're going to set up a probe, right? And the probe object is actually going to create a route. And it's going to create a type of authentication, and then it's going to create a policy. So let's just look at that object. So we create a route for slash ping. We basically say that, hey, listen, anybody on any IP address, if they're not part of the network, may talk to that. And then we bind them together with this authorization policy. Now, we bound the network authentication called the author's probe to the HSP route that is that author's probe. So now, our pod becomes ready once again because it's allowed. And now, all right, that's what I just did. And now we're going to actually allow some traffic to come in. Worth noting, before I do this, the success rate for the web app service has cratered down to 11%. That's because as it tries to talk to authors, which it's doing a lot, it always gets denied. And we can see that we've got, well, you can maybe see it. But there are a bunch of unauthorized bits of traffic rolling through. Hold on, I'm not going to try and decipher that mess. There's a bunch of unauthorized traffic going on. So when we create this policy, we're going to see a bunch more stuff get authorized. And we are now going to close our pods because these will not be interesting anymore. So now, we can stop caring about the health of our pods because there's nothing new to show there. Is that invisible? All right, well, sorry about that. So let's look at what we did. I guess I just did the get routes. So I'll just show you the get policy. So for get policy, we said that Books app, web app, or my point cloud agent, sorry, this is reused AML from something else I do, they can all talk to the author service on this port, or sorry, on those routes, not ports, on those routes. And that just basically binds them together there. All right, so if you look at the application, well, in fact, we can just do that. All right, so if we go look at the application, we're going to see that it loads, it loads, right? But if I go to make a change, so I'm going to add an author. Add is an author here. Oh, sorry, no dice. This doesn't work because we can't update authors. We only gave ourselves gets, right? We didn't do any puts, we didn't do any deletes, you know? Sorry, Neil Gaiman, you're out, man. I hated Sandman, it's gone. Nope, can't do that either, because I have no policy. Also, I didn't hate Sandman, it's amazing. Is that the same guy? I feel like it's the same guy. Anyhow, right, that's what's going on here. And we also see that our success rate for Web App is like staying low, right? Well, because Web App's actually doing a lot of trying to update and delete authors, and that's not going to work. So let's see how we can make that work. So we can't update books. Our app looks good, but it's not really working the way it's designed, so let's fix it. So we're going to create a route that allows us to modify authors, and then we're going to bind that route to our server. So what it is, is create a new route, right? It gives me, it gives whoever gets attached to this route, it gives them the ability to do deletes, puts, and posts, on different paths within the API that makes sense for application. You hopefully have noticed that where the last section on locking down the namespace was not trivial, but relatively straightforward, right? Getting more fine-grained in policy involves a lot more knowledge of our application, right? So just be aware of that, understand that as we get more granular with our security, you need to get more granular with your channel, and get more into depth. So now we're going to actually bind something to that route. So what we did here is we now bound web app and books, and we gave them permission to modify those components. So if I'm at traffic, even though I can talk to authors, I can't make any changes to the authors. I can only do that through the web app front end, because it's the only service that should be allowed to make those changes. Authors is a protected back end. Make sense? And that's the demo, right? So it's meant to show you how you would do this in practice, right? For those that have done it, right? Who is considering doing policy? Who are the Lincardy users before this? Right, nobody that's here. Cool. Well, I hope you all are interested in trying Lincardy. And if you do try Lincardy, I would love to talk to you about using policy with it. And with that, let's go to our cleanup slides. So I've got eight questions before we leave the demo. All right. One thing I would say is that I'm not employed by Boyn or anything. I'm not even a Lincardy user, really. But I've really been struck at how simple it is to use in this kind of thing. Like we're talking about some pretty complicated concepts. And it's really impressive to me how simple it is relative to other service measures that I've used. I think it's a really great tool. I'm certainly going to be playing with it a lot more following this. Thanks. Well, I'm glad to hear it. And I hope I've encouraged you all to try it out. And if you have questions, I'll be at the Boyn booth like all week, basically. And there'll be lots of other folks over there. There's also a Lincardy booth. If this is cool, you want to learn more, Boyn.io slash SMA, that will get you all the workshops that we've published online. They're recorded. You can go through them. There are examples out there for all this stuff. The next step, we're going to do a deep dive into everybody's favorite topics, CNI plugins, and how do we modify your IP tables on your Kube cluster? Keep it at fast pace and exciting. Beyond that, if you think Lincardy is cool, but you think, gee, I wish instead of doing any of this work myself, I could pay somebody else and they do it for me, check out Boyn.cloud or go to Boyn.io slash demo, where you can sit with me for 30 minutes and I'll tell you why you should pay me to have us run Lincardy for you, all right? Well, paypoint. I'm not going to do it. I'm not going to get the checks, unfortunately. But I'll talk to you about it. Yeah. We're also going to have a JetStack and a CertManager booth in the Project Pavilion. The CertManager booth is well worth a visit. This is your advance notice, so you can get in early. We have a Raspberry Pi and a little printer, and we will print out a little certificate that you can take home in person as sort of memorabilia. It's really fun. It was really successful and popular at KubeCon Valencia, I'm told. Unfortunately, I wasn't there, but yeah, really popular there. Please do come along and get your own certificate with a little seal and everything. It's a really neat little thing. Also, do check out the JetStack blog. We have some pretty cool stuff on there. And when you're dealing with certificates in any meaningful, large-scale deployment, usually that introduces sort of management challenges, like how do you keep track of them, that kind of thing. That's JetStack's product area. So if you sort of want to go above and beyond the open source CertManager project, it'd be worth visiting my colleagues at the JetStack booth to talk about JetStack Secure and the product offerings that they have for helping to manage your workload identities. Yeah, I'm going to go get a certificate tomorrow. Yeah, excited. I hope to see some of you there. All right, that's it. Thank you so much all of you for coming here. Really appreciate it.