 Hi everyone. Welcome to the introduction to Kubernetes for developers. My name is Jason Van Brackle. I'm a unicorn engineer with Defend Unicorns. If you have any questions about this session, you can hit me up on Twitch, on LinkedIn, and on Twitter. If you'd like to reach Defend Unicorns, you can reach us on the Kubernetes Slack in the Zarf channel. You can also reach us on LinkedIn and Twitter. Today, I'm going to show you some Kubernetes concepts. We'll go over some tools you need to get started, and then we'll throw the kitchen sink at you. This won't make you a Kubernetes expert. This is the start of your Kubernetes journey, but I hope with what you've learned today and with the tools that you have, you can start getting Kubernetes better. What is Kubernetes? Its job is to orchestrate containerized applications, and that's it, but it is complicated. It not just runs the applications and containers, but it handles all the storage and networking and everything that we need to do containerized applications at scale. It's complex because of that scale. It is a reference API, and we'll talk about Kubernetes physically speaking. That reference API is just a REST API. For each individual item within Kubernetes, there is a REST API for it. You can talk to Kubernetes through the REST API, and all the CLIs you'll use will be talking to Kubernetes to that REST API. Every time you want, for example, a pod, it is calling a REST API with a specific format that is expected by all things in Kubernetes. Physically, well, it's less physically now than it used to be. It used to be one node had one function, but this could be one node, this could be thousands of nodes. Essentially, you have your REST API over here, your Cloud Platform Manager talking to your Cloud API. That could be Azure, that could be AWS, and then you have various pieces of Kubernetes for storage and for using your worker nodes to set them up for your applications. Go ahead and check out the links. The PowerPoint will be in a link at the end of this presentation, so all the links I show you here will be available to you. Also, it's a lot of YAML. It's YAML on YAML on YAML or JSON, but the YAML or JSON represents the REST APIs we've been talking about. For each REST API, for each entity on that REST API, there is a corresponding YAML or JSON artifact. Versioning is also fairly complex, and while this is a joke, it's kind of true. This one here hasn't changed since 1.0. There's been now 27, 28 releases of Kubernetes, so this is a big things break when this changes, and all of the various fixes, security things, small tweaks happen here on the end of there. So you're a software developer. Why should you care? Well, chances are you're probably doing a few of these things. One, you're probably containerizing your applications to run in Docker containers or maybe under container D. Maybe your company's gotten really into Kubernetes, and they say, here, learn Kubernetes. Or perhaps you're taking an existing monolith, and you're breaking that monolith into microservices, and that Kubernetes will enable your microservice architecture. And quickly, Kubernetes has become the de facto substrate for cloud-native applications. So generally, these are the three reasons why developers are needing to learn Kubernetes. So let's start with some Kubernetes concepts. Again, these are all APIs in Kubernetes. Kubernetes has many APIs. But again, I'm here to help you get started to know just enough to navigate your way through these APIs so when you read the documentation, these concepts make sense. So most of these APIs you won't create on your own, they'll be created by other APIs. But we're going to layer these concepts. We'll build one on top of another so you can understand what you're looking at. So the first item here is a pod. The pod is the smallest atomic unit that Kubernetes cares about. They are co-scheduled, which just means the containers in the pod will be co-scheduled in the same node. Inside of that pod is a shared Linux network namespace and a shared file system. So in this scenario, we have NGNX over here and we have Redis. And something running in NGNX could talk to Redis over a local host 6379, which is just the default port that Redis listens on. So you know about the pod. The pod's job is to manage the container lifecycle within the pod. And that's the only job. Again, we're going to layer these concepts. Everything has one job. So next is a replica set. This is another API you probably will never create on your own, but it builds on top of the pod. So where the pod is concerned with containers, the replica set is concerned with counting containers. So in this case, we have a replica set where we have specified via the replica set specification that there has to be 12 instances of this specific pod with the NGNX and the Redis container inside of them. So let's say I'm clumsy. I'm walking through the data center with my coffee. Nobody should walk in the data center with coffee. I take a sip, I trip, and I spill coffee on a server and the whole server goes down. A little bit contrived, but you know, we have some sort of outage on that server. The replica set's job would be to spread out the 12 container images that you desire into the remaining infrastructure. So Kubernetes is eventually consistent. It will do its best to meet the declarative specifications that you make or die trying. Well, we may not die trying, but it will at least yell at you when it's trying. So there will be various events and logs, and Kubernetes will spit out things to tell you that it's not quite working right, but you'll be able to see it, but you'll be able to get access to that via the CLI, via the APIs, or via your observability tools you learn as you go. So replica set deals with count of pods. The pods themselves deal with the specified containers and sort of them with them. Deployments are exactly what they sound like. Deployments are managing replica sets. We have replica sets of various pods with a specific configuration and deployments handle, they tend to handle migration from one version to another, or a change in image or a change in image tag. These are the things that deployments are for and how those things get rolled out. So you can specify deployments for like a percentage over a period of time. All your new version comes up with a new one goes down. There's lots of ways to do that. So in our example, we have this deployment there with four of our V1 pods. We changed the image specification for the Docker image we're using, and now we have our V2. So very slowly we'll roll out V2 versions and the V1s will go away until we have all V2s. And then if something goes wrong, we can completely roll back to our V1. Now this has some implications for how we develop our software, especially our software that surfaces its own APIs. So let's say we're in this scenario where we have V1 and V2 running at the same time of these specific microservices running within our architecture. Well, we may have some compatibility issues here. We may have to detect API changes. Our software is going to be a little more flexible to deal with microservices architecture. And again, these concepts all build on one another. So how do they relate? So as we said, the deployment manages replica sets, the replica sets manage pods, and the pods manage containers. Kubernetes generally doesn't care down here the pods care. Remember, this is our small atomic unit. So again, deployments manage replica sets, replica sets manage pods, but in a large infrastructure, how do things know what they belong to? Well, enter labels and selectors. So labels and selectors are the only grouping mechanism within Kubernetes. So let's talk about how this works. So our deployment ends the specification and we'll have a selector. That selector will be a key value pair. In this case, it's app equals engine X. When the replica set, the replica set will have a label. And this label is app equals engine X. Since this label matches this selector, this deployment knows that it manages these replica sets. And then the replica set itself also has a selector app equals engine X, and that matches labels on pods. Again, this is the only grouping mechanism within Kubernetes is are these labels and selectors. Now, there's another type of key value metadata called an annotation. And here's the differentiation when you start looking at the APIs between annotations and labels. Annotations are things that third party components that attach to Kubernetes care about. And labels are things that Kubernetes API is care about. So if we were looking at ZAR for example, this is a project I work on, we annotate various Kubernetes APIs for things ZARF cares about. But here, the labels and selectors are about things Kubernetes cares about. Now, these labels and selectors are arbitrary, they don't have to match. But most of the time, you're going to see the selector and the replica set and the pod all have matching selectors and labels. But again, these are arbitrary. So the deployment could have a selector and label match for the replica set. But the replica set and pod can have different selector label combinations. All that's important is that the query represented by the selector matches the downstream API's label that it's in charge of managing. I'm beating a dead horse a little bit, but I'm doing that so that you understand this concept because it is the grouping mechanism for Kubernetes. So let's take a look at one of these. I'm just going to create a deployment really quick in kubectl and we'll look at the selectors and labels. We use two tools. We use kubectl to create this and we use k9 to take a quick look at it. So I'm going to do a kubectl. First, we need a cluster. k3, cluster, create. So k3d, that's k3d.io allows you to create non-production clusters on your local machine inside of Docker. That's quite literally k3s, which is a small Kubernetes running in Docker containers. So that will actually create for me a cluster and you'll know how you attach your cluster because it's in your kube config file. So that's in your home directory and then it's kube, sorry, not, that's the home directory.kube.config. You can see here, it's, you know, this is how, this is the default authentication paradigm for Kubernetes. There are other ways to do it, but that's the one that comes out of the box with most Kubernetes clusters. I'm going to do kubectl. We're going to create a deployment with an image of nginx. Oops, I always forget the name every time. So we have this deployment. So inside of Kubernetes, you have this concept called namespaces, a lot like Linux namespaces. These are separations, the logical separations in the cluster. You can go to deep drive and documentation with this. I'm just saying that because I'm going to, you know, look at everything in this namespace. So kubectl, get all, I think I said default. So by default, you're in the default namespace. There is also a namespace called kubesystem. Don't muck around in here. If you muck around in here, you're going to break your cluster. But you can see here, we have a set of pods with different names that have statuses that may have been restarted. Like I said, you see lots of things about your cluster just by talking to the APIs. And each one of these represents their own REST API. But we're here to look at our nginx deployment. We can see here, we have a deployment and a replica set and the pod. The only thing we had to create was deployment. So let's describe it now. kubectl, describe. And we're going to go look at the deployment. And we can see here, it has its own labels when they call that .test. And if we were to, actually this will get in canines. So we can actually do, we can actually look at the YAML and canines pretty easily. You do it here and then you're command line using kubectl, get deployment with a dash O and YAML to see the YAML it was used to create it, or you can use a JSON if you want to get the JSON version. But let's take a look at the YAML real quick. You can see here, we have this selector here with a match labels of app.test. Again, this was the key value pair that should, that all the replica should have. And so the specification area is for the image nginx. And that will actually specify the replica set. Here's our available replicas that we hear about. And here's what the pod looks like here. So let's take a look at the replica set. Oops. Replicas set. And then, well, that's why I spell kubectl, right? Replicas set. Test. And then, and we can see here, so all the pods are going to have this template where it has the labels for app.test. You can see the selectors again. This is how we know this selector, this query, is how we're going to know which pods belong to us. And if I were to look at the pod, again, let's go grab the pod. I scroll up here, you can see in the metadata at the top, the labels are app.test. That's how it knows that these belong to it. Let's see if we can edit one of these. I'm going to show you something. So we're going to edit this pod and we're going to, if we can, we're going to try to erase these labels. So now, we have two pods. Why do we have two pods? So remember, our deployment is in charge of making sure that we have replicas. Our replicas make sure that we have pods. When I removed the labels from that pod, the replica set said, I'm managing all pods with the label app equals test. Well, when I remove those labels, that pod was no longer associated with that replica set. So it did its job. It made sure it created the one replica of that pod. That pod is not orphaned. We can delete the pod. We can do things with pod. This is useful in different scenarios where we might have to diagnose something. So when we look at services APIs, they work on labels and selectors as well. And we could have, we could actually orphan a pod from a service that way and look at the running pod to sit, but not send traffic to it. It might be one area where we do that. And again, we also have the canines tool. So canines is interesting. It uses the same kube config file. And this allows us an easier, more visual way of looking at things. So I entered it to navigate. I hit colon. I do the, I do the type of object I want to look at. Like in namespaces, I can hit default and see all the pods. If I want to see all my deployments, I can see my deployments. To exit, it's just like them, colon, queue. And we're out. So we've talked about, we've talked about all the ways to deploy things, but how do we expose things? That is the services API. So when you create a Kubernetes service, you're defining an internal DNS entry that can be used for a group of pods. So it's quite literally namespace.servicename, you know, basically .cluster.local. I forget the back end of that. There's a default for it. But this gives you consistent endpoint for a group of pods, again, by selector and label. And there are different types. So we'll talk about a couple of different types and how they work. But just know that services use a heavy selector that talks the labels on pods. We'll look at it visually in a moment. So services, we have node port. So for every node within your infrastructure, we talk on the same port. So let's say we create a service of a node port type that listens on 32,000. That means on 32,000 on every node, in this example, the service attaches to all of our App Engine X pods. So regardless of which node we go to, the routing will route to one of those pods. So even if you hit this node that has no pods deployed to it, the routing rules within the node will make sure that traffic gets to one of these pods. So when you create a Kubernetes service API, it is modifying what depends on how it's configured. It could be modifying IPvS. It could be modifying IP tables. It could be doing modification of hardware configuration to your router, the way like, for example, some Cisco things work with Kubernetes. So that service, how the service works is dependent on the underlying infrastructure, either software or hardware, but the behavior of it in a node port scenario work that way. Load balances are special. So remember when we were looking at that diagram here, this cloud provider API and this cloud conform manager here. We can see here the cloud conform manager, it's optional. This basically has API keys to talk to your cloud provider to create load balancers on your behalf. So that quite literally means like an Azure load balancer or an Amazon load balancer, for example. So in this case, your public IP, let's say on port 80, that hits your load balancer that's configured like any other Amazon or Azure or GCP load balancer and then sends traffic back here works just like what's called a cluster IP service. So the node port navigates to a cluster IP service, the load balancer navigates to a cluster IP service and cluster IP service just means within the cluster. So it means I have a DNS endpoint discoverable by other things inside the cluster, but not outside the cluster. Once you get into that load balancer, that load balancer boundary, everything else works essentially just like this after hits the load balancer. So we've talked about deployment methodology, we've talked about exposure methodology. Let's talk about configuration. So if you read the 12 factor manifesto for application, for application architecture, and you talk about packaging, the configuration and the application in the same area and apply them separately, this is how this works. I warn you, secrets are not secret. They're just base 64 encoded. So if you have access to the box and you can see the secrets, you can decode the secrets and they are not real secret. That's just the API name. So if you're going to actually want secret secrets, you'll want some sort of key value store. So your cloud providers have secure key value stores. If you're running on-prem, consider something like a hash e-corp vault. Works pretty well. Oh, and in your load balance area, if you're working on-prem, I like Metal LB. I've had good success with that. There are lots of them out there. So this storage or configuration values, it gets exposed to your pods either as environment variables or local storage that basically files in your local file system. So you can decouple your storage from your pods back from your application code base. So let's take a look. This is actually some really simple YAML for config map. The secret is nearly identical, except these are all base 64 encoded things. And this is a secret. So you have an API version, you have a kind, it's just your API, some metadata, like its name and name space. Everything gets the name space. And then you're just passing labels and annotations as well. And then data. So this could be mounted into the file system as a file called item one with the contents of the file being dev stuff. Or this could be environment variable that's item one and item two with values of more dev stuff and dev stuff. And you could separate your config maps and secrets for your various environments. Like using secrets is not, is a pretty common way of doing, you know, TLS certificates for underlying, you know, web based containers and authentication and things of that nature. It's a pretty common paradigm. So last high level API we're going to talk about, because this is enough, this is enough that you could set up a simple infrastructure for an application to test with. And this is ingress. So ingress is defined how traffic outside the cluster gets routed to inside the cluster. This is your HTTP HTTP as your layer seven. This is how layer seven gets to your services, which are your layer three, four. So, you know, this is quite literally the translation from HTTP into TCP IP or UDP IP. This routes traffic to internal surface services based on host and path. And the ingresses are implemented by an ingress controller, which is a container that is specific for doing ingress ingress work, usually provided by Nginx or Ajproxy or AWS ELB or Azure LBs, all I can act as ingresses. And like we said, this is, you know, it takes your domain or your your path, just kind of like a route, like an ASPMVC or, you know, it's an orange spring and converts that route through the ingress controller to a service. So this might be where you strip off TLS or you pass through TLS, depending on what you need, depending what these things are going to do. So maybe, maybe this one's traffic unencrypted and you strip TLS here, or maybe just pass the TLS through through the services to the underlying pods. So ingresses and services have some caveats you have to know. I mentioned namespaces. The ingress API that you make, so the ingress rule, not the ingress controller itself, this can limit a different namespace. But the ingress rule that you make, the ingress API, and the service API have to live in the same namespace, the same Kubernetes namespace. If they don't, things don't work right. And then once you get past this ingress boundary, the services work just like we talked about earlier with services routing inside of your cluster. One other thing to note, remember Heist that only grouping mechanism was labels and selectors? Well, here I'm lying. Ingresses talk to services by name. And that's why they have to be in the same namespace as that whole name paradigm. So let's take a look at a big picture of how this might work. So this might be a simple infrastructure for, let's say, an interior application running for you. Now, I haven't mentioned DNS yet that external DNS, but there is an external DNS project. If you search in your browser for external DNS Kubernetes, you'll get a subproject that allows you to get your Kubernetes cluster to manage DNS entries based on services. So you could do everything publicly that you'd want or have a local DNS thing and middle LB for your load balancer and do everything internally, but you can manage all of that. So your client does a DNS request, you external DNS configure that, that sends an IP to your load balancer, and you hit that Kubernetes load balancer that's your metal LB internally or that's your Amazon or Azure load balancer. Then the load balancer spreads out to Ingress controllers and I have this set up this way for a reason with this dashed line for a reason. There are two patterns that are pretty common with the Ingress controllers. Some put the Ingress controllers in Kubernetes, some manage them outside of Kubernetes. Regardless, there tends to be a separate set of nodes or infrastructure that is set apart for doing Ingress because the Ingress controllers handle a lot of traffic. So it's not uncommon to have separate nodes for Ingress controllers and to route load balancer Ingress controller traffic as quickly as possible. Now, it's possible to do all of some one node to share Ingress controller traffic with other traffic running within your system. That's just a common pattern. And just know 90% of the time, 95% of the time, I see these inside the cluster now early in Kubernetes days. I sometimes saw these outside, but generally nowadays, I see these inside the cluster. And then the Ingress controller knows by HTTP route to send traffic from the Ingress controller to these services. Now, if the route doesn't match the default backend and your Ingress controller pod will know how to handle that default backend. And then it will go from Ingress controller to service and the service will have a selector in this case, app, my app, and that select will match labels on these pods to get traffic to those pods. I haven't mentioned it, but pods can have multiple labels. So literally, these pods can have traffic from both services, no services, you get the idea. But this is like kind of a high level with the APIs you've learned, what you could build just with those simple APIs. So let's talk about some tools, tools to get it set up with running, running queries locally, running in the cloud and dealing with scale. Again, we're building on concepts here. So running locally, we've used K3D. This is just K3s in Docker. So if you have Docker, you can run K3D. K3s is just a, or K3s is a small light weight Kubernetes distribution that works great on everything from your large data center to your Raspberry Pi. So I use K3s and K3D almost every day for running locally. You can consider Rancher for multi cluster management, or Docker can run its own Kubernetes cluster. You have kind, which is Kubernetes in Docker. So this is run by the Kubernetes community, this project, and it's just like K3D, except it's upstream Kubernetes instead of K3s. And they have KWalk. So you get really advanced. This is Kubernetes without kubelet. So when we go back to that cloud conform manager, you have kubelet here doing management of things on your node, working alongside of your Docker engine, your container D engine, manipulating things on the node, you can create 1000 node clusters without kubelets to do like massive scale testing on small form factors. But you're getting pretty advanced when you get down to here and you're really knowing Kubernetes when you get there. In the cloud, if you're already working with a cloud provider, might as well keep working with them. You've probably done your IAC with them. You probably have lots of scripts for getting your stuff been running with your cloud providers. So go ahead and get started with their Kubernetes offering. Just understand there's Kubernetes offerings, have some quirks to them, and then get a little expensive from time to time. If you want something a little bit less expensive, I use Linode and DigitalOcean every day. I have a soft spot for that, especially for Linode. I used to run Kubernetes fully back in the day. And we went to Linode to do our meetups. And that was a lot of fun. So soft spot for them. Plus, they're still relatively inexpensive. They've been bought by Akamai. But the Linode APIs, it's still like Linode branded things. But feel free to give those a try. Then dealing with scale. There's a whole discussion we can have today or some other time about telemetry and observability. People generally start with one of these frameworks, either ELK, that's the last to search LogStack Kibana, or PLG, which would be Promtail, Loki, and Grafana. So these are observability stacks. These are open source observability stacks. Elk has been around longer. You can move the L for LogStack for FluentD. It's another pattern you'll see. I'll just warn you, whatever you start watching and monitoring and alerting on turnover alert, alert blindness is a real thing and it won't be useful to you anymore. And you'll just be wasting cycles for something that's not real useful to you. So pick one that you like, one that your company already uses, one that your friends use, or buy a third-party SaaS. They're actually really good observability platforms out there that are third-party that will do tracing observability, like a light speed, for example. They're having the open telemetry product or project and it's worth your time. And then if you're getting into supply chain security, chain guard images are really, really cool and very, very secure. We use them at Defense Unicorns. They're using some government projects as well. And then if you get into the software supply chain and packaging, especially as it brings the air gap, check out ZARF. Full disclosure, work for Defense Unicorns. We're a major contributor to ZARF. It's a big part of what we do. It's a project that we started fully open source and free, just like the other ones that we've talked about this evening. And then just know you're scratching the surface. There is a lot to Kubernetes and you think you're just getting started and you're going to find out there's a whole lot of other tools to learn. So it's not just those you just saw on the screen, but it's all these things. This is the cloud native landscape. So all of these tools or many of these tools work together to create your infrastructure. Now, just know you don't have to know all of these tools. You know, you pick one or two out of here that work. You're probably working in Helm, maybe some build packs. You might use Chef for setting up nodes. You might use a message bus. You're probably going to use a database. Not what I told you three years ago. Don't do databases in Kubernetes. But nowadays, I don't have that problem. I should do this myself. And then CI CD is going to be important. Look at the downside is there's a ton to learn. The upside is there's a ton to learn. And it's all really cool. There are lots of really great open source projects that all would love your contribution. So get involved, try some things out, learn some things and feel free to hit me up if you have any questions about anything. I've been doing this for a little while now and I'm more than happy to help. So to get in touch with me, the ZARF team, we're on this Kubernetes Slack in ZARF. I hang out on my channel all the time. You hit me there. I'm at JBB in most Slack. So the Rancher Slack, the Kubernetes Slack, the STO Slack and a few others. I do a Twitch channel on right now, Tuesdays and Thursdays, about noon Eastern, testing out Kubernetes things, testing out open source projects, just hanging out while I'm doing my work, maybe doing some coding. If you have any questions for our larger team, you can hit Defending New Corns on Twitter and LinkedIn. So thank you for your time. If you'd like a copy of this PowerPoint, hit the QR code or the link. All the links are in there. So real quick before we leave, since we have a couple of minutes, I'm looking at my timer. So K3D, we're going to create a cluster. We did a K3D cluster create. And KubeCTL uses download. If you're on Linux or macOS, the brew packager is an excellent place to start. And you can go to brew.sh and be able to do Linux and macOS packages. And a lot of tools I install are just brew install K9s, brew install KubeCTL, brew install K3D. Get the idea. That's the quick way. That's not always the highly secure way for doing a full-scale deployment, but it's a nice way to get started. So with that, thank you to the Linux Foundation. Thank you to Open Source Summit. Thank you for your time today. And as always, I will see you in the community. Have a great day. Bye now.