 My name's Phil Lombardi. Some of you may know me from just the DevOps community in my own Boston. Some of you may have seen me traveling around at various conferences giving talks. Some of you may have actually used the tools I've written. I'm currently at a company called DataWire.io. And we are building open source tools for developer productivity on Kubernetes. The purpose of today's talk, or really the topic, is Kubernetes in about 45 minutes. And really, its subtitle is everything you need to know to be dangerous. So without further ado, let's get going. All right, so we already know who am I, I think. You can find me on Twitter, by the way, if you want to tweet at me, the big Lomboski. Go for it. So why are we all here? So we are here because a couple things. You're either curious about what Kubernetes is, and you're not using it. And so you want to know what Kubernetes is, the ecosystem around it, which would include containers, Docker, things like Envoy. I will talk a little bit about all of these things, but it will be primarily a deep dive into the core things you need to know if you are using Kubernetes. The other thing is you are invested already in Kubernetes, but you're looking to figure out how to do some kind of developer productivity stuff. So you have developers using it, and you want to see some tools that may make their lives easier. The second half of the presentation will be focused around that kind of stuff. Or you're here because this is the last presentation, and you really just kind of feel guilty about the fact you're here. Question? Oh, is that you? You're the one? OK. I totally understand. Anyway, so the agenda is going to be four parts, and then we can do Q&A, assuming I actually get this thing done in 60 minutes, which, based on the title, we should be able to do. Part one is just going to be containers, and Docker, and Kubernetes. I noticed some people during Richard's talk, which was about an hour ago, kind of just had questions about what containers are to figure a good primer round containers, and Docker, and Kubernetes will be good. Then I'm going to do a core concepts for working with Kubernetes. So this will talk about some of the actual constructs that you run into every day. I'll talk a little bit about development workflow when using Kubernetes, so you guys can try and make your developers more productive. And then finally, I will go into a little bit around logging, debugging, and resiliency using Kubernetes, and talk a little bit about some tools you can use there, focus on some logging aggregation tools, and that kind of stuff. So what is a container? This is a common question. People who have been working primarily with virtual machines ask. They kind of wonder, like, so I hear about this thing where people are running code inside containers, and they go, what is it? And the common answer is, well, it's not like a virtual machine, which doesn't really describe anything. So what I'm going to tell you a container is basically, it is a form of virtualization, but it's not like a virtual machine in the sense that you're not abstracting and virtualizing the hardware layer. At the same time, you're also not running a full stack for an operating system in there. So a container is really kind of like, basically, think about it as process virtualization. You are taking the networking stack, the file system stack, the process stack from the host operating system, putting a view over it so that when that container is running, only that container sees what it kind of is in that group. And everything else is kind of opaque to it. It doesn't know about its existence. Even though it's running, basically, if you actually look at it from outside the container, you'll just see it as a regular process. Containers themselves are actually not basically deployable artifacts. There's this thing called images, or you can basically bundle them up. The container is the runtime. The image is the shipable artifact. These are basically immutable deployable artifacts. And they're runnable. So the beauty of them is they're kind of treated as static kind of binaries. If you kind of think about, like, Golang, which has become really popular in the last two, three years, its claim to fame is that you basically get a single shipable artifact, and you can hand that off to somebody, and it will run on whatever they're running. There's no need to worry about libraries that have to ship with it, no need to worry about the runtime. Just hand them the go binary. It's good. Containers and container images are basically the same concept. You have an image. You can ship it off to somebody else. You can push it up to somewhere where they can pull it from, and they can run it as if it was a single static binary. It's been popularized mostly by the Docker tool chain and the Docker ecosystem. But it's certainly not Docker is not the only form of containers in the world. Rocket is another popular alternative to Docker. And I say popular. I mean, there's some people using it, but it's not ubiquitous. Similarly, LXC has been around for a long time, but it is not nearly as common as Docker. People have adopted Docker as the primary container engine for working with these things. So going into just a quick deep dive into what Docker is, it's a combination of things. It's a tool. So we're all familiar if we're working with containers with the Docker CLI tool. It provides the ability to run containers, to build container images, to do a whole bunch of things around inspecting and running containers. It's an ecosystem, so there is a bunch of tools that have been built up around Docker, which people would know, things like Docker Machine, Docker Swarm, all these kind of things. There's also orchestration tools that have been built around Docker, which include things like Docker Swarm, Kubernetes, Amazon ECS, et cetera. And it's a platform. So it's because it's a combination of a tool and an ecosystem, that's basically what we call these days the platform for actually building things. And for the purpose of this, it's important because it is actually the default runtime in Kubernetes. So when people talk about Kubernetes, they're usually talking about a Docker-based Kubernetes. There is an ability to actually run Rocket as another container engine. But as I said, since Rocket's not very popular, you don't usually run into it in the wild. Why containers? So the reason containers have really become popular is because developers have said it's fast and easy to produce them. They can basically easily decide, this is what the container looks like. There's a convenient format for describing what a container looks like. The tool chain is simple, that they can basically specify in 10 lines how they're going to package up their entire app. And then that thing can be fed right into Docker, which builds the image that is built, and then it can be shipped off into a Docker registry where someone else in their team or an external organization can just as easily get access to what they produced. And so that's a pretty big speed-up improvement and a pretty big packaging improvement that developers had to use over what they were using previously. So the past developers may package things as jar files or zip files and chip those off somewhere, or they may create VM images if they were doing immutable infrastructure. So when you're going down the path of a VM image, you have the problem of, well, the speed to actually produce the VM image is considerably slower than producing anything else. Because in a VM image setup, basically, you're gonna have to start up the VM, run through its init process. Then you start pulling down system updates to make sure the VM image is up to date. Then you start layering in all the application-specific stuff you want. Then you have the shutdown process on the VM. Then you have the process of taking that hard drive that you had the VM running on and converting it into some kind of format that can be run on a hypervisor later on. That's a slow process compared to actually creating a Docker container. So if you just wanna package up, say, a Java application so that you can run it again, the VM approach can take you 10 minutes, even with the fastest tooling. So letting something like Packer do the work for you will just take a long time compared to being able to specify how it looks with a Docker file and then running it through the Docker build command. It's 10 minutes first, 60 seconds or less. It's a great way to basically isolate different components in your system. So if you want to basically do something where you have an application itself is packaged, so let's say you actually, an example, you're working with a Java application and there's the app code and then there's the libraries that go with it. You can split those up so that you ship the libraries themselves in a different Docker container, link it over with the app and you can basically be able to update the library container at any point in time. It also allows you to kind of ensure a reproducible runtime for your app along the Dev build test and prod kind of pipeline that you might create. One of the problems people have often had is while we can make, as operations, we can make the parts like the test, the stage and the prod parts more or less completely reproducible. There's been for a long time, the problem of it doesn't build on my machine or it doesn't run on my machine, which is kind of an easy thing to fix in the Docker world because you can actually package up the entire build infrastructure needed for a code base right into the Docker image. Then you can share that with your team. So rather than having to have people install Gradle or Maven or PIP or all the Python stuff or the Go tool chain, you can package that all into the Docker image, then ship that around, then you have reproducible builds at any point in time. And so kind of just going with the, it's easy to share. So if you're working in a team, because of the Docker registry system, which basically is all built around the ability to pull images from a central location, layer on changes onto those images, and then you can push them right back up to the registry with a named tag. You can basically go to the person in your office, say just pull this image and you'll have it. Or you can tell an external partner or someone you're working with, a customer, pull this image and you'll have what we were working on. So that's Docker, that's containers. What is Kubernetes? So at a very technical level, Kubernetes is, it's all about running a massive number of containers. And it's based on lessons that Google learned when they were building out stuff like their Borg infrastructure. But the idea is it can schedule all kinds of containers. It can schedule things that are long running. So your typical web services or your microservices that build up your application. But it can also run short-lived processes as well. So a typical use for the short-lived process cycle in Kubernetes is you want to do something like run database migrations before a container comes up. So you can basically have it go, I'll bundle up all the database migration logic inside the container that runs before the other container comes up, that'll occur. And then you have a reproducible database migration code that can be put into a container and shipped around. Similarly, you can schedule cron jobs that can go into containers. You can run those cron jobs to the containers as well. I like to personally think of Kubernetes as a distributed operating system or a process manager where the thing that is running is actually the containers. And Kubernetes is basically responsible for managing a whole bunch of worker compute nodes and putting stuff out there. That analogy works for some people. It doesn't work for other people. That's one I worked with. Richard ran with the POSIX of the cloud basically as another kind of example. But really getting to it, trying to get more to a level where people often talk about what is Kubernetes. I'm gonna go with an office tower analogy. So if you think about what your typical product looks like, we'll think about it as the Chrysler building. You are, your company is building the Chrysler building. The business logic contained within the Chrysler building is the offices and the workers, the individual functions going on within that building. Kubernetes is this. It's the frame that your stuff is built atop. It's not the framework that you build your applications with. It's not Django or Ruby on Rails or Spring. It's the wiring that's underneath, it's behind the walls. If you think of your application frameworks as the plaster, the lighting, and all that stuff. Kubernetes is actually more like the bearing walls, the steel, the wiring that gets, allows you to send messages from floor 50 to floor seven or allows you to get your visitors from the lobby all the way up to the top of the tower. That's what Kubernetes is doing. It's the foundation for your app, the foundation for your team to build a platform. Why are people adopting Kubernetes? People are basically adopting Kubernetes these days because it has grown as the most popular, basically the most popular tool out there for contributors and change. So there's a couple other options in this field. There's Amazon ECS, there's Docker Swarm, there's Hashtag Carps, Nomad, and there's Visos. All of these are good tools. They all have basically drawbacks. So Amazon ECS, you're locked into running on Amazon, you can't run it locally, you have that problem. When you're using Docker Swarm, it has kind of the reputation Docker has had in the past times with engineering quality. If you are on Hashtag Carps, Nomad, it's a little bit more of an unknown tool, though very customizable and very kind of productive if you have a minimal design. And then there's Apache Visos, which doesn't really belong in this category because Visos actually is, it's more of a generic scheduling framework. You can use, people kind of lump Visos and Kubernetes into the same bucket, but when you think about what Visos actually does underneath, it's all about scheduling resources rather than containers. The design of Visos is built around any kind of abstraction. So it could be virtual machines, it could be actual bare metal. They're not quite parallel. Point is like they're all lot less popular. Why has Kubernetes kind of taken off? Well, it seems to have figured out how to get to the biggest ecosystem going. Combination of big name backers. So Google obviously started the project, has put a lot of money into it. It's got Red Hat working on it, it's got Oracle working on it. Amazon is working on parts of it. Microsoft is heavily involved in it. So it's kind of pulled in all these big companies which have put a lot of engineering effort into it. It's very open to contributions. So people are basically able to propose things and get them in very quickly. The release cadence for Kubernetes is three months and it's kind of like clockwork. So they don't really spend any time dilly-dallying with releases. If it's not ready in three months, it just gets pulled off the release train and they go. So, you know, one seven, one eight, one nine, all very predictable when they're going to occur. Which makes it really easy for people to predict when features are actually going to land. It's runnable just about anywhere you want to run it. So you can run it in the cloud, very typical to run it on top of Amazon, very typical to run it on top of Azure or on Google's cloud. But you can also run it bare metal is another option. Or you can run it locally on your desktop. So you can have a local Kubernetes cluster available to you at any point in time. And really it's the unprecedented cloud portability that it gives you. You're no longer necessarily bound to running on Amazon. If you want to put your compute workloads on Google, keep your storage in Amazon, that's an option that's available to you. It gives you a migration path and a way to get away from the domination of Amazon in the marketplace. I kind of put this up here as just a brief, let's talk about the Kubernetes architecture. It's pretty simple, it's actually, it sounds more complicated than it actually is. But the setup is basically you have one or more masters running for high availability, followed by one or more nodes that actually take workloads. So the nodes are what run the containers. You have on the master just an API server, which is what you usually talk to with the kube cuddle command. You have the scheduler and the controller manager, which are responsible for putting containers out on the nodes and the controller for managing that things actually exist and get a little bit to that. And the second with the way Kubernetes handles the world is kind of like a thermostat. On the nodes, very simple to think about. Docker is obviously the actual runtime. Kubelet is basically the orchestration engine for talking to Docker. And then this kube proxy thing is what enables the communication between the nodes. So everything kind of flows through kube proxy to talk to the other nodes. Kubelet is handling the actual orchestration and Docker is handling the runtime of the containers. What are the big five things you really need to know when using Kubernetes? It's pretty simple. There's the things that are gonna run your code. There are the things that are actually gonna allow you to connect your code. And there's the things that are gonna basically let you configure your code. In Kubernetes, that falls into these things. Pods, deployments, services, config maps and secrets. The latter two, config maps and secrets, pretty much being interface the same with some subtle differences which I'll talk about. A pod, you say. So when you are running a pod, basically Kubernetes, it groups containers. It doesn't run, you don't actually tell Kubernetes run this container directly. You tell Kubernetes to run a pod. And a pod is a grouping of one or more containers that are strongly related next to each other. So when I say strongly related, think about your typical application like a blog where you will have a front end tier, perhaps that's Nginx or it's a static site or API server. It'll have a comments server and a story server and that it may have a Redis instance for caching. And then persistence is kinda left up. We'll just leave that out for now. Think about that way. That is your application at a glance. So you need all these components, you need to deploy them. When you think of a pod, think of them as all local to each other. So those components are gonna be deployed onto a single worker within the Kubernetes cluster. And so when you were actually talking about pods, what we're really talking about is they're sharing the same host. They all have the same IP address, all those containers within that pod. They have the same IP address. They have the same port space. Because they're sharing the same host, they also can do things like use Nginx domain sockets. They can basically share the file system on the host. Basically the idea is there's a locality when you're in a pod. But the unit of scaling in Kubernetes is also the pod. So say you want to run two or three or 50 or 100 instances of your app. You bundle them all up into pods. You tell Kubernetes, I want 50 of these out there on my worker cluster. And it'll go and schedule all those pods across the entire system. And so now you have the problem of, well, oh, sorry, well, hold on, we got there. Moving ahead too fast. So you've got all these pods. They all have their own individual address. They have all in their own individual port space. They're all kind of local to each pod is local to all the containers in the pod are local to the worker node. So I like to think of pods as basically like a virtual machine host, a regular bare metal host, your laptop. Kind of think of it that way. You can reference local host. You can talk to the other stuff you've deployed on that machine as local host, or sorry, the other stuff in the pod as local host. Since they're all sharing the IP in the port space, I'm gonna hit that again. Important to note about pods, they're not durable. So what that means basically is, yes, you can deploy the pod. Kubernetes is not going to ensure the pod stays running. And what I mean by that is if the node fails, then the pod goes away. But that gets to my second point here or last point really. You don't really interact with pods much. You need to know about them from an abstraction level because they're the most primitive thing you actually interact with in terms of specifying the shape of what your application looks like. But very rarely do you put into the configuration for Kubernetes a pod definition and then ship that off to the Kubernetes API server and let it deploy just a bare pod. It's an atypical use case. What people really do is they use this thing called deployments. Deployments basically are basically a mechanism to configure, scale, and update applications. So going back to our blog post, if you thought our blog engine, it's you want to run three of them. You tell Kubernetes, create a deployment that has three replicas of this. The deployment mechanism also has the ability to kind of specify how you want to update. So it can do a full on, just recreate the application every time. So basically roll, not even roll, just put out new pods, shut down the old ones. Don't even think about like the safety. Or it can do kind of a specialized rolling update so it can go through the entire cluster, say 50 of these pods and one by one by one, update them to a new image or you can specify a scaling factor. But the point is you basically only have to tell it what to do. It's a declarative system. So you specify what the, you not only specify what your application looks like by specifying a pod, you specify just how you want to the update and you specify how fast do the update and Kubernetes will do all the mechanical work for you internally. So in that way it's basically operating like a thermostat. When you set your thermostat to 70 degrees, you don't worry about how to go make, you know, your house actually 70 degrees. You basically let, you know, the furnace figure out how to get the temperature up then the thermostat to tell the furnace when to cut off and all that jazz. Gets you from current state to desired state. So, great. We've got a whole bunch of these pods running. We use deployments and we've got 50 of these pods running. We actually want to basically talk to them. As I mentioned earlier, we have a problem. Very tricky problem. So all those pods have individual IPs and they all use individual ports. And so how do you actually communicate with the pods as if they were a single application? And for this is what you need is a service. And it's actually a well-known problem and there's a well-known solution that we're all extremely familiar with called DNS. And so services are basically configuring an internal DNS server for you. You specify a name and then you basically specify a selector. So the service has a name and you can label the pods that you put out through the deployments with certain labels. So like I would say name of blog and the environment and the selector on the service would be the blog and it would be prod. And when you hit that DNS name, which would be blog, it will send you right internal into the actual blog app. And it'll basically, it does a round robin but it's kind of a, it's a weighted round robin. So it's not a dumb round robin. It's a weighted round robin across the individual pods. So this is kind of what it looks like. You got basically DNS name, blog. There's a long form as well, blog.default.cluster.local. That, you only use that in special use cases. So like if you're actually using namespaces and Kubernetes, then you would use that. The default is the thing that would change. It could be something like, you know, my team or Ken's or prod could also be namespaces. There's no real policy around how you should use namespaces. People have different strategies for it. But the idea is it'll, you hit that name goes through the IP. Nothing, nothing we haven't heard of with basically A records. What starts to get cool is the selector capabilities. So you can create two or more services or end services and you can basically start using these label capabilities. So if we're looking at the blog, the blog zero, blog one, those have app of blog and end for prod. There's this poorly labeled blog zero, which should be blog two, just has app people blog and doesn't have an end on it. Think of it like a Venn diagram, where when you have, when you match all the things, that's where you go. If you match only one of those things, that's where you go. So we have this blog service, and if someone were to do HTTP inside the cluster, HTTP colon slash slash blog, they'd end up at blog zero and blog one, the left-most side. And if they were to go to blog HTTP, colon slash slash blog staging, they'd actually end up doing something interesting. They'd be going to not only the new one, blog zero, but the old, the other two, blog zero. Wow, this is poorly, so labels good. Blog zero, blog one or blog two. They would go across all three, which is really a useful feature. Sounds kind of like, why would I want to do that? But it's useful in the sense that, think about it, if you were doing canary deployments on Kubernetes, you would have a mechanism to start feeding some amount of traffic into a new container that you just brought up, just based on using labels on the service. There are several types of services in Kubernetes, and they're all pretty useful. The first thing to know is most of the time you're going to use a cluster IP. If you think of a cluster IP mapping to something in Amazon, or, sorry, EC2, specifically, it's probably, it's going to be a private load balancer, private ELB. So it has no public endpoint. The only way you can actually talk to a cluster IP is being on the cluster. So that means if you have service foo, it can talk to BAR, if you had a BAR service, but you cannot come from the public internet and talk to either foo or BAR if they're both cluster IPs. Nodeport, what Nodeport does is it actually opens up a port on the host, underlying host machine, and then routes traffic in. So this is how you actually end up getting public traffic from the outside world into your cluster. You'll open up a Nodeport service. You typically use this for an API gateway type service, a front-end service of some sort. Traffic can come into that. You will be responsible if you do that for updating some kind of DNS system with the actual IP addresses of the individual nodes, since it's going to open up a port on all the nodes across the cluster. The third type of service is this load balancer service. The load balancer service basically wires, it's interesting. So load balancer service is only available on certain platforms. So you're only going to bump into it if you're on Amazon and you're only going to bump into it if you're on Google and you may bump into it on Azure, but I'm not sure. What it will do is it will actually go off and create a ELB for you or a Google load balancer or whatever the equivalent is on Azure, and it will manage all the work of adding the nodes behind it to the service pool. So basically people will be able to talk to an ELB endpoint, so then you can put a DNS record on that. Traffic can come in through that and it'll route to one of the worker nodes in the Kubernetes cluster, which will then forward it back to whatever the container is that is actually running the code. Finally, there's the external name service which is often forgotten, but incredibly important. So the external name service isn't at all about talking to another thing inside the Kubernetes cluster, it's all about talking to an external service outside the cluster. Think of a use case where you need to talk to a Postgres database. And so in some situations you may actually want to have that Postgres database be an RDS database run by Amazon. In other situations you may actually want to just run the Postgres database as a container, perhaps for development speed. The external name feature allows you to basically put a name that acts as a redirection point. So your service inside your actual business logic looks up the place it's gonna talk to by talking to the external name service. The external name service will hand it the backend that it's gonna talk to, and that can change depending on what you want it to be. So it could be one day it could be an RDS database, another day it could be a Docker container database, a third day it could be some other thing. Depends on what you wanna do. The point is to avoid actually having to inject application level at the application time any kind of configuration that really says like, oh, go talk to the RDS specific C name that they set up for me. And then having to change that later on when you wanna go into development mode. At that point then the code becomes, we talked to this external name DNS record which is a C name which points to one of these other things. So very useful. Anyways, summary of this thing. So services, they're creating DNA records for the cluster IP, the node port and the load balancer thing. They are not doing it for the external name, those are C names. They're powerful label matching capabilities which allow you to basically kind of route to different pods within your Kubernetes cluster. It's just based on kind of doing like a Venn diagram. And then the third one I find really powerful but a lot of people don't know about. Kubernetes supports SRV records which are really useful if you need to do a port lookup and you don't wanna code the port into your actual application code. So your developers will often, you know, hard code like 5,000 into their app or 5673 or whatever. No, they can actually do a, they can do an SRV lookup in their code and that will come, the port number they have to talk to will come from Kubernetes, which is really useful if you're basically changing things up and you don't want to, you know, you don't want developers making things brittle by, you know, just picking random ports and having to always rely on it. Config maps, so we covered things that actually run code. We covered things that actually allow you to talk to code and stuff that's running. Then there's how to configure code so that it does what you want to do. There's two constructs for this. The first is config map. So age old problem containers or if you were doing immutable infrastructure with VMs, they're immutable, how do you get configuration data into it? And so lots of us have probably come up with horrible solutions over the years to do this. I know I've written some terrible things that did things with S3. I've also written some really bad code that talked to console before, way back a couple years ago, that basically pulled out data at boot time for a VM but it didn't really think about things like versioning or any of that kind of stuff. Config maps are the answer. So config maps at deployment time, you specify all the configuration that you need. You put it into Kubernetes. When a pod comes up that references a config map, Kubernetes will automatically inject all the configuration into the pod. It'll either do it as environment variables or if you need to do really advanced configuration, it can actually just lay down entire files onto the file system through a volume. So that's config maps. Secrets, they're a cousin of the config map. They're kind of hacky these days. I'm gonna say like, you may or may not want to use them depending on how concerned you are about security at your company. So the problem with secrets in Kubernetes is that they're stored in plain text on the master. And so then you basically, the answer so far from the Kubernetes people has been like, well, make sure you secure the master, which isn't a very satisfying answer. They're working on it. They're actually, they are working on making it so that that's not the case. But that is the state of things. At last I checked with one seven. I think one eight, they're actually finally rolling out some security there. But the big difference between config maps and secrets is really about how they're, how they're issued out to the nodes. So the only time secrets are actually shared from the master where they live over to the nodes is if there's a pod that actually needs it. So if a pod says like, I need this secret, give it to me, then the master will ship it over. Otherwise it's gonna stay only on the master. Further, it's also gonna live in tempfs or in memory somehow. Kubernetes won't write the secrets out to disk when it ships them over to the node. So it's just living kind of femorally. I put this together because I know a lot of people are really familiar with AWS and EC2. It's sometimes easy to map over concepts rather than trying to kind of understand them through all the features. If you think about what a pod is, it maps over really nicely to an EC2 instance. If you think about what a deployment is, it maps over really nice to an auto-scaling group plus a launch configuration. And there's a little bit more going on there too because there's a policy, like the deployment has policy around how to upgrade. So like launch configurations and auto-scaling groups, they don't really have that. You have to kind of orchestrate that yourself. But deployments can do some built-in policy stuff around that. Services, map, basically the ELB. I also should have put DNS record in there as what they are. And config maps and secrets, they don't have any analog in the EC2 world other than creating some custom solution through DynamoDB or S3 or running console or etcd yourself. Let's talk a little bit about DevOps or workflow on Kubernetes. So as part of our role as operations, DevOps, platform engineers, infrastructure engineers, whatever. It's really about aiding our developers and making it so that they can ship code quickly, efficiently, do it correctly and be productive. And Kubernetes is awesome for doing that. Like it has a lot of power, a lot of flexibility. It can kind of satisfy any use case you throw at it. But great power comes with lots of potential for learning and learning pain is probably the single greatest ability to slow down engineers because they will start digging into every little detail that they don't need to think about. So how do we end up making developers productive? So part of this is really about kind of laying down some standards and I'm not saying there's like a, not saying there's a particular silver bullet here, but it's kind of about standards and kind of about basically making sure configuration and stuff like that is specified in a way that no one has to guess about it and that kind of stuff. So how do we do that? Kubernetes has this thing called a manifest. A manifest is basically a giant blob of YAML and by giant I mean under 200 lines of YAML. It can be split out across one file or n number of files. It can be written in YAML or it can be written in JSON. I strongly recommend avoiding the JSON format for a number of reasons, mostly around comments. If you think about what you do with this YAML, so you write out what you wanna do in the YAML as a declarative, like this is what I want, this is what the state of my world looks like. It's gonna be a pod, it's gonna have four containers, Redis, comment server, post server, front end. You put it in that file and then you tell the kubectl command which is basically the interface people use to talk to Kubernetes, go apply this thing. And Kubernetes will go off and basically it will take that config file, it will do a whole bunch of API commands, actually it'll do a whole bunch of API calls. So there's business logic built into that kubectl command. It will check the state of the world, it will see what has to do with computed diff and then it'll apply the changes that it sees to bring you from whatever the current state of the world is into the desired state of the world. So basically, as I mentioned a little bit earlier, thermostat model. Finally, the thing about manifest is while you can basically write out all the config as kind of like hard coded, so like Docker images, there's a field for putting a Docker image name in there or Docker image tag. You can do that, but you're gonna be updating files all the time when changes come in and in a fast paced kind of system that may be negative. I find these days that parameterizing the templates and then using something to run over them and change the parameters to what you actually want at runtime or deployment time is a much more satisfactory solution. So there's a bunch of mechanisms for doing this. You can do it with Python, it's Jinja2, GO, obviously has templating mechanisms I have in panic or time crunch done stuff with said and not been particularly proud of myself. So often asked when people are starting to build stuff on Kubernetes, like how do I structure an application in a sane way? And this is not about how to lay out your source code. This is not about how to where to put really a Docker file. I do generally suggest people create a KADS directory, which is the abbreviation for Kubernetes or something similar Kubernetes directory and start putting YAML files in there for describing what is needed to run the application. So drop that in there, you put your KADS directory there, you put your manifests in there either in actual concrete form so it has all the values you want hard coded in and then you can just do kubectl apply on that entire directory and kubectl will basically know what to do. It'll start creating the deployments correctly. It'll write services out to the Kubernetes cluster. It'll set everything up just the way it's supposed to. Or you can take the approach I mentioned before which was have something code gen. The templates out, take a template, code gen it with the values you want and then same idea, kubectl apply on that output source. I'm kind of harping on the avoid hard coding thing because it's bad. I've done it before. Really try and do the templating if you can and then also a further lesson that I've really learned is avoid putting namespaces, directly hard coding namespaces into your templates. Instead prefer to actually use namespaces as switches to the kubectl command. The problem with the namespace by basically hard coding it is if you do it that way you're basically telling anyone who runs and deploys your application that you want it in that namespace and it's really not your policy. It's not your decision to make. People use namespaces for a variety of different things. Some people use namespaces as environmental policy like we will segregate on prod, staging, dev. Some people use namespaces by team. Some people use namespaces per app. There's no policy. So it's not really polite to be like let's just hard code a namespace in here and call it a day. So try and avoid doing that. I really strongly recommend sticking to YAML rather than going to JSON. I have a love-hate relationship with YAML as many of you probably do. It's very easy to read and terrible to write and has lots of ambiguity. But at the end of the day the ability to put comments in there and describe what some of your stuff is actually there for or to kind of be like there's a reason I've picked this particular type of update policy for this pod. Don't change it. Really valuable. From a just documenting what's going on and making sure that's available in source control and all that kind of stuff. I also recommend just keeping your Kubernetes manifests in a single file. For a long time I used to split it out across a number of files and what I realized is people don't know where to look for things at times. And so I've kind of consolidated on the just keep everything in a single YAML file. Name it whatever you want, service.yaml, hello.yaml, deployment.yaml. But it makes it easy to basically do something like you can throw comments in there that people can search for easily. You can easily look at with the whole thing it's structured in a single editor window. It just makes life easier. So development workflows. There's no real silver bullet for workflows. They're all different for everybody. Our industry loves to kind of do the let's pick an awesome one single thing that'll work for everybody and just never happens to be the case. That's fine. Kubernetes can do a whole bunch of different workflows but what you really do need is tools that can adapt to changing requirements and the process around what you're doing. And so you can't pick a tool that will straightjacket you into a certain methodology. Personally, I've had great success doing trunk-based development. Using parameterized templates as I've kind of mentioned. Structuring out things is a mono repo which would be basically the entire companies code is living in a single repository in Git or some other version control if you're using it. Or the pseudo mono repo which I've actually fallen in love with a little bit more recently which is on a per application level I keep all the services that compose it in a single repository but there may be several applications in a company that actually have to be built so those can all be their own mono repos. And then really you need to offer your developers dev tooling that allows them to go fast. If you make their lives easy they will be quick to adapt to whatever you're throwing at them. They don't wanna be spending time kind of like down in the trenches thinking about the infrastructure or the process any more than they have to. They're focused on basically clearing their plate whatever their boss is throwing at them or their GRO tickets or telling them to do for business level functionality. One tool that I've kind of adopted recently and I'm gonna put a little caveat here is I work for a company that builds this but DataWire is a little bit of an interesting company is like no one's really forced to use the tools that we produce at the company. We all kind of do our own thing and over time consensus arrive at the correct solution. It's a somewhat interesting engineering culture. I started out with the approach of having a single directory where I put my Kubernetes manifests in and then using said to actually template out things. And I was very happy with that solution but it was really confusing for other team members to work with and it was hard to debug and it was kind of brittle to what I was doing. And so my co-worker and my boss really, Raffy started building this thing called Forge and for a long time I was pretty resistant to working with Forge. It wasn't my code, it wasn't the thing I built but the other day like it makes sharing the projects way easier. It has a process that's pretty flexible built into it. It automatically knows how to do template parameterization with Ginger 2. So it took me a little bit of time to finally go like, I don't know, I'm just wasting time and kind of adopt this thing but I've adopted it. I really like it, it works really well. I've actually recently started kind of moving it not even out of my development, I kind of started using it as one of my development tool but recently kind of moving it into my production workflow. So I kind of wired it into CI and I allowed it to do its thing also on the CI server. But it's awesome because I can basically building an app and I can have one or 20 or 50 microservices composing it and it will know how to build them all from source code, turn them into a Docker image, push that Docker image and then deploy it onto Kubernetes based off basically two config files. The config file itself for Forge and the config file for Kubernetes. And what's even better about it is it's basically incremental. So I can put this thing inside of a Mono repo and if I only change one service I don't want to rebuild the world and redeploy the world. It will just incrementally recompile one thing and then put that one thing out to the new world which is really great for development workflow. So basically it computes different changes and then it pushes those updates to Kubernetes. Works great, love it. If I had more time I'd probably do a demo but actually also Richard kind of did a demo earlier so if you were here at the earlier presentation you saw him using it briefly. All right, so final kind of topic. Logging, debugging and resiliency in a Kubernetes setup. So it has built in log aggregation and log collection for all the containers running in the system. It will log and store standard out and standard error. This is not gonna be good enough for you in operations. So you're going to need to bring something else into the fold but Kupka.logs is more than good enough to hand to your developers and allow them to kind of go crazy with debugging. But for operations you're going to want to hook in to Fluent D which is there and send off logs to Elasticsearch where you can basically do real queries on what's there. You can offer up an API to your developers to do queries. You can kind of avoid having to be a grep wizard, figuring out what's wrong. But there's really, there's also more to logging than just application logs. So just getting what your app server's doing is cool but there's logging along the request line within all of your application that's really important that's often missed. So how do you know something when it fails, where it fails? It went through A, B, C, D, E, F, G, all these services and somewhere something failed and it cascaded up but you have no logging about why it failed. You have no logging about the parameters. You have no way to trace it. When you get to this kind of point and you should get to this point really quickly if you're building microservices, you want to start thinking about a service mesh. A service mesh, there's been two talks basically so far, I won't really go into detail here too much about it, two talks last two days about this. But just to recap on what a service mesh is, it's a dedicated infrastructure layer basically for service-to-service communication to make it reliable and safe. And so Kubernetes and the CNCF, they're kind of aligning heavily onto Lyft's Envoy which runs as a sidecar proxy to your services. So you schedule a Envoy proxy to run next to the blogging stuff and communication calls coming into the blogging stuff and communication calls going outside the blogging stuff would all go through this proxy to wherever they're gonna go. And with that, you get request level logging so you can trace the traffic, it gives you request IDs, you can then feed that stuff into something like OpenZipkin or Yeager which would allow you to basically see what's going on in your system. Not only from a this is what was in the request, this is where the request went, but also how long it took for this request to complete and then you can do kind of performance analysis on that information as well. So really powerful stuff. Kind of going to the kubectl log thing for developers. Your developers are gonna want to get those logs. If you can, the kubectl log command is unfortunately not very powerful. So here's the real problem with it. You have n number of pods running and you have n number of containers inside those pods and it can only get you the logs for a single pod in a single container at a time. There's no way to get an aggregate view across the cluster. So the correct solution is to spin up a last search and store all that stuff for them, but if they're really, like if they're just working and you haven't gone that far in the process of setting up the infrastructure, handle one of these things, kubectl, ktl and stern. They're all fantastic. They all kind of do similar things. I use stern personally and I really like it. It works amazingly well, but you can basically tell it go to these pods, collect all the logs, show them on my screen rather than having to go through all the various commands to do it or write a loop. So debugging on Kubernetes. So Kubernetes is reasonably complex, not in the sense of the machinery that actually runs it, but in the amount of interactions it has with the pieces. So there's a lot of failure points between talking to Docker, Docker talking to the internet, Docker talking to private registries, misconfigured things. All of those things end up sending you on a basically a troubleshooting loop and it took me a long time to come up with my own, basically a set of troubleshooting guides only to find out that they've already written all these down. So there was basically two URLs that I find really handy. These two are fantastic. Use them, because they'll save you a lot of time and I would have seen them before I started kind of debugging things myself. I will say one of the unfortunate things about Kubernetes is the documentation. While it's all there, it reminds me of the Amazon docs, which while also all being there are a nightmare to actually traverse and if you actually want to find out how to do something, it's not clear at all. You end up basically doing a blog post search on Google to figure out what you actually want to do. But something you really, at times you want to do debugging any more kind of in the cool way than basically looking at logs. So, or your developers are kind of working on a shared development cluster. There's tools for doing this. One of these classic problems is basically like, how do I actually get it so that I can run a local editor against the code that's running in the cluster or I can hook up a debugger to the code that's running in the cluster and have my native tool chain without having SSH and just use kind of like v or cat and grep on the remote host to do this kind of stuff or trying to do something like get the logs out of a log integration. Sometimes you actually want to run code and see traffic coming through it. So there's this thing called telepresence. Another disclaimer, work with data wire. Telepresence is one of our tools, but it's open source and it allows you to do this. It allows you to bridge a local laptop or workstation into a Kubernetes cluster so that traffic on the Kubernetes cluster can talk to the code that's running on your laptop and vice versa, you can have the code on your laptop talking to stuff in the Kubernetes cluster as if it was all native, feeling same network as the VPN. Even more cool is it injects the file system stuff from the Kubernetes cluster back onto your machine so you'll have access to whatever the Kubernetes cluster has for volumes so you can do kind of like you can work with the same data sets. Makes it really kind of a cool development experience. So yeah, basically I just kind of mentioned proxying that work request, environment variables, volumes and basically you code locally. You can use your favorite editor. I'm an IntelliJ guy so I often want to be able to use IntelliJ anywhere but you can do that. I can also hook up a debugger to the code and I can handle requests coming in and look at them and inspect them as if they were just there. Another use case that is really awesome for this and it's not in the debugging path but say you have a team and you're working with a team of developers they're spinning up on a project and every night you'll do like a nightly build of a shared development environment for them. You'll crank out all the base services they're gonna use but inside their little bunker each one's working on their own services and calling out to each other. It enables a really unique collaborative development experience that I don't think there's any other tool out there that enables so each person can be working on their particular service. They can call across the room through the Kubernetes cluster using all the Kubernetes infrastructure you set up. They can call to the stuff in the Kubernetes cluster but also talk to the other guys stuff who's working on his laptop. Really cool, really powerful kind of a unique development experience makes things really fast and enjoyable especially when you're doing kind of like rapid prototyping and just hacking it away. So anyways, wrapping up. So Kubernetes is awesome. There's a lot of power and flexibility in Kubernetes. We as operations engineers and platform engineers and whatever your title may be our role is to empower developers to work faster, work better, produce the business logic they're supposed to be working on. We can help them do that with Kubernetes. And also our role as these two things is to make sure that the business continues to operate and much of what Kubernetes and the service mesh stuff allows is the ability to make a resilient, safe platform that your developers can basically work on quickly and it'll be safe from failures. So service mesh stuff can do things like circuit breakers and so when one service fails it doesn't cause a cascading failure which allows you to allow the developers to start shipping code much more quickly and not have to worry about your pager blowing up all the time. Anyways, we're done. DevOps days I guess is coming to a close soon. Thank you for listening to me ramble up here for the last, I don't know, 45 minutes. If you're building cloud applications on the hop of Kubernetes, if you're doing cloud native stuff please check out our tools forge.sh telepresence.io and get ambassador.io. Forge was the development build deploy tool I was talking about. Telepresence is the remote proxying tool and get ambassador is a, basically it's a HTTP API gateway or at least it's really a full on API gateway built on top of Lyfts Envoy so that you can get all that nice service mesh GUI insideness basically not only internally to your cluster but all the way from the internet down. Thank you. So I'd be interested to know more in depth how it, so it makes sense if it's just a web service but if you're gonna set up say like five node ES cluster elastic search cluster or something, right? Like the data needs to be present obviously and if one goes down and it spins up a new one the data needs to be present there also so like can you describe in more detail how that works? So I have not really worked with building stateful stuff on top of Kubernetes so I'm gonna be the wrong person. There is features for doing this so they have this thing called a stateful set which if you wanted to actually do stateful applications on top of Kubernetes you should look at stateful sets to figure out how, what they're telling you to do. Does Kubernetes provide a Docker engine or do you need to install Docker and Docker engine independently from Kubernetes? So the question was does Kubernetes provide a Docker engine built into it? So when you deploy Kubernetes you pick the container runtime so Docker is one of the container runtimes you can choose. You could also choose Rocket which would be another container runtime. There's also the container runtime can't remember CRIO that's kind of an abstraction over all of these. The point is it doesn't mandate use Docker and the fact they're trying to get away from mandating the use Docker, basically make it pluggable for all these container engines. Okay so the question was do you need to install Docker independently from Kubernetes on each node? Is that correct? So they're trying to explain this correctly. So if you think about the two independent programs but yeah basically if you have each node you're gonna have to install Docker next to Kubelet and Kubproxy for each node that you deploy. Is that sound about right? Yes but there's tools for doing the deployments for you. So COPS is a common tool people use for deploying Kubernetes themselves. There's Kubicorn which is also pretty cool and Kube-ADM is a very low level tool but they'll handle all the nitty gritty like I need to deploy these three apps together to make an actual node or a master. Does Kubernetes provide any type of interface to tie in with auto scaling? For example in that big complicated game file you were talking about could I specify the equivalent of AWS or GC auto scaling rules? Auto scaling rules oh so like scale in for traffic? Yeah correct. I don't know so I haven't had to do it so I don't know. That's a great question. Sorry I don't have a good answer for you. On okay Phil thank you so much. This was super informative.