 different ways of understanding Kubernetes architecture. My name is Kim Schlesinger, and I want to tell you a little bit about my career and how I got to the tech industry. Currently, I am a developer advocate at DigitalOcean, where I focus on cloud native technologies. Before this job, I was a site reliability engineer for a few years. Prior to that, I was a JavaScript developer. And my first few jobs out of college, I was an elementary school teacher teaching here in the United States. I bring up my career timeline to tell you how I got into tech. So when I left the classroom and I was looking for a new career, I eventually landed on wanting to write code, wanting to be in the tech industry. So I attended a coding boot camp where I learned front end and back end JavaScript and then became a web developer for a few years. Although I enjoyed web development during that time, I realized I was really interested in ops and DevOps and system administration and how do applications get on the internet and how do they become accessible to users. And so I eventually found my way into a job as a site reliability engineer. I started that job in 2018, and I was working at a company called Fairwinds, which is a Kubernetes consulting agency. And so my first set of experiences as a system administrator and as an ops person was as someone who was a Kubernetes administrator, which was really, really tough. So there are a lot of people like me who start in the tech industry and they are already working immediately in the cloud. And I'm going to refer to these people as the cloud native generation. Cloud native generation is anyone who starts their career by writing software and deploying it into the cloud. And it can be really difficult as someone in the cloud native generation to really fully understand the systems that we're working with because it's very hard to build mental models of things that you can't see and touch. So we all use mental models to think about and interact with systems like Kubernetes and a good definition of mental models is any framework that you hold as a mental representation of an external reality. When I started at Fairwinds and I had more senior engineers mentoring me and trying to help me understand Kubernetes this system itself, often static block diagrams would come up. We've all seen things like this and they are really helpful diagrams and representations of Kubernetes, but they're most helpful if you understand the technologies and the practices that Kubernetes is replacing and automating. And since I didn't have that prior experience, these block diagrams were extremely difficult for me to understand and I didn't find that they were helpful. So to get you thinking about this like, okay, maybe these block diagrams describing Kubernetes are helpful for me but they might not be helpful for people that we're hiring. I think it's really important to be thinking about this because we want to be a hiring, training and retaining early career engineers who have limited computing experience. People like me who didn't grow up with computers, who didn't get a computer science degree. We can still be excellent engineers, but we have to be a little bit more creative and a little bit more deliberate about how we help people understand the systems that we're working with. And so in this talk, I'm going to be showing you three different ways of visualizing and understanding Kubernetes clusters that aren't static block diagrams. So if you're someone who's mentoring more junior engineers, these are the takeaways that I hope that you'll have. The first one, when you are creating a representation of a Kubernetes cluster, create diagrams that show the passage of time. The second takeaway that I hope you have is that you should use distributed tracing tools as a learning tool, not just as something for identifying and debugging errors. And then finally, you should build 3D models of Kubernetes clusters so people can actually touch and feel something and develop a richer mental model of the system Kubernetes. So in order to show you what those takeaways might look like, I'm going to give you three examples of alternative ways of understanding Kubernetes architecture. The first is a time sequence diagram. The second is combining distributed traces with a diagram. And then the third way is to actually show you what a 3D model of a Kubernetes cluster looks like. So let's get started with the time sequence diagram. So the example that I'm going to be demonstrating is you have to imagine that I'm a Kubernetes user and I want to create a busy box pod. I have this pod manifest on my computer. It's a YAML manifest. And I'm going to run the command kubectl create-f pod.yaml. Here's the time sequence diagram for that sequence of events. So let's go through this. So on the left, there's me, Kim, and I'm running that kubectl command kubectl create-f pod.yaml. And that request goes to the API server. Immediately, the API server goes on to step two where it saves the new state in SED. And then the controller manager is checking the API server for changes. The scheduler is also watching for changes. And the API server says, hey, there's a pod and it doesn't have a node name. The scheduler then says, I'm going to assign that pod to a specific node. The API server is saving that state in SED. And then we go to the kubelet, who's looking for any newly assigned pods. The API server binds the new pod to that node. The kubelet then creates that pod and starts the container with the container runtime. Then the kubelet sends back the pod status to the API server. And the API server saves that state into SED. So that's what happens when I run kubectl create-f pod.yaml. So why is this time sequence diagram different and potentially a better learning tool than the block diagram of Kubernetes that we saw before? So the first reason I really like time sequence diagrams is from top to bottom, you see the HTTP requests flow and they're representing time. So you see what happens first, what happens second, what happens third. And you get a sense of how this request triggers different operations in the Kubernetes cluster. So that's one thing I like. Another thing I like is there's a clear separation between the actor. That's me. The control plane, it's on this light yellow background with the components of the control plane. And then the node. So the individual node in this case is that light blue. So you just get to see the separation of what am I doing, what's the control plane doing, and what does a node do. Another thing I like is that the actor icon, which in this case is a stick figure of me, it invites you to think about who and what can communicate with the API server. Sometimes it is individual developers pushing code to the API server maybe for a staging cluster. But often we have really complex CICD systems that are actually creating the pods. And so the actor could be something automated like Jenkins. And the time sequence diagram, because it has those icons, invites you to think about who's actually doing that thing that's causing this sequence to happen. And then finally, the thing I like about the time sequence diagram is that when the API server and the Kubelet communicate. So the first time that happens is in step eight. You can see that they're located on two different virtual machines. There's that white space in between them, and you just get a sense of like, oh, okay, there's a lot of networking things going on because there's at least two virtual machines talking to each other. So really pretty interesting. So that's one alternative way of visualizing Kubernetes is through a time sequence diagram. In that time sequence diagram, we're actually getting a sense of what events are happening to create that pod. So let's go on to our second way of visualizing Kubernetes, which is a combination of distributed traces and diagrams. So distributed tracing is sometimes also called distributed request tracing. And it's a method to profile and monitor applications, especially microservice applications. And the CNCF has the open tracing and open telemetry project that creates standards for how tools that do this are built. I really like distributed tracing tools because of the visuals that you get. And I've done a lot of learning by seeing distributed traces in production. So the example that I'm going to show you today is I have deployed a sample application. It's called Meow Micro, and it's two go microservices. There's two parts to the application, the Meow Client and the Meow Server. And the Meow Client accepts requests via a REST API and then uses GRPC to communicate with the Meow Server. And I want to give a big shout out to Fernando Diaz at GitLab who published a tutorial and the Meow Micro code that I'm using in this example. So what's going on is we have a Kubernetes cluster. The Kubernetes cluster has an Ingress, IngenX, Ingress so that traffic can enter the cluster. And then Yeager, starting at IngenX, actually traces the request as it gets to the server, which is incredible. So let's take a look at what's going to happen. So we're imagining that a user sends this curl request to an IP address, the Meow endpoint, and we're sending a post request. So we're actually sending this data, and it's a key value pair. The key is name and the value that you're sending is Meow Mixer. So this is what it looks like in Yeager. And I really like this. So at the very top of the Yeager user interface, we see this request has come in through the Meow endpoint. And then if you look below where it says service and operation, we've got IngenX and that endpoint, that endpoint. And then we have an indentation. We see that request goes from IngenX to the Meow client. And there's a function in the Meow client service that sleeps the application. The nice thing about Yeager and a lot of distributed tracing tools is that you can unfold the trace and get a lot more information. So in this view, we've unfolded the sleep trace and we can see a lot of information, like these tags, the process, the host name. The thing that I really like is that you can see the IP address of the node that this service is running on. And so if you sort of know the layout of your Kubernetes cluster, you can say, oh, it's actually specifically on this node. We'll look at that a little more closely in a moment. So some benefits for creating mental models that distributed tracing tools offer. The first one is the flame graph, which is what we see this graph that's moving from top to bottom. And so the flame graph, like the time sequence diagram, shows the request move from top to bottom and it represents the passage of time, which is a really important concept when you're trying to debug a Kubernetes cluster. The second thing that I like about the Yeager traces is that it shows how many milliseconds it takes for each request to complete, which is incredible. The next thing I like about the Yeager user interface is that different colors represent different services. And so in this case, Nginx is represented with yellow and the meow client service is represented in that teal color. And finally, I already mentioned this, but just to underline it again, you can unfold each trace and get more information, like which virtual machine is that node running on that's hosting that specific container. So if you have people who are new or curious about Kubernetes, setting something up like this and letting them unfold and explore is going to be a really powerful learning tool. So this is distributed tracing, but in this particular example, I think distributed tracing plus a diagram are particularly powerful. So I created this diagram using Lucidchart that shows the request that we just saw the distributed traces for. And so let's take a look at this. On the right side of the screen, there's the computer and you see, like I would be running that curl request, so sending some data to the microservices. So we see first that request hits the Ingress IngenX load balancer, which is how traffic gets into the cluster. And then that load balancer sends it to the Ingress IngenX controller. And then the controller sends the request to the meow client. That request then keeps going toward the meow server and then the sleep function gets run. So the way that I developed this is that I looked at the distributed traces in the Yeager UI and then built out this diagram on my own. So some benefits of this diagram that you don't necessarily get with just distributed tracing. The first is you see the computer outside of the cluster making the HTTP request. Once again, it invites you to wonder, well, who's making that request or what's making that request? Is it an actual person or is it some sort of automated system? The second thing that's interesting about this is that you see the request go to the load balancer and then to the Ingress controller and then the server. And so you get a better sense of how does traffic get from the internet into a Kubernetes cluster and a load balancer is usually the way. And then the most interesting part of this, I think, is that the control plane is not involved in this particular process. Somebody sending a request to an IP address to a specific endpoint. And so you can just see there's nothing going on in the top part of this diagram. So that is distributed tracing plus a diagram, which I think is a great way to understand the passage of time, but also sort of understand how do things flow through Kubernetes when you're just interacting with the applications that you've deployed. I also want to call out a very cool project. We've worked project called K-Span and Brian Borham created this project. And so this project is very cool because what it does is it's using Yeager to trace Kubernetes events. So in my distributed traces and in this diagram, I'm not tracing Kubernetes events. I'm tracking an HTTP request that goes into a cluster. But this particular project shows you all those events that my time sequence diagram was trying to represent. So check out this project. There is a KubeCon talk about it from earlier in the year 2021. And there's a lot of information about it and it's really exciting. It's still experimental, but something that I'm super excited about. So I just talked about some of the benefits of distributed tracing plus diagramming. Before that, we talked about time sequence diagrams. And now we're getting into like the wackiest, like the most fun part, which is you can represent Kubernetes clusters with a 3D model. And I don't mean a 3D model like you're using some software to develop things that look like they're three-dimensional. I mean something like this, which I've built out, which is like I used clay and marker and twine to represent a Kubernetes cluster. So let's see exactly what I'm representing in that 3D model. So this is the exact same example that I did for the time sequence diagram. We're imagining that I am the user who is creating a busy box pod by creating the pod through a YAML manifest. So running this command again, kubectl create-f pod.yaml. And so this is the 3D model of that particular request. It's similar to the time sequence diagram, but doesn't have things like a representation of the passage of time. But you see on the left side of the model is me at a computer, and you see the request flow into the API server. And then you see on the right side, we've got the controller manager. It's also talking with the API server. We see lots of requests going from the API server to at CD. And then you eventually see the API server and the kubelet are communicating with one another. And then the kubelet only communicates with one request to create the pod. So it's sending it to the container runtime interface. So I understand this is a little bit wacky. This is very homemade. You can see the glue that's stuck on the cardboard. But there are some real benefits to being able to touch and feel a Kubernetes cluster even if it's a model that you're making up on your own, not actually touching hardware. So some benefits of using the 3D model and some observations about what I learned about Kubernetes building out this thing. So the first one is in this 3D model, you do get the sense that only the API server is communicating with at CD. That's true, but you see that there's only one place that those strings are going from at CD and it's to the API server. The other thing you really get a sense of with this 3D model is that the API server is servicing a ton of requests. And so in building this model out, I was like, oh, yeah, we call the API server like the brains of Kubernetes. And I have experience with API servers being down and clusters being unavailable, but you could see like nothing can happen until lots of things go through the API server first. And so you get that sense here. And then the thing about this particular request that I'm modeling is that you see the complexity of what's happening in the control plane. Even when you're creating just one pod, a lot of things have to happen for that image to create a container on a VM that we have access to. So that is the 3D model. So I showed you the time sequence diagram. I showed you how to use distributed tracing and a diagram and then the weirdest and most fun one for me anyway was the 3D model. So let's do a recap of why all this stuff is important. So in order to learn and in order to interact with complex systems, we all develop mental models that are frameworks in our minds that represent an external reality. And so everyone that interacts with Kubernetes has some sort of mental model about it. There is a group of people and I am part of them who I call the cloud native generation and we are people who have started our ops careers purely in the cloud. We don't have any prior experience with all of the technologies and the practices that Kubernetes has replaced. And our go-to way of modeling Kubernetes clusters right now are block diagrams. And although I don't think they are harmful, I think we need more engaging and interesting ways of visualizing Kubernetes so we can help people in the cloud native generation get up and going. And so the takeaways for you if you are visualizing or diagramming a Kubernetes cluster. First, you want to create diagrams that show the passage of time. Time sequence diagram is a great way of doing that as well as using distributed traces that show you from top to bottom how are things unfolding in the cluster. The second thing is to use distributed tracing technology as a learning tool. Yeager is the only distributed tracing technology. There are things like LightStep and Honeycomb and Datadog and others that use those both as a tool for knowing what's going on in your cluster and being able to identify errors but also use it as a learning tool. And then the last thing is that you should build 3D models of Kubernetes clusters for people who don't have a lot of prior experience in computing, touching something creates a much richer mental model than just imagining an abstract system in your head. So all of this is important because we want to be able to hire, train, and retain early career engineers who don't have a lot of experience in computing. So I used a lot of resources to prepare this talk and I've listed them out here. There are some outstanding examples of Kubernetes time sequence diagrams you can find here. And I created my sequence diagram at SequenceDiagram.org. For distributed tracing I used Jaeger and was tracing Ingress Engine X so there's some documentation there. And then like I said I'd like to call out again Fernando Diaz and his tutorial setting up distributed tracing in Kubernetes. And then look into caseband that tool that uses Jaeger to visualize Kubernetes events. There is a KubeCon talk from 2021 about it as well as the GitHub repo with the documentation. Thank you so much for coming to my talk. My name is Kim Schlesinger. I work at Digital Ocean and would be happy to chat with you soon.