 Today, we are going to see about how to debug Kubernetes controllers using your IDEs. So I am Surinder Ravichandran. I work as a senior software engineer at FN Networks. Without further delay, let's start this presentation and then this is the agenda of this presentation. We'll see a little bit of a background about where this problem starts and then how to address that and a short introduction to debuggers in general. Then we'll see various patterns where you can use your ID integrated environment where you code and connect that to your Kubernetes cluster and debug your controllers directly. We'll see a short demo of this, and then we'll finally summarize the talk, and then we'll go for the questions. So six months back, I was asked to start a new project to take over FI's ingress controller for Kubernetes and OpenShift. When we were in the discussion, when the developers were in the discussion, I asked, how do I debug this? So they said, listen carefully to me, okay? What you're going to do is add a bunch of print statements, compile it, and then you need to tag it. Once you tag in the Docker, you need to upload to the Docker Hub and then you go to the Kubernetes, deploy your controller, then change the image back to your debugger image, and then go ahead and deploy it and then see the logs. So for every single time for the next three or five months, I was doing that even today, even today I'm doing that. For small problems, it will work very well. It makes perfect sense to add print statements and then go ahead and execute that. But in container space, it is difficult because it is running in an isolated environment, there is no connection between your IDE and the controller which is running. When it comes to controller, even just for a containerized application, it is difficult. When it comes to controller, it gets even more difficult. So your controller needs to constantly communicate with the Kubernetes API in order to function it properly. You need to keep that connection while you are creating a debugger connection. So as this progresses, new hires are coming into our team, and then when I explain this process to them, they get fed up. See, I usually go ahead and click with the click of a button, I get the debugger-enabled formizations in my previous development environment, why can't I do that just yet? So that's why I spend some time on then saw what are all the ways to solve this problem, and then there are a couple of patterns that you can use to easily modify your existing workflow to enable this debugging functionality. So what's wrong with the print statements? So a while back, I exactly told what is the primary problem of using a print statement. It slowly eats back. In other environments, it makes sense, but in debugging containerized environment, it just slow by little, it eats your developer's time. They are not reliable. What I mean by that is, you will often don't know exactly where to pinpoint a particular variable which is causing the problem, meaning you will assume the origin of a problem to be one point, but if you just put a print statement, it might not. The code execution might not go to that particular point as well. So you need to iteratively probe various points along the code, and then recompile it, push to Docker Hub, and then do this deployment again and again until you pinpoint to the particular location of the problem. It lacks a big picture. So you will get a value of the variable, that's all. You don't know what exactly is going on, what is a workflow, and everything. When it goes to, when you are using multi-threading, it gets even more difficult. Your variable might be changed in one thread, and in another thread it might not. So it gets even more difficult, and this only works forward in time. So let's say you have a loop of 100, and on a 50th go, you are having a bug. You just cannot go back to 49th. All you need to do, you again need to start and then your entire workflow and then watch for it. So another big problem is when you add print statements, if you don't have proper reviewing setup in your workflow, it will get end up in your production code. So this is pretty bad. So if it's not only going to be there in your environment, probably it might end up in your customer's environment if it goes to the production, right? Then various security, it might end up being a security bug. Okay, what's the solution for this? So you need to use a debugger, a short introduction. Let me go over a short introduction to this debugger. It won't take much time. Okay, what exactly is a debugger? So a debugger is like if you've watched Doctor Strange movie, he is having a time stone, okay? So it's debugger is an exact equivalent of time stone for developers. So you can control the time, you can stop the time, you can move forward, you can go backward, and then you can do all sorts of things, right? And not only you can stop the execution at a time, but you will know all the values, all the information at your fingertips, not only on your program, but also on your dependent libraries everywhere in the whole of the universe, okay? Right, so how to put it into practice, right? Keep this model in mind, and a debugger is consisting, well consists of, a debugger is a program that runs another program, okay? So in this case, you will invoke an application inside a debugger, right? And the debugger will be having a TCP port, it will be exposed to a TCP port, okay? All you need to do is either you need to plug a CLI or IDE into this socket, into the CPU socket, and you can control that, control the debugger. When you control the debugger, you can control the application which is running inside the debugger, right? So, what's specific to controllers, right? This is the entire model. If you want that to be working, you need to set up these things, okay? On one side you have the IDE, and you have your debugger which is running your controller. Instead of any generic app, the debugger will be invoking your controller, and the controller will need to be accessing to your Kubernetes API, right? So there are two TCP connections involved here. One is the debugger connection which is your IDE or CLI connecting to the debugger, and the second one is your API connection where your controller is communicating to the Kubernetes API, there are two things, right? So, keep this in mind whenever there is a situation to set up your debugging environment, keep this in mind, take a step back and map which component is where. IDE, whether IDE is in your local or debugger and debugger and your Kubernetes API is in remote, you can shuffle these components around in order to make this work. So let's see the patterns, okay? So, the easiest thing where you can get controller debugging, working is to keep your controller out of the box, okay? So you don't need to run the controller inside Kubernetes, okay? You can keep the controller outside, all the controller needs is a connection to your Kubernetes API. Okay, so if your Kubernetes is running somewhere, you can expose the API services, IP address, either through, you can expose it through IP address or whatever means, either through load balancer. You have to just create a service and expose it, and in your local environment, you need to point your cube cuttle to talk to the Kubernetes API. So once that is set, everything will be taken care by your code itself. It can be controlled directly by the code itself. So if you go to the Clangolibrary GitHub page, you will see, you can see the examples, okay? There are multiple examples, one of which is out of cluster configuration, where you can run the controller outside Kubernetes. There is an example for that. So if you see, it just, all it does is it goes to your home directory, such as the Kubernetes configuration file, and then take the API and use the secret there, and then use that particular information to connect to your Kubernetes. So with that, you don't need to do, all you need to do is just modify your code, maybe in two or three lines in your existing controller, okay? And this is just 20% of the work, which will give you 80% of the results. You don't have to work much hard on solving this particular problem, okay? Let's see a short demo of this. Sorry, my keyboard is not here. Okay, I'm inside the Clangolibrary, and if you go to examples, there is an out of cluster configuration. You can just directly go ahead. So this is the out of cluster configuration where it just fetches your home directory, and then use that one to create your client set. Let's see that in action. So you don't have to do anything, just it works. All you need to do is if your kubectl command in your local, in your terminal can communicate with your Kubernetes, this can use the same connection, right? So I have put a breakpoint. This is the exact same file we just saw in the web, and this is IntelliJ IDEA, so if you just click the debug button, everything works fine, okay? You just don't have to do anything. For example, if you see here, it has fetched the list of parts in the Kubernetes API, and you can just go ahead and look how exactly it looks. What are all the contents of it? For example, it has fetched 14 items here, and you can go to each one of those items and see the structure and contents of it. Not only here, you can go to anywhere. If you see here, there are a list of variables that are there right now, and you can find what is the current state of the variables at this particular moment. You can step over and then go to the next line of execution, and then you can step into it to go to the library which is actually doing these stuffs, and then you can come out of it and you can control everything, right? So that is out of the cluster configuration. Let's come back to, right, this works fine, but the problem here is you need to change the code. So you cannot change the code every time you want to debug the controller, right? In a few cases, you may need to work on customers, Kubernetes environment, and then debug some stuff, okay? So in the cases, what you can do is you have to now do 80% of work for achieving the remaining 20% of the automation, and yeah, so you can take the cluster configuration example available in the Planko library, and then use the same thing. For this, in this situation, right? If you see the IDE debugger and Kubernetes API, your IDE is going to sit inside your laptop, and your controller is going to run on the Kubernetes API, okay? In this case, you don't have to worry about controller connecting to the API, it is automatic. So what we need to do is now we need to take our code and then build in such a way that the image, the Docker image is containing a debugger. I use Dell in this demo, I'll show that shortly. And once the debugger is there in your Docker image, you can then use a service to expose TCP port. Go Dell use 40,000 as a default TCP port for accepting commands, API commands. So we just need to expose it as a service. First, let's say you have a production Docker file, something like on your right, okay? So this is the production Docker file that you use day-to-day. So you need to do a slight modification in this Docker file, okay? So in your production Docker file, what I just did is, I have two stages, one is builder and runner. Builder is going to just get the dependent libraries from the internet, and it's going to build your code. So once it is built, it is going to transfer that built binary to alpine, and then I'm just going to start my application. So that's what happening in your right. What happened in your left is a modified Docker file where now we are going to start a debugger, right? And the debugger is going to run your controller. So what I did is, a long, a long way when I go and fetch the dependent libraries, I'm going to fetch the Dell debugger also. And then I'm going to just pass a couple of flags which will enable us to disable the compiler optimizations. And then I'm just going to, in addition to copying the controller, I'm just going to copy the Dell binary. Also, and then package that inside the container. Once it is done, I'm just going to start Dell, okay? Which is going to listen on port 40,000, and it's going to execute the controller. And then finally I expose the default port. And once it is done, I'm going to deploy this controller in the Kubernetes, and then I'm going to expose it. Expose port 40,000 as a node port mode so that I can have direct access between me and the node. So once I access the node on that particular node port, the traffic will be forwarded to the debugger port. And then you know, I mean, the debugger will take care of the rest of the mix. So this is done. Once this is done, you may need to modify your IDE settings. So you need to let know that IDEs don't go ahead and look for the debugger in the local machine. You need to go ahead and look in this particular IP address and this particular port for the debugger, right? So in IDA, you need to use a template called go remote. And all you just need to give us your exposed port number and the IP address. So depending upon how you expose your service, you may need to either give an IP address here or load balance DNS address. In Visual Studio Code, you go and modify the launch.json file and then you need to edit two values, port and host. Once you change this, your IDE will know exactly where to pass the commands, debugging commands. Let's see a short demo of this as well. Here, this is, so this one is out of the cluster and this is the in cluster file, okay? And I'm there in the go path. It is important that you have to be inside the go path when you are building the binary. So once this is done, I have this Docker file which I showcased just a while back and this is the debug file and this is my regular file. So I have gone ahead and built a Docker image which is located here. So all I need to do is just deploy this debugger, this modified image and then everything should work fine. So that is my image, the modified image. And when I just run it, it got created. And let me put a break point, exactly the same place. Maybe here. And then let me say go, yes, it is in debug mode. If you go and see the variables, you see this exactly same experience just like you're out of cluster configuration and exactly like when your debugger is in your local computer, okay? This way, wherever your Kubernetes is, your deployments are, you can easily just go ahead and hook up everything and then go ahead and start your debugging. And I can just like that, I can go ahead and see the pods. There are items, even I can go ahead and evaluate a particular one. So in this case, pods. Let's say I want to see the fifth pods name. You can go ahead and evaluate it. I'll get the results immediately. This is, you can do this on any of the IDEs. Not only IntelliJ, you can go ahead and use that at VIM. You can go ahead and use this in Visual Studio Code. All you need to do is have a mental map of those components, ID, debugger, and then API. Let's get back to the presentation. Right. While doing this, you may be encountering problems in establishing the connectivity between maybe your debugger connection as well as your API connection. Both needs to be available to your debugger as well as your ID. There are few things that you can do to overcome the situation. One is that setting up an SSH proxy. So if you have SSH access to a machine which is in the local network, local network same as your nodes, master nodes and slave nodes, you can very well go ahead and create a SOX proxy or you can do a boot forwarding and then configure that in your local machine so that a particular port will be exposed as a local port in your laptop. Then you can then use that connection to make either an API connection or the debugger connection. And there is a kubectl proxy. What it does is you don't have to do anything, just issue the command, it will go ahead and create a proxy for you and then make the port available in your local host. You can then use this to configure your IDE to debug or if you're using the out of cluster configuration, you can directly, your API will be exposed in one of your local ports. You can go ahead and use that. There is another awesome tool called telepresence where it acts as a two way proxy. You need to install telepresence in both in your local computer and as a container in your Kubernetes. It takes care of the network connectivity between these two. It creates proxy automatically and it just works like magic. You can do that. And you can, if you're running your Kubernetes cluster in cloud environments, in any of your cloud environments, you can always go ahead and create a VPN connection between the network, between your virtual private network and sorry, virtual private cloud and the networks inside and you can create a VPN connection between your local laptop to that network and then you expose the service as a node port and then you can directly access. That is one other way. Even one more little complicated thing, but which you can still do is you can create a VXLAN connection. You can configure a VXLAN connection to add your local system as a node to the VXLAN that is established between the master and slaves. You can directly go ahead and join to that particular node and then you can have the communication as well. So there are a few catches with this. In, when you do in cluster configuration and when you create the Docker image, it is required that you need to be in the co-part, correct co-part. Otherwise, the co-points will not match, will not be created which will result then the IDE cannot recognize which particular code point to stop it and then move the execution to move forward and then come backwards, something like that. And in some, it's not deep enough. In some IDEs, communicating over a remote host, other than the local host is a premium feature. You need to pay for that. In those cases, what you can do is you can use the proxy, one of those proxy methods that we discussed and use that to expose the port as a local host a port and then start communicating with that. That should be fine. And yes, this is another thing. You have to create two different workflow for this. Keep your production Docker file separate and debugger Docker file separate and then you can use a make command to I mean, separate out this workflow so that if you pass a command additional parameter, it will create a debugger image. By default, it will create a production image, something like that. So, final part of our presentation. To summarize, in while using containers, Kubernetes and controllers, avoid using print statements. Always use a debugger to debug your code. And we saw out of cluster configuration which is easy to configure. And then in cluster configuration where you need to modify the Docker image that you produce and then deploy slightly differently to make it work. And then always keep your production and debugging controller images separate, the workflow separate, right? And with the networking issues, there are multiple ways to overcome it. So, you can choose either one of the way to come out of it. So, here are the links. The code that we discussed are available in these links. The first one, it contains the in cluster configuration and the Docker files for it. And the second one is your client go configuration. So, when you start writing a controllers, obviously you will go to this particular repository. And if you go to the examples, you have a bunch of examples that you can use right away. So, with that, I conclude my talk. So now that you have the time stone, so you can control the time as well as your controllers, happy controlling. Thank you. So, we can take some questions if you want. Thanks for sharing. So, in your solution, do you need to change the code debug to DLV? Yes, you can use DLV. You can use GDB or you can use RR, Mozilla RR, anything you want. It doesn't matter. I found that when you start a controller in container, you'll add additional parameters such as slash DLV. So, what's the purpose of this? Okay, let me go there. So, this one, entry point slash DLV, right? DLV is the binary. So, I know, it's a code debug tool. Yeah, so. It seems like, okay, I know it. You'll use DLV to start a controller. Correct. Yes. Got it. Any other questions? All right, seems, okay, thank you.