 Hello, everyone. Thank you all for being here. My name is Sergio. I'm from Brazil. This is my first talk at the Linux conference, so I hope you enjoy the talk. We're going to talk here about the Linux kind of debugging. The title is Going Beyond Print K Messages. So one thing that we are not going to talk about here is putting print messages in the kernel to debug it, right? Well, let's try. Okay. Well, as I said, I'm from Brazil. I've been working with Embalo Linux for about 15 years, and today I have a company called Embalo Labworks. We do a lot of training in Brazil, so it's around 40 trainees per year, so I'm used to talk a lot. Let's see if I can talk for only 30 minutes here. I also have a blog in Brazil where I write a lot about Embalo Linux, and I'm also a Linux kernel contributor. I contribute a little bit with BuildiVoot and other open-source projects. So this talk is not about print K, right? And everything related to print K. We're not also talking about static analysis tool, like how to find bugs in the kernel, looking at the search code. We're not talking also about fuzzing tools, like testing tools to find bugs in the kernel. We're not going to talk about user space debugging. It's only kernel debugging, and this is also not a tutorial because it's just 30 minutes, so we can't have a tutorial on a lot of tools that we're going to study here with just 30 minutes. I know that there is some talks about debugging the Linux kernel in the past, so I'm trying here a little bit different talks, so I will try some live demos. I know it's risky, yeah, but I'll try it anyway. I have a board connected to my machine, and it's a board from Toradex, so if I have a problem here, I will call the guys from Toradex to help me. It is booting the Linux kernel through TFTP, to network, and then mounting a root file system built with built-in root. So it's a very small system with the tools necessary for my talk, right? So I have a lot of things to talk about, a lot of tools to show you guys who here is a kernel developer, developer kernel. Great. We're going to talk about a lot of tools to debug the kernel, and I hope you enjoy it. So before I start talking about the tools, I just want to talk a little bit about the process of debugging. We have to debug something, we have to understand the problem, we have to reproduce the problem. If you don't understand the problem, you're not going to be able to solve the issue faster, right? We have to be able to reproduce the problem, because if you don't reproduce the problem, you don't know how, when you solve it, right? You have to know how to reproduce it. You have to identify the source of the problem, and then you have to fix it. So I would say this is the main four steps to debug anything, right, in software. So our talk here is focused on the tree, right? The third step, you need to find the source of the problem. There are different kind of problems that we can have in the kernel. We can have crashes, so the kernel just crash. You have the kernel whoops, the kernel panic, and some tools to debug it and find the source of the problem. There is some problems that, like a deadlock, so you have a lockup in the kernel, something hangs and stops working, and you have other tools to try to identify and fix that kind of problem. You could have a logic problem like everything is working, but the result is not what is expected. You can have a kind of leak, like memory leak, so some research leaks could be memory, could be a file descriptor socket. You could have performance problems, right? Like the usage of the CPU is too much higher, things like that. So here we have five different kind of problems, and we could use different tools to debug and then find the source of these problems. The first tool I would mention is our brain, right, the knowledge, because if you don't have the knowledge to debug problems, you won't succeed, right? So if you don't know how the kernel works, what is virtual memory, how would you debug a problem with invalid memory access? So you have to know how the system works. The second kind of tool or technique you could use, we call the post-mortem analysis, right? So the problem already happened. We'll have the logs or a cord dump, and we have to take a look at and find the source of the problem. Another tool or technique we could use is tracing, right? So trace the system, identify what functions are being called, how the functions are being called, how much time is taken to try to identify the source of the problem. Another kind of tool is interactive debugging. So you start GDB and debug the system interactively. You run the code step by step, things like that, put break points, et cetera. And you could have also some, I call debugging frameworks like tools to find problems with memory leaks and lock and look up things like that. So we could have these five different tools and techniques to try to debug these five kinds of bugs. And I created here a kind of table to try to identify what is the best tool for the job, right? So you could put print and solve all the things, right? No. Yeah. Our objective here is to talk about those tools. So knowledge, knowledge could help no problems. So if you don't have the knowledge, if you don't know how the system works, it would be very difficult to work and solve the problems. Post-mortem analysis, like having logs and dumps, would help in some kind of problems like crashes. The best way to find bugs that crashes the system is to look at the logs, take a look at the kernel ops, kernel panic, or a dump in our case of the kernel to try to find the source of the problem. Logs and dumps could also help sometimes with look ups. The task hangs in kernel space. You could have a look at it. Our logic problems, you have the logs. You can take a look at the system or the application or the kernel execution and try to find the bug. But research leaks and performance normally logs don't help that. Tracing profiling could also help with a lot of different problems like look ups. You could trace the system, find the functions that are running. So if it stops in one function, you know it's hanging there so you can take a look at the problem easily. We're going to try to use here a trace to find a look up in the kernel. Performance could also be used with trace and profiling tools. So the tracing profiling tools like Perf have trace and profiling tools could help you to find problems of latency in the kernel space. Interactive debugging can help in some kind of problems like crashes. Sometimes we're going to use GDB here in the kernel to try to find problems with crashes also. Look ups, logic, write your program. In our case, the kernel doesn't do what we could. Start a GDB session to take a look at why it's not doing what it's supposed to do. But leakage problems, problem not. And performance, of course not because the GDB makes the kernel run slower. So it's worse. And the buggy frameworks like you create a kind of tool to find the problems. For example, memory leak. The kernel has a tool to try to find memory leaks. Inside the kernel, that's one example of a debugging framework. Kernel has also a lot of options that we could enable to find look ups and things like that. So this was created by me. Of course, we could discuss a little bit about what tool is best for a kind of problem. But the main message here is you have a lot of tools, a lot of different tools that behave differently. So we have to sometimes find the best tool for the job, right? To fix bugs faster. Okay. That said, I will start talking about kernel looks analysis. Kernel looks is a way of telling us that something bad happens or happened. And when the kernel detects a problem, like an invalid memory instruction, invalid memory access, something that they can continue the execution, they will print a kernel ops message. And the kernel ops message, we have a lot of useful things to use for the debugging process. Sometimes the kernel ops generates a kernel panic, right? And when the kernel panic happens, we don't have access to the kernel. You don't run any task. There is a documentation in the kernel search code to help us debug this kind of problems. Using the kernel ops, find the search code that caused the problem. And that's what we are going to do here. So this is our first demo. For the kernel developers, you're going to see a lot of bugs here, but these bugs don't exist in the kernel, right? So don't worry about it. I put these bugs in the kernel for the talk. So I'm going to cause here a crash in the kernel, right? There is a bug somewhere in the kernel. When I try to read, GPIO crashes, right? So this, I think it's big enough for you guys to see, right? So we have a kernel ops followed by a kernel panic. So the system just stopped at the execution. The kernel is not doing any more scheduling, things like that. What we have here, looking at the slides is better. I will come back to the terminal. So we have here, from bottom up, we have the stack trace. So with the stack trace, we know which functions were called, right? And the last function that was executed before the crash. I put it in red to make it easier for you to see. We have the program counter, right? So it's like the instruction pointer for the guys that work with x86. So we have here the address that we have the problem, the crash. And above we have the program counter, the kernel is able to solve the address to the function. Normally, the kernel is compiled with an option called K of sims. So with this option, the kernel can solve some symbols, like pointers to functions. So here, he's saying that the crash was that the function mcp23xxxxpireed at the offset 34. So we have the function in the offset. What can we do about it? We could use a lot of tools to find the source of this problem. One of the tools, it's called address to line. One of the tools from the tool chain. So we'll call it here. I'm working with a board that's based on the architecture. So I have to use a tool chain for army. So I'm going to use the address to line to just convert that address to a line. That's what the tool does. So I'm going to pass the dash f that will show me the function, name it. Dash e, we have to give this to the kernel with the bug in symbols. So we have to have the kernel in elf format with the bug in symbols. So we have here, I am in the Linux kernel search code. So I am in the search code. I have the VM Linux. That's the elf image of the kernel with the bug in symbols. So I compile the kernel with the bug in symbols. And then the last parameter is the address that I will just copy from the ops message. Let's go to the beginning of the ops message. It's here. So just copy it here. And put here. Right. So it is saying that this program counter, this address is associated with this search code in the kernel. Line 357. And then we have the line that causes the crash. As you can see, the mcp pointer is new here because I just turned the code so we could cause this crash. So the code doesn't have this bug. Okay. How can we debug this differently? There is a tool in the script director of the Linux kernel called faddr to line. And in this tool, we just have to give the kernel image with the bug symbols in elf format and the function and it's offset. So this is what we want. And the result is the same. So you don't have the program counter but you have the function with the offset. You can solve to a line of code. And another way to solve this is using gdb. So we could just run here gdb. I forgot to give the mlinux. So I started the gdb for my tool chain with the kernel image. And now I'm going to ask gdb to list what is in that address. I could use the address or the function plus the offset. It's the same, right? I'll take the function and the offset. The same result. I can open here the 2e mode of gdb. Sometimes it doesn't work the 2e mode. Let's try again. Very good. So in any form, you could use the address to line, you could use that script from the kernel, you could use gdb and you can find the search code that calls the crash, right? It's all in the slides here. But what if you can access the kernel console? I'm here in the kernel console, I'm connected to a serial part, so I can grab the kernel ops and analyze it. But what if I don't have access to it? How can we debug crash in the kernel if I don't have access? Because when it crashes and panics, you lost connection with the kernel, right? If you are not connected with something that are sending the message, you just lose the message. So that is a framework in the kernel called pstore. It's a generic framework to store data. And that is a driver for this framework that is able to store the kernel message, the logs. You could store the full log of the kernel or just the kernel ops and panic in memory. So you could reserve part of the memory for this. Of course, the kernel can touch it because you are losing the reboot. So we reserve part of our memory to store the kernel ops and panic messages. And if the kernel panics, you should configure it to reboot in a panic. So when you reboot, it must be a software reboot, of course, because it's a rather hard reboot. You just lose the memory. So it should be a software reboot. So if a kernel panics and rebooted, you can have access to the kernel ops and panic message. So to configure the pstore, of course, you first have to enable pstore in the kernel. Then you have to configure the device tree of your board to reserve part of your physical memory to store these messages. So this is an example. That's how it's configured on my device tree. You just enable the run ops driver with specific configuration to allocate part of our memory. I'm allocating here two megabytes of memory to store kernel messages. And then let's try that now, see how it works. The first thing I would do is configure the kernel. I will give the kernel parameter to reboot on panic. Let me take here my... So I'm just passing the kernel the option panic equals three. So it's going to panic in... When we have a panic, it's going to reboot in three seconds. Let reboot it. That's my time here. Great. Now, I will access the board via SSH. I will show you that this kernel... Let me show you first. The pstore is supported to the user via a virtual file system. So we have to mount it in some directory. The default directory is sysfspstore. So we should be empty now because I just did a hardware... A hardware reboot. So now I'm going to crash the kernel. But since I am in SSH, I... It's just going to freeze. I don't have the kernel panic here because I'm not connected to a console, the kernel. It's a pseudo TTY here. So I should... I'm not going to take a look at the console. So I don't have the console. It should close in a few seconds. Close it. And the kernel rebooted. I'm going to do SSH again. And now I have the files of the crash inside the pstore virtual file system. Can print. Oops. And I have the same crash. I can take a look at it. So if you don't have access to the console, you still can restore the last kernel log or the last kernel panic and take a look at it. That's the pstore. That is a feature called Kdump. Has anyone here used Kdump for... I must say it took a while for me to make it work on ARM. But it worked. Kdump is a mechanism to take core dump of the kernel. So it uses the Kexec system call from the kernel to run another kernel. It's like you could use Kexec to reboot your kernel. You just give your kernel image, it will run. And the Kdump is a mechanism for you to use when you want to take the core dump of the kernel. For example, a kernel that crashes. You could take an image of the core dump. I will show you here in the terminal how it works. So first you have to enable Kdump in the kernel. And you should style the Kexec tools in your virtual file system. And that's what I've done. So I have here... Oh, and you have to give the kernel an option. Let me show you here. Called crash kernel. And you give the size of the memory that you want to allocate for the crash dump kernel. The kernel that will capture the core dump for you. So I give him 64 megabytes for the crash dump kernel. I'm using the same kernel image. I didn't create two different kernel images. So it's the same kernel image that will dump my kernel image. Okay. And then how to use this feature? First you have to configure, run the Kexec tool to load your kernel image to this reserved range of memory. I have here my script to do that. It's here. That's my script. So if we were on the Kexec tool and we'll give the Kexec to some parameters. That's D for debugging, to show the bug information. The type of my image is a Z image. It's an ARM image. My image is in my boot directory. It's the same image that I'm booting here. And the command line, to pass to the kernel. It could be any command line normally you use here in each hand disk image. But I'm just rebooting using the NFS protocol again. The same root file system. There are certain parameters that the documentation of the kernel recommends to pass to this crash dump kernel that I'm giving here. Three for network support. They say you shouldn't boot with SMP support. So you should run just one CPU to take the core dump. Right. So I'm going to run this script to load my kernel to memory. Great. It is loaded. It is configured. Now if I have a kind of panic, you just run this image, this kernel image, to take a core dump of the kernel. So I'm running the same command again. Let's see if it works. Loading crash dump kernel, buy, and then kernel is rebooting with other kernel that are loaded with KSAC. And then this kernel will provide me this file. Core . Let me remember the name of the file. The name of the file. VM core, I think. Oproc. VM core. This is the core dump of the kernel. It's an image in alpha format. Then you can save it as a core and open like with GDB. There are also a tool called crash that is built using GDB to analyze this kind of image. Copy this image. It takes a while because it's big. It's 40, almost 45, 4500 megabytes. So I'm not going to copy this image. I already copied to my machine. It is here. And the file's VM core is here. What you should do? You should just copy this. Since it is in memory, right? It's in the proc file system. You should copy it to the disk like you could do that. It's going to take a while with this copy. So I just this is normal. There is a bug with the KMean leak. That's a framework. I didn't have time to take a look at this. This is good work. Despite this bug with the KMean leak. And then I have here the core dump of the kernel. Here. And I can just open GDB. GDB. I have to give GDB the kernel width in the L formats with the bug in symbols. And then the core. I will start with twin mode. And then we have the search code the line of the search code that caused the problem. So a lot of different ways to get the same result. Very good. My problem here is the 30-minute limit of the talk. That's it for the crash dump analysis. Let's talk a little bit about the bug in the kernel with GDB. You want to interact with the bug in the kernel. The problem is that you want to use the kernel to the bug itself, right? So it's not an easy task. And you have the search code in one side. You have the kernel running on the target on the other side. So the solution is to use the client server architecture. So you have the GDB server in the kernel and the client in your machine to send messages to the GDB server in the kernel. And that's how we're going to do here. So the kernel has an implementation of the GDB server called KGDB and you can use it to debug the kernel. And that's the architecture. You have just to enable how to make this work. You have to enable the KGDB in the kernel. KGDB is available for a long time. You can use it to communicate with the kernel using the serial part or network. But the network support is not mainline. You have to apply patches in the kernel to use it. Now, as far as I know, the mainline kernel supports debugging via serial part. So first, you compile the kernel with the KGDB support. Second, you put the kernel in the debug mode and then use the GDB to connect to the kernel and start the debugging process. Instead of showing you the slides, these slides are available right in the website from the event. I will show you a session of GDB here. So what I have here, I already compiled the kernel with KGDB and all we need to debug the kernel with GDB. The first thing I will do is reboot because it just panics. So the first step I already done, I already compiled the kernel with the KGDB support. The second thing is to put the kernel in the debug mode. So you could pass parameters to the kernel command line to do that. Or you can at runtime put the kernel in GDB mode. At runtime, we just need to configure parameters telling the kernel what serial port is going to use for the debugging. And we just need to send a command to enter in the debug mode. And I'm going to use here the sysRQ command to the kernel. I have here a script to do that, KGDB. So this is my script is very simple. It's going to configure the kernel with the name of the serial port that's going to use for the debugging. And you just put the kernel in the debugging mode. Right. I'm going to run it. And it's running. I forgot something here. I'm using the serial port for the console. And I'm trying to debug the kernel with the serial port. That's not going to work very well. So I will do something here. There is a project that works as a proxy for the serial port. It's available in the kernel.org website. It's called agent proxy. I'm going to use that to do that. So I wouldn't need that because I was not using the serial port for the console. But I want the serial port as a console also. So I'm going to do that. So I will just stop my serial port program. I will start this agent proxy. I'm almost without time. Let me finish here. I will run the proxy. Now I'm just going to telnet. Sorry. So this proxy is going to create two TCP ports. One of the TCP ports for the console, the other one is for the GDB connection. It's going to work as a proxy. It's going to receive the commands and say, oh, this commands for the console, this commands for the GDB. So one of the ports 5550 is the console. So the console is working. And the other one is for GDB. So now I'm going to put the kernel in the buggy mode again and start the buggy session. Great. Now I'm going to go to the kernel search code. Linux. I'm going to run my debugger HF GDB the kernel with the buggy symbols. Let me open and hit it to mode. And I'm going to connect using the command target to remote to connect to the kernel. And then I have to give localhost 5550. That's the other part from the agent proxy. And it is connected now. So it is stop it in the breakpoint. I will just tell the kernel to run. Now it should be running. Forget about this lot of strange things in the console. So it is running, right? I can't interrupt it at any moment. Type in Ctrl C. So what I'm going to do here is try to put a breakpoint. Let me see if that works. It is in the breakpoint. I will put here a breakpoint and a function of the kernel. That function is in the GPIO keys driver. Continue again. And I'll just press a button from my board. And it's stopped in the function. I can do debugging stuff like run the step by step, print variables and things like that. I can do debugging the kernel with JDB. One thing cool is that when you are running the kernel with JDB, if you have any crash, it will stop in JDB. So if I run that catch again, we have the crash and then in JDB we have the kind of stops, the execution and you could analyze here that crash again. Right? So, I think we are out of time. I have here a lot of more tools to show you. Does anyone here wants to have coffee? No. You're fine? I'm fine. Okay. More 10 minutes. Okay. Great, great. Right. So, another tool that I wanted to show you is Tracy. I don't know if anyone here was the Steven talk earlier today. He talked a lot about the history of the tracing of the kernel and the various tools that you have inside the kernel. So, today Linux have a very good support for Tracy. There are two kinds of Tracy. You have a static Tracy like you instrument the kernel at the compilation. So you put tracing points in the kernel and you have dynamic Tracy like you instrument the kernel at run timing, at any position of the code of memory. Some examples of how the kernel is tracing. The F trace uses the dash pg parameter from GCC to instrument all the functions. So we just call a function when you call a function. This is the same function of the kernel. So we have here just to show you when you enable Tracy in the kernel you have this. You have a call to a function that you can implement to instrument the kernel. You have trace events to instrument specific events in the kernel like scheduling GPIO events in this example. So we have a whole framework to add the trace events to the kernel. We have CAPROB CAPROB is a framework for you to instrument any memory. You could put a break point and run everything you want in the kernel. So this is some examples of the framework that we have inside the kernel for the tools to use it. And we have a lot of tools that use this kind of frameworks right? You must know F trace, trace md, that's a common line tool for F trace kernel shark that's a visual tool for the file generated by trace md. System tab per kernel patch it uses and there are many more LTT and G there are many more tools that use this framework of the kernel. Trace is a very useful tool because it's very simple it is able to trace the kernel using static and dynamic probes and the interface is very simple. It's just files. So just write to files using a virtual file system called traceFS you just read your write to files and you can trace the kernel. So it's very, very simple interface but sometimes dependent on what you want to do it's difficult to use F trace. Some examples the F trace you can enable the F trace in the kernel hacking menu of the menu config of the kernel using F trace it's simple so you just mount the traceFS in a directory, the default directory today is syscarnaltracing you mount there you have the available traces of the file so you can trace functions you could trace a function graph and have functions in a C style you can trace latency of the task is you can trace a lot of things so for example to trace a function you just write function to the file current tracer and catch the trace file so the kernel will start to trace all of the functions. There are a lot of files that you could configure F trace you could set filters I just want to trace this driver you could set the functions of this driver to trace the kernel TraceCMD is a command line tool that is able to write to this file so it is easier than just writing to the file and it is able to generate a file called trace.dat that you can open the kernel shark I show here an example of using these tools so I have here let me just reboot that is the last thing I am going to do because we are out of time I have here a bug the bug doesn't exist in the mainline kernel when I try to set the cpu-frag scaling governor to on-demand the kernel actually the command freezes so I don't know what is going on I don't know if everything is freezing I don't know if just this command I could try to SSH to see if it is running so SSH looks like it is running but it is freezing so how can we use the F trace to debug this kind of problem the problem is very simple it is just writing to a file but it is freezing so it is hanging inside the kernel so we can use F trace this is the command I am going to use to debug this problem that is going on here on-demand scaling governor so I am going to I am using the trace cmd command I am asking to record the event from the kernel I am asking to use the function graph tracing and I am asking to not trace interruptions functions inside interruptions and I am asking to follow this command so it is going to follow to trace just the kernel functions from this command so that is the command that is freezing inside the kernel when you run it it gives you this error it should give you this error because something is wrong right that is let me try again here and echo it is running it is tracing the kernel it is generating the trace dot that file it is not returning because it is hanging inside the kernel so let's just reboot the board you see what we have in the directory that we run the problem we have here these files so this is kind of temporary files from the trace cmd command we have to generate trace dot that file with both of these files because the command didn't complete trace cmd wasn't able to generate the file final trace cmd file so there is a command from trace cmd that is the restore command that is able to take both of these trace files from cpu 0 and 1 and generate the final trace dot that file and then you run the command then should take a while I also have the command here in my machine to show you guys the trace dot that file is here and then we can open this command with kernel shark that's a tool visual tool that you can use to trace the kernel in a more visual way so I have here all the functions executed by my task I will just sometimes I don't know if that's 178 that's the PIG of the process of the command that we ran I will filter by this 178 I don't want to see any other process here so how now to find out where the kernel freezes where is the problem I'm going to filter here I'm going to search for a function of the kernel that I know that is going to be called on writes because that's what I'm doing I'm writing to a file I will search for syswrite and syswrite is here this is the syswrite I can follow all the calls syswrite, case is right vfs is right since I'm doing graph tracing I can see who is calling who and I can see one of the functions vfs write that call it vfs right underline vfs write that call let's see kernel fs top right and the call goes sysfs right now we're going inside the sysfs system that call store kaling governor so we know the problem is around here because we are trying to set the governor that call this cpufrack parse governor that call mutex lock that call find governor that call try model get that call mutex lock again and that call looks like it didn't return the mutex lock should return faster but it didn't return so there is something wrong with this call what we can do here we can just go to the function cpufrack parse governor to see this calls I'm going to open the kernel source code here cpufrack parse governor oops cpufrack parse governor so here we can see that this function call mutex lock find governor here we have the function so the function mutex lock find governor it didn't call this mutex unlock because we are not seeing here just find governor then try mutex lock that's here then another mutex lock that didn't return it's not closed in the mutex lock and as you can see in the code it is in a deadlock state because you are doing two mutex lock of the same mutex so that's the problem so that's another way of debugging the kernel in the case the kernel that hangs is above the bug that hangs the kernel very good, well let's finish the talk the conclusion here there are more tools here but I don't have the time to talk about it please take a look after the talk the slides and if you have any questions please send me an email the conclusion is know your tools I see a lot of developers they have bugs in the kernel they just go to the printk command to do everything and sometimes we have more efficient tools to do the job so know your tools and use the right tool for the job that's our conclusion there are many more tools to talk about system tap per eBPF there is a lot of uses nowadays of the eBPF framework sometimes printk so it's a problem don't ignore the problem is not used in the printk the problem is just the printk to debug the kernel and debugging is fun thank you