 Hello, guys. Myself is Vandana Salve. I have been working with electronic systems like almost more than 20 years. I've done driver development for various subsystems like networking, embedded devices, ARM, SOC boards, and various kernel subsystems as such. And I'm also been recently getting involved with open source presentation. This is like my fourth presentation in open source submit as such. And this time, I decided to do some ramp on sessions for the newbies and the presentation who would want to know about the kernel tracings as such. So my today's talk, I am going to cover the basic tracing tools and utilities that we can get started off with as such. So the overall agenda is like I plan to cover up the tracing tools like S-Race and F-Race as such. And then understand what are the different options that are available with those tools and how it is helpful for tracing and debugging the applications as such. So just to get started with what is tracing as such. So basically, tracing helps us to log information over the program's flow as such. And to elaborate, it's like when we add the print messages into the code as such to debug the flow as such. So what we are basically doing is like we are adding our own tracing system altogether. And as the project goes development as for the requirement and how the product flow goes on, the debug prints can be added. And all these debug messages, they are the static trace point as such. And it works for a particular project, but it won't be, it is not very scalable as such. And for that, instead of having the individual or static tracing system, what a project can do is like use the tools that are already available that can help for later analysis as well as to store the log and then work on those logs to understand the analysis of the code flow or to debug the code or to get most other stakes from that execution of the programs. And that's where these tools, S-trace and F-trace, are useful. So what is S-trace? What S-trace is and how it is useful for monitoring and debugging as such. So basically, this tool is to trace system calls and signals that are being used in the system call that are part of the application as such. And this tool, the back end, it has been implemented using the P-trace system call, which basically does the work of tracing the live processes. And how do we decide when to use this S-trace? Like what kind of answers it gives to the questions like what are the different system calls that are implemented in the application or what are the files that are been touched by that application? And along with that, the information, like for a particular system call, what are the input parameters in arguments that are there, that are being passed? And what are the written values that are coming from, what is the result of the system calls? Like whether it was successful or whether it failed and if it is failed, what are the different causes or the reason for that system call failure altogether? And so here we are talking with the system calls like system calls are the ones that are being used by the application, user space applications. And basically, they are the mechanism to go and talk to the kernel that does the higher priority work altogether. So whenever the application wants to go and talk to the hardware or to the kernel, it goes to this system calls. And based on the different resource management, there are different system calls altogether, like more than 400 system calls that are present in the Linux kernel. And if we want to get more details, then we can do a man page on sys calls as such. Unfortunately, some issues with my laptop where I was planning to show you the demos and go through the execution that I would not be able to do it right now. So it would be more verbal things that I would be presenting as such. So the system calls that have been used in the applications, they are not directly the system calls, APIs as such. But rather they are like the wraparound functions that are part of the standard library, libc as such. So basically, the applications that goes and makes the calls into the wrapper function that is part of the libc standard library. And the standard library goes and talks with the kernel interface that goes and actually executes the system calls. So what does this library wraparound function does is like it packs the system call arguments altogether into the registers and puts the unique system call number and then flips the application which is executing user mode to the kernel mode. And then the kernel mode, then the remaining part is gets executed into the kernel space as such by executing the corresponding system call handler. And each of these different system calls, they are being identified by the unique system call number. And the kernel maintains the table or indexes the system call handler based on the system call number as such. And here in this case, like when the system call gets executed in the kernel, it would return back. It will perform its operation and then correspondingly based on the execution flow, it will return back the written values and whatever the written output that has to happen. So OK, so S-rays interprets and records the system calls which are called by the process. And also it also interprets the signals if the applications are receiving the signals. And basically, when the output of the S-rays is like it outputs the information in a symbolic form as such, like it gives the system call names, its arguments, and what is the written value. Again, like if the system call is passing the input parameters which has a number of different flags as such. So those flags, if there are multiple flags, then whatever the bit mask or together that everything is displayed, so that you get to see if a particular. Now, here in this case, the example that I've took is the open system call. Here the application basically is opening this libc file, that .so.6. And for that open system call, the flags that are passed is like real only and close exact. So these are the input parameter that goes to the open system call. And then the last parameter that equal to 3, it's like the written value. The open system call when we make there, we expect when it's successful, it returns the file descriptor. So here in this case, this is the number 3 is the file descriptor, that is the written value of the open system call. Now, if the system call has not been successful, then it returns the error, that is minus 1. And a corresponding error number has been set as such. So here in this case, if you take the access first line, here in this case, it is not successful. And the reason is that this particular file, preload file does not exist. That is no such file or directory. That is the corresponding text string corresponding to the error number that has been written as part of that system call execution. And along with that, like if the particular system call is using passing a structure as such. Structure has a lot of members fields in that case. All that structure and its contents can also be seen as part of the output of this S trace comma. So here in this example, f star does not have that. So it's not seen. But basically, let me see if I have another example. In that case, we'll see that if there is. OK. So here in this case, I wanted to show with a small example, like Hello World program that is right. Now here in this case, we see the example, which is a simple printf statement, which would do the work of outputting that string on the standard output as such. So when we execute, when we want to go and press this program as such, what all system call that has been used by this particular application. Now here in this application, we see we are directly not using any system call as such. The printf is the function. Basically, it is not a system call, but it's a library function, which is part of libc. And the difference between system call and the library function is library functions. They are implemented as part of libc. And they might be implemented within the libc itself, or they might go and make a call to the system call as such. So here in this case, when we do strace after compiling this program as such, what basically we see is that now when we are first, the first line of this strace is the exact v, which is like the shell is loading this hello program as such. And loading into that, that's where this exact v comes into picture. And the input parameters, that is the program that is getting executed along with the other parameters that that particular system call takes as such. And when you see the output of the strace, it contains a lot of system calls other than the exactly the code that has been implemented. So basically, those are the set of system calls that are, I don't have the detail log as such. But that is there to load the shared libraries, like application one it is executing. Now we are in the hello world, there is nothing we are much using as such. So there is not, but still it will be loading the shared Lipsy library and the corresponding any libraries that it is there. And then eventually at the end, what we see here in this case, we see that Lipsy has been opened and its contents are being read, loaded into memory. And then there is a write system call that is going. And if you see the parameters of the write system call is like it is using the file descriptor one that is started output. And this string that is hello world string that is there, that is the second parameter to the write system call that is the buffer, the static buffer. And the number of bytes that would be read. So the printf eventually has gone and made a call to the write system call. So that's where when we do a strace on this application, hello world application, we would see this. And then there is another system call that is exit group, which is, which has been called by the exit library function that is there to successfully complete the execution of the program. So here in this case, what to just to summarize that is the strace output shows the name of the system call, what are the input parameters that have been passed and the return value. So here in this case, this for the write system call, this 12 is the number of write, basic functional write is like it returns the number of bytes that has been written. It's either to the file, here the file is a standard output. So the hello world is that contain, that is the 12 number of bytes as such. And in case of error as such, like here in this case, we see that it displays the corresponding error string as such. So this is what we talked about. So basically like the small hello world program that boiled down to two system calls, like is write and exit group as such. So here the strace is just tracing the program that has been given as the input parameter and we're just logging the strace log by the minus o option. Now another thing is like if the application has a child process, so by default strace does not traces the child process as such. So that if the application has to be traced with a child process, it has to be traced then that can be done with this minus f option. And okay, so here is this simple program where we create one child process using the fork system call and then just print the ID, PID process ID of the child process in the child process as such. And the parent process is just waiting for the child process to complete as such. So when we do a strace on that, on this particular application, here's a simple output that says that it just shows the parent process ID and the child process ID. But like if you look into the strace log of that child, in that case what we see that the output shows the process ID like the instruction that are being executed by the parent process and the child process as such. So here we see there are two PIDs that is 2946 and 2947. So 2946 is parent process and 2947 is the child process. So each line of the trace, it has been prefixed with the corresponding PID. And here we are in the application we use the fork call. So fork is basically a wrapper function that basically goes and calls the clone system call inside the kernel. So that's where we see this clone system call. And then in the application, here the parent process is just waiting for the child process to complete. So the weight is again a library function wrapper or function which around this weight for system call as such. So here we see that the weight system call has been called. And here you see like 47 in that case the instruction that is this PID of the child is executed it prints that. It's basically not just printing that one statement as such. Then again the parent process it is waiting and then after the child gets terminated it resumed back as such. And also you can see from this is like the parent process received the sick child signal. Like when the child process completed it sends the signal back to the parent process. And that signal capturing information about signals are also been captured as part of this S trace. S trace will also capture the signal so that as received by the process. Now we are in this first example what I have the exa deco this is simple application but it can it can receive the input parameters like if you want to execute some different function by the child process that it will do through this EXIVI. So here in the first example the input was not given but like if we again execute the same process then giving the command or the application PWD to be executed by the child process. So here in this case I have just the snippet of the trace as such but that will help to explain that okay. So here in this case 30 16 is the parent process and the 30 17 is the again the child process. So here again the okay. So here in this case EXIVI is like if you look at this input parameters it shows that in the name of the process and the parameters that are been passed. Like here we are passing this been PWD as one of the parameters. So those are all things and here one more thing is like in the S trace along with minus V I've also given minus V option that is verbose. Like if you want to basically by default the S trace abbreviate does not shows all the all the number of parameters it just abbreviates them. So if you want to have the detail information like here by giving minus V verbose option we can see that the third parameter that has been part is the environment. So what are the different environment variable that goes to goes when a new process is created and executed all those things all that information is been captured when we go to for a more verbose mode with minus V option okay. And yeah so here again then the closed system all that creates the process and when the child process execute first it prints the PID okay and then it calls the AXEV as such. Now here in this case the AXEV is like we have given the input parameter that PWD that is print the working directory. So that particular application will be executed and it just prints the working directory as the output as such okay. So this way a child processes can be traced with the help of the airstrace with specifying the option as such. So along with that there are different filters that can be used for tracing as such. Like if you are in a given scenario like if you suspect that there are some particular system calls that are been given problematic as such. So in that case by default the airstrace gives the flow traces all the system calls as such but if you want to do trace only a specific system calls then those can be specified with minus E option just give the N number of system call that you want as the output of the airstrace rather than every other every system call that as part of the application as such. And again like we can categorize we can do the tracing based on the category like if you want to trace only the system call that are taking file name as an argument as such like the system calls open, start, drunk or change mode all those things. So that can also be specified with minus minus E trace equal to file that is a file category. Similarly like if you want to trace the system calls involving process management all the process management system calls then which are like clone, AXEV, wait for, XZSS all the process related information that tracing can be done with specifying trace equal to process. Then similarly for the network related system calls the system calls that is socket, buy and listen, accept, send message or lessons if you tend to say that there is some problem with the network flow or trying to debug a network performance as such and those things corresponding system calls can be traced separately by giving these filters. Similarly we have the filters for memory management as such. I would have been really happy like to show this demonstration of those things but I don't have that access so unfortunately that we would not be able to see but these are the commands that can be used through which we can see like when you case of memory management like MAP system call or MLOC system call MPROT those system calls traced can be collected. Also like you can do filtering based on the path name or filtering based on the signals that have been captured by the applications. And similarly like if you had to trace a particular life process PID then that can be specified by minus P option. Again like as for the security rules that you can only trace the process that are owned by you as such. And to get the overall summary of the system call we have this nice option called minus C option which gives the overall execution time the counts time as well as number of system calls that number of instance a particular system call has executed and the overall report of that application what are the different system calls and how much time it took. Now I just took this map shot of this LS command that is executed. So here it shows that these are the number of system calls that are been made. A map called 19 times and so on. And again like if a particular system call has failed then those are been the count has also been shown here. So this helps to give the overall stats status of the application that is running. With the again this having this time information helps to understand like what particular system call whether it has incurring any latency or as such. So that summarizes the S trace as a tool to collect the information and use for the further analysis. Now let's look at another tool that is the F trace. F trace is the function tracer that is part of the Linux kernel which was added sometime back in 2008 by Stephen Rostin and the development and evolve. It has been evolving since then as such. So basically it's a function tracer. Let's let you watch and record the execution for of the kernel functions. Here in S trace what we from the application we can see only the system calls execution flow but like if you're if you want to get into deeper into the current functionality and see what particular current functions what the status of that in that case F trace is useful. So what it can do is that it you can trace the function calls and find out which system calls are called when you run particular application and it can also profile application with the measure execution time and find out the bottlenecks as such. So let me, okay. So to use F trace like the F trace is built will be useful when it is built with this GCC minus PGA option that is performance which allows when minus GCC is like it allows to prevent the machine instruction to the prologue of every function a non-enlightened function. And with that it has to redirect the execution of the F trace so my time for planes or the plugins that are being implemented, okay. And when this tracing is enabled when the tracing in a enable it will generate the trace events and the data as such which can be collected or collected and stored by the F trace in the ring buffer in the kernel ring buffer. And all that info data is available through the virtual file system they call as a word stress Fs file system which will be available. Basically it has to be mounted and that's the usually it sets in this is kernel tracing path as such. So that's where it provides that is the interface that can be used by the application to do the tracing altogether. And as it is like there's interface is a file system interface then basically it helps to provide the simple file operations and tools like echo and cat to interface with that framework. So here's another snippet I've taken that is we mount the file stress Fs onto this CIS kernel tracing path and if you look into that it shows the different files that are there that is the interface to the F trace as such. So what with the help of the F trace what can be traced us? So we can trace the kernel events and the different traces. So the events kernel events they are they can be static or they can be dynamic like static events in the kernel that provides the various data and it's like it comes with a lower overhead as such but the drawback is like the trace code is limited and it is like the it is defined at the build time as such and the events can be dynamic that is by using k probes implemented through dynamic probes injected at runtime. And this kernel events they are broken down into different categories or groups like the events from for the scheduler subsist scheduler as such or networking subsystem member power management or interrupt as such. Based on this you can apply different filters to get the events for schedulers or networking or inter management as such and as they are built in the kernel defined or implemented in the as a build time function you can just if you just grab onto trace underscore you'll see like lot of trace events are being added in the kernel for the various subsystems as such. And traces are another that is provided specific functionality like by specifying the filters to trace only the functions or function graph as such in that case traces can be used. So here just to show you about how the events can be used means first is enable the events like we echo right one into this switch. Now here in this case particularly I've taken the example of scheduler switch when of the process switching takes place and if you want to trace all those kind of events then in that case we go and write one into this particular switch scheduler switching enable flag as such and then by using the kernel buffer as we see that ftrace logs all the events into the kernel buffer ring buffer as such and that ring buffer can be read through this trace file. So you can just do a cat or less on that trace trace or trace pipe these are the files through which all that ftrace output is seen as such. So here in this case particular snippet here it shows all the information it tracks events information whenever context switching or process switching is happening all that are being captured. And here particularly to understand what exactly this information like here it is giving the process ID and this is the name then the PID on which processor it is CPU it is executing and then which is the previous process Prevcom is the name of the previous process that is executed it's PID and other information all together. So what all this format that information we can see from this another file that is even format what all that information is exported or been available. So here in the format file it shows that what are the different fields what this different fields corresponds to as such. Okay. And here again in one of the yeah so here in this case there is one one field called as the state field that contains the information the process state as such. So here in this case we see that what is the different state where the process in the sleeping state or the other states and like if you want to get the information what are the different process states like if you can do a PS command it will show you the process states the different flags as such. Or moreover if you do a band page on PROC AFS it also provides the detailed information what this X means what T means and the different values altogether. So from this that helps us to understand like what are the things that are going during the process contact switching. Okay. Now this the that was the output that we get from the after is as such. So basically to get that information the event that is that is causing like here in this case again we are still talking about the switching function as together. So here in the scheduler this is the address event that gets called that is trace schedule switch which generates those events as such. Okay. Now this now if as a current developer or as a subsurface thing when you want to implement this events as such for monitoring that flow or such. So that whole implementation is supported with help of the trace event macros as such. So the trace event macros has to be implemented into that code flow as such. And so this is like particularly I just taken a snapshot from the schedule.hcd file where this particularly trace event is there been defined for the schedule switch event as such. And if you want to explore more and know how that has that can be implemented then there is sorry it's not a sample spelling mistake is happened. So it's a sample directory the trace events in that directory we have some sample code. You can take a look at that and compile and see how that trace even macros are being implemented and that can be used as a starting point to implement further into your drivers or current subsystem wherever you want to. Okay now that was about the current events and let's look at the traces. So traces define the functionality inside the kernel for tracing functions or enabling latency tracing as such. Again now here in this case to enable the tracer like suppose a function tracer in that case so what we have to do is we have to echo the function into that current trace. So by default it is not there is no tracer been unable. So here by using this echo function what you are doing is basically enabling the function tracer okay. So once that is done okay. So here and once it is there then you can check the events by using the by doing cat on this trace or cat trace or cat trace pipe as such where you would see all the current functions that are been executing at that moment okay. So in this thing there is a line that shows that when the schedule switch is getting got executed as such okay. So another okay here in this function we see the functions that are been executed. Now with the help of this graph function graph tracer that allows it shows the code flow as such that it not only traces the function entry but also the return of the function means it shows the complete flow how the function that is getting executed and the output is in like C code style information with the time duration that each of this function has taken the time for execution okay. So this helps us to see if there is any bottlenecks or the delays taken by particular code as such or functions as okay. I mean those were pretty straightforward like function or function graph as such. Again that there are multiple filters that are available for like tracing for filters for a particular setting by PID or other particular function like suppose you are implementing some driver RDC driver or something. So you want to trace only those functions as such so that can be set up with the F trace filter options as such okay. And similarly now this again like going for the particular subsystems like scheduling or networking based on that set those filters and then collect the trace information okay. So here events just to summarize the traces events and traces, events are the static points as traces add functionality dynamically and you can have your own options as such and again traces can also include the events as such. And the dynamic events they are same as the static but they are not created with the trace event macro as such. They are injected through the breakpoints and Cape Probes is one of the framework that has been used and this I write I don't plan to go into much into Cape Probes as such so I'm just putting up the information that you can look at those documentation as such or to get to know more about the Cape Probes but that is the main framework that is as a background for the dynamic events as such. And this is just to summarize like the F trace is the kernel framework that is there and okay so this is the trace command is the front end to the F trace like you can use that by the application again it's been devled by Steven Ruskin and that's I think that's the time up so I would stop over here. Any questions I could take on.