 My name is Jerry Cooperstein. I'm the director of training at the Linux Foundation and also the maintainer and developer of the LF320 class, which is titled Linux Kernel Internals and Debugging. And today I'm going to give you a brief introduction into the use of kernel modules. So let's get started. The Linux kernel often uses modules, which are code which can be loaded on demand and unloaded when not needed anymore. Often modules are used for device drivers, but they can be used for many other purposes. For instance, different network protocols, different file systems that may be needed at various times. And so the code is only loaded when necessary. And basically almost every kernel facility, whether it's device driver or something else, can be developed so that it can be loaded as a module rather than built into the kernel that automatically starts every time the system begins. So let's take a look at how modules are written and how they're used in Linux. So here's a very trivial example of a kernel module. We have the code here. There's two basic functions, which are in virtually every kernel module. There's an initialization function, which gets called every time the module is loaded. In this example, it's called myinit. And you notice it's marked by underscore, underscore init as attribute. And then there is an exit function, which in this case is called myexit, which is right here, which is marked with the attribute underscore underscore exit that gets called every time you unload the module. So when you load the module with the insmod program, which we'll mention later, the initialization callback function is called and the exit function gets called when you unload it. The way the system knows that these are callbacks associated with loading and unloading is because we use the module init macro and the module exit macro to load them. So this is a totally complete module. All it does is say hello when it's loaded and say goodbye when it's unloaded. So it's really just the usual equivalent of a hello world program. There's two additional lines at the end of this. There's a module author and a module license macros are used. The module author license is just, the module author macro is just for documentation to give yourself credit. The license does matter though. Here we're using the GNU, the general public license version two. And that has some implications for how your module may be used later and what it has access to. Though in this session it really won't matter, but that is important. So now that I've written that trivial module, and it's presented in the solution set for the class so everybody has access to it and can use it for a template for doing more interesting things. How do I actually compile it? And then we'll talk about how we get it loaded. Compiling modules involves use of what's called the K build system in the kernel. It's a very somewhat idiosyncratic but easy to use set of scripts, et cetera, that are used to compile kernel code and modules in particular. For normal software packages, you're usually just told what headers you need, what libraries you have to hook up to, give an advice about optimization options and flags and then you go ahead and compile. You can't really compile the Linux kernel that way. It's become very important to compile in a way that respects the way the kernel is configured and uses the same options that were used when the kernel itself was compiled, et cetera. And in this section, we talk in detail about how to do this and at the same time, I won't go through the details here but we give you pretty precise prescriptions for how to do that. It's really not that hard. Basically, you kind of jump into the kernel, do your compile, kernel source tree, do your compile and then get out to do that. And there's a script here which we make available to students called genmake which will do a lot of the work for you automatically so that given a directory which contains some source code for kernel files as well as application programs it'll identify which are the kernel modules and then write a make file for you so that when you want to compile the kernel stuff you can just say make and not have to do anything more complicated. So it makes it rather easy. Question often comes up is are there any advantages to having a device driver or some other facility such as a file system be written either as built into the kernel so that's always loaded when the kernel starts or have it be able to be loaded on demand as a module later. Generally speaking, we try to have everything as possible loaded as a module. There's no reason to load things before you need them and so it's better to load things on demand. Also, if you're a Linux distributor you don't know what hardware or exactly software people need so it makes sense to kind of compile everything as a module and then only load the pieces you need as you need them. There are no real disadvantages to having something loaded as a module rather than built in so for the most part we usually do things that way. However, in embedded situations where sometimes we want to keep things as small as possible have as fast a boot as possible sometimes people dispense with modules and just have them built in because their needs never change so there's really no reason to have this kind of optional dynamic loading and unloading so that's the sign decision you make with embedded type systems. Loading and unloading modules is done with the INS Mod for loading modules and RM Mod for unloading modules and in the text we talk in detail about the different options you can apply for these programs and we also talk about the Mod Pro program which is used for automatic loading of modules it knows where to find modules on your file system and load the appropriate module for instance when the device is found, et cetera. So we talk in detail about how to configure and use those programs too. We have a section here where we go in detail through the module structure. There's a structure called a module struct that's associated with every module that contains every bit of information the system knows about that module and the most important pieces there are exactly what symbols that module needs to use, what other modules or device drivers, et cetera might be calling some of the code that's in your module so it's who I refer to, who refers to me, et cetera and the kernel has to do very careful bookkeeping and all that to make sure we don't get some kind of bug where we load something and then it tries to call a function that doesn't exist. The next section talks about how the kernel keeps track of hues using modules and there's two ways modules can be used. They can be used by various applications or processes on the system and they can be used by other modules or device drives, et cetera that may be loaded after this module is loaded. It's very important to keep that count accurate because you cannot unload a module which is being used by somebody else that's sure to crash your system so we talk about how to keep track of that. In the early days when I first began working with Linux somebody writing a module had to take care of incrementing and decrementing usage counts manually which is an error prone process. Nowadays most of this is done automatically by the system and we talk about how to set up your modules so that this automatic reference counter can take place. And we also talk about the cases where we don't keep track of this for various reasons such as network drivers and why that makes sense to not have to try to maintain an accurate count of users. The next section has to do with module licensing whether you use an open source license or proprietary licenses, et cetera and what the technical implications are. We don't want to get into the legal aspects or instructors and not lawyers and they don't want to give legal advice and most companies have a lot of people working on this. So in this class we concentrate on the technical aspects what the advantages and disadvantages are of having various licenses. We also show you how to figure out what license various modules have on the system, et cetera in real time if you don't know that in advance. The last section I want to discuss on modules is about exporting symbols. As I said earlier, when you load a module it may require other modules to be loaded first or your module may need to be loaded before another module can be loaded. You may have functions, et cetera that are defined in your code that need to be loaded later higher on the stack. In order to do that you need to export the symbol and there's a number of macros, et cetera, export symbol that for a given function or a variable name make it available to any code that's loaded later. This is one essential difference between modules and built-in code that's always loaded when the kernel starts is that the kernel itself is one big monolithic program and you have access to virtually all symbols in the kernel from any code in the kernel as long as the code is not defined as static as a keyword. But modules have a more restricted API they can only directly access functions and variables which have been exported. So in this section we talk about how to export various symbols, how to restrict the use of exported symbols and some other variations such as export symbol for multiple CPUs, et cetera. We discuss all that here and it's pretty straightforward but you have to understand the implications. In the next section which is optional we don't discuss here we talk about how modules can resolve the symbols and find the hexadecimal addresses, et cetera and that's kind of a technical detail. Then finally, we have a number of different laboratory exercises that we do in class. All our sessions, almost all the sessions have multiple exercises. Many of them are C programs that need to be written. Both a kernel module perhaps and a testing program in user space and there's generally too many to do in real time in class but the idea is you pick the ones that interest you the most try to write them from scratch other ones you simply test the solutions and maybe make some enhancements to them. In this lab the first one is to write a module that iterates over every process in the system and prints out various information about it. This is essentially a variation of the PS command so you're writing one that can be run as a kernel module. The second lab is one that is a module you write which extracts certain information about your system such as what CPU it's on, what version of the operating system, et cetera. And the third one plays around with modules and exporting symbols. You write modules that export symbols and then you load other modules which try to get reference to them. And those are the three labs that are done in this session. I'd like to thank you for watching this brief excerpt from our LF320 Kernel Internals and Debugging class. We encourage you to go to our website and find out more details and look at the detailed outline for this class and we hope that you'll partake of training with the Linux Foundation soon. Thank you.