 Hello, everyone. I'm Yu Hao from the University of California, Riverside. And I'm excited to share with you our paper titled This Describe Principle Automated Static Generation of Cisco, Description of Kernel Drivers. Currently, I'm a final year PhD candidate at the UC Riverside under the supervision of Professor Zhiyun Qian. And I focus on the Linux kernel security kernel filing, symbolic execution, and static analysis. And I have published some paper in the top academic conference of computer security, for example, SNP, CCS, and DSS, and the software engineering, EECC, and FIC. As we go deeper into our presentation, I would like to provide you with some background and motivation for our work. First is the CCS color, the state of the art operating system kernel father, has found more than 4,600 bucks in the Linux kernel over the past few years. One necessary component of CCS color is a collection of CCS call description, which includes the CCS calls, the argument of them, and the dependency between them. So however, to our knowledge, current CCS call description are largely written manually, which is both time consuming and the error point. So now I would show an example CCS call description for the QVM driver, which includes the following necessary components. The first thing is the CCS call interface, which allows user space application to interact with the driver. For example, we can see the open CCS call open and the CCS call error control here, which are the most common one. And there also can be read, write, and others. The next is the device file name here, which is used as an argument in the CCS call open. For example, here is a slash dev slash QVM. The next is the command value, which decides the subinterface of error control. And it's the value of the second argument of the error control. Treating them as separate CCS call interface allows CCS caller to generate a test case more effectively. Then next is the argument type, which are other arguments to the CCS call interface. For example, the third argument of error control. Here is, for example, the QVM underscore user space underscore memory underscore region, which is declared in live file. The last one is the explicit dependency between those system calls, mainly involving file descriptors being returned from one CCS call interface. For example, here is a return value of CCS call open. And used in another one, for example, the first argument of the error control. Explicit dependency allows CCS caller to generate valid test cases. Note that the file descriptor can also be returned by CCS call interface other than open for some complex driver. For example, here, this error control could also return file descriptor, which is fd underscore QVMVM here, which will be used as the first argument of next error control. So we refer to such dependencies specifically as non-open file descriptor dependency. Note that it is almost impossible to fuzz those CCS call interface correctly without knowing this non-open file descriptor dependency. So since current CCS call descriptions are largely written manually, and not all kernel drivers have manually created CCS call descriptions, especially compared to the kernel driver. The kernel driver accounts for about a 70% line of code of the kernel. So the goal of this work is to replace this step. So for those human, we would like to replace with our two CCS describe. We design a new tool, CCS describe, which is able to statically, accurately, and automatically generate CCS call description for the Linux kernel driver. CCS describe only needs a comparable kernel or even single module. And it can directly generate CCS call CCS color compatible CCS call descriptions. And the generate CCS call descriptions are better than manually created CCS color description in some degree and closest to the ground truth. So the key insight of the CCS describe is that we summarize and model the programming conventions regarding the kernel driver development. This allows us to understand how a kernel driver is initialized and how it shows or it interface construct. At a high level, we can then statically reconstruct the initialization of a kernel driver by first following the summarize initialization process. Now, let me take a closer look at the device driver in the Linux kernel. There are three first level type of device in the Linux kernel, character block and the map. And in our work, we consider character device and block device because they can be accessed from user space by the device file in the slash style. The figure shows the results of LSAL in the slash style, which lists the device files. And here, the B and the C here means the character device or the block device. And the two number, the two number means for each device file as a major number and as a minor number. And there's a combination of major number and a minor number. It's called the device number. The device number not only uniquely identify a device and its file name, but it's also associated with a set of Cisco handlers. For example, the IO control. So with those device numbers, we can know the target function of the IO control. So the stream likes, for example, PDMX and PPR the device file names. So then, we go a little deeper into the south code of the device drivers. In Linux, driver are in the kernel modules. And each kernel module have well-marked entry and exit point, usually defined by macros. Such as the module in the here. And the module exit is for the exit point. A single driver is usually defined within a single kernel module. By the way, it's also possible that a driver is defined across multiple kernel modules that have a dependency between each other. Here, we only show a simple example. We can see that a driver's device file is initialized in the module interface function at line six to the line 11. So which involves the creation of two objects. One object corresponds to the, we can call it driver, which is a strike CDL here. And another is corresponding to the device, which is a strike device here. And for the strike CDL, which would define the related Cisco handler later. For example, here, the CDL we used in this function. And finally, it would define a related Cisco handler. And the device file name would define the device structure. So for example, the device object would use here. And finally, set the device file name. So another important thing is each device must have a unique device number. For example, the line six. We can say this is the device number for the driver. And so this device number can decide both the device file name and the Cisco handlers. And we can also use this device number to pair both the driver structure and the device structure. And then if we use two Cisco, for example, to open this device file and then have control. So if there are two Cisco like that from user space, and finally, they would call to the Cisco handler here. So if they open this device file, the Cisco handler would define here. That's how the device driver works in the South Code level. Next, I will show you the CCD-Scribe design. The core design of the CCD-Scribe is based on the modeling of a well-established contract between the core kernel and the driver size we just introduced. So we have two analysis phase. In the kernel module analysis phase, we can recover the Cisco handler and the device file name. And in the Cisco handler analysis phase, we can recover the command value, argument type, and additional Cisco handlers. First is the kernel module analysis phase. CCD-Scribe can detect the kernel module by their initialization function by this micro module unit. And then CCD-Scribe recognizes the existing of any kernel driver and recovers the basic interface created and exported to the user space, which is the supported CCD-Scribe. For example, in this driver, it supports the open and idle control. And also, it would recover the device file name. As we show in the figure, we know that there are two types of objects that are defined during the initialization of a kernel driver. And they collectively defend the basic driver interface. Our goal is to identify the two types of the objects and associate them. So one is what we just called the driver object, which, for example, the CDEV structure for the character or genetic for the black device. Another type is the device object, which is the device structure device. And to pair those two types of the object, we rely on the device number assigned to each type of object. Now we can track both driver and the device object by the type as Josh mentioned. And we can simply expect their relational critical fields at the time when the object are registered to recover what we want, what we need. So CISCAL handler are stored in the file operation structures and which are assigned to the driver object. And the device file name are stored in the device object. And for each discovered CISCAL handler, CISCAS describe attempts to recover additional detail about those interface. This includes the command value under the argument type supported by the L control handler. And in addition, CISCAS describe can recover additional CISCAL handlers, which can be needed to recover the non-open file descriptor dependency we just mentioned. And finally, CISCAS describe can translate to those things we just found into the CISCAL description directly used by the CISCALer. For most counterdrivers, the main driver logic is encoded in the IOControl CISCAL handler. So we identify the command value of IOControl by checking their use in switch case statement if condition. For argument type, we model the common function, common counter function such as a copy from user or copy to user to achieve this in the line 30 or 34. We can see the destination argument type of the copy from user is the structure, is this one structure xx underscore type. So which should be the, which under the command value are same d underscore one. So this allow us to infer the argument type. Then the next thing is to recover the additional CISCAL handler. So in addition to the module interfunction, CISCAL handler themselves can also create and register additional CISCAL handler. This is the most common in IOControl handler where additional structure file object are created. And inside our structure file object, there is a pointer points to the structure file operation. So which is corresponding to the set of CISCAL handler associated with this file object. So in addition, as shown in the figure above, we can see that typically a structure file object is paired with a file descriptor that will be returned to the user space. And for example, in the figure below the line 37 and 41, we can say that IOControl handler with the command value are same d underscore two as create a file descriptor and file object and pair them through a function called file fd underscore install. And then it returns the fd to the user space. So next I would show you the evaluation results of the CISCAL describe. First, we presented overall results of the CISCAL description generated by CISCAL describe. Later, we would also compare them against the results from other static solution, for example, DPUs and dynamic solution, for example, KSG as well as CISCAL description, which is a manual created CISCAL description from the official CISCAL report. And here the CISCAL configuration is from CISCAL and tries to enable as many as possible drivers in QMU. As we know that kernel without yes config cannot boot but dynamic solution need a bootable kernel. So we compare results at a CISCAL config also. Before we compare CISCAL describe with other static solution, DPUs, I first briefly introduce overall workflow of it about how to generate a CISCAL description. First, DPUs would identify CISCAL handler structure from a per-defender list of structure type. For example, the figure in the left. Second, DPUs would identify the corresponding device file name, which is any constant string used near the reference of the structure. For example, here, the string in the right figure. Finally, DPUs would apply an inter-procedure static analysis to CISCAL handler to recover the argument. So in the high level, there are some advantage of this describe compared against the DPUs. The first and the most important difference is the modeling of the kernel driver. Instead of use a per-defender heuristic list of structure type, CISCAL describe is based on the well-established contract between the core kernel and the drivers. So this describe could recover CISCAL handler and the device file name reasonably, which means the last positive or first negative for the CISCAL handler and the device file names compared to the DPUs. Another benefit is that CISCAL describe against a more accurate indirect CAL resolution from the kernel module analysis phase, which improves the results later when recovering the argument of CISCAL handler. Since DPUs do not handle indirect CAL, so it would miss some command value and argument type, which located in some indirect CAL. Another issue of the per-defender heuristic list of the structure type is the list is likely not compatible with the newer kernel. Meanwhile, the contract CISCAL describe based on is unlikely changed. So it can work on the newer kernel. And another thing is CISCAL describe could recover non-open file descriptor dependency, which means more CISCAL handlers. Besides CISCAL describe directly generated CISCAL compatible CISCAL descriptions. Since our evaluation is based on a new version of Linux kernel, which is 5.12, we had to port DPUs to a new version of LVM, which is required by Linux 5.12. We also fix some issues in the DPUs under the fixed version. It's marked as DPUs underscore F. The issues are mainly under two categories. First is correcting hard-coded offset for certain types that have changed in the early in the kernel 5.12. Second is fixing failed tracking of copy from user destination object due to changes in compiled LVM bit code. So we can say that CISCAL describe recovers less CISCAL handlers structure than DPUs or DPUs fix. But however, in fact, most CISCAL handlers structure found by DPUs or DPUs fix are actually false positive. Given that they require only 48 or 60 device file name, typically it should be almost one by one. And in order to understand the correctness of the generated CISCAL description of different solutions, we manually collect the ground truth and take a deeper look at those details. So we randomly pick 100 kernel driver as database for the compression against the DPUs. And for the ground truth collection to mitigate the potential errors we make during the process, we always cross validate with the CISCAL description, generate with CISCAL describe and from CISCAL, which is manually once, and to see whether we missed anything. Overall, it takes more than one person months to collect the ground truth for 100 drivers. To our knowledge, no other work has built a ground truth data set for the evaluation. And we can say the results are expected as our high-level compression. CISCAL describe has a significant advantage in all aspects compared to the DPUs. As for the finding compression, we pick the 30 out of the 100 kernel driver based on whether they are compelled and available in the QML. CISCAL describe has a significant advantage in both coverage and the number of crash, which is expected given the accuracy results. It is worth noting that CISCAL describe achieve much more coverage and number of crashes for QBM than the DPUs, because CISCAL describe recovered three non-open file descriptor dependency for the QBM. Then we would compare CISCAL describe against the dynamic solution, for example, the KSG. Again, I first introduce our workflow. First, the KSG would scan all the device file available under the slash dial, for example, the figure in the left. And the second, they would hook the CISCAL open on those device files and get the CISCAL handler structure, for example, the figure in the right. Finally, they would apply an intra-procedure symbolic execution to CISCAL handlers to recover the argument. In the high level, there are also some advantage of CISCAL describe compared against the KSG. The biggest differences between CISCAL describe and the KSG is that CISCAL describe is static solution and the KSG is a dynamic solution. So KSG requires a compatible portable kernel, and which can run in a QML also or some physical machine. And dynamic instrumentation is also necessary, since it requires a whole kernel. So it cannot work if there is only a single loadable module. Besides this fundamental difference, the dynamic solution also needs modeling of the kernel driver. It's the same as the static solution. And the KSG, however, only has a limited modeling, which brings some false negative or even false positive of the CISCAL handler recovered by KSG. Another limitation of dynamic solution is it does not support all-compiled modules, which requires additional dependence. For example, we need to plug in some related hardware, and then the kernel module would the device file would appear under the slash dial. Dynamic instrumentation also means that KSG cannot directly work on the kernel with some fixed configuration, for example, the Android kernel. The lack of the inter-procedure analysis make KSG have a lot of false negative when recovered, command value, and the argument type. So we can say in the CISCAL configuration, CISC describe can recover 185 CISCAL handler while KSG recovers 88 CISCAL handlers. And the region of more CISCAL handlers found by CISC describe belongs to what we just mentioned, the region we just mentioned. We also take a deeper look after those results. To be fair against the KSG, we choose to include a subset of the 100 drivers that are compiled with CISCAL configuration, which is 70 drivers, and the other three are compiled only under the OES configuration. We can say the results are also expected as our high-level comparison. KSG has a false negative in all aspects, especially for the command value and argument type because of their lack of support of the inter-procedure analysis. And because the CISCAL description generated by KSG, in many cases, incorrectly associate different drivers with the same set of CISCAL handlers. So we compare against KSG by finding the whole kernel during all available descriptions. And CISC describe had better results in both coverage and the number of crashes again. And we also try to compare a CISC describe with the CISCAL description. And because 57 drivers are missed completely in the CISCAL description, so to be fair against them, we choose to include a subset of drivers that are covered by them, so which is 43 drivers. And the overall results CISC describe generate either more command values or more argument types for 13 drivers. And so we try to analysis those 13 drivers to understand why we generate more. And we found some bugs in their CISCAL descriptions, which is generated by humans. So we summarize the results by the type of bugs for those drivers, which is identified by their conscious funding of control handlers. And then we can say that CISC describe driver 78 missed command value or argument type across a total of 13 drivers. In addition, we see manual CISCAL descriptions have three positive with regard to the argument type across two drivers. And so the result means that ongoing human maintenance is needed, but however, it's almost no one to do that. And we also find that only one of those bugs, this one, is eventually fixed on January 2022 before we started to report it to them. And this analysis shows that those CISCAL description bugs can persist over sometimes. And so far, we have reported all the bugs to the CISCAL and all which are fixed. And we also shared our generated CISCAL description to them. Then we inspect the gap between the CISC describe and the ground truth. And summarize the reason which will be helpful for the future improvement of the CISC describe. For CISCAL handler, the major reason for the false negative is that CISCAL handler structure take dynamic construct variables. For the device file name, the main reason for the false negative is the device file names are also constructed dynamically. As for the false positive, the reason is that their major number or device number are generated dynamically, so we cannot match the driver object and device object accurately. For the command value, the false negative of the CISCAL handler would bring a false negative of the command value. And the second reason is the non-trivial use of the command value, which bring both false positive and the false negative. For example, in the figure above, there are some calculation on the command value before the use. And another example is the figure below. There is a function pointer array for each command value, so we cannot trigger, we cannot track them. For the argument type, the main false negative is from the false negative of the CISCAL handler of the command values. Besides the incomplete data flow tracking of the failures to model the additional functions, which is defined in the inline assembly code. For example, the getUser, which could also, which is similar like the copy from user. And for the finding results, CISCAL still make significant improvement in our coverage compared with the CISCALer descriptions. However, more code coverage makes sense, but not means more code coverage would directly means find more bugs. So CISCALer achieves a fewer crashes than the CISCALer description, because the manually curated description can include things that are out of scope for the current CISCALer. And overall, the ground truth results are better with a sub aspect to both coverage and the number of crashes compared with CISCALer. This is what CISCALer describes. We would also try to generate the CISCAL description for the Android kernel for Pico 6. But however, one thing is, when finding Pico 6 kernel crashes are captured by the RAM dump model in the Pico kernel. And there's no, not any public document about it. So we don't know the root cause of those bugs. And here are some examples of the bugs during the finding of the Pico 6 kernel. And then the next is the limitation and the future work. First is for the argument, there would be some specific value or value range. We haven't handled this in our tool now. And then we only support the most common CISCAL error control, so we would support other CISCAL, like the read, write in the next step. Besides, there are some other explicit dependencies. Besides, we would also consider CISCAL for other components in the kernel. For example, the network device file system BPF and something to fuzz them. So we open source over to in this repo. Finally, there are three take away of our presentations. We designed new tools described to generate CISCAL description for Linux kernel driver, and which can significantly improve the effectiveness of the fuzzing test for the device drivers. This can help identify more bugs in the device drivers. We evaluated the accuracy of CISCAL to describe against other solutions, such as manually created CISCAL description and showcase its effectiveness in drill world. Based on the data gathered after analysis, analyzing hundreds of kernel drivers. And we also open source the implementation of CISCAL described, and this can help other researchers or people in this field to build upon our work. That's all, thank you, Sarah. Any questions or discussion? For the new CISCAL definitions that were built, did those get fed back to CISCAL? Sorry, Chris, to be loud, I cannot hear you clearly. Oh, the new CISCAL fuzzing descriptions that were generated, did those get sent upstream to CISCAL for use generally? You mean the CISCAL description we generate can directly use the CISCAL? Right. Yeah, the CISCAL description we generate can almost directly use the CISCAL, because in CISCAL there are documents about how to add a new CISCAL description. So we need to at least generate some command value in the CISCAL side to make the new one to work. So it's almost directly used in CISCAL. My question is, is this a static tool or a dynamic tool? I see in the previous slides, you construct the relationship between some runtime objects. But I think if you use static analysis, maybe you can get the structure of the input of core modules. This, you mean whether this tool is a totally static solution? Yes. Yeah, our solution is totally static. In fact, when we try to design the tool, we have some consideration that... So if there is, for example, maybe we can use some dynamic solution which may be more straightforward and accurate, but it sometimes needs more engineering work, so we choose not to use them. So in order to make the solution as simple as possible, which can be used in real world, so we only choose the most basic things, and we only use the static solution. So if you have the source code, compatible source code, and then you can generate a physical description without any other requirement, any hardware or something. But I see in your slides, there are some relation construction such as struct file, struct device, right? Yeah. How to do that? You can find the name screen of the driver. Yes. So we can perform a static analysis on the... Our tool is based on the LVM also. So it's a static analysis based on LVM. So we can perform some static analysis on the LVM IR level. And so we can get the type information and some other necessary information we get and perform inter-procedural analysis and track the data flow and something. So for the kernel crashes that you found on the Pixel 6, I'm wondering if you've reported any of those to the Android bug bounty program. In fact, I sent the picture to some employee of them, but they said that in fact there are some files in the phone very fast. And if they can get that file, they have some internal tool to read some information and know the root cause. But for those information, they can do nothing. Okay. Who mentioned that a bunch of the false positives that you got were because the device numbers were dynamically being generated in some cases? Oh, yeah. Do you all have any ideas of how you would address that in the future if you continue to work on this? It's dynamically means... Here means, for example, there are some... Typically, it's possible that there is a function, something like generate a new device number or a major number or something. And then the code would use that number in some other place. So the traditional static analysis is... Maybe it's possible to track the data flow, but it's hard to be very accurate. So in our future work, in fact, I'm doing that now, is I want to apply a symbolic execution, which is more... Accurate, but maybe face the positive solution issue. But since we already have a static analysis, so we can perform symbolic execution based on the results of static analysis. So we only explore limited paths of the driver, and then we can track the more accurate data flow or even value flow of those dynamic construct value to get more accurate results. And in fact, I'm currently working on that now. So that's all.