 All right. Good morning, everyone. This is Jumper Lin. I am a PhD student from Penn State. My research focuses on exploitation and defense in complex software, especially OS kernel and browser. In this talk, I'm going to talk about exploitation, specifically a new approach for more precise exploitability estimation. There are a lot of kernel bugs. Since both the continuous kernel fuzzing platform is a public platform that contains bugs found by syscaller, you can find the bug report and the POC to help you reproduce the bug. There are over 4,400 bugs found by syscaller in the past four years. Currently, there are 3,000 fixed bugs and over 1,000 open bugs. The open bugs hasn't had any fix yet. These unfixed open bugs are the potential of public zero days. And the number of open bugs keeps growing. It grows by 100 bugs this year. You'll be interested to know their exploitability if some exploitable bugs don't get enough attention from kernel developer and remain unfixed all the time. It leaves a large time window for attackers to write exploits for those public zero days and attack kernel users. Starting the exploitability of those fixed bugs is also interesting because vendors don't adopt all the fixes from upstream. You can always find a lot of fixes from upstream being ignored by vendors. Those ignored fixes could contain critical fix to exploitable bugs. But they just don't know which one is critical because they don't know the actual exploitability of those bugs. Knowing the exploitability of them could give attacker the potentiality of finding zero days in vendors kernel. From the defender side, we should know the exploitability before attackers do. In general, starting the exploitability of bugs gives you an estimation of the consequence of bugs, whether it's exploitable or not. When we know a bug is exploitable, we should fix them as soon as possible to reduce the time windows of being exploited because the attacker may have known the exploitability and started writing exploits. It should have the top priority compared to other bugs. Knowing the exploitability could also promote fixed adoption for vendors. Vendors have the responsibility of protecting any users. When there is a critical fix for exploitable bug, they should adopt the fix immediately. Starting the exploitability could also guide the direction and the design of hardening. When we find a new exploitation method that always can be used to exploit a certain type of bug, we say that the exploitability of this type of bug is increased. From that, we know that there is something missing in kernel to mitigate the attack, and we know where the hardening should be working on. Knowing exploitability could have a huge impact on vulnerability management, but knowing exploitability is challenging. To prove the exploitability of a bug, the straightforward way is to write an exploit. However, kernel is a piece of complex software. It often takes expert days to write a working exploit. As a result, giving the large amount of bugs in kernel, it's not realistic to utilize many way forward on it one by one. Another idea of determining exploitability is to prove unexploitability. If you can prove some certain type of bugs are unexploitable, you can conservatively treat the rest of them as exploitable. However, proving exploitability is even harder because you have to go through all the execution paths to prove that no path will lead to exploitation. This has shown in some academic research and they prove the unexploitability on some toy examples, but when faced in this kernel which has millions lines of code, it's not realistic currently. A practical way to do that is to approximate exploitability. That is, based on the arrow the bug has, we approximate the likelihood of exploiting the bug, but how to approximate the likelihood of exploitation? Can we, based on the read or rewrite ability of UIF and auto bound bugs? The answer is no. Based on the read and write ability of some memory corruption bugs, it could give you a concrete result of memory corruption, for example, overwriting function pointers, but what if they don't? Meaning UIF and auto bound bugs cannot override function pointers directly unless using some sophisticated exploitation methods. And this may let you underestimate the exploitability when you cannot find the overwriting ability directly. In general, most of UIF bugs and auto bugs are exploitable. For UIF bugs, when you find the original UIF object doesn't provide the overwriting ability as you want, you can transfer the UIF object to another one. After doing that, you can easily obtain another overwriting primitive. UIF bugs could also be used to leak kernel information and bypass mitigation. A recent research has shown that most UIF bugs in kernel could be utilized to do so. For auto bound bugs, a recently exploited write-up showed that four zero bytes overflow is powerful enough to demonstrate unexploitation in kernel. So it's very likely that other auto bound bugs with limited overwriting ability could be exploited in a similar way. Conservatively, we treat all the UIF bugs and auto bound bugs as exploitable. Since kernel lacks effective hardening for these types of memory corruption, the advance of offensive techniques have demonstrated that most UIF bugs and auto bound bugs are exploitable. As such, we approximate the likelihood of exploitation based on the type of bugs. When a bug shows UIF or auto bound error, we think they are very likely to be exploitable. For other types of bugs, like warning, general protection fault, non-pointed dereference, they are less likely to be exploitable. This approach of determining exploitability is straightforward and practical. It doesn't need any complex analysis and computation. We see how bug is like and then we approximate. But the question is how reliable this approximation is. The answer is sometimes it could underestimate the exploitability because it's possible that a severe memory corruption just don't show memory corruption behavior and you don't know it's a memory corruption bug. For example, you see a warning behavior for UIF bug so you don't know the bug could actually cause UIF error. And it's possible that a severe memory corruption bug only shows limited memory corruption ability. Even if you see the memory corruption behavior, you are not able to find memory corruption ability it actually has. For example, you only find the non-pointed dereference ability, which is considered unexploitable in Linux kernel, but the bug could actually cause the UIF error. I have a detailed example for this in the following. So how can we improve the reliability? Our solution is to find the true effect of bugs. Here we discuss how the errors of bugs found by the fuzzing tool is ignored by the fuzzing tool and how we solve it. There are several situations. The first situation is that this color generates incomplete errors. When it fuzzes kernel, if the panic when one is set, it will miss the potential case on error right after the warning. On the left side, it's a bug report from SysBot. Since the panic on warning is set, the kernel crashes right after generating the warning message. On the right side, it's a log off trigger in the bug with the panic on warning flag disabled. A case on error shows in the following. Compare these two different logs, the case on error is not shown in the first one. The bug actually can cause a UF error but due to the panic on warning setting, the case on error is missing. Another issue in SysColor is that it only reports the first error that the kernel triggers. This is another bug report from SysBot. In the kernel log, you can find two errors that the kernel triggers. However, SysColor only generates reports for the first one ignoring the second. And we will see the report title for this bug is a warning error. In this case, the exploitable case on error is ignored as well. So looking at how this happens, it's because a bug could generate several errors of the triggering. We're running the POC, you'll find the several errors triggered by the kernel along the way. For example, the POC triggers the root cause and then cause a warning in kernel. And then the execution of POC make the kernel generate a case on error. The solution to this is trivial. We don't have to change anything in the input. We just need to capture all the errors, the kernel reports and present them accordingly. Other than that, there is another situation that we call it multiple error behaviors of bugs. And this is the main content I'm going to present today. So the multiple error behavior is that when we trigger the bug differently with different inputs and follow different execution paths, the bug shows different error behaviors. For example, a bug could be triggered and the case on generate a warning message without observing any memory corruption. But if you trigger the same bug differently with a different input, you may find that you have error in another place which is totally different to the warning error. And this is different from finding errors being ignored by this caller. In previous example, following the original execution path with the same input, you will be able to find the ignored you have error in the following. But here, you have to vary the execution path and to find the path that leads to the case on error which is totally different to other ones leading to other behaviors. So to improve the reliability of exploitability estimation, we should expose multiple error behaviors as many as possible to avoid under estimation. Let me show you an example of how multiple error behaviors is like in kernel and discuss our approach to finding them. This is a real kernel bug and here are some code snippet of the bug. Don't worry, this is really simple and I will be slow. So there are three functions contributing to the bug. Let's go through it one by one. The first one is the time attach function. The flag here is controlled by user. In this function, if the NAPI flag is enabled, it will initialize a timer and link current NAPI to the list in the device. Otherwise, it will not. And the second function we will need to look at is the time detach. So these two are pair, attach and detach. Here, if the NAPI flag is enabled, it will cancel the timer and remove the NAPI from the list. Otherwise, it will not. But eventually, it will destroy the file and inside the destroy, it will free the NAPI object properly. The last function we need to look at is the function to free the device. When it's called, it will go through the NAPI list to see which one is still in the list and destroy it one by one. Now let's look at how the bug will happen. Since the time flag is controlled by user, users can specify inconsistent flag between the time attach function and the time detach function. Specifically, if users disable NAPI flag at time attach, the code here will not be executed. So there will be no timer and the NAPI object will not be in the list. And then the users enable the NAPI flag when calling time detach. It will cancel the timer and inside the timer cancel function, it will direct a pointer inside the timer object. But since the timer is not initialized, so there will be no valid pointer in the timer object. So the kernel will direct a non-pointed, will direct a non-pointed and generate a non-pointed direct reference error. The bug could actually cause a total different error behavior if we trigger it differently. This time, we enable the NAPI flag in the time attach function. And we will execute the code here. We will have a timer initialized and the current NAPI object will be linked in the list. And we disable the NAPI flag when calling time detach. So the code here will not be executed. And eventually the NAPI object will still in the list, but it will be freed by the destroy of the file in the following. After this, we free the device. And the kernel will go over the NAPI list in the device and access to the NAPI object, which was freed in the time detach function. And this will result in a UF arrow. This case tells us that if we follow the same execution path, we will never find other exploitable behavior of bugs. Now, look at the exploitability of those two arrows. The exploitability of these two arrows is totally different as well. The execution to non-pointed de-reference arrow will always de-reference a non-pointer since the timer object is not initialized and there will be no valid pointer. In this kernel, mapping memory at zero address is not allowed, so this case is very less likely to be exploitable. However, exploiting the UF arrow for this bug is straightforward. The attacker can spray objects to occupy the memory of the freed NAPI object and then trigger free of the device which will destroy the NAPI object that the attacker controls. In the destroy, the kernel will free the SKB in the NAPI object and eventually it will call a function pointer inside the spray object, so the attacker can override the function pointer to obtain the control for hijacking ability. In summary, bugs may have multiple arrow behaviors. In this case, if we don't expose the UF arrow of the bug and only rely on the non-pointed de-reference behavior, we will probably underestimate the exploitability of this bug. But if we can find multiple arrow behaviors, we can have a more precise exploitability estimation and use the process of developing exploits. In this case, it's from non-pointed de-reference to a very straightforward UF exploitation. Finding multiple arrow behaviors of bug is not trivial. As shown in the motivating example, to find the bug's other arrow behaviors, we need to find a totally different execution path of triggering the bug. And we also don't want to have too many false positives, since it will make the exploitation estimation imprecise. If we don't have the input, we may not be able to reproduce the arrow behavior. As such, we don't consider using static analysis to find the bug. Instead, we will use buzzing to find these other arrow behaviors. There are several challenges that we need to address when applying buzzing on finding multiple arrow behaviors. First, we want to trigger the bug differently. In other words, we have to test the same code snippet over and over again. While the nature of buzzing is to find as many new code coverage as possible, which is to maximize the code coverage, the buzzing will easily detour the path to the bug and it will test on other codes that are new to it. So, the traditional buzzing by design will be no efficient. An enhancement to make sure the buzzing will not detour is to restrict the buzzing scope. However, the question here is, how to restrict the buzzing scope and what is the proper scope for each bug? We don't want to have a very narrow scope or a very large one. If it's too narrow, we will miss the potential arrow behaviors. And if it's too large, it will be not efficient. It should depends on what the bug is like. Here we propose our approach based on some observations from kernel bugs. First, the design of Linux kernel is object oriented. We use different objects to represent different protocols, different drivers, different kernel modules, and so on. Second, based on our observation, most kernel bugs results from incorrect usage of kernel object. In the motivating example, the bug results from incorrect usage of the tongue object. And the users can specify inconsistent flag between the tongue attach function and tongue detach function. Third, when the incorrectness propagates to different places by the object, we will find an arrow behavior there. Based on these observations, we propose our approach as an object-driven kernel buzzing. The inside of our approach is to use the reachability of some kernel objects as an additional buzzing feedback. When the bug is triggered differently, the feedback is different. In addition to that, the object is an ideal restriction of buzzing scope. All the related code containing the operation of the object will be included. So there are two steps in our approach. We first use static analysis to find objects. Then we enhance kernel buzzing using the reachability of this object as an additional feedback. This is the overview of the static analysis part. We take the crash report as input and then use backward tent analysis to identify the related kernel object. After identifying the object candidates, we use a heuristic to filter out abstract objects. For abstract objects, they are the object in the abstract layer, like file which represents the open file. When using them, they are initialized, they are instantialized into different types of files, like Tomfile in the motivating example. We have constructed a structure graph. They connect all the structures based on their definition. If one structure is referenced by another one, there will be an edge connecting them. Then we apply the page rank algorithm to calculate their popularity. For the top 5% object, we treat them as abstract object and we will not include them in the buzzing. Here is the example of how we identify the objects for buzzing. We start from the code where the crash happens. In this case, it's a non-pointed reference as the timer active function. Then we backward analyze the depth use chain of the data. Along the way, we collect the type information of the data flow. So in this example, we first found the timer structure in the timer active function. Then we find an APS struct and a Tomfile in Tomattach function. After this, we will fill out the timer structure because it's too popular to represent a bug. With the object on hand, we then use a customized compiler to instrument basic blocks which involved with the operation of critical objects. The instrumentation will send object feedback to the fuzzer when it's executed. So in addition to code coverage, we also have object coverage as a feedback to fuzzer. During the buzzing, only inputs reaching these objects are interesting to fuzzer and the fuzzer will try to find as much object coverage as possible. We implement our approach as a tool based on this caller and had the largest scatter experiment on it. We randomly choose 60 kernel bugs found from 2017 to 2021. Each bug comes with a patch. We set up separated virtual machine for each bug. And for the comparison, we run both this caller and our tool on these cases for seven days. After seven days, for bug reports found in each case, we manually analyze the new found bug reports and find out which one tied to the same bug that we are analyzing. In this way, we will be able to evaluate how many error behaviors each tool has found. Our results are twofold. The first part I want to discuss is the exploitability escalation. Among 60 kernel bugs, 44 of them are less likely to exploit bugs, which are warning, general protection fault, info, non-pointed de-reference, and so on. In seven days, since caller identified, four of them have other error behaviors that are likely to exploit. Our tool identified 26 of them are actually can cause likely to exploit behavior. In addition to that, we also found that on average, each bug in the dataset has three error behaviors. The second one is the exploit potential. In the dataset, 16 of them are likely to exploit bugs. We add them to the dataset because we want to evaluate the ability of finding more exploit potential of our tool. The ability help us find better primitives when writing exploit for bugs. For example, you'll find the UIF but it requires specific privilege to trigger. However, there may be other execution paths that don't require any privilege to trigger the same UIF. Our tool can help you find the path automatically. In comparison, since caller found one of them has other exploit behaviors, while our tool successfully identified, eight bugs have other exploit potential. In summary, our tool is much more efficient and effective than six caller. If you look at the time used to find the new error behaviors, most of new error behaviors identified by our tool only takes minutes. Six caller takes days to find them. In this sense, our tool is much more efficient than six caller. I want to highlight several takeaways here. First, a kernel bug could have multiple error behaviors. In our experiment, we found that 34 out of 60 in our dataset, our tool could find at least one additional error behavior. We believe this is not a coincidental issue just in kernel, bugs in other complex software like browser may share the same issue. Second, multiple error behavior contributes to more precise exploitability estimation. Multiple error behavior represents different effects of bugs. Exposing all of possible error behaviors help us understand the worst effects of a bug. Our experiment showed that 26 out of 44 bugs in our dataset could have a more precise exploitability estimation through exposing their other error behaviors. This normally benefits the defender but also benefits the attacker potentially. Attacker can easily find most straightforward exploitation path through another error behavior exposed using our tool. I have an upcoming talk in Black Europe. In that talk, I will talk about a real-world exploitation in Centaur's kernel. The bug was found with a warning error first using our tool which successfully turned the non-memory corruption bug into an exploitable UIF and the UIF demonstrated demonstrates successful exploitation with all the mitigation bypassing. It's really interesting. Third, we have proposed an automatic approach that take the bug reports as an input to identify our object and then utilize the object as an additional feedback to kernel browser to find the multiple error behaviors. So finding multiple error behaviors automatically is possible. Last, in comparison to this caller, we showed that our approach utilizing kernel object as feedback is much more effective and efficient. That's all of my talk. I would like to thank my collaborator, Yuichi Chen, for his help and my advisor, Xin Yu Xin and my mentor, Kang Li, for their guidance and advice. This is Jumper Lim from Penn State. My research focuses on exploitation and the defense, especially for OS kernel and the browser. I'm looking for internship for next summer, so if you know there is any opening, feel free to send me a message. Here is my Twitter and you can find my other information from my personal website. Yeah, now I'm having to take questions. Which one? The very first slide. A powerful one. This one? Yeah, no, we're trying on this one here. Okay. I don't explore the ability of these bugs that are known. I'm just trying to process that. How could we know that something is a bug and not know the exploitability? Does that make sense? Okay, so for exploitability, if a bug is exploitable, we think the bug has exploitability. But not all of bugs is exploitable. Yeah, for successful exploitation, we think the attacker can get ideal operation on the system, but not bug allow them to do that. Thank you.