 Hello, everyone. Thanks for joining me today, either virtually or in person. My name is Wei Dongchen. I'm a fifth year PhD student from University of California Riverside. Today, I would like to introduce one of my work on triaging kernel autobahn write vulnerabilities, which was also published in USINX Security 2020. According to CISBOT, Google's Continuous Kernel Fuzzing Platform, there were more than 1,000 Linux kernel bugs discovered during one single year. Given such a large number of bugs we have to deal with and the lack of human resources, it's impractical to fix all of them in a timely manner. The data was from almost three years ago. How about now? This is the screenshot I took about one week ago. As we can see, there were still more than 1,000 open issues. And among them, about 75% were reported more than 100 days ago. Even for those fixed bugs, the average patch today is still about 51 days. Although we cannot fix all of them, we can at least prioritize the fix for those definitely exploitable bugs. To the end, one promising direction is to automate the process of vulnerability assessment. And one plausible and trivial solution is to look at the crash type. For example, use after free and autobahn write are considered as the most severe bugs because they are more likely to be exploited. That being said, it still relies on some more context to determine whether any particular vulnerability can be truly exploited or not. As I will explain in the following slides. So in this work, we focus on OB write, which is one of the most common security bugs. And given one POC that can trigger OB bug, we would like to facilitate the process of generating IP hijacking primitives, which apparently demonstrate the highest variety of the bug and the requirement for a fix as soon as possible. Meanwhile, it should also be able to help analysts understand the root cause and the security impacts. This is a simplified version of CVE-2018-5703. As we can see, there is a system called Socket defined at line eight, in which it allocates an object of type type one. We refer to this kernel object as the vulnerable object. And there is a type confusion bug at line 11 inside the system called Accept, where the vulnerable object is concentrated into an object of different type, type two. If we take a close look at the definitions of these two types at lines one and two, we can see that the structure type two actually contains a field of type one. So the size of type two is larger than the size of type one. Therefore, when we access the second field option at line 12, we have a auto bound write. Also, we can find that the value we use to override the vulnerable object at line 12 is from the global variable GSOC, which is defined at line six. And we can see that the field option is initialized to some constant. At this point, we may conclude that this bug only allows the attacker to override the vulnerable object at the fix of site. And we say constant, which sounds less exciting. However, that's not true if we can see the whole picture. So far, we only look at the first two system codes, socket and accept, because our POC only needs to invoke these two to trigger the bugs. If we can explore some other portion of the kernel code, we can see that there is actually another system code set socket, defined at line 13, in which we can change the value of the global variable GSOC, which in turn controls what we can override at line 12. So and then we can exploit these more powerful write primitives to easily achieve good privilege. Therefore, depending on the context, we may have different security impacts, even though we trigger the same bug at the same line. Considering these examples, what are the challenges we are faced to implement such a tool to evaluate the security impacts? Well, the first challenge is that the initial POC we have does not necessarily manifest a complete capability for the bug. To give you some sense how common this problem is, we actually have a follow-up work to measure this phenomenon. And we found that among more than 1,000 similarly low-risk bugs, for example, like autobahn read and warnings, 147 often actually have high impacts. For those who are interested, please refer to our paper for more details. Here are some more examples about autobahn write vulnerabilities. The first one is from CVE 2016-6187. It's basically an off-by-one bug, and it only allows us to override one byte with the value 0. But the size of the vulnerable object is from the user input, which is controllable to the attacker. This can be handy when developing exploits. The bottom one is from CVE 2017-7184. Although the size of the vulnerable object is fixed and the attacker can only set one bit, the off-site of the overwritten address is controllable to the attacker. And the bug can be triggered multiple times to set multiple bits. As we can see, for different vulnerabilities, the capability of those overflow varies significantly. So the second challenge is how we can model the capability of overflow. Intuitively, there are three dimensions we can consider, including how far the write can reach, what the value, how many bytes we can override, and what the values can be written. So going back to the motivating example, we know the size of the vulnerable object, which is the size of the structure type 1. We also know the off-site from the starting address of the vulnerable object to the overwritten address, which is also the length of the structure type 1. This vulnerability allows us to override a bytes with user controllable data. However, the effect of this OB write really depends on the object following the vulnerable object. So we refer to this object as the target object. In the motivating example, there is another structure type 3 with a function pointer at the beginning, defined at line 3. We can see that if we can allocate an object of type 3 and put it next to the vulnerable object, then we can trigger this overflow to modify the function pointer in the target object. And then we can easily hijack the control flow by invoking the last system code defined at line 19 to trigger the reference of the function pointer we just modified. So now the question becomes how we can put these two objects together. Well, there is actually a well-known technique called hip-foam-shui. The idea is that we can manipulate the hip layout because the behavior of Linux allocator is deterministic by default. Of course, there is an option in the configuration that allows you to change the default behavior. Since there is nothing new about it, and this is not the focus of our work, the detail is omitted for the interest of time. Given that each target object is so unique in terms of the critical field we want to override and how we can exploit it, the challenge is how we can evaluate exploit her ability against different target objects and how we can do it efficiently. In this work, we focus on OB-Write vulnerabilities. We assess one OB-Write vulnerabilities by trying to exploit it and generate IP hijacking primitives. Some modern defenses like kernel address space layout randomization, SMEP, and SMEP are considered out of scope. Also, we use some well-known hip-foam strategies rather than exploring nuances. Here's the overview of our system, which consists of four components. Given one POC, our goal is to assist the exploit generation. Although the whole process of exploit generation sounds complicated, the design principle here is that we can actually decouple the capability summarization from the rest of the pipeline. In this way, we reduce the search space to the point where we only need to check if there is a target object that could match the capability with some writes. This is actually exactly how security experts develop an exploit. So the first step is to perform vulnerability analysis to pinpoint a vulnerable object, as well as all the vulnerability points where autobahn access occur. In addition to using a kernel address sanitizer, we use symbolic execution to collect more information to improve the precision. This is because Ksum sometimes mis-identified vulnerable object if the overwritten address is not close to the vulnerable object. By using symbolic execution, we can trace back any pointer to figure out what's the vulnerable object. I will talk more details later. And then we also use symbolic execution to summarize the capability of one particular POC and store the result in the database. For the same bug, we may have different POCs and hence different capability summarizations. Now that we have the database for those capability summarizations, we could check the target objects one by one to see if any of them matches the capabilities. In the case where the vulnerability can be triggered multiple times, our system allows combining different capabilities. If we find a solution, we could synthesize and exploit. Otherwise, we are not satisfied by the initial POC we already have. We could explore the POC to find more interesting inputs and then repeat the process. We extended CISCOLOR to support crash exploration mode. So a little bit of background for symbolic execution before we dive into the details. Symbolic execution is a static analysis technique that emulates the execution of a program. The key idea is that each variable now can associate to a symbol. For example, we can assign a symbol alpha to an int variable i. And each symbol represents a site of values. For each program statement, instead of computing the concrete value, it generates formulas over symbols as the result. When encountering conditional branch, it can use constraints over to check whether both paths are feasible or not. If so, it folks the execution into two states, each of which adds one additional path constraint. In our work, since we already have the POC, we know the path that the program will execute. We can guide the symbolic execution to follow the exact path without forking. To summarize the capability, this is the crucial part of the work. Recall the three dimensions I mentioned earlier. We could extract, for each autobahn right, we could extract its offset, length, and value with symbolic execution. For example, when we first encounter the K-maloch function at line nine, we could assign a unique symbol to the returned pointer. And then when symbolic execution reaches the vulnerability point at line 12, the first operand of the right is the overwritten address, which is a symbolic expression in the format of base pointer plus some offset. And the base pointer is the unique symbol we assigned to the vulnerable object earlier. And the offset is just some constant. Also, the second operand of the right is also a symbolic expression derived from the user input. And then we can use symbolic solving to get the valid value range for the input. For one single pass executed by the POC we have, it can have multiple autobahn rights, and thus we collected a site of autobahn right summarizations. In a motivating example, we conclude that the OB value is controllable to the attacker, but it doesn't mean that its value is arbitrary. In fact, there is a check at line 14, which prohibits the value to be negative one. Since the pass constraint can affect the value, we consider them as part of the summarization. Similarly, the size of the vulnerable object can affect the offset and should be included as well. If we go back to the initial POC that only executes the first two system called socket and accept, the summarization is shown at the bottom of the slide in which the size of the vulnerable object is the length of the structure type one. The pass constraint site is empty, and there is only one OB that overrides a bias with a constant. After exploring the initial POC, we make it a new POC that could give us a more complete capability. The summarization is shown at the left button of the slides, where we can see that now the value we used to override the vulnerable object is now a symbolic expression whose value can be arbitrary except negative one due to the pass constraint we collected. Also, we can even define the partial order for those capabilities to compare them. This can help us filter out some redundant summarizations. In order to explore some other part of the kernel code related to the vulnerabilities, we propose a novel capability-guided fuzzing solution. Existing coverage-guided fuzzing solution are ineffective because they only focus on the coverage and are insensitive to the capabilities we care. Therefore, we use dynamic instrumentation to hook all the vulnerability points to collect more information such as the off-site lamps and values as the feedback. We also maintain different cues for different feedback. As I mentioned earlier, the effect of the overflow really depends on the target object we choose. In general, we consider three types of objects, objects with function pointers, objects with data pointers, and objects with some special fields like UID and reference counters. For each target object, we create a template for it which specifies the size of the object, the off-site to the critical field we want to override and the type of the critical field. In this case, the type is a function pointer, also the payload for the critical field which is the new value that this function pointer should become. It also requires some code snippets that could trigger the allocation of the target object and the dereference of the function pointer. This is useful when synthesizing the exploit. In this work, we manually collected some promising target objects used in the wild, but there are also some other interesting work on automating the process. If you are interested, please refer to the two papers I listed here. So recall the hip function technique with that we can pre-arrange the hip layout so that the vulnerable object can be put next to the target object. Therefore, given the capability summarizations and one particular target object, we construct a memory model only containing these two objects as shown in the figure. At this point, the memory model is filled with concrete values and thus there is no difference between the memory model and the regular memory region in a running program. And then we can update this memory model with OB rights. Specifically, we updated with the symbolic expressions from those OB rights in the summarizations. For example, the right shown in the right part of the slide can reach the first 8 bytes of the target object and the value used to override the memory model is a symbolic value. As we can see, the overwritten memory is now covered by black and its content is also a symbolic expression. As you can see, this memory model is no difference with a regular memory region. The only except that we can store symbolic expressions, store symbolic values to the memory model. And then we can also read symbolic value from this memory model. And then we can query the constraint solver to see whether the critical field can be overwritten to the desired payload with respect to the past constraints. From the memory model, we can see that the memory that stores the function pointer of the target object is also the overwritten memory region where we store the symbolic value. And the symbolic value can be arbitrary except negative 1 due to the past constraint. Apparently, this function pointer can be modified to the desired payload and our system could produce a solution for it. On the other hand, if we have a target object like structure type 4, which has a function pointer at the offside 8, the constraint solver would fail to find a solution for it since the vulnerability does not allow to temper the second 8 bytes of the target object. So now let's go through a concrete example to see how it works. This is CVE 2016-6187. It's a 5.1 bug. And the attacker can only override one byte with the fixed value 0. The POC shown here is written in the language supported by SysColor. It only needs to invoke two system codes open and right to trigger this bug. By taking advantage of SysColor, we could convert it into C code with source code instrumentation. Since we need to run it under symbolic execution, we need to make system code arguments symbolic. However, we don't want to make all system code arguments symbolic, which would slow down the performance significantly. Instead, we selectively make some arguments symbolic according to their types declared in the SysColor specifications. For instance, types like file descriptor and constants do not need to be symbolic. Now with the POC, we could perform symbolic execution to precisely identify the vulnerable object and autobahn rights. In the report, we also showed the core side of the allocation of the vulnerable object and the stack trace for each autobahn right. This can be helpful to understand the bug. Now that we know where the autobahn access occurs, we run it under symbolic execution and hook all the corresponding instructions to extract symbolic expression in order to summarize the capabilities. As we can see, the summarization is shown in the bottom of the slides in which the sides of the vulnerable object and the off-site of the OB right are both an unconstrained symbol, which is derived from the user input. To evaluate the exploitability, we need to choose a suitable target object. The structure key is a commonly used target object, which contains a reference counter at the beginning at the off-site zero. Although the reference counter is a four-byte integer, this vulnerability allows us to modify the least significant byte, which is already enough to set the reference counter to zero because the original value of the reference counter is usually small and controllable. Thus, it can produce a solution, specifying the concrete value for those arguments, system call arguments we made symbolic earlier. By setting this reference counter to zero, we effectively convert it into a use of the free bug. And the structure key actually has some data pointer that we can leverage to achieve control flow hijacking. The last step is to synthesize the exploit. We firstly concretize those symbolic arguments based on the solution we got. And then we add some more instrumentation to perform heat-foam strategy to ensure that we can put the target object next to the vulnerable object. We evaluated our system against 17 different out-of-bond right vulnerabilities in Linux kernel. Seven are from CVE database, and the rest without CVE numbers are from Sysbot. For the seven CVEs, four of them have public exploits. With our system, we could generate seven more exploits. Note that we count the numbers of exploits based on the target object we choose. For those without CVE numbers, only one has been started, and thus has a public exploit. With our system, we also generated seven more exploits. As shown in this table, we break down the time cost for each step. The solving time per target object varies from as small as one second to about three minutes, indicating that our system can efficiently search through hundreds of target objects. The time cost for symbolic execution is also reasonable. All of them can be finished within a few minutes. For the fuzzing part, we attempted to compare our solution with the Manila six-colar, but it is not designed to explore the crushing input, and thus fail to produce any result in the limited time budget. Note that we only apply fuzzing when necessary, meaning that our system only performs fuzzing when it fails to produce a solution for the initial POC we have. Some takeaways. First, one vulnerability can manifest different security impacts, and fuzzing can be a good strategy to expose them. Secondly, different autobahn vulnerability exhibit a wide range of capabilities, and it's necessary to have a formal modeling. Also, lastly, kernel exploits multi-interaction in nature, which allows the exploit crafting process to be modular. Although in this work, we only focus on ob-write, and the principle is that we can actually separate capability summarization from the exploitability evaluation, which can also be applied to other types of vulnerabilities, such as USOP3 and Double Free. That's all for my talk. Thanks for listening. I'm happy to take any questions. Yeah, go ahead. I can't find the solution, and it also can't determine the result, please. Or is that a rare effect? Yeah, so actually, if it times out, then we consider it there is no solution. OK, see who interprets that as the solution. Right. Yeah, but yeah, so it's not guaranteed that we can always find a solution. Any other questions? OK, cool. Thanks, everybody.