 Hello, that's Elena. Good morning, everyone. I'm Tony from Intel. Glad to see you, glad to see you here, and that's for coming to my talks today. Today I'm going to talk about the exploitable Linux kernel vulnerabilities 2017 to 2019. Here is the agenda. First of all, before the technical discussion, I will show you some background information about the Linux kernel vulnerability. Then I will introduce the basic kernel exploitation techniques. Next are key studies. I choose two exploit cases which I think are interesting. One is about the eBPF verified bypass and another is timer FD risk condition. And the last part is concluding. I think everyone sitting here today are already familiar with the Linux kernel and its main distribution families. But for the newcomer to the Linux, one of the most confusing things is how many distributions are the versions there are. Maybe one true or right-hand enterprise Linux or the Android are the ones most people are familiar with. But there are hundreds or others as well. As we all know, these distributions have more or less modified the Linux kernel to meet their customization needs. So how can we ensure the security of so many distributions? I think there are no universal solutions. But at least we should do is to timely pull in the patches from the mainline kernel and enable more widely deployed security mitigations. That sounds simple, but it's not the actual. Then I wish to quickly review the Linux kernel vulnerability trends from 2017 to 2019. From the number of the CVEs, 2017 is a very special year. This year's CVE number is twice the second most wrecked years. And from the below 20-year CVE train chart, we can see that the total number of CVEs are slow increasing before 2017 and zoom in 2017. And after that, the number of CVEs return to be normal and tends to be stable. I also summary the CVE of the Linux kernel vulnerability in the past three years. CVE is common weakness enumeration. CVE is more like categories why this vulnerability come from. And each CVE belongs to CVE. Here I list the top six CVEs. Basically, these CVE types are also the common types of exploitable vulnerabilities. For example, the CVE 416, user for free, which are known well, which has a high probability of exploitation for privilege escalation. And in this page, I list some representative kernel vulnerabilities according to the timeline. There are about one or two representative kernel vulnerabilities each year, which can be used to get universal route. Of course, it's only based on my personal opinion, and it does not include the vulnerabilities we are going to talk about today. Although we have been talking about CVEs in our previous slides content, what we need to emphasize here is that the Linux security fix are not equal to CVEs. Actually, only a small part of kernel fix got CVEs. Just like Greg said, who is a stable Linux kernel maintainer and the linear foundation fellow. He said, if you are not using a stable or long-term kernel, you have an insecure system. Finally, at the end of the overview part, I will give some notes about the following technical discussion. The talk is mainly focused on the overall Linux kernel exploitation techniques. The selected cases are based on the X86 and the ARM architecture, with Ubuntu and Android as the target platform. And of course, all the vulnerabilities mentioned next have already been public, mitigated, and fixed. So the next part, kernel exploitation for privilege escalation. First, let's take a look at the definition of the privilege escalation described in a simple sentence that is a user received privilege they are not entitled to. Of course, they are motivated to achieve the privilege escalation in Linux. For example, exploiting root services and the weak system configuration and even the past available. Besides this, the attacker can also exploit SUID executables or the sudo rights. At least CVE 2019 or 14287 here, because it's just newly disclosed sudo security policy bypass vulnerability a few days ago. Finally, what I want to introduce is exploiting kernel vulnerabilities. It's our main topic today. To easily understand the following case studies, let's have a look at the common root flow on Linux. The whole flow consists of two stages. The purpose of stage one is to get arbitrary kernel memory right and the stage two is to get privilege escalation. Of course, there are some steps to bypass the related mitigations in the process. For vulnerability exploit, there are often different techniques for different vulnerability types. And for the other part, like control execution flow and get arbitrary memory right and privilege, there are also some general tips. I will detail some in next slide. First, control execution flow. This usually can be done by modifying the victim function pointer or the stack register. Of course, there are some criteria to choose the right function pointer. The first thing to satisfy is the reachable. And if it's even triggered from the user space, it lose the meaning of talking about the later condition. And the second condition is that the shorter call pass with few checks is better. And this will reduce the difficulty of construction of special data to bypass the check and eventually trigger the control pointer. And the last one can't affect the data structure used in our escalation process. Of course, the choice of the common tag function pointer is usually based on the vulnerability type. For example, if you are dealing with an right vulnerability, like every index out of bounds, we already have the option to directly modify the pointer in structure fail operations. And for the stack overflow, the return register is a common victim target. And for the heap overflow, we choose to modify the pointer in adjacent victim heap object. And the last user for free, we already overwrite the victim pointer by means of heap spray. And what we need to think here is whether the control flow hijack primitive is enough for the kernel privilege escalation. In a few cases, like all the kernel with fewer mitigations, the answer may be yes. But for most cases, the arbitrary kernel memory reads or writes is much better because we can utilize these to bypass many differences. Then let's talk about get arbitrary memory writes. Usually, there are two ways here. The first one is exploit vulnerability directly. But as Windows communications and the developers pay more and more attention to security in recent years, this type of vulnerability has become less and less. Of course, according to the quality of the vulnerability, the memory write can be divided into the arbitrary memory write and the restricted memory write. And the difference between some is whether the memory location and the return content are arbitrary and controllable. And another way is to modify address limit. First, let's look at what is address limit. To be simply, address limit can be seen as a partition between the user and the kernel space. It's a global variable and a member of the kernel structure thread info. Windows here starting with some version of the kernel. Some thread info feels like address limit has been moved to the thread strut. But this doesn't affect our follow up analysis. The URL pattern for setting address limit in the kernel is as follows. We set FS to kernel DS. The process can gain access to the entire process, the entire address space. And after completing the required operations, the process will be restored to its original normal accessibility. But if the kernel could be made to ops or be hijacked, the code flow between the two set FS calls, the second call restore the address limit will never be made. That left the kernel data open to be overwritten by the user space. And let's look at the two code SNPs from the real kernel. We can see that for the both two functions, between the two set FS, there is a pointer which can be controlled through the first parameter of the function. If we can make the code flow to one of these two functions, and at the same time, we can control the first parameter, then we can jump out of the function and skip the address limit recovery. In the code, it's the set FS to LDS. And after that, we can get the arbitrary memory right. So the last part, gain root privilege, the most common and classic method is to call the following two functions. It has been used by many real world attacks. In addition to use function calls to modify process condition, we can also modify it directly. If you want to modify it directly, we need to first locate the position of the process related to the structure first. And then find the location of the create structure based on its offset. One thing we need to pay more attention here is the offset may be different for different kernel version. After that, the last step is to modify the process UID and GID and screw the arbitrary memory writing. Next are case studies. The first case is EPPF modifier bypass vulnerability. Before the detail vulnerability analysis, let's first look at what is EPPF. EPPF was originally used for network packet filter, and it can be used to run user space code inside a sanity check in virtual motion. And we all know there are security risks with allowing user space code to run inside the kernel. So before the EPPF program loaded, a service or check will be performed. The first test ensures that the EPPF program terminates and does not contain any loops. And the second stage is more involved and requires the modifier to simulate the execution of the EPPF program when instruction and the time. And the virtual motion stage is checked before and after the execution of every instruction to ensure the register and the stack stick are valid. This is a timeline about the vulnerability. Bruce found this vulnerability and developed a walking exploit, but he doesn't public age. Just posted some information on Twitter. You can see from the left. And in early December 2017, he found age and reported to the upstream. I think maybe he is sitting somewhere now. Because I see his topic for the risk condition exploitation is next to my topic. After that, Bruce public the exploit code. And in addition, in March 2018, another research with Tenly also publicly released a different version of exploit code. And his Twitter reveals that Ubuntu is still affected. In order to solve this, the mainline patch is also back ported to the 4.4 stable kernel. This session will discuss the root cause of this vulnerability. A simple description for root cause is in constancy between the simulation execution in Word of Fire and the actual code running in the kernel. From the patch commit information, we can see that the check ALUOP function does not make a specific distinction between some certain operations. The attacker can utilize this to achieve malicious code execution. And the following is a text scenario for Word of Fire bypass which I got from Bruce, the exploit code. Let's observe the code which is responsible for processing the first instruction. First, the check ALUOP function will be invoked to check the validity of the related instruction. And here, we can see that the immediate one-low is retrieved from the instruction and is stored into the rack stage. And note that here the IMM in the rack stage and in the BPF instruction are both signed in Tigger one-low. Then we come to the second instruction. For this instruction, do check has to decide which branch to tick. We find that the right code snap is responsible for deciding which branch to tick. And this point is still comparing to signed in Tigger. Based on the source one-low given in the BPF program, the first rule, an instruction will be ticked and the other branch of the push stack operations will be ignored. And the instruction three is very clear, is to assign a one-low to the return register. Let's look at the instruction four. Since the previously push stack operation has not been executed, so when you want to pop stack to see if there are any instructions that need to continue checking, the pop stack will return negative one directly. Then the do check will execute breaks to jump out to the entire check process. As a result is if there are some other instructions behind these four instructions, they will not be verified. And the next, let's look at the actual code running. We notice that the regs here is different from the regs given in the do check. It's an unsigned 64 between-low. In the first instruction, the into one-low IMM, which is HF, will do sign excitation to unsigned 64 between-low, which means that the new one-low stored in DST is 16F, so when we compare the two one-low, the result here is exactly the object, the results from the wildfire process. These two one-low are not equal. In this case, the jump branch will be ticked and the malicious BPF instruction will be executed. This page is a brief summary of the previous code analysis. That is when verifying the program will go to exit, but when actually code running, the program will go to the backcode. And from here, we can more clearly understand the root cause of the vulnerability. And for the privilege escalation, through the previously introduction, we can already execute arbitrary EBBF program. The next thing is to construct the specific EBBF instruction to achieve arbitrary memory read or write. After that, we can modify the process to create a structure to get root privilege. Due to the time limit, if you want to learn more details about this part, you can refer to the below public exploit code. And we didn't introduce the mitigation bypass for this exploit. In fact, all these common mitigations are invalid for each, especially the whole exploit doesn't need to know the address or a particular symbol. We don't need the gadget, so we don't need to know the symbol or some address. And also, we don't execute user space code or access user space data with the supervisor mode. So there is no need to consider the KSGR, SMEP, and SMEP. For the KCFI, although it's not broadly adopted by the major links kernel version, it was designed to defend the control flow related attack. However, for this exploit, it doesn't tamper with the control flow and it just utilizes the native functionality to change the program's normal behavior. So it's also invalid. A simple case summary, and this should be considered a data-oriented attack with the ability to cause great damages, especially the type of attack. This type of attack can be used to modify the table and change the memory permission of the kernel code to be writable and then inject the malicious code to the kernel space. And for example, we can patch the implementation of the control flow check to bypass the KCFI. The good news is that many Linux distributions are affected because they don't enable the eBPF feature and or they don't allow the normal user access. That's the timer FDU risk condition. I found and reported this issue to the Google Android script team in the early 2017. I like to call each time bomb vulnerability because it's related to the timer and it can be used to get universal route. It's based on the file descriptor and there are three main system calls timer FD create, timer FD set time, and timer FD get time associated with age. And look at the vulnerability analysis. Both timer FD set up cancel function and the timer FD remove cancel function where perform the list operations. The developers want to protect the might cancel queuing only by sets the context might cancel. It's a global variable to true or false to perpetrate the list operations. But this protection method does not work against the risk conditions. Let's clarify this by specific attack scenario in the next page. First, let's look at the left diagram. The thread aid set the context might cancel to false and then delete the list. In the normal case, it's impossible for thread aid to delete the list again because it will not pass the if judgment. However, if before the judgment, if the CPU switch is context to run thread B and the thread B modifies the context might cancel to true and then CPU context switch to switch again. At this time, thread aid will pass if judgment and delete the C list node again cause the list corruption problem. And in this case, the system will also crash. There is another situation. The same list node is ended twice. In this case, we will no longer delete the node when doing the list delete operations causing the dangling pointer problem. And in this case, the dangling pointer problem can be converted to the user alpha 3 and it's possible to get privilege escalation by exploiting this. The following is the list operation details to explain why it would cause dangling pointer issue. So look at if we add the same code twice and then delete each, what will happen? After the execution of the related link list operation, we found the node is still in the list after the delete operations. That means if we are ready and the same node twice, we can't never delete it. And quickly go through the conventional UAF exploit. The first three steps is to use risk condition to cause the user alpha 3 problem. And after that, we did the hip spring. The last two steps are trigger and gets the code execution. But we are in triple trigger in the victim pointer. There is a capable check in our trigger path. The ordinary process doesn't have the capsis time capability. And so we can't access to the victim pointer. But as we all know, for one to know privilege, it's required to create a username space. So can we use this to bypass the permission check? The answer is also no. Let's look at the information I took from the map page of the username space. It clearly describes some privilege operations like capsis time, capsis module can make node are not associated with any name and space types. Only the initial username space can perform such operations. Before I introduce the final solution, I will first introduce two concepts. One is TOCTOU and another is pipe subsystem. TOCTOU means time will check to time will use. The following simple example could give us a good understanding of that. After the access check, we link the field to the victim field. For example, the computer password. And then we can overwrite the password. It's just a simple example. And then look at the pipe subsystem. Here I will introduce a system call related to the pipe system. Read with this call. It uses a series of our vectors to describe user buffer. If the number of our vector is greater than each, it will meload the memory on the heap. Otherwise, it will put our vector on the stack. And when no contents available from the right end read we may block in the kernel, then we can see our vectors may stay in the kernel heap. So according to the description, we can use read with for heap. In this case, our victim buffer is greater than each. So we can use this. TOCTOU also exists in the pipe system. We found pipe will check our vector and make sure it's our base points to the user space. After that, when pipe performs copy to user, it will not check our base again. So if we can modify our vector at this internal. To achieve this, there are two types of vulnerabilities that can be used to modify our vector between the internal or the time of check and the time of use. And the vulnerability we are discussing is actually used after free vulnerability. So we should be able to use this to complete the full execution. The following is detail list operations. We combine the list operation with pipe heap spray. I think this part is difficult to understand unless you do it by yourself, step by step. So I will quickly go through and only explain some results. And if you are interested in this part, you can deep dive into it offline. The first step, we add victim context twice and then delete and free it. Through the previous introduction, we know that such operations will not delete the victim node. There are different pointers in the heap buffer. So next we need to do is the heap spray. So we use the read-we to heap spray to modify the list node. Then we block read-we and modify our base by delete another node, context A. And the last step, we pipe write our prepared container to modify context B. The next and the context has been less and fewer pointer. And finally we delete context B to achieve with street memory write. For the kernel address basically out randomization, an attacker does not even need an arbitrary read primitive to bypass it with an info leak from the message. It can be easily bypassed, especially for one tool. And there are no restrictions on access to the message. And we can just trigger some warning information or force a pitch fault because the panic on ops is not enabled in Ubuntu. And for the Android, SC Linux will restrict access to the D-message, but there are some small number of vendors or products that can allow access to the D-mess showing the blue picture. Of course, in addition to this, we can also use the hardware-sized channel, the timing-attacking method to bypass the KSR. And the next is the SMEP bypassing. The first method is bit flipping, but we use the native write CR4. The scheduled set affinity system is to force the exploit program to be execute on one CPU call and just make sure that the user space payload will be executed on the same call where we have disabled the SMEP. I need to mention here the schedule set affinity system can also be used to help improve the success reach of the HIPP Spring. And so another method is address limit exploiting. We have introduced this previous. And one situation to discuss here is what to do if the first parameter, the first parameter is the RDI in the X0 register in the architecture. If it can't be controlled, what we can do? Instead, we can control a pointer in the field structure. I think construct, like the EOF think and the check flags in the field operations and think construct parameters we need in the user mode. Of course, SMEP will block this method. But if there are only SMEP enabled, this one lead. About several years ago, red to DIR also have been proposed to bypass SMEP and SMEP. However, with a kernel patch applied, this FITS map page are no longer executable. So currently it's not one lead except for some order kernel version. So the case summary, in this case, attackers can't trigger the victim function pointer due to the capability check. As a result, we can't use traditional user effort free exploitation techniques. But by combining the list operations with PIP, we also achieve memory write. The write content is restricted. After that, we have two options. One is modified address limit to get memory read and write. Another one is to directly read the driver's associated function to a specific JLP to achieve the arbitrary memory read I like. It's like we can modify our control of the specific driver, like the PDMX and make is some command of our control point to some specific JLP to directly get arbitrary memory write. The last part, the first one is important to know your enemy. As we all know, defense is more difficult because it requires a continuation of virus attack and attack means and attack surface. So we can't separate the defense from the attack. We need to keep abreast of the exploit techniques to promote defense better. As a second, security is not just a simple integration of the virus mitigations. Just like you have already able the kernel address basically out randomization. But it can be easily bypassed only through the D-message info leak. As you see, a small leak will sink a great ship. As the last one, now the widely deployed mitigations are against the control flow attack. So this attack we are talking about are mostly based on the control flow attack. But we also need to appeal more attention to the data-oriented attack, even some... I have noticed that there are some papers proposed some mitigations to prevent the data-oriented attack. But it can be predicted that there will be many rated warrants in the future. This is a reference. That's all. Thank you. Any questions? Questions? This is a big room, so I'll have to be fed running. Hey, great talk. Have you looked at the exploitability of L1TF or all the other Specter and Meltdown vulnerabilities that came out in the last two years? Because they are exploitable in a lab environment, but I yet to see someone publicly claim that they can do that in a real native environment. Do you mean the Specter and Meltdown? Yeah, Specter and Meltdown L1TF. Have you looked how exploitable they are in a real environment where the cash is flying everywhere? Yeah, I know what you mean. It's the Specter and Meltdown are the CPU-based vulnerabilities. It's different what we are talking today. It's from the kernel OS version we are talking about today. I know there are some techniques just used to catch fresh and some other to exploit this, but I don't know much details about this. Sorry for that. Any other questions? Thanks for the talk. You mentioned this SMEP bypass with native write CF4 register function and recently there was some additional mitigation for CR4 pinning. Did you have a look at it and other any comments? Yeah, I know there are some new mitigations adopted to mitigate this attack. We can't use this. I introduced the next method. We can control our code flow to this kind of address. In addition to control the code flow, if we also can control one register like the RDI and the X0, we can prepare it's contained in the username space and to use this method to get the upstream memory write. So the first method is not needed if you use the second one, right? Yeah, thank you. No questions? Someone else has a question. I have a question. You mentioned that we should be paying more attention to these data-oriented attacks. Is there any technique or a group of technique that you feel is the most prominent? I mean from the defense point of view against this data-oriented attack, what we as kernel security people should be paying more attention and looking into? I'm sure there are many proposed in the research and do you have your own like things, maybe preferences that you think that some defenses are more kind of better shot to do and to look into the kernel space? Yeah, yeah. This is a difficult question. Yeah, just as the first I said the defense is, I think the defense is difficult than the attack because it requires to know the every attack surface and we need to know every attack means. So for the defense developer, maybe I want suggestions and except for to be focused on the defense itself, it's better to know something more about the exploitation techniques. I think this one. So I guess your suggestion is that we should be educating ourselves more with data-oriented attacks and kind of thinking more towards how do we protect against them? Yeah. So any more questions or in that case let's thank the speaker.