 Hello everyone Today we're going to talk about phantom attack evading system called monitor. My name is Rex nice junior so Imagine an attacker compromise your Linux infrastructure So the attacker first Compromise a web app through a web app Remocal execution and then it launches a reverse shell Then it discovered a vulnerability on the system. He can activate privileges using the pseudo vulnerability CV 2021 3156 then he's looking for Secrets on the system. So he read the Etsy shadow file and then he discovered additional Later movement opportunities by reading the SSH process environmental variable Then he let her move to the second machine using SSH hijack as he is celebrating this moment He discovered that his reverse shell connection is gone and it doesn't take him for too long to discover that He his IP is completely blocked now Let's take a look at the other side of the story while all this is happening our security engineer Has received a bunch of slack messages For the alerts generated by his latest cloud workflow protection software and The reason that the software can discover all these activities precisely is because it monitors The system calls and other process related data. So for example When the attacker launches the reverse shell, there will be a connect system call and there may be additional system calls depends on the reverse shell that he uses this is similar for the other activities and Through this talk, we are going to use open as system call as an example So let's take a look at how one can use system calls and other process information to detect a attacker Reads Etsy shadow So here's an example rule The rule is trying to detect untrusted programs Reads the Etsy shadow Let me explain what the rule means It detect that there's a open at or open system call with the reprimission and the file name is equal to Etsy shadow and Also, the program is not in the allow list That allows to read the Etsy shadow So from this rule it should be very obvious that the ability to precisely monitor system calls and How their system call related data is critical for the detection for the this attack The agenda of this talk is we will talk about system call monitoring in more detail and Then we'll talk about the two open source system call monitoring project that we analyze and then we'll talk about the first model ability the Talk to issue which we use Benton v1 attack to exploit and then we'll talk about the second vulnerability a semantic confusion Issue which will use Benton v2 attack to exploit and finally we'll conclude the talk with takeaways With that I will hand over to Junyuan to talk about the system call monitor Yeah, so as Rex mentioned system call monitoring is very important to detect threats So what is this call monitoring? We can define it as a technique to verify whether the application application system call Conferred to the rule that's based by the program behaviors at runtime Here is a graph showing how system call monitoring works When application system call is invoked System call code pass is executed if there are any hooks in a code pass The attached program will be called to collect system call data For example, since call arguments the data are sent to use space monitoring agent The monitor agent will check if application system calls Confer to the user defined rules If not the case it may generate errors So typically at least two steps should be include for system call monitoring One step is call system call interception Which is to get notified if target system call are invoked In order to intercept system call, you can use trace point or raw trace point Both of them are static hook place in a kernel code Raw trace point are EBPF alternatives to standard trace point It's faster because it provides slow access to the arguments without processing For this call interception the kernel provides two raw trace points 6 enter and 6 exit This raw trace point can be called this function trace 6 enter and trace 6 exit respectively The first arguments of this function are PT register structure saving user registers in user kernel mode switching In it includes system call arguments The second parameters is syscall number If any progress attached to this raw trace point It will be executed with the same arguments as functions Trace point has low overhead but it only provides static interception Different from trace point, kpro or kretempro provides dynamic hook in the kernel Using it, we can register the progress on kernel instructions For example on system call co-pass When the instructions are executed, it will trigger register programs Kpro can be inserted on almost any instructions in the kernel However, kretempro can only be inserted in function entry and exit Kpro provides dynamic hook but is slow compared to trace point And you need to know exactly how data is placed on a stack or registers in order to read system call data You can also use ld preload trick to intercept system call But it's not working in all cases For example, application is statically compiled Key trace system call provides another way to intercept system call However, the overhead is high The second step of system call monitoring is called syscall data collection Which is to collect system call data For example, system call arguments FBR notified by system call events The program used to collect system call data is called tracing program For example, you can use the tracing program to collect system call arguments As we mentioned before, tracing programs can attach to different hooks Like trace point, raw trace point, kpro or kretempro When the hooks fire, tracing programs are called to collect data There are different ways to implement tracing programs to collect system call data You can use Linux native mechanisms like ftrace or perf events You can also implement the trace programs in kernel module Or evpf programs Which allow the execution of user code in a kernel The open source project, farco and tracie both use the similar techniques to monitor syscall Farco is originally created by sysdick It's one of the two security and compliance projects And the only endpoint security monitoring project in CNCF, in cubatings projects It has two 3.9k github stars It actually consumed kernel events and enriched them with information from cloud native stack Like linux, containers and so on Farco supports both evpf and kernel module implementation for the tracing program Tracie, on the other hand, is originally created by aqua security It has 1.1k github stars It's basically a runtime security and forensics tool based on evpf So unfortunately, the open source projects or other projects using the similar techniques Are vulnerable to be attacked during syscall monitoring The first vulnerability is time of check, time of use During time of check, tracing programs collect syscall data During time of use, syscall data used by the kernel is different from what tracing program check Let's take open syscall for example The second parameter is called file name, which is the pointer pointing to user space buffer Between time of check and time of use This pointer is vulnerable to be modified from user space So we will introduce phantom b1 attack that can exploit the topical issue The second vulnerability is semantic confusion It means kernel interprets data differently from tracing programs For example, symbolic link is interpreted differently by the kernel and tracing programs We will also introduce phantom v2 attack that can exploit semantic confusion We will also demonstrate phalco is vulnerable to both phantom v1 and v2 attack While trace is only vulnerable to phantom v1 In order to understand tactile, we used open-app syscall for example We used kernel version 5.4.0, but regardless of the kernel version If the monitoring software used trace point in this way The tactile vulnerability will exist To simplify, we only show the code that is related to the attack When open-app syscall is invoked in applications syscall handler will execute trace6 enter function with two arguments As we mentioned before, if any tracing program attached to sysenter tracepoint, the program will be executed After that, syscall handler look up syscall table and jump to open syscall to open the file Before returning to applications, the handler will call trace6 access with exactly the same arguments as trace6 enter So similarly, if there are any tracing programs attached to sys exit tracepoint The program will be executed As we mentioned before, the second argument of open-app syscall is fine-name pointer Pointing to the user space memory The fine-name is passed to do this open function And kernel populates it to kernel buffer temp using get-name function After that, kernel use the kernel buffer to call internal function to file open to open the file This is time of use for syscall arguments by the kernel If we divide the open-app syscall co-pass into two parts based on the get-name function We get two sub-co-pass cp1 co-pass1 and cp2 co-pass2 In cp1, the fine-name pointer hasn't been copied to the kernel buffer In this case, no matter where we place the host in cp1 The attached tracing program will have to read user space buffer in order to get the fine-name This is vulnerable to be changed in user space attacker For example, if we attach tracing program to sys enter tracepoint Or to do sys open using kpro During time of check, the tracing program will have to read the user space buffer to get the fine-name In cp2, user space memory has been copied to the kernel buffer Making it not vulnerable to be changed from user space For example, if we attach tracing program to the entry of do file open function using kpro The tracing program can read the kernel buffer tab to get the fine-name That kernel buffer is not vulnerable to be changed for tactile attack However, if the hook are placed improperly in cp2, tactile is still possible For example, if tracing programs attach to sys access tracepoint It will read the user space buffer to get the fine-name If, as we mentioned before, we use kernel version 5.4.0 But regardless of the kernel version, if the monitoring software use the tracepoint in this way This vulnerability will exist FACO is vulnerable to tactile And the vulnerability is tracked by CVE 2021 3.3.5.0.5 means the scores 7.3 In particular, the vulnerability Exists for FACO with the version older than 0.29.0 or open source sys stick It also affects some commercial versions based on the open source agent This was confirmed by the open source maintainer Please contact the vendor for the versions The reason why FACO is vulnerable to tactile Vulnerability is that they use sys enter and sys access tracepoint to intercept system calls In that case, user space pointers are read directly by FACO trace program In both kernel module and eBPF programs implementations In order to demonstrate the generality of FACO We evaluate the sys calls in FACO rules Please note that we only consider system calls That includes user pointers as arguments like open sys call And we found that FACO is vulnerable to monitor most of sys calls that we evaluate Accept exact v-system call because FACO doesn't read user pointer for exact v-system call arguments directly Instead it reads the data from kernel data structure So we evaluate trace 0.4.0 and we found that it's vulnerable to many system calls like Connect sys call One thing I need to mention is There's no CVE given because the trace C also is mentioned The tactile attacks on sys calls wrappers Or tracer is a well known issue and trace C is no exception And also agrees on the fact that there's no CVE or normal findings and therefore we could talk about it publicly I will let the audience to interpret So I will hand over to Rex to explain and demo the V1 attack All right, so The high level idea To exploit the top two issue Is fairly simple So first of all, we want to trigger the target system call with malicious arguments And we'll let the kernel to read the malicious argument and perform the intended malicious action for us After the kernel reads it We will override the data structure pointed by the user space argument pointer with benign data And at sys exit The tracing program reads the data structure pointed by the user space pointer And checks the benign data against the rule and therefore you will not fire Although the high level Plans are simple. There's a few technical challenges that we need to overcome first one is When does the kernel threat reads it? And how can we synchronize the override with the kernel threat read? Are the recent windows big enough for the system calls that we're going to attack And how do we ensure the tracing program gets the overwritten copy all the time? So before I dive into the step-by-step exploitation There are a few primitives that we use in the exploit which I want to talk about First one is user for fd system call The system card is designed in a way that the user a user threat Can handle page fault But page fault is traditionally handled by the kernel So what's the initial design intention for this This was designed for memory externalization In in in the case where you're running a distributed program You can run compute node and memory nodes When the compute node Needs a particular memory that doesn't exist in the compute node The kernel triggers the page fault and the user space For handler it's going to reach out to the memory node to get the desirable memory On the other hand if the compute node Has memory pressure it will send those memory pages back to the memory node One very important fact about user for fd is that once the kernel thread triggers The page fault the kernel thread is completely paused and wait for the user space program to respond As some of you may already Be aware of this has been used Quite a bit in exploiting kernel risk condition bugs The other two primitive I want to talk about one is interrupt And the other one is scheduling So an interrupt notifies the processor with an event that requires immediate attention It will diverge the program control flow to an interrupt handler Let's uh look at the picture On the right side So we have two cores and we have two tasks Running on each core corresponding it On core zero task a issues a system called a And then the control flow transferred with kernel thread to handle the system call While it's running the user thread On the core one triggers an interrupt And the way it triggers the interrupt is uh Indirect interrupt using system call Once the interrupt is triggered it's Uh core zero will execute the interrupt handler And after the interrupt is handled It will return back to the system call routine which handles the system call So there are different ways to indirectly trigger system call indirectly trigger interrupt using system call Um one way to do is to trigger a hardware interrupt So this can happen when Uh a program issues a connect system call the cpu that is dedicated to handle networking interrupt Will get interrupted Another way to trigger this interrupt is called inter processor interrupt This can be done by issuing n protect system call so um once The n protect system call is issue the memory page Is permission is changed and therefore all the cpu's that are Caching those memory permissions need to be updated With the right memory permission The scheduling primitive that we use one is sex Uh scheduler This will change the scheduling priority Of a particular task This is optional in the exploit Because for system calls with longer top top windows such as networking system calls we Find that it's not needed to reliably exploit the top two issue but with system calls related to files The top two windows typically smaller and With the capability we can 100% reliably exploit the top two issue And then the second primitive we use is set affinity which will pin a task to a particular CPU Okay, so let me talk about the step-by-step exploitation in detail um Initially we need to do some setup. So we set up three thread a main thread A user for fd thread The user for fd thread um Can run on any cpu and also an override thread So the main thread will pin to cpu 3 The choice of three here is because we Uh run our experiment one of our experiment on a four core system and cpu 3 is used to handle The networking interrupt But if you are using IPI interrupt it can be any cpu um And then the main thread will map a memory page a Uh the page is not allocated And uh, it will register the user for fd thread to handle the page file generated for this page on the uh override thread site We pin it to a different cpu because we want to reduce the interference between the override thread and the uh main thread um And then the override thread will just block on conditional new text Once it started after the setup The main thread will trigger A system call in this case open at And it was specified a file name argument to point to page a Now keep in mind that page a at this point is not allocated And uh, so the kernel thread Will trigger a page fault Once the page fault is triggered The user for fd thread Will write page a with the malicious file name And then it will release the conditional new text And then it will issue out a system call to return execution back to the kernel Now once the conditional new text is released the override thread will start running And uh, it will first write the behind file name In the last stage of the attack Once the execution Return back to the kernel The kernel will use copy from user And you will get the malicious path name This is a time of check and uh It keeps executing until As this exit the trees program will read the register and de-reference The file name value again. This is the time of check So this is the talk to window we have And uh, let's see how we can Use the override thread to make this override successful so the After it writes a benign name We issue a memory consistency update using cpu instructions and uh As the memory consistency update takes place We want to increase the talk to window Such that we have enough time to update the value for all the cpu's So what we do is that it will issue a Interact using system calls And this will increase the risk the talk to window size And then the memory consistency update applies So The file name now points to be the benign file name And then the tracing program will read out the benign file name and track it against the rule Which will not fire anything Now let's take a look at the demo Okay, so in our first demo we're going to attack cystic um For using connect system call as a as an example So what we'll do is we will run cystic And ask it to monitor connect system call and just filter the data on our tag program And then on another window where we use tcp dump to monitor all traffic going to 48 and then we'll run our attack program The attack program It will Connect to 1.1.1.1 But then it will override this uh, ip address With a ip address from linking.com Okay, so we can see that the tcp dump report traffic actually going to 1.1.1.1 This is what happens on the wire But cystic will report that okay this program attack connect Is actually connecting to 13.107 42.14 which is the ip address on by linking.com and therefore We showed that we can write this Uh with arbitrary value Okay, so uh now we'll show a second demo In this demo we'll show an attack on cystic on the open add system call Now our demo setup, uh is slightly different So for open add The top-to-window is small so If we don't use uh capsis nice Sometimes the attack will fail but If we assign capsis nice we can achieve 100 reliable exploitation um Keep in mind that because The override thread has the highest scheduling priority. So sometimes you can write the uh file name faster Then the kernel reads So When that happens The system call will get the benign file name And the Cystic will also get the benign file name. So there's no harm being done and no other being fired as well And therefore we run the experiment 15 times using our script And each time we check What is uh, what is a syscall rate? And what does uh cystic read? And that will compute the result based on each observation So what we'll do is we'll print out the number of success attack the number of Attack that has no harm and also the success rate Okay, so as you can see here There in the 15 experiment we succeed three times No failure And 12 times there's no harm And therefore the success rate is 100 percent Okay, so lastly we'll show a demo on 3c as well And here we'll attack connect call In 3c the attack setup is very similar to what we have for the cystic again 3c will monitor the connect system call and we'll just filter the data on our attack program reported by 3c And then we have tcp done monitoring on port 80 and then we run the attack as you can see that The actual traffic going on the wire is going to 1.1.1.1 But 3c reported that the traffic going It's going to 13.107 42.14 which is overwritten by us okay, so This is I just want to reiterate that the capsis nice is optional. It depends on the system call that you're attacking and For networking system calls typically this is non-needed and for file base Typically, this is needed to achieve a hundred percent reliable exploitation And then I'll talk about the second attack the semantic confusion The idea of the attack is fairly simple The kernel and the tracing program can interpret data differently so We use file link as an example When the kernel reads a link it will try to resolve the link and read the actual file But when the tracing program read a link it will just take the link as the argument And use that to track in the root So file call is vulnerable to this semantic confusion attack Because it didn't resolve the link In the system call There's no cve given because they mentioned that same link and same link add link and link add are all monitored by file call but practically Detection team need to track all these same link add Or link link add to all these file base rules if the attacker is using those And tricy is not vulnerable to this attack because they use a mitigation in rsm hook Which junior will talk more in a later slide So I just want to quickly show the example of phantom v2 attack insisting remember this rule that we talked about at the very beginning and See that the file name actually we check that whether the file name. It's a atc shadow, okay So in order to exploit this We can create a file link temp shadow pointing to atc shadow And then the tracing program will read the same link atc shadow And then the system call open add Seize the temp shadow and track against the rule. It doesn't match atc shadow and therefore the rule is bypass Okay, so with that I will hand over back to junyuan to talk about mitigations Yeah, so um for mitigations. There are two basically two approach for the Bennett attacks. So one is to detect the potential exploit is happening This was a proposed and partially implemented by its faculty It has include in faculty release in version zero point 29.0 Basically is trying to detect the foreign behaviors used by the exploits for example It detects the arm privileges use of user for fd season core this was implemented And detect the user registers memory address range and also detect a user copying Continuous memory trunk into a user of d register range and so on The second way to mitigate fan attacks is to read the data that actually used by the system call or the kernel In order to do that, you can hook lsm functions to get those system call data that is actually used lsm hooks function is a list of checkpoints That are placed in a kernel before operations happens on the kernel objects So here is the table showing a list of the lsm hooks used by tracee version 4 and the second column shows the system call that was that is protected by the lsm hooks from Fender attacks You can also read those data that's used by the kernel from kernel data structure for example In order to read the arguments of exactly tracing programs can read it from the FM structure from the kernel I will hand over back to Rex to conclude Okay, so basically In this talk, we show that phantom attack is generic And it exploit the fact that kernel and tracing program Can read data at different times. This is exploited by phantom v1 and we also show that the phantom attack can Exploit it the fact that kernel and tracing program can interpret data differently. This is exploited by phantom v2 We demonstrated that kernel raw trace point on system call are not ideal for secure tracing and For other tracing implementations such as kpro it could also be vulnerable if it is now implemented properly for mitigation One can use detection for abnormal usages on for user pho fd Or to ensure that the kernel and the secure tracing program reads The same data and interpret the data in the same way If you're interested in discuss further feel free to contact me on twitter and We'll share the github link at twitter as well. So Before we conclude we also want to thank all these people During our research chris and joe for the discussion on the ebpf and kernel tracing and top toe also you on top toe and Lastly, we really appreciate the falco open source team They are very professional handling the issue. We have really good discussions On there Thank you