 Alright, we got to talk here on weaponizing hyper- hypervisors by Ali and Dan. Uh, and we have a little tradition here, uh, DEF CON for those that are new to speaking at DEF CON. And do you- do you- does anybody know what that, uh, what that is? So this is called Shoot the Noob. Um, and uh, you know, those of you who can cheer them on as we do a shot here for the first time speakers. Hey guys, thank you for coming. Uh, really excited to be here. Uh, it's our first time as you saw. Um, I'm Ali Islam. I'm the co-founder and CEO of Newman. Uh, I've been in the cyber security industry from last, uh, 13 plus years. And for a long time, I- I was working for a company called FireEye. It's another cyber security company. So, um, that's yeah. Hey guys, my name is Dan Regalado. Um, I'm so excited to be here because you know it's the first time in DEF CON. So, so glad and, and blessed. I want to thank my wife, my brothers, and my friends, the Mexican band that's here, gentlemen. DEF CON speaks Spanish. Cool. Okay, let's, uh, let's get started. So here's a quick, uh, overview of like, uh, how we organize our talk. So we're going to introduce some, some of the basic concepts. Uh, these concepts are really important, uh, and kind of a foundation upon which we build our, uh, later detection cases, um, and, you know, other policies. Uh, apart from that, uh, then we're going to explain the embedded, whole embedded environment, the journey, you know, setting up the board and, uh, the environment that we play in. And then we're going to talk about the very important VMI on ARM. Uh, I'll explain shortly what VMI is. And, uh, finally we'll discuss about, uh, different attacks, uh, use cases, and some of the recommendations. Uh, okay. So, uh, why is hypervisor so important and relevant? Uh, as you can see, uh, all of the major players, automotive grade, Linux, Renaissance, uh, Denso, and Alibaba, NXP, uh, Perom, Robotics, uh, you, I mean, there, there are many, uh, Green Hill, Intel, they are all have now the hypervisor in their architecture. And, uh, the traditional use case for hypervisor is basically the efficient utilization of the resources and, uh, probably the isolation use case. Uh, however, uh, not many people have explored hypervisor, uh, for security in terms of, uh, you know, how you can use hypervisor to build a really good security system. Uh, isolation is there, but not the VMI, uh, that we're gonna talk about. Okay, so, uh, what, the beauty of hypervisor is that it exposes, uh, an interface which is called, uh, you know, the introspection interface. It's also called virtual machine introspection. What it does, uh, or it allows you to do is, um, you can monitor the whole system from the outside. Uh, that's really powerful and, uh, we haven't seen many people using it, especially on the arm. Um, so, you know, we were very, very excited to start on it. And it was a long, very hard journey because, uh, there's not much out there. There are like few projects or papers out there. Um, so, traditionally the antivirus, we all know that it has a lot of issues. It sucks. Um, but mainly, uh, you know, when the any advanced malware comes into the system, uh, the first thing, uh, it does is, uh, basically check if there's an agent inside or if some, somehow it's being monitored. And, um, uh, if you're sitting outside of the printing system, you solve that very important problem automatically. So you, so basically you have that sophisticated invisibility. Uh, apart from that, uh, not having anything inside the operating system also helps with, uh, you know, uh, the different certifications because you're not messing up with the device functionality. Imagine, you know, you're securing a car infotainment and then, you know, certainly, you know, there's a bug in your software which kind of messes up everything, right? So you don't want that. Okay, so let's get started on the, uh, on the VMI. Um, what is, what is VMI or what, what kind of, uh, interface hypervisor exposed? Basically what you have is raw memory. And, uh, you take the raw memory and you use some of the always specific knowledge to actually make sense out of it. For example, you should really know like where kernel is storing what and, um, and then, uh, you know, you kind of like build your logic around it to really make sense out of it. For example, if you, if you, uh, take the raw memory and you know where the, uh, Linux task list is stored in the memory, you can go to those offsets and then you can start decoding and, you know, finally, as you can see here, uh, you can, uh, decode this task list and traverse this task list from the, uh, raw memory. Uh, to give you an example, uh, for example, if you have to read a kernel symbol value from the raw memory, you know, it looks like a simple thing if you are inside the operating system. However, from a VMI, uh, it's a very complex, uh, process. So what will happen is, um, first of all, uh, you know, once you say, hey, you know, I, I want to read the value of this virtual address. First of all, VMI, uh, by the way, we are using, uh, LibVMI, uh, we highly recommend, uh, it's an open source project. It kind of provides the basic functionality, uh, that you need. And on top of that, we build our own functionality. Some of the basic functionality might be like the, you know, um, I'll explain you shortly, like once this is done, for example, the caching and, and some of those, uh, capabilities. So, uh, we really recommend using LibVMI. So what will happen, uh, let's say if you have to read the value, uh, you'll get the virtual address LibVMI. I will first, uh, go and check with the system map, uh, to see, uh, you know, where the virtual address is. And then, um, uh, take that and then, uh, map it to the page directory. Uh, the page directory will, uh, be added to the, some of the bytes from the virtual address to map to the page table. And finally to the actual physical location in the memory. And then it will come back and give that value back to the VMI application. So that's the, that's the general flow of how, uh, you know, you can, uh, get the value, value out. Uh, also, uh, here you can see that once you go through this process, you're not going to do that again for this particular, uh, virtual address. So that's where, again, LibVMI comes in. It has a very efficient caching mechanism. You can do on your own also, but, you know, in case you want to write some of that work, you can, uh, just simply use the LibVMI. Okay. Another very important, uh, concept that is critical to, uh, you know, what, what we're going to show is single stepping. Um, because we're going to show you eventually how you can, uh, get the system calls out, uh, by staying outside of the operating system through VMI and from the raw memory. So I'm sure you have, uh, all used, uh, debuggers and single stepping in your on use cases, including reversing. However, um, there, there, there, there are other, uh, single stepping and breakpoint mechanism that I want to quickly touch on. The hardware breakpoints is basically, uh, uh, are mainly used for code that sits in the room. Uh, so basically, uh, when the code is not in the RAM, you cannot override the breakpoint instruction. So what you do is, uh, you have these special registers provided by ARM and Intel. For example, on Intel, it's D0 to D3. Um, and you stores those, uh, locations, memory location values in those registers. And then, uh, program counter, uh, comes to that particular memory location or that instruction. It just halts the execution and then you can analyze the whole system. Uh, and other is the software breakpoints, uh, you know, is the most common one. CPU assisted that we use in debuggers. Uh, for example, you will set the trap flag in the E-flags register and then you'll, uh, uh, let the CPU do the work. Uh, once the breakpoint hits, the CPU automatically do the single stepping. But what really is single stepping, uh, you know, when we say, uh, what happens in the back end? Uh, so in the back, what happens is there is a special, uh, for example, called int1. Um, when the breakpoint hits, you know, you, you do the whole analysis. However, in the back end, that instruction, uh, the CPU makes sure that it also executes the original instruction and then move on and then move on. So that's very important. And now the third one is the software breakpoints with no CPU assistance. So we're gonna use that, uh, in our implementation and there are reasons for that, uh, that I will explain you once we are done with the basic concepts. But just, just keep that, the concept, the single stepping concept in your mind. Uh, apart from that, uh, in our implementation, as I said, right, you're playing with the live raw memory. So you really need to know how the kernel internals, uh, and how the kernel is, you know, organizing the memory and how everything is, uh, working inside. So you have, you need to have that, uh, basic knowledge. So, uh, we all know that, um, from quite some time, we now have the paging, the virtual addressing, uh, since 286, where the real, real addressing mode ended. Um, and then this is a typical, uh, two level, uh, paging. We also call it SLAT, second level address translation or, you know, the, uh, and, and it is implemented using the extending page table and the virtualization extension, uh, extensions that the newer architectures provide nowadays. Uh, so to quickly follow the flow, um, like I was saying, for example, if you are going to read a value, this is how it looks like. So, basically you take the first 10 bytes, uh, add it to the CR3. CR3 is a special register which stores the base of the page directory. And you add that, um, 10 bytes to the CR3. And then, uh, it takes you to the base of the page table entry. And where, uh, then you begin to pick the next 10 bytes and then you add them to get to the right page table. And finally you add the 12, uh, 12 bytes which are the offset into a specific memory page, uh, from which then you fetch the value. Okay. This is a, uh, this is a, another view of just what I, I just explained. Uh, so two levels of translation. First, from a virtual machine to the VM physical, uh, VM, virtual to VM physical address. And then from VM physical to machine physical. So basically you have this hardware on, uh, running right on top of the, by the way we're talking about the bare metal, the type one hypervisor, right? So, so you have this, uh, hardware, the hypervisor sitting right on top of it. And then, you know, there are, there are like different domains or the VMs. Um, so the second level of translation is, uh, handled by, uh, table called extended page tables. And there's a pointer EPT. Uh, for example, Zen hypervisor stores that, uh, per VM and it is used to, uh, do the translation. Uh, EPT, there's something very interesting about EPT. So what EPT, as you guys can see, what EPT is doing is, it is eventually giving you the value from the memory, right? It points to the eventual physical location where the, where the value is. And, uh, the newer, uh, virtualization extensions, um, uh, for example Intel and ARM, uh, they have, um, a way where you can store multiple EPT, uh, pointers, uh, per VM. Now that is very powerful, uh, because one, you can, uh, for each page table there are permissions, right? So you can assign different permissions to it. And, uh, that way, you know, for example, if you assign, if you assign permission execute to one page and then read write to another page, you know, you can play with those permissions. Uh, and I'm gonna show you, um, later in the, in our talk, uh, some of the use cases. Uh, but for now I think it's very, uh, critical to, you know, what we're gonna do, uh, uh, what we have implemented in our implementation. The second, um, uh, interesting fact about EPT is like, you can, as I said, you can store more EPT pointers apart from, uh, partition. What you can do is you can actually point an EPT to a different memory location. Now what that means is basically, for example, if you are, um, you have to translate one virtual address, right? Uh, you can play with that last level of translation and then, you know, at one time you are, uh, giving one value, at the other time you're giving another value. So basically, you are using the same code, you are creating two different behaviors, uh, which is very powerful. Um, and there are many use cases for it and, uh, you can see the code snippet in the Zen where it stores the, and the permissions also. Uh, yeah, that's the one I was talking about, like for example, uh, the second EPT is actually pointing to, uh, another, uh, memory location. And, uh, I'll, I'll also talk to some of the implementation details. Um, you know, it's, it looks, it looks easy but it's not that easy. So for example, you have to create, uh, all of the pages in the memory yourself. Uh, well, Zen, uh, Zen Hypervisor, for example, have, do have APIs. So, uh, but they're not very well documented so you have to play with them and you create, for example, increase reservation as one that you can use to create pages. So once you create the pages, you have to maintain all the mappings. But, uh, you know, Zen provides a, uh, functionality to switch between different pointers. And, um, another very, uh, important aspect of this is like, um, you know, uh, this switching between different EPTs or you can call them the memory views because essentially what you are doing is you are creating different views of, uh, of the memory so you can present one view to one user and other view to other user and you know, by user, user can be a malware also. Uh, the, the, the best part is that, uh, the VM exit, uh, VM exit is a very, uh, expensive operation. Uh, what, what, why VM exit means is like when, uh, you know, you're running your domain on top of Hypervisor and suddenly the control has to be transferred to the Hypervisor. What, what that means Hypervisor has to store all of the virtual CPUs inside that domain, the context of all those virtual CPUs and then get out and get, take the, give the control to the Hypervisor. But, uh, switching between these EPTs, there's no VM exit. So it's very fast and I, I will explain later like why performance is very critical to, uh, anything you're doing, especially building a detection system using VMI. So, okay guys, so, uh, once we understand the basics, uh, we started playing with an environment, right? We need to have a board to play with this whole, uh, implementation and in our case, we didn't have a specific board. So we went to LinkedIn, we used Google, uh, Silencs, you know, Silencs, uh, FPGA creators. And turns out that this guy has a pretty cool, uh, board, uh, which is, uh, MP Sock CCU 102. The problem was that that board is $2,500, just a starter kit. So we talked to those guys, we tell them, hey, you know what, we have this project in mind, uh, Kuro's to Silencs, they just ship us, ship, ship the board to our house and then, uh, we started playing with, with the whole environment. So this is a board, it's just a, please, uh, standard, uh, uh, inputs and outputs. So we have an SD card to root the fest. We have Ethernet, UART, it's JTAG, uh, and we have a quad core A53s that are the ARM CPUs, uh, dual core RFIUF. So this is the board, that is very expensive, but don't worry the, uh, what we are presenting this, this can be using the, any other board. So the first thing that we were learning guys, because in, well, I, I am a like, a reverser, you know, exploit related guy, this kind of board, I never played with it. It was very challenging because we are running same hypervisor on it and there is no implementation on this board with the VMI in suspicion on it. So it was kind of hard for us because there is no support. So we learned that there is a p, p, petal Linux project, which basically helps you to, uh, deploy the whole, uh, uh, information into the board so that you can put it up. It's pretty cool. The only problem is that, um, it works with the specific images. So we have our own send hypervisor custom, uh, deployment. So it was a problem for us. That's the first thing. Second thing is, Petal Linux is for silence, right? What if tomorrow we want to deploy the same environment in Renesas, NXP, others? So we don't want to just tie to silence related stuff. So then we, we went to Jogto. You know many of you guys know Jogto very well. Well, in my case, that's like, uh, I don't have an idea. It turns out pretty cool. Uh, the only problem was that the root effects at the end was VC box related. That was a pain in the ass for us because, uh, once we have everything open running, we really need Python libraries to run our machine learning stuff. And it was a pain to just compile a single library, like TensorFlow, for example, machine learning. So it was pretty cool, pretty easy to run. But at the end of the day, a VC box, root effects, root effects, it was a pain for us. So we end up choosing the bootstrap. The bootstrap allows you to have a root effects, in this case, ARM64, Debian based flavor. So you have APT get, you can download all the libraries. So it was the way for us to go. Very recommended instead of the, the previous one. Now in the dev environment guys, so you don't want to push everything on the board, right? Because the board is like the production environment. So you really want to have a test environment. So in our case, we pick a CHSCH root. It's a wrapper for CHRoot. It's pretty cool because we have our Ubuntu Intel X64 environment, as you can see in the picture in the, in the bottom. So with this Intel based environment via SCHRoot, you can, you know, using chemo, you can CHRoot into the environment, you can do your testing, install libraries, connect to internet, everything like you were in the board. But once you test everything, then you can jump into the board. It's pretty convenient for us. Now, let's boot the board, right? So what do we need to boot the board? SILENCE has a specific debugger, which is called SILENCE system debug client. What it basically does, it reads a TCL file. That TCL file, just you guys to have an idea. What basically is doing is going to boot for different files. The PNUFW, which is just to set up the clock and the platform management on the board. Then the first stage boot loader, which is going to initialize U-boot. And then U-boot, which is going to allow us to boot the hypervisor, in this case, same hypervisor, then the Linux kernel, and finally the root FS, and the VL31, which is ARM trusted firmware. We didn't play with these components. We just use them with the versions of SILENCE. Important to mention that if you combine versions, it will never boot the board. So we have a lot of pain trying to play with different versions, and it was not booting. So once you reach this initial state of JTAG, you get into the U-boot prompt, and now let's boot the board, right? So the first thing is, you know, the data device tree blob, that's where all the configuration from the board is located. So the first thing is the root location, right? Where is your root FS that you want to boot on? In our case, it's an SD card partition, as you can see there. Then the second thing is we have our same hypervisor. We just convert it into U-boot format with that command line, and then you start booting up the system. You know this is the typical U-boot commands. So the first thing is U-boot the same DTB, which is the device tree blob in a specific location. Then you boot the Linux kernel. That Linux kernel, you see the 80,000 address. That must be exactly the same address that you have in the DTB. Otherwise, when it is booting up, it's not going to find the Linux kernel address, and therefore, it's not going to boot. Then you load the same hypervisor, and finally, you run the boot M, which is basically telling you, okay, so boot the board in the address, you know, 140,000, which is the same hypervisor. Then the hyphen in the middle is telling that you need to grab the root FS direct path from the DTB, and finally, you have the same DTB. With this, we can boot into the world. These specific steps, guys, for us, it took us many days and months, because it was a lot of issues booting the same hypervisor. Another important point here, guys, is that those addresses that you see there loading, if you don't have enough space between them, they are going to overlap, and then your board is going to crash. So suddenly, we were just, you know, booting up. It was crashing, and turns out that you need to have enough space for every single memory address, but there is no validation on it. So you need to make sure that whatever the same hypervisor is loaded in memory, the Linux kernel, that has enough space so that you don't overlap, and you don't get a kernel panic or same hypervisor problem. So let's get the damn syscalls out, huh? Basically, I don't know, I mean, it was kind of almost became a dream for us to see the arms syscalls on the screen, because, you know, we've been going back and forth. We tried different things. We always got stuck, because, you know, as I said, it's not very well documented, and, you know, there are a few research papers out there. So we were like just trying different things, playing with the memory, and finally, you know, we figured out how we did our own implementation. So as you guys remember, I was mentioning about single stepping, and you know, this why single stepping is very important, and why, you know, we've been, we're going to use the known CPU assisted single stepping. The reason is like, since we were using the Zen hypervisor, and Zen hypervisor does not support single stepping on arm. So we have to figure out our own way of doing what normally a CPU does. And we found a very interesting paper online. We talked to that guy also, Sergey from Germany. So, and then we reviewed some other techniques online, and we finally decided to use this technique. It's a very fascinating one. So remember, EPT helps you create different views of the memory, right? And so first, the first step is basically you create two views of the memory. By views, I mean like, you just take the whole memory, take all the pages, you create the copy of those pages, and then, you know, you have these two copies in your memory for those pages. And then, since Zen hypervisor provides this support to have multiple EPTs, so you point one EPT, you know, let's just call one view default view and the other the single stepping view. So you have two EPTs now. One is the default view. Default view is the one where, you know, you'll, you're gonna run by default when the system starts. That is the view where you're gonna, the control will be and the execution will happen. And then you have another view in the memory waiting, if you want to switch to that one. So we have these two views and I'm talking, remember, I'm now I'm talking in context of the, the SIS calls. So the goal is to start monitoring the ARM SIS calls, or the operating system, system calls on an ARM platform, right? So, so think of it like, okay, you want to monitor a particular system call, it, the code is in, in the memory somewhere. So you find the location of that, that, that API in the memory. So you, for the, at the first instruction, you put one breakpoint. And on ARM is, sorry. So on ARM, the breakpoint is this D40003. It's also called SMC. It's an in special instruction on Intel. It's in three. So you can use the same for Intel also. But you don't have to use the, the this one for Intel, just to mention that because Zen actually supports single stepping on Intel. So you can just use that one. So on the first instruction, you put one breakpoint in the default view. And then on the second instruction, you put another breakpoint, but in the single stepping view. Now what will happen? In the, the control is in the default view. And you know, you just, once the first breakpoint hits, what you do is, you do your analysis, you want to get the syscall, you know, you note down all the parameters, what process called the syscall, whatever analysis you want to do, right? And then, you know, you switch the view. You switch the view and the control will go to the instruction one in the second view. The instruction one will get executed. And then the second breakpoint will hit. When the second breakpoint hits, again, you get the control, you switch back to the default view. And then the second instruction gets executed. So basically that's how you single step on ARM if you're using the XAN hypervisor. And not many hypervisor support ARM, so XAN is probably your best choice. Unless you want to build your own hypervisor from scratch, that is an option too. So this technique is very fascinating. And, you know, as I was saying, this is one of the applications of the views that the underlying architecture provides through the virtualization extensions. So that's how we single step on ARM in our implementation. And so let's take a look overall, how do you now as a detection or as a monitoring system using VMI, how do you monitor from the outside, right? So using all the concept that we have so far shared, right? So what you will do is, first of all, you will put a hook on any, every function or every API in the memory that you want to really analyze or, for example, you might want to always analyze sleep, malware use sleep a lot. So the first step is to put that break point, go to that memory location and put that instruction. And in combination, you have to register an event. Now a good hypervisor always provides an event mechanism, right? So you want to monitor something, it provides you an event mechanism. And XAN does the same. So you register your event and then you also register your call back. So that's the second most important because with the call back function that you're going to register, what it does is you're telling the hypervisor, hey, anytime my break point hit, you know, give me the control and execute this function. So in this function, you're going to do everything. For example, you're going to extract once the break point hits, you're going to extract the parameter values of the function, you know, the function name, process name, or whatever you want to do, you do in this function. The single stepping functionality that I showed you in the previous slide, you will also do in this function. So that's the second most important step. The third step is obviously, you know, you need to single step. Remember, so if you don't single step, what will happen? The break point will hit, you don't single step, the control will stay there, it will continue to execute the same instruction, the system will get into a very unstable state and, you know, imagine you're protecting a car infotainment and then you will crash the system. So it's very important to make sure, you know, you properly single step. And finally, you know, once you're done monitoring, make sure you remove all your hooks from the memory and, you know, you, you know, you do a clean exit. So that's how you basically do the syscall monitoring through VMI. And on arms, especially it was challenging because, you know, as I mentioned, there was no single stepping and there was no documentation. So it took us some time, but, you know, finally we're very happy that, you know, we achieved this. So let me, let me now show you how it looks like our moment of happiness. When you first saw this, so I hope it, so how do I show it? So this is our Xilinx board. We have our introspection framework. We call it Newman introspection framework. And, yeah, these are, this is how the syscall looks like, you know, the process name and everything. Now, see, you know, we're typing and then you see all the syscalls being logged on the upper right. And then now I type top and then you can see the process name top being tracked. This is happening, everything happening from the raw memory outside the operating system. So that's pretty cool. And now finally, you know, we do a, we do a sleep and then you're going to see the sleep being shown. And not only shown because we have other components, it's also being monitored as highlighted here. So not only we get these syscalls out, we also send it to our machine learning model to monitor it. So that was our, it took us quite some time to achieve that one. And, yeah, that was the... Okay, guys, so let's get to the phone part because after this whole effort, as you can see, now we know the basics. We know how to arm, use BMI to introspect into the machine learning, sorry, into the machine. And now we are able to have the board, everything, right? Now let's see what we can do from a tax perspective and detection with these components in the board, right? So the first approach is a typical malware coming into a car, into a medical device, right? How can we tackle that problem from BMI? So if you have been working on antivirus companies or sandbox related companies, it's the typical way to do it is very common. But keep in mind, here, guys, we are out of the box. We don't have an agent inside and that's a totally different beast. So let's see the first example. In this example, we have a malware running. So what we do in this specific scenario is we're going to use machine learning in order to detect attacks. So what we do is we get, let's say, the infotainment from the car and we profile it. We get all the processes running. We feed it into our auto encoder, which is neural network-based approach. And what it's going to do is it's going to learn how a healthy system looks like, all the processes, all the differences caused being executed. So once it learns that, when it is running a process in the infotainment, it's going to grab that profile, which is in the left side, the actual profile, that is going to be feed into the auto encoder. And then the auto encoder is going to create a new profile based on his learning experience. Once it generated, if the similarity between the reconstructed profile and the actual profile that was running in the device has really low error, which means it's very similar, is when we know we are dealing with a benign process. But what happens when we have a malicious process? So the same process happens. We have this malicious process. We profile it. We create the actual profile, feed it into the auto encoder. The auto encoder is going to create his own reconstruction of the profile. And it's going to realize that the sequence and the sequence being used is totally off. The error level is really, really high. And that way is easy for us via machine learning to detect that there is an anomaly inside the device. That's pretty simple. Step one. Second approach, guys. Let's say that we have an exploit going on, right? In the left side, you can see an application running on the list of all the syscalls going on there. So you are going to see always the way it executes. But what happened if suddenly that application is compromised? So if you see in the right side, at the end in the red bot, in the red square, you will see that comparing the two applications, the V9 one without being exploited is going to exit the program. But when you look at the second one, it's going to call execB. Why execB? Obviously because it's gaining a shell. So in this case, we don't really need machine learning to realize that the flow has been affected, right? We call it S-H-I-T. So at the end of the day, you just need to measure the sequence based and realize if there is any alteration. And in this case, one exit properly and the second one called execB. So it's easy to detect. Now let's talk about the delay. If you guys are familiar with antivirus and sandboxes, right? One of the techniques of the APTs, for example, is to delay execution so that it executes and then it's going to wait for execution so that an antivirus needs to take a decision in milliseconds to know if it is malicious or not, right? It cannot be scanning the process all the time. So by delaying the execution, those engines are going to give up and they are going to stop monitoring it and that's a way to bypass it. So that scenario is a pain always in the enterprise with antivirus. In our case, it's the same issue, but how we tackle it comparing with them. So in our case, for example, we have this pretty simple delay in ARM and it's just a simple loop that is going to delay execution for 8 seconds, 8 minutes and then after that it's going to trigger the shell. If you remember the previous slide, in that one we have all the execution and at the end we have the execB to be called. In this case what happens is that the execB is never going to be called and therefore we're not going to see any anomaly. So that's a pretty challenge that people have. So what the people does, for example in the sandboxes, they hook all these sleep or nano sleep or those syscalls and once they see this syscall triggering, let's say that there is an sleep for 5 minutes, they will change at runtime the 5 minutes to 0 so that they force the malware to execute, that is a cat and mouse game because today you sleep, tomorrow it's nano sleep and there's going to be multiple techniques all the time so you need to keep updating these techniques so that any time there's no way to bypass it, you need to update it. In this case for example we don't even use an syscall so in this case it's just a code that's going to delay execution so you cannot rely on hooking syscalls. This is a challenge for everyone. In our case we tackle it in a different way. One way is you don't want to monitor these executions all the time, right? We were talking about we are in a car, we don't want to delay execution, can you imagine the GPS just slowing down or any application in the car so we need to decide quickly but if this application is running and it's sleeping, we cannot wait forever so we have a way to tag the process so that when the process start running suddenly it goes to sleep, we stop executing and then we monitor it in a way that we cannot know when the malware wakes up. There is a way to do it with a profile based approach that Ali is going to explain in a second but from BMI perspective same challenge but we don't want to follow the syscall implementation in order to understand when there is a delay. In our case we tag the process, you are sleeping, we sleep, we don't do any performance issue but as soon as we wake up, we wake up. That is a profile based policy that we are going to explain in a second. Ali was explaining to us that his technique, the technique that he was explaining to hook into the memory so he was telling us that you need to put a D40003 as an MC hook so that's pretty cool but it's pretty easy to bypass so I mean if I have a kernel module I can just go to the syscall, grab the syscall and then get the the SMC there and for me as a malware I'm going to be able to detect right away that I'm being monitored and that moment I'm going to say you know what man I have this SMC hook I'm being monitored so I'm not going to execute anything, I want to wipe the system or even worse I can override if the memory the page allow me, I can override that SMC hook to put it in the original state and then I can just totally bypass the system and then this whole technique being shown by Ali is totally screwed because it's pretty easy to identify so I don't know how come Ali come up with this technique but anyways. Okay so let me think how we can answer Dan so you remember the views right and I keep saying the views are very powerful so here's another use case and we'll see how we can answer Dan's question um in this case what we are doing is again we are creating uh you know two views actually you know you don't need to create um um the whole the whole memory pages what we are essentially doing is we are um assigning different permissions to different uh EPTs so we have two EPTs uh one is going to be the execute only the other is going to be the read write only now what will happen so say by default we are in the execute only view right and the malware comes and tries to read the hooks that we have placed in the memory um since the page is execute only there will be an exception the control will be transferred to me and I will see that the malware is trying to be smart so what I will do is I will switch the view uh which is the clean view by the way there is no hooks hook hook in uh in this view so that's the work so I have this view where there is no hook I will switch the view I will let the malware read the memory pages in this view and then it's uh read request will be satisfied the malware will not know you know that there are any hooks or it's being monitored so that's how and then once the that read request is done uh will switch back to the execute view so you saw the previous application where you know we created these views um to do the single stepping now you saw another one where you know we use the same capability that you know the intel and the arm provides to um bypass um or trying you know malware detecting us uh we bypass that okay guys so let's see another uh other examples um in this case we have an anti-VMM you know hypervisors are known to be virtual machine monitors is another name so what we have here is uh I'm again I'm a malware right so I want to know if I'm being monitored if I mean I'm being monitored so I'm going to stop executing so in this case uh since we are switching views as we explain that has a cost has a time in terms of performance so what a malware can do uh we do we just do a proof of concept here is we measure as a baseline without having the page view switching we just measure the time in this case 38 nanoseconds and then once we do that we now measure the time that it takes to write into the memory and you will see in that red color that the time is highly significant so that you will realize that the time to write into memory is totally different to the baseline and this is another technique that we can use to detect that we are being monitored just by measuring the time that it takes to write into memory we can realize it's taking too much time to the baseline and and and this is another way to to find a ways to bypass to understand that we are being monitored now let's talk about process killing guys so when we when we kill a process right we all know how it works right if you want to implement it on on code you will just need to call the syscall kill and you pass the PID and the way you want to kill it right dash 9 if you want to force it so that's the way it works normally in a user mode scenario and we all know but what happened here is what happened if we want to implement a simple killing process in VMI keep in mind that we don't have any agent inside so people or other companies what they do is they just drop an agent inside they call kill in the user mode and then you just kill the process but in our case guys we don't have an agent inside and we don't want to have an agent inside we want to keep being out of the box so how a simple kill process can be done from VMI so we have a way to do it here it doesn't need to be the perfect solution but the idea here is that you guys can see the challenge when you want to just kill a simple process that is very easy to do but from the hypervisor so what we do is let's say that we want to kill the process ID 300 so what we do is in the kernel because this whole VMI guys is being executed in the kernel we are not using user mode syscalls because it's too high to consuming performance issues so we are just cooking in the kernel so when this 300 process is running what we are going to do from VMI is we are going to monitor all the syscalls so when the syscalls come in from that process what we are going to do is we are going to mess with the stack that information that you see in the green is the full stack dumped from one syscall so what we do is you know there is something that is called a safe user state registers so when you transfer from user mode to kernel mode those registers in user mode are stored in the kernel so that when the kernel resumes execution in user mode those registers are used again right so what we do in kernel is that we get that stack content we get those safe registers which are not being used by the kernel and we just nullify them what's going to happen guys is that when the syscalls come back into the user mode since those registers are totally nullifying is going to cause an exception and it's going to exit the process as you can see totally different way it's not a traditional one but taking advantage of VMI we are able to mess up with the stack and then force the process to kill important to mention that it's not easy to do because in this example you see the yellow box if you mess with any kernel related information which is in the lower addresses you're going to get a cache in the device so you want to make sure that you are really overriding only the user mode register information so we learned that from experience and then we come up with this specific offset that is calculated dynamically so that we always overwrite only user mode registers and the way it works guys is that every time we get we set to the device a kill PID300 it's going to grab all the syscalls every time we grab it it's going to mess up with the stack sometimes in the first it gets the exception and kills it sometimes in the second in the third one so sometimes it needs to take like 5 or 7 syscalls to be called in order to be killed but this simple example guys as I said is totally different beast as you can see it's not easy to do it at the VMI if you want to be totally aimless so let's see an example of how we kill it in the device so here we have our system so what we're going to do guys is in the upper left we're going to run a simple malware like mirai which is very common one in the upper right we have our inspector monitoring it here is the mirai we're going to execute it as soon as we execute it mirai delays itself and spawn a busy box process pretend to be normal and it is in the 336 PID so let's go back to the virtual machine that we're monitoring we search for that process 336 and we can see that it is there running right 336 is there we click it on it and then we can see that the busy box is showing up there is busy box now let's try to kill it from VMI so we go to our front end we go and try to find the 336 process we find it there and then we click on kill bottom that's going to be from VMI sending a signal to kill it so then we go back to the VM search for the mirai which is gone already it disappears as we said and now let's search for the 336 and you can see that it's totally killed this is totally clean we don't crash the system and this way to do it which is not the best way it works for us and that way we are able to kill from VMI perspective so just want to quickly mention that you know once you get hold of VMI you understand it it's pretty cool and you can do a lot of other things it's very powerful so one of the things that we did in addition to detection is the policy so not only you can do the detection with VMI but you can also implement policies again not putting anything inside the operating system and still maintain these policies one of the example that I want to quickly share with you is a lot of infotainments or any IoT device for example medical device there are certain processes very limited which are responsible for going out to the internet so what you can do is you can you know the task list that I showed you you can continuously monitor that from the outside and you know since socket is a special file you can just simply see if there is another socket being opened that might be slow because you're actively traversing alternatively what you can do is you can hook a connect or a network and then you know when events that API gets called you will see if this process is allowed to communicate outside or not so here's you know just a quick view of how it looks like in our system like Dan was mentioning remediation is another topic that we are tackling aspect once we detect something we need to kill it's not easy to kill from outside we always have that option to put an agent inside but we don't want to do that we want to be totally out of the box so you want to try I mean the method that we showed you is not one of the I mean we figured that out on our own but we have another one you know that's that we're exploring right now so you can try different things you can make the parameters null and try try some of those things quickly want to share some of the recommendations with you guys for an end to end system let's say you're working with some other hypervisor like in our case you know Zen is just the start we are already working with some you know another one and that's a custom one so you really need to have good way of putting the break points efficient single stepping mechanism the even mechanism now we didn't talk too much about even mechanism because Zen provides a really good even mechanism but if you're working with some other hypervisor you want to make sure you know you have a good even mechanism anything that you translate using those page page tables or the page walk you know you want to make sure you cash those in an efficient hash multiple views is awesome as you can see I mean we have few other use cases which are paid in pending we cannot talk about it we combine the views in you know with other techniques to really really improve the performance and VMI in VMI monitoring system performances the key and finally the permission management finally guys we are releasing some tools for you guys to play with you can see the the draw box you can go and check them out so we have the ARM and Intel VMI monitoring tools that you know you can play with we have all the files that we have used to you know boot up the Xilinx board and finally we have the one malware ARM 64 base malware that you can use to do and end to end scenario so do check that out and and you know play with it okay so finally the the takeaways in today's world of advanced malware we really need to make sure you know we make the hypervisor smart hypervisor is everywhere so you know you want to make sure you know you make it smart and you know agent less is the way to go because we know that you know anytime you have an agent inside it's a it's a losing game ARM syscall monitoring is obviously was a great achievement for us but it's just the start I mean we there's now there's a lot we can do so don't just stop at the you know syscall monitoring you know think of new use cases and you know especially switching between views is very powerful so you can also think of new use cases around that and finally performance is the key we have a patent pending Newman adaptive monitoring and then we we we use that in our system we're going to talk about it right now we need another talk for that okay finally I really want to thank Sake, Matt, Stefano, Waleed I don't know if you guys are here but still these guys really helped us along our journey and it wouldn't have been possible what we have achieved without these guys so thank you guys thank you