 Thank you guys, this Congress has been a real experience. This was my first one, so it's great to be here as a presenter for the very first time too. So we are both researchers from TU Munich. We are doing PhD, and this is the topic that we have been looking at for the last three years. So briefly what this talk is going to cover, we'll tell you what our motivation is, why we care about this technology, what this technology actually is. We will talk about Zen, we will show you demos, but really the three concepts that we will try to focus on in this talk is isolation, interpretation, and interposition. After we cover the basics, we will look at cloud security and how this technology applies to cloud security and discuss open problems. And this will lead the discussion into kernel code and code integrity, and that's where Tom is going to talk to you about. And then we'll have some conclusions. So really our motivation has been looking at malware, malware collection, malware analysis, and using virtualization and virtual machines to do that. We also been looking at intrusion detection and intrusion prevention. I was actually part of a DARPA cyber fast track project last year that looked at that for seven months and we created a prototype that we will actually show today. But the technology is also applicable to debugging and more importantly, stealthy debugging, using virtual machines. And of course, cloud security is a big part of that as the entire cloud is based on virtual machines, but the upcoming things are mobile security and using the ARM processors that are in your cell phones to use this technology to provide some sort of protections against malware. Which is not our motivation is to do DRM and espionage and stealth root kits, but this technology is also very applicable to that, but we are really interested in the defense side. So you heard a lot about hacking, this talk is not about hacking, this is about protection. So when you have virtual machines and you need to tell what's happening in a virtual machine or you need to control it even, the common approaches that you have today is that you install something in it, right? You have virtual box, you install virtual box tools, you have VMware, you install VMware tools, you have Zen, you install the Zen tools, which is very easy to implement and convenient. It can sometimes use shared memory or just most simply just use the network to communicate with your server outside. You can also use network monitors, snort or whatever IDS systems that you have, which is better than ingest agents which really have no isolation, right? You're running in the virtual machine, network monitors have this isolation that they are outside of the machine, so it's more protected against attacks from within the same VM, but at the same time you lose context. You look at the network, you see very limited information about what is actually happening within that virtual machine, especially if the traffic is encrypted. There are some steps into using live forensics on virtual machines or on physical machines even, where you can just scan the memory, which has isolation and context, but you really just have a passive view into the system. So all of these are valid approaches, but they all have their limitations. And this is where VMI really comes into play. And the basic idea behind VMI is that you look at the virtual hardware of the system and just by looking at that, you try to understand and reconstruct what is happening within. In that sense, it's very similar to deep packet inspection, where you're looking at packets and you try to reconstruct what's happening within the operating system, and you need to have some understanding of what operating system is running within that virtual machine or if you're doing deep packet inspection, which operating system segmented the packets to be able to reconstruct it. And for VMI, really the three points that we want is isolation, interpretation, and interposition, in which in isolation, I mean that you have some sort of increased resiliency against attacks. And you also have complete view of the system. You have access to everything that that system is doing. And since you are in a more privileged level of the system, you can even interpose yourself into the execution of that machine. So we're gonna look at these three things. So first isolation, this is why we are really moving things out from a virtual machine and not having ingest agents, is because if you run the code outside, you can avoid ingest hooks, which is the most common attack vector on anything that you run within that machine. That makes tampering it harder because you have isolation provided by the hypervisor. Of course, that depends on how good your hypervisor is and how hard it is to break out from the virtual machine. But provided your hypervisor is secure and your hardware is good, then you have increased trust in the code is going to do what you want because it's in a protected region. By doing this, you also gain some performance because instead of having to deploy like an anti-virus software in every virtual machine that you want to protect, you can just have one anti-virus software that just protects all of your virtual machines. In that sense, you can avoid things like the anti-virus stormer, all of your anti-virus scans kicking at the same time on your virtual machines and just kale the hardware. So interpretation is a very critical thing because you really want to understand what's happening within the machine. And with VMI, we have a very heavy focus on memory because memory is really the common point of all the hardware that the system is using. But we, of course, also have access to the CPU registers, the desk and the network. So those can come in handy as well. But we're really going to be focusing on memory. The reconstruction of the state, though, even for memory is really hard and there are many problems with it and mostly because of complexity. You look into a virtual machine and it probably has Linux running or Windows running and that's a large piece of software. Trying to understand what that large piece of software is doing just by looking at the virtual hardware is hard and in the case of Windows, you don't really have access to the source code so you really have to reverse engineer a lot of the things. And even though your code is running outside of the virtual machine, the data that code is interacting with is still potentially tampered. So we have a dilemma of what data can be trust in the machine. But let's start at the beginning. So if you're looking at a memory, what you see with virtual machine introspection is physical memory. But of course, that's not what the operating system is using. It's using paging and virtual memory which has been around for a long time. And the basic idea is that the operating system sets up the page tables and the hardware walks these page tables when it needs to translate the virtual to a physical address. So it's a nice interface between hardware and software where software provides the data and the hardware actually walks it. The problem with this very basic thing already is that while we have a bunch of different paging systems, so we have the 32-bit paging and two extensions to that and then we have the 64-bit paging. So all right, so now we have to reconstruct all of these paging mechanisms and emulate it what the hardware would do. So that part is fine. That is defined in the Intel manual. That should be straightforward. Except there are three bits that the Intel manual says that are up to the operating system to decide how it uses. And Windows does actually use these bits, at least two of them, which means that we have a difference between which operating system is actually running, how these page tables look and what we can get from memory. So for example, volatility has this line when it does a translation that looks at these software-defined bits in the page table entries that says that if this, the 11th bit is set but the 10th bit isn't, then that page is present in memory. And they do that for every translation even if the guest is actually Linux, which is of course incorrect. So this is of course not a big problem because oh okay, we need to understand that we are looking at Linux and not do that but already shows that accurate reconstruction is complex and you can easily miss things. And this is still the case with volatility. There are of course other problems with memory, like if memory is paged out, then you don't have access to it directly. But we have access to the disk, so now we have to look for the swap file and reconstruct the swap file, which is again totally dependent on the operating system and how it's implemented, which just adds more complexity on top of that. So with Zen, fortunately we can avoid some of that complexity. For example, now we can inject page faults from the hypervisor to have the ingest operating system bring those memory pages back from disk, so we don't have to understand the file system and the swap file. But of course that takes time because the virtual machine has to execute to bring that back and it's only gonna be available in the next release of Zen. But it's progress. Some other problems with paging is that you need to find the page tables and the way forensics tools find the page tables in memory is by really just having a signature that they scan for. So in this code segment here, we have again volatility, which defines that four bytes there, which is the signature for scanning for page tables, but that's actually just the signature for a process. In fact, it's just part of a header for a process in memory and of course that's not very robust. With VMI, we can do a little better. We have access to the registers and we can just read out the CR3 register from the virtual machine and just use that for translation. So VMI has advantages over just raw memory forensics. After we have been able to translate virtual to physical addresses, of course we need to understand the kernel and reconstruct what the kernel sees and what the kernel does, which really requires debug information. And Microsoft gives that debug information for free, but of course the format is proprietary. It's the PDB format, which fortunately over the years has been slowly reverse engineered, but it's a pain to work with. But recently Recall, which is a fork of volatility by Google, really now nicely supports it where you take this debug information from Microsoft and just dump it into JSON files. So with Microsoft this is actually, and between those this is actually quite nice. This is very workable. With Linux, this is of course a bit more problematic because we have a ton of different kernels. So even if you're not taking into account your custom compiled kernel, even if you just have stock Ubuntu kernels, there is really no cross distro central repository that you can get these debug information from. Every distribution has its own. Maybe they do have it. For all distribution it's probably gone. So having the debug information is not as easy as on Microsoft where they just have a very nice central repository. So there is some work in that sense to have a central place where you can grab the debug information and that's a Fedora Dark server. But it's not really used. I mean, how many people have even heard of it? I don't see any hands, right? But there are at least some initiatives in that direction. Going back to scanning, so even if you have the debug information, you need to find those data structures that you care about, right? You want to find the data that you're interested in. So you need to find first the kernel and then the processes and files and whatever else you are looking for. And on Windows at least this is done by looking for pool tech headers which is essentially just a debugging header attached to memory allocations that Windows does when it allocates a file object, for example. And very similarly to the scan that we saw before, these are usually four byte signatures that you can scan for. The problem with that is that you have a lot of them and you don't know which one is valid and which one isn't because you have partial structures in memory and old structures in memory and then you just have false positives because you are scanning for four characters in memory and you're gonna have a lot of that. And really memory doesn't get reset so if you free a structure it's not like it's magically disappeared from memory, it's still gonna be there just while operating system treats it as it's not there anymore but it's really still in memory so you need to have more heuristics to validate those hits. So now you have even more complexity in trying to interpret which data is valid and which one isn't and that just makes everything more fragile. And of course people have discovered that this is fragile. So in 2012 there was a paper about one byte modification which just broke finding the page tables and then what we just give up because it can translate virtual to physical memory but this year earlier there was another talk about just adding a ton of fake signatures and then what we will throw up as well because it can't tell which one is real and then you have to go through by hand and that's really not a feasible approach. And that just really highlights that we have fundamental problems with trusting the data and we don't have a good way to validate which data structures are valid from what we find through scans. And that's where interposition becomes critical because if you know the execution flow of the system and you can trap at specific locations, you know what state the system is, system is in and then you can avoid scanning. So for example, instead of having to search for the KD bug structure which forensics tools use to map out some basic structures, we can just use the VCP registers and we can automatically find the kernel and since we use the debug data, we don't have to use the in-memory debug data, we can just use the one that we pre-generated and then we have complete map of the kernel. Furthermore, the heap allocations that we use before with forensics tools to find files and whatnot can be trapped automatically when the kernel actually allocates something on the heap. And what's great is there is actually native support built into Xen so you don't have to custom patch some weird hypervisor, you can just use what's already provided. And this is great because this was actually designed for debugging but it's really unknown and there is not many documentation on it and if you try to even read the Xen API, it's not gonna tell you much. So you really need to look at some sample codes that are hidden within the Xen source but it's there. So let's look at what I'm actually talking about. So this is not a live demo, I pre-recorded it but for all intents and purposes, this will be good. So I'm running on Xen 4.4 and I have two VMs running, DOM 0 and Windows 7, 32-bit. And I can list all the processes that are running on the system. This is very similar to how forensics tools do it so that works and you can see that I see more processes running than what task manager tells us in Windows. There are the kernel modules which show up for ntoskernel.exe so you can tell what's loaded in memory for that machine but these are all using the kernel internal data structures which you may not trust necessarily. So now what we see here is actually the live execution of the Windows kernel where I trapped all internal system functions that result from system calls. So these are not the system calls themselves that I'm catching but the functions that are being called from the system calls. As you can see, there's a lot of things happening within Windows even if I'm not doing anything. So it's impossible to see what's happening so I just dump it into a screen lock so I can grab through the live output. So these are actually the functions that are being called if you can catch some of them but I mean I'm not doing anything and there is a ton of them being called all the time but this gets you an idea that you can actually jump into the execution of the system and see what's going on and if any of these functions you have a deeper interest in, you can actually look at them and for example, one of the functions I'm trapping is the heap allocation function where I can check what structure, structures Windows is actually requesting to be allocated and heap allocations of course are very much in the fast path of the system so if I just move the mouse around you see a ton of IO structures being constantly allocated on the system and that's just by moving the mouse. Now if you have deeper interest in some of those structures that are being allocated we can do that as well. So for example, if I want to check what files are being accessed by Windows I can do that just by watching what objects are allocated on the heap and I just clicked on some random personalization and these are all the files that been those accesses in the background that I'm not even doing it I think this is just still loading, right? It's still going, all right. So if you want to debug an operating system this is really great because you have full understanding of what it's doing and just to show you that this is actually what's happening I just created a document here on the desktop and as you can see there is still a lot of things happening so I don't even necessarily see the file that I created so actually I'm just gonna capture that screen log to see if the file was actually in there that I created and yes it's there and it's actually test.txt.txt because Windows automatically adds the extension for me so I did not know that. So the way this work is actually using four types of events on Zen. Unfortunately this is for Intel only but that's good enough. The first type is move to CR register events. So this is, there are three control registers the operating system uses to define various features. The CR three is of course holds the page table address that is used for the current executing processes virtual to physical translation. CR zero and CR four holds various options can be used to flush TLB and we can trap every time those registers change so we have an understanding when a new process gets scheduled in the operating system so that's great. The second most important one that I used here is debugging break points. So these are just really the debugging break points that you would use if you run GDB or only debug which is the in free instruction hex CC opcode and you can just write that anywhere in the kernel code pages and it will trap and you can actually configure the virtual machine to trap into the hypervisor when such a break point happens so that's pretty great. The other event type is via the EPT violations which is the extended page tables where we actually have a whole other set of page tables that are maintained by the hypervisor that maps what memory is allocated for that virtual machine and before EPT this was done via the shadow page tables but now with the EPT this is all managed by the hardware but the good thing is that you can set different permissions in the EPT tables than what the operating system sets in the virtual machine so you can actually trap various accesses read, write or execute instruction fetches from memory. And then the fourth one which is also critical is the monitor trap flag single stepping which is an invisible single stepping feature built into inter processors now where you can actually single step a virtual machine without having anything within that machine knowing that it's being single stepped. Well there is a bunch of other features that Intel now allows you to trap on but these are basically the four ones that Zen currently supports but as you can see this is already pretty cool like you can do a lot of with these four types. As I said using the Zen API is really not very nice and it's kind of hard to actually wrap your head around how things go together but fortunately you don't have to so the way I implemented this system is using LibVMI which is a hypervisor agnostic C library that we have been working with and actually extending heavily which is a wrapper API around Zen, KVM or even if you have a raw file dumps you can do introspection on and it supports all the paging that is out there plus I recently added ARM support to it so you can actually do introspection on Android devices for example but for now it's basically Windows and Linux it has a Python interface and really the idea here is that you write code once using LibVMI and then if you switch the hypervisor underneath it's not gonna matter as long as the drivers are set up properly within LibVMI and then you write code once and it's good to go. You can use it to read and write into memory and it has wrapper around Zen events so it's actually intuitive on how you need to do like single stepping and setting up things and it's open source so it's LGPL so we are free to use it with any project that you want to implement. So a little bit more details about how the actual tracing happened that we just saw is I injected a break point into the xallocate pull with tag which is the heap allocation function but of course when that function is called the memory is still not allocated we need to catch when that function finished so we need to extract the return address from the stack, trap that as well when the return address is hit then we can actually extract the address where the memory got allocated. The trick here is that there can be actually a bunch of different threads calling this function and while you are in this function it can be context which so you need to keep track of all the colors of which ones are actually active that when it's returns you know that okay this was the structure that I was actually waiting for and then you know where that structure is allocated but it's of course at that point is just a memory address it's the structure is not initialized or anything so now you have to really watch that memory region as it's first like being zeroed out and then slowly updated as the operating system fills in all the headers and the information that we really care about so for example here we would care about the access type to the file and the file name so we just set up the EPT traps to monitor as that page is being updated but then of course this traps the entire page that that structure is on so you're gonna have unrelated write events so there is even more logic in there and that adds overhead but as you could see it was quite responsive you could move the mouse around and interact with the virtual machine so it's not too bad. What's really cool about though about heap tracing is that some basic kernel rootkit mechanisms can be really side step. So for example, direct kernel object manipulation which has been around for 10 years is the idea that you can break the integrity of kernel data structures without actually affecting the execution of the system where you would have for example a process in this case in the middle that doesn't want to be found by task manager or by the user and what it would do is unhook itself from that process list that I actually showed you in the beginning where I listed the running processes and that's just the link list so if you just switch the pointers out the structure will be still in memory but you won't be able to find it through the link list but of course now with heap tracing we know exactly where every structure is allocated without having to walk the link list so who cares if it's unhooked from here I know exactly where in memory that process structure was allocated and that increases the trust in the data and furthermore I can do some type of cross validation to see like okay I know this structure get allocated at this address but it's not showing up in the link list well that's probably some rootkit so let's look at another demo here are all the running processes and of course I have paint running I cut down the output so we can actually see what's happening here and what I'm trying to show you here is that you can really catch any event that you want so in this time I wanted to catch when the file gets deleted but before it actually got deleted and what I did is I actually fired up volatility and dumped that file from memory before the operating system was actually able to erase it so now it's in the temp folder and there you go it's actually extracted into the control domain so you really have full access to that virtual machine and you can reconstruct everything that's happening within and even extract files that were in this example closed and deleted but I was able to extract it from memory because Windows doesn't actually delete it right away and this is very handy when you're dealing with malware because a lot of the time what happens is that you have write caching enabled on the disk so when you actually say save this file Windows doesn't actually save it to the disk right away just queues it up and waits for a while to have writes buffered off and then writes it to disk so even if I would look at the disk after I save the file I might not even find it because it's still in memory and for malware this is actually usually the case because you have temporary files that the malware dropper extracts and then loads into memory and then cleans up after himself and you really need to catch the delete events otherwise that memory region can get recycled so for example what I was doing first is I was just running some malware samples and pausing the VM and I saw some files that were interesting and then I tried to go there with volatility and dump that file but of course the memory already got recycled so there is really a very short time frame where you can actually catch these files so interposition is really critical if you want to do malware analysis let's look at another demo demos are fun so this time I have a Windows 7 64 bit there is some debug information about it but it's 64 bit Windows 7 and there are all the processes running and as you can see there is task manager 2364 and what I'm gonna show you in this demo is how you can actually take full control of that virtual machine not just extract files and monitor passively what's happening so for example what I'm gonna do here is actually hijack the task manager to start up calculator for me within the virtual machine which is quite great because I didn't have to install any custom software within that virtual machine to take control of that I just need any process that is executing within that virtual machine and I can do whatever I want no passwords, ask, no username really you have full control right that's what being in a more privileged level of the system actually means and yes you can fire up command.exe and pass whatever arguments you want to it and as you can see I actually get the return value as well so I know what the PID of that process that got created is so if you want to execute some function within that virtual machine you can actually extract the output of that and then you can just pipe this together so you have like an external shell for that virtual machine effectively so now what I'm gonna do is actually just fire up internet explorer and send it to some website of my choice and there we go the virtual machine just happily does that and it's pretty much instantaneous so I really can control what that virtual machine is doing so if you're like a sysadmin this is great you can install software within that virtual machine close processes really whatever you want so all the demos that I showed you are actually part of a malware analysis system that I built for my PhD which is as I said built on Xenlit VMI volatility and recall and it's released for free for you so all of these demos and tools are now yours to play with and again the important thing here is that for malware analysis you really don't want to have anything identifiable within the virtual machine that you are running in because that's what malware usually looks for so if you have no ingest agents you don't have any ingest artifacts that malware can look for and then you have a more stealthy environment to do your analysis in and of course you can extract all the temporal files that you would otherwise miss and the open office just crashed doesn't really like embedded demos another tool that I wanted to bring your attention to which is really just fresh out of the oven is debugger integration so as I said all of the features built into Xenver really designed for debugging so why not use some of your favorite debuggers in this case GDB so this should be online by now I haven't actually had a chance to test it but this has just been released so this allows for stealthy debugging using the hypervisor and you can really debug the operating system so if you are developing a kernel driver or whatever this is really handy and of course we can add VIN debugger integration which is coming next so go check this one out too so let's go back to isolation real quick so now what we did actually is we moved a lot of the security stack out of the virtual machine and we can do a lot of that but what that really just achieves is just moving the target exploit is getting harder because you have hypervisor based isolation but of course now you're running in a more privileged part of the system so if you actually manage to break out but using some vulnerability potentially in the security tool then you have a bigger reward and it's not like it doesn't happen I mean there was just a couple of weeks ago a month ago local privilege escalation in OS like which is a host based intrusion detection system so it's kind of naive to think that our system wouldn't have any vulnerabilities so why not separate the security stack into also deep privileged virtual machines and Zen has some features to achieve that and that's the Zen security modules and what that allows is to really desegregate the trusted computing base that you use with Zen so you don't have to put the security stack within DOM zero as I did in the demos here but you can actually create a virtual machine that has control over a set of domains that you want to protect or just maybe a single domain that you want to protect without affecting anything else on the system the way it works is it creates a wrapper around hypercalls and has a policy to define how the interaction between domains can happen this is actually a piece of software that has been contributed and maintained by the NSA and Zen and they have a bad reputation but in this sense they actually do some positive work as well because without their work it would not be possible to do this as I said what it does is actually a wrapper around hypercalls and then you have the Flask policy engine where you define which virtual machine can do what hypercall and what the target is or that hypercall is and in that sense it's very similar to SEO Linux you can use the same tools to define your policy how to allow and check policy and it's disabled by default but I mean that's you can recompile Zen and then have this but really it's only usable from Zen 4.3 and Linux 3.8 so that was actually my first patch to Linux in 3.8 when we were actually testing an early version of this system that was not merged into Zen yet and we discovered that the Linux kernel actually did some access control checks itself to see if it is DOM 0 or not and if it wasn't DOM 0 it would deny issuing the hypercalls that we needed to do this type of security of course if you have access M defining what is allowed and what isn't you don't need the kernel to tell you that and furthermore that's really an arbitrary check if you have the rights to do insert kernel modules within your kernel then having a security check there is really not going to get you much so my patch was really just removing that surplus check and now with these tools we can actually start thinking about cloud security so we have a mechanism to have different security policy for different users of that cloud and the idea would be is that we start monitoring a virtual machine before it goes live so we have some sort of baseline of integrity that okay this is my base API hasn't been released on the net so I start monitoring and we can see if some critical data structures get hijacked or maybe even the code if there are any inline hooks being injected we can detect that and we can really just limit what we are trusting in the data to stuff that is bind by the hardware because if malware changes those structures the most that they can achieve is dosing the system so they probably won't touch it if they have some better use of that machine but we are back at what data can be really trust in the system so for example with the events that I showed you with EPT violations there is already some limitations in what the hardware tells us and there are corner cases so for example read, modify, write instructions which in one instruction read from memory and then write back to memory which is usually used for mutexes and concurrency stuff the Intel manual says that well it's really implementation specific whether it says the read bit so these will always set the write bit but whether it says the read bit it's well we don't know what happens so that's not really cool we actually patched that in ZEM 4.5 which will be released next month but there are ambiguous it is and there is a lot of ambiguity it is like that in the Intel manual so we actually wrote a paper about collecting all of these and these are just really some of what the limitations are a bigger limitation is really the tagged translation look aside buffer which was introduced in 2008 both by Intel and AMD which essentially caches the translation of virtual to physical addresses into a cache that you cannot query it's really just for the hardware to speed the translation up and now if you have a tagged TLB that means that the page tables don't necessarily represent what translation the guest actually uses so what we do with VMI is we look at the page tables in memory and we emulate what the hardware does but we don't have access to this cache which is a problem because with tagged TLB the cached entries survive a VM exit VM entry so these are actually persistent TLB entries and that means that the root could potentially can mock with the page tables in the guest without actually crashing the guest and we would have no idea what it's doing because it can set up the page table to point into some benign code region and when we try to see what code is actually running we see that oh it's calculator it's no problem but what machine is actually executing is something totally different of course there's some limitations to that so depending on what hypervisor and guest operating system you're running so for example Zen always assigns a new tag whenever the guest schedules a new process so you would have to do some malicious modifications to the page to essentially every time a new process is scheduled so okay we might be able to detect that and Windows 7 surprisingly is actually pretty good against this because it always flushes the global pages like very regularly but if you're running Linux on KVM well this is a more realistic problem but if you think about cloud security for a moment here there is really no need to move everything out from the virtual machine so with malware analysis there was a reason you don't want to have any artifacts within the guest that it can detect and shut down really quickly but with cloud security we want the malware to stop executing as fast as it wants we don't want it to stay alive so maybe it's enough to have some sort of securing guest agent that we can protect from the hypervisor level but it would have better performance and better visibility into the system because it's running in the same context as the virtual machine so the tech TLB problem doesn't exist for in-guest agents so we can do some sort of hybrid approach potentially and this is actually where the hardware is heading so in the upcoming Intel CPUs there is gonna be this extension called Intel VE or stands for virtualization exceptions where you can actually trap the EPT violations within the guest so you don't have to trap out all the time into the hypervisor so you would have really better performance and then you can do some sort of protection of that code that's running within the system and as I showed you, you can really control what that system is doing so you can not just control the code but you can also control the data so you really can achieve securing guest agents another approach would be for cloud security is to really just reduce the size of the guest operating system there is really no reason you need a full Linux stack in your operating system that just serves Apache so there are some works in that direction so just in this Congress we saw Mirage was but there is also NetBSD ROM kernels and OSV which really just try to achieve that reduces a virtual machine into a process and then just use the hypervisor as your scheduler just kind of ironic if you think about it because processes back in the day when they were introduced they were called virtual machines and now we would have virtual machines becoming processes again so we are kind of going in a loop here but also we could try to secure the ingest kernel because what we have actually been discussing thus far were the blacklist approach we look at what malicious changes happen to the system and we try to deny that but that of course places the burn on the defender to enumerate all the possible things that could go wrong well good luck trying to do that so now I'm going to handle where to Tom to talk about the white list approach which might be a better alternative yeah so if you want to verify the integrity of the system which we have to to run our ingest agents in we have to see what the kernel is in the system and whether it's changed by malware or not so what we propose is a white list approach that would allow verified changes within the system to the kernel so for that we need to validate and see all the changes in the system that we want to allow so code integrity basically is assumed to be an easy thing you have the code which from your binary from your kernel you hash that and you compare the hashes if it's matched then the kernel is or the kernel's integrity is there but actually Linux employs runtime patching or runtime self-code patching to have performance optimizations within the system so for that if you run a Linux kernel you now have to differentiate between legitimate and malicious changes to your software so there are two kinds of changes that are done by the Linux kernel so for the one thing there are the easy low-time patching things the easiest thing is the relocations which or alternative instructions which are architecture specific so dependent on the CPU the system is running or on the dependent on the hypervisor the system is running there are different instructions patched into the code of the system so depending on the hypervisor for example some function can be implemented that way or another so but you can say yeah low-time patching can be handled by loading the kernel in a secure environment on the same architecture maybe and creating a hash and we still have no problem but on the other hand we also have runtime patching employed by the kernel which is for example employed for hardware hot-plugging but this has to be validated and verified continuously as the system is going on so I want to show you two examples where this is actually applied in the current Linux kernels so for example SMP logs is one of those mechanisms so currently if you have virtualization on a system and you need scalability the number of actual CPUs that are allocated to the virtual machine may change during runtime so this gives a problem because if you have a signal-threaded operating system and you have logs then you don't have to ensure atomicity of that log because you have only one CPU anyway this changes if you have multiple CPUs because then you really need atomic operations and for performance reason if only one CPU is present in the system the Linux kernel does not use those atomic operations but instead what he does is when another CPU comes to the system he automatically patches all these locations in its code to atomic operations which are slower than but required in that place and also this mechanism can also be used to replace entire functions within the Linux kernel it's currently not used as that but the mechanisms basically there and another thing where runtime patching occurs in the Linux kernel for example is a mechanism called jump labels which is equally for performance reasons so you have some things, some checks in the Linux kernel that will be unlikely passed so other than just checking constantly now is it an unlikely case is it now an unlikely case you just patch out the jump to a certain code snippets out of there but once those functionalities are enabled for example by a user or by any hardware mechanism or whatever the Linux kernel just patches the jumps into the code to the function that should be executed and with that we have the additional problem that we don't have only like two possibilities yeah the patches or the jump is there or the jump is not there but the location where the jump points to also has to point to a location which is consistent with the entire system state and here you have like the perfect thing for something return your oriented programming where you just need to have an arbitrary jump within your code so these are mechanisms that we have to defend against and to verify that we already heard yeah we have simple approaches like locking the kernel yeah the easiest thing with the hash based approaches as I said is deny all changes to the kernel code and runtime but here we have the problem that we completely disable all of this legitimate patching approaches and yeah on the other hand you also can say yeah well most of the code is static but just a couple of locations might change but here you have an equal problem yeah the number of hashes that you have to maintain for every location that might change is a very large number so you have to maintain very many hashes and also the Linux kernel in its current form has a problem that for some code pages both code and data reside on the same executable page so the kernel for its code pages uses large pages which are basically two megabyte pages that are used for kernel code but on the last page there is still some spare memory that would either be wasted or the Linux kernel gives that to user space applications if they allocate some memory so here we also have the problem that we don't know really is this code, is this data, what is to verify, what should be on that page so this is also a problem for hashing the kernel so what we propose here is a trap and violate approach using VMI so we know patching only happens to predefined locations so from the binary we can derive which patching mechanisms are there and where or at which offset in the binary the code will be patched and also with which values under which circumstances and for that we can now retrace those patching mechanisms and understand what they really mean and also see their system state that it's consistent and also this fixes another problem because the code patching is not an atomic operation like for example if you patch entire functions this cannot be done at once so there's always between the two good states a bad state in between which is also good because it leads from one to the other so you have to have a system which is aware of all those intermediate states and can handle them appropriately so we propose to trap right events to kernel code and when the kernel code is changed you can validate that the change is not malicious and that does provide the integrity of the running Linux kernel so this basically now was about the integrity of the code and with that I went to conclude and as a summary to say VMI supports a wide spectrum of applications they are from malware detection to cloud environments and VMI gives us the three isolation, interpretation and interposition but depending on what you want to do with VMI it depends on which of those three features you want to have the most so the pure VMI as in isolation is not required for all of those use cases so you have to see whether you want to have an ingest agent which you can secure or if you want to have very intrusive mechanisms which make the execution of the virtual machine maybe slow so you can have like the more in depth view the less performance or trade between those but as we said the hardware support for all of these mechanisms is continuously improving and gets better there the tools that we showed you today are open source you can look at the websites libvmi.com, dragoof.com and you can find us libvmi on freenote and our contacts respectively after that I just want to say thank you to all of the names here on that slide without those it wouldn't be possible and conclude our talk and be ready for all your questions so if you have questions please line up on the microphones and if you absolutely must leave the room do it quietly number one I have a question on the runtime batching does your code work with the function tracer in terms of dynamic F trace and also with the upcoming kernel feature kpatch or kcraft with the F trace feature it works currently with a function tracer with a function replacement it does not currently work I have not implemented that yet and also for the tools are open source that last part is still to be published but this will happen in the near future okay thanks number three four four is me okay one question you showed this tool for this tracing of the memory states so you mentioned that it's only supported under x86 right now other plans to support ARM Xen as well at some point of time yes I am actually been working I was hoping to get that into Xen 4.5 but unfortunately the freeze window closed just too early so it's going to be available in Xen 4.6 okay because there is this octa core ARM platform coming up and it would be really handy to have it on ARM as well yes so next release I can expect it to have these features as well yes we are working on it wonderful a question from the internet yes we have one question on IRC what is the easiest way installing a hypervisor in a general Linux OS and what are the minimum hardware requirements for embedded systems with current distros I have been compiling Xen from source myself but I guess you can get install Xen something of that effect KVM is of course built into the kernel so that's usually a loader if your hardware supports it so usually your distribution has some built-in support for that microphone number four okay so first of all I was trying to get into the same area as the first guy what's the actual purpose of those tools I mean those tools are really cool if you want to analyze a system you have a contained environment and you want to you know that it was compromised and then you want to figure out what actually went on right so you can analyze it dynamically and trace things that's pretty nice for an actual life system everything that's coming up with live patching like Perf, like KGraft, like the others you can do a fingerprint for a security vulnerability that's going to happen in a year right so you don't you can't you simply cannot do a right list because you are in time before the fingerprint would get created but that's where the right list approach sort of comes into play like you know what the good things that can happen to the kernel so only allow the good things that you want and know about and everything else you can flag as potentially malicious it means you need to update your right list before the guest can be updated so you have a time right so if you deploy a very new kernel that the protection tool doesn't know about yet then yes that has to happen okay so imagine there's a zero day coming up and somebody like you basically get a specially crafted KGraft kernel module that you load to fix that kernel that thing it's first off it's probably tailored to you because you might actually have another security fix in your system that is just only for you so you actually might be running a kernel that's not being printed by you or by anybody else um or your host provider your your cloud provider and uh and then the cloud provider might not even know about that fix yet because well they might be in the loop after you yeah but that's a general thing like for the internal integrity part that I showed that's very very kernel specific like you need to know exactly which kernel is running on the system with everything you have to have the binaries and currently we extract all of the information for our verification or validation out of the binaries so if you as the cloud provider don't know what's in that system you can't do that anyway but you're also not supposed to do because if you would supposed to we're supposed to do that you would know about the patch that is in the system and if you can also take like this is the kernel code that we are running and these are the patches that we are applied from that we can build like the ground truth and match it against what's in the system so my main point is I don't think cloud is the right term in this case this really is an awesome thing for embedded projects where you control the whole stack as soon as you have different tenants doing different things things fall apart because not everybody has knowledge of everything left and right and also um this is also not only like for real malware detection like from the beginning you want to you have a system you think is clean and if you work with a system you have the different tools that you with which you can look at the system and once one of the system reports any funny things about the system you definitely have to investigate further so you don't have all our systems running like with the most details viewed about the systems and it's also not for preventing any malicious things or entirely for preventing them so it goes in the right direction I think and but it's not a tool like that solves everything just having a hard time figuring out where to use you but so the second thing I had was on the VMID flushing stuff so the first generation CPUs didn't like first the first generation uh VM cable CPUs didn't have assets or anything of the likes and performance difference between having assets and not is like what five percent two percent something around that ballpark you can just flush and every vm uh context switch and call it at age is just at submit a patch to kvm um to enable always to flush it always right i mean that's always a possibility that you just disable tagging and your problem goes away but then you sacrifice performance and you always sacrifice performance if you trap on every heap allocation right that's yeah so so that performance really is not an argument here um if you lose two percent you don't the two percent is definitely a lot less than your in your way oh yeah it depends on what your security application is doing so you might not need to hook all the fast path functions to really protect stuff so it really depends as I said you probably don't want to trap on everything because if you have an ingest agent then you don't need to trap you just have something within that you protect from the outside and for cloud application that's probably the way you want to go but for malware analysis doing something like this where you really can't trust any kernel data that's really essential okay number one are any of the large cloud providers offering malware detection service for their clients not yet but we are expecting that to happen soonish I guess number three you mentioned something about the android um how far we are today to use this we are my end and the system on on ARM devices so the thing with ARM is that we have the two-stage paging and I have the code to have that trapping mechanism working with Zen but unfortunately that's only one part of the picture so for all of these things that I showed here today it's really required all four types of trapping for now it's only the memory part that's functional so there is more research need to be done on how to do single-stepping and efficient trapping so for example there is no breakpoint trapping on ARM so there is some alternative that needs to be found yet but it's still very early in the research phase so interestingly the people who are looking into that mostly are Samsung so I expect them to have some sort of security stuff that they want to sell soon so it might be improving as well so and does xen already support full virtualization on ARM yes yes yes thank you so if we have no other questions we will conclude the talk and thank thomas and thomas for this very deep look at the hands of the total