 Okay, my name is Jonathan Myers. My co-speaker is Andrew Gwynne. We're from the Johns Hopkins Applied Physics Laboratory. And this talk is about enforcing runtime integrity measurement with MOT. So MOT is a framework for runtime integrity. I think to discuss this, we'd first like to get a few definitions out of the way. I imagine that these are familiar to many of you. But for our purposes, we're talking about measurement and attestation as tools for establishing system integrity. Measurement is the collection of evidence or is a collection of evidence. Usually, in this case, it's evidence that a system is in a given state and that it is a trustworthy system. And attestation is providing evidence for a claim, such as the claim to being a certain computer with a certain identity, that that computer is running the software that it claims to be running, et cetera. Now, trusted boot and TPM provide measurement and attestation. They're probably very familiar to everyone here. But this is only at boot time. So these tools will tell you that a system booted up with a given firmware and boot loader and kernel. And kernel IMI will provide load time measurement for files through hashing in the TPM. But if you were to say load a binary into memory and then execute that binary for a long period of time, it could become infected with malware or corrupted in some way. And your load time measurements would not contain evidence for that corruption. So MOT is an open source system for measurement and attestation for monitoring runtime integrity throughout the lifecycle of a system and software. So the goal of runtime integrity measurement is to detect compromise at any time during a system lifecycle. That would include malware, including zero day malware, exploits against a piece of software. It would also include misconfiguration of a system, either deliberate or unintentional misconfiguration that could cause a system to behave incorrectly. And it also includes corruption, such as bit flips or other errors that could cause software to behave differently than you expect it to behave. And obviously, these are all of great concern when you have high insurance systems or systems that are running very sensitive software. So the approach of MOT and runtime integrity measurement generally is to take a measurement of a running system and then compare that measurement against a knowable baseline of what the system should look like. So for static objects, that might be a file hash of a file is a static item that you might measure and compare against. But we also support measurement of more dynamic objects. For instance, the text section of a kernel or a process, which might be self-modifying. The kernel self-modifies its code, so it is a dynamic data item. But also in-memory data structures, such as data structures used in the kernel, if those become corrupted, that can change the behavior of the kernel. We also look at CPU registers, syscall tables, got plot, all of these things that are used by kernels or processes that, if corrupted or modified, could be changing behavior of that software. Now, total assurance is generally not possible. You're going to run into the halting problem in RISIS theorem pretty quickly if you try to totally model everything a system does. But generally, a lot of system software, we can actually get a pretty good degree of assurance. And the more things that we check, the better our assurance becomes. Because if you're checking something for malware, that's one less place that malware can hide from you. So more formally, we've divided the measurement in attestation process into three roles. There is a requester, which initiates a measurement request that might be like a system administrator or it might be a piece of system software that wants to take a measurement of an attestor system. And then the attestor is the system under query, the system that is being measured. And the appraiser is the final role, which actually evaluates the measurement and compares it against some kind of knowable baseline and determines whether or not that software or that system is in a trustworthy state. In many cases, the requester and appraiser may be the same, but our design is broad enough that we support the case that they might be separate entities as well. So as a quick example of how runtime integrity measurement might look in practice, we have a requester, attestor, and appraiser. The attestor is living on or the attestor boots up. Say it's a laptop at a conference. You're connecting to Wi-Fi. Ideally, you're using TPM and trusted boot to show that the system is in a good state when it boots. It is allowed onto the network because it's running up to date kernel. The system is in a good state that we are going to trust it with the Wi-Fi. And now, say you have a web browser running and you connect to the internet. And we often leave our web browsers running for a very long time. And say there's a JavaScript escape and malware escapes into the browser process. Now there is malware inside of that running process, even though it might have been in a good state when it started up. And so long as that browser is open and on your network, it is able to do bad things on your network. So runtime integrity measurement allows a requester to then investigate the attestor periodically throughout its lifecycle. The requester will talk to the appraiser and initiate the measurement request. The appraiser will talk to the attestor and request a measurement. A measurement will be formed on the attestor system and sent back to the appraiser. The appraiser will investigate it. And in this case, we will flag that there is malware on that system and then appropriate action can be taken. We might put the laptop onto its separate enclave for an incident response so that we get that malware off of our network and minimize the harm. So more formally, this process is performed through the contract flow shown on the left. We use contracts in order to negotiate what is being measured. And this is done partly for privacy concerns and in order to give us flexibility in our system. So you may be willing to share all kinds of information about your work laptop with your work network, especially if you're working in a sensitive environment. But you may not be willing to share a lot of sensitive information with just the Wi-Fi at Starbucks. So there is a contract flow shown on the left here. So the requester will initiate a contract, send a request to the appraiser. The appraiser will actually send, will determine what kind of measurement is appropriate and send it to the attester. The attester then is actually given a negotiation step where they can determine whether they will perform that measurement and then they will send a modified contract back. The appraiser will again evaluate that to determine whether it's sufficient for bringing the system onto the network or giving it the privileged operations it's requesting. And then a final contract is sent back and if accepted will be performed, the measurement will be performed and then the measurement will be sent back with a measurement contract where the appraiser can investigate it. And there's been quite a bit of academic work on the system and design. There is a language called Copeland that is an actual formal language for specifying measurements. The two papers I've noted here are good papers if you want to learn about Copeland. It's been investigated pretty deeply in the academic literature. And there's even a Copeland website hosted by Kansas University that will hold more of these citations and more information about the Copeland language if you are interested. So that was an overview of how measurement works. Let's talk a little bit about what kinds of things we can measure. The start of this work for us really began in 2007 in a paper called the Linux Kernel Integrity Measurement using contextual inspection. This described a tool called LKIM or the Linux Kernel Integrity Measureer. LKIM uses a Zen system or it resides on a Zen system. So as you're probably familiar in Zen you have a DOM zero which is the device driver domain that's a specially privileged domain. And we imagine that we have a DOM T, the target domain which executes some kind of sensitive application or a high value application. And then there's an additional domain called DOMM which hosts LKIM. And then LKIM has special privileges within Zen which allow it to measure the kernel of the sensitive domain and evaluate whether the kernel is in a good state. And there's quite a bit of detail in the paper. LKIM is actually quite thorough. It's been around for quite a long time and has actually seen a lot of real world use. We also have a tool called XIM, the Zen Hypervisor Integrity Measureer. This is a tool for measuring Zen in case the hypervisor itself becomes corrupted or becomes a host to malware. XIM lives inside of a special area called the STM Protected Environment. This lives inside of System Management mode in Intel. This is an Intel only technology which is a super privileged environment if you're familiar with System Management mode. System Management mode is a super privileged environment only accessible during early boot. And there is an open source implementation of the STMPE that is now part of Coreboot. And that is something that you can just go out and get off the internet for any system that is compatible. And then finally we have this paper, Runtime Detection of User Space Implants which describes the process of measuring user space processes. So things like looking at got and plot measurement, process text, all of the places that malware would generally hide on a Linux-based system. So between these three we have the ability to measure the hypervisor, the ability to measure the kernel and the ability to measure the user space. And then MOT is really a tool to provide the glue for connecting all of these in an intelligent way and providing the maximum assurance from all of these tools put together. There's also a paper called Principles of Remote Attestation which covers a lot of the theory behind a lot of the measurement and attestation processes that I would recommend if you have interest in learning more about those. So in order to make MOT useful, we have a variety of design requirements. First of all, we have to have some trustworthiness that the measurements that we are getting out are good measurements. We have to be able to trust our implementation. We need to be able to function in a contested environment, so we need resilience and the ability to function in an environment where there might be malware running on the system. Obviously, flexibility and extensibility are critical because we need to support a lot of different systems. MOT is also designed to support a lot of different use cases and deployed systems. And then finally, we need to provide security in the sense that we need to be able to function in a contested environment and not add additional attacks vectors for bad guys. So to do this, we generally try to uphold the principle of least privilege and minimize the attack surface that we've added to the system through adding MOT. And in order to fulfill these design goals, we've arrived at the following design. We have three or four components, depending on how you count them, that are part of MOT. These components are compartmentalized and there's enforcement through SE Linux in order to minimize what each component can do, and that provides the principle of least privilege for the components. So hopefully minimizing any added attack space. So the first component is really just the UI for the user. We provide something called test client in our package. This initiates the measurement request and can be used in scripts and things. And this talks to the attestation manager on a system. So we have an attestation manager or AM that lives on the attester machine and one that lives on the appraiser machine. These components listen to the network and so they are a really ripe target for attack. And as a result, the AMs are really, really restricted in what they can do. They can talk on the network. They perform the negotiation of contracts and then once a contract is agreed upon, they fork an attestation protocol block or APB. So all the AM can really do is receive network communication, perform negotiation, and then spawn APB processes. And that minimizes the amount of damage that could be done if it were to become a host from malware. Once we've spawned off the attestation protocol block on a system, there's going to usually be an attester APB and an appraiser APB, at least one per system. And these are the components which actually fulfill the protocol and investigate, perform the, or they bundle the measurements and investigate the measurements. The actual measurement is done by a third component called the attestation service provider or ASP. These tend to have special privileges because they are actually performing measurements so they're able to look at other processes or look at the kernel or even look at the hypervisor. And so these are as isolated as possible. They are not allowed to talk to the network. They can be spawned by the APBs but the APBs cannot do that kind of work themselves. So we've tried to design this in a way that separates the components that are the most sensitive as far away from the network as possible in order to make attacks against the system itself as difficult as possible. And of course this is a relatively complex system that we have created, but I would challenge anyone to come up with a better design which fulfills all of the goals that we need. I really think that this is about as good as you can do. So hopefully our system now provides trustworthiness because we have TPM integration within MOT. So we have a root of trust in hardware. We can perform layered measurement of a hypervisor OS kernel and user space. So this gives us some assurance that the measurements that we're getting out are not obscured by a component that is deeper within the system. And for resilience we've component separation, layered attestation can start in the STMPE which is that system management mode. There's TPM integration which can allow us to get attestation of the identity of the system. And of course as I mentioned we're using SE Linux to give minimal privileges to each component which minimizes the damage any given component could do if it was corrupted or subverted. So let's do a quick walkthrough of what user space measurement in MOT will actually look like. I'm going to actually cover all of the individual components and how they communicate here. So say an admin uses test client to talk to an appraiser AM and says I want to do a measurement of the given attester. The appraiser AM will talk to the attester AM over the network, they will negotiate a contract and decide what's going to be measured. In this case we're performing a user space measurement only. So they will fork off APBs for user space measurement. The measurement or sorry the attester machine APB will spawn off a bunch of ASPs which perform actual measurement. We do things like measuring the packages that are on the system, looking at all the processes on the system, looking at all kinds of little details like the got and plot and the tech sections of those processes. A variety of other things that are described in the user space integrity measurement paper. All of those measurements are then collected by the APB which is allowed to talk to the network. It will fork off a final ASP for signing so that we can test the identity of the system that we're talking to. And the measurement with the signature is bundled up and sent over to the appraiser side. The appraiser will then verify that the signature is correct so that it knows what system it is talking to and then it will break out those individual components and it will fork off ASPs which actually evaluate each individual measurement component. And then there is a final result provided back up to the AM which finds its way back to the person who made that initial request. So that was a user space measurement. That is a thing that you might want to do pretty commonly but if you would like to do a deeper measurement and look at the whole system that is supported through MOT, we call that a layered attestation. In this case that is a measurement of the hypervisor, the kernel and the user space on a system. Timing matters and this has been investigated in the academic literature. We want to measure the hypervisor first, then any kernels, then any domain user spaces generally. In this way, unless the system or the adversary is able to move quickly throughout the system, there's not a good way for them to hide. Obviously if the hypervisor is corrupt, you can't trust the measurement of the kernel because we're trusting the hypervisor to perform that measurement. So you want to start with the hypervisor then the kernel move upwards. So in this example, we're going to have the DOM zero, the DOM T, the target domain that actually is running the user software and DOM M which runs LKIM. So we start out with the test client making a request to the appraiser AM on behalf of the user. This negotiates a layered attestation with the DOM M AM. Then it will start by requesting a measurement of Zen from XM from the SMM mode and STMPE is able to measure Zen and provide a measurement back to the DOM M. Then we will measure the DOM M user space to assure that that system is actually in a good state. And then we will begin to measure DOM T, the target domain. So we'll start by measuring DOM T as kernel. That will be sent back to DOM M. That is bundled and signed. And then the bundle is sent back to the appraiser AM or the appraiser APB actually. And then that will be evaluated through the ASPs and then sent back to the test client. And this is a demo that we've actually performed. This is something that MOT can actually do and be configured to do. And it is included in the open source release. So we have open sourced MOT. The repository linked here contains MOT as well as the user space measurement and appraisal APBs and ASPs. It also includes demos for measuring using trust zone and on smaller IoT devices and microcontrollers. It also included is the code for performing a demo of the layered attestation, but LKIM and XM have not been open sourced and are not currently available. You can ask me some questions afterwards if you have any interest. We would love to see the system actually deployed out in real life so we are looking for more partnerships and places to help develop and mature this code. So in summary, MOT is an open source framework for runtime measurement and access station. Runtime measurement and access station can detect previously unseen zero day malware and implants or other security breaches and errors. And this includes detection of implants inside of a running process or inside of a kernel or hypervisor and provides assurance throughout the life cycle of these long running processes. And finally, MOT supports a variety of multiple deployment scenarios. I showed the user space measurement example and the layered attestation example, but it can actually do quite a bit more including we're hoping to add support for mutual attestation in which two mutually untrusting entities will measure each other and gradually release more information. You can find more reading here on these talks here. There's documentation in our repository that is fairly thorough and there's the Copeland website provided by Kansas University which is one of our collaborators. So that concludes the talk. Do we have any questions? So how closely tied in to say LKIM is this can another integrity runtime system utilize MOT? Sorry, so your question is how closely can you? How close is the integration between say LKIM? LKIM? Yeah. LKIM, yeah. Can something else be plugged in? Yes, something else could be plugged in. Yeah, that's I think one of the nice things about having sort of the APB and ASP kind of architecture is that if you want some sort of new measurement you just write an ASP for it and integrate it into, you can integrate it into existing measurements represented by an APB or write a new APB that incorporates new measurements that you wanna have. Okay. And also, is there any chance that LKIM would be made open source? Steve Smalley behind you is smiling because he has some answers. I've been asking for a couple of decades. So no, it won't be open sourced, but if you're interested in it you can talk to us about it, but not in an open source form. So the user space measurement is open source. The user space measurement is open source and it's provided in the repository that I linked. Like how smart it is in measuring pages, for example. If it needs to, does it go and hashes all the text segment or it only picks like the pages that are in the memory? So the details of this are covered in one of the papers that I linked and there's quite a bit of detail there, but yes, we do look at text process, text sections. We also look at got and plots. Andrew, you actually know more about this than I do. I think that's basically the high-level overview in terms of introspecting processes specifically. I know one of the things you can do is you can compare sort of, if you map in executable code, your process, one of the things it can do is it can compare what's on disk versus what's actually loaded into the process via hashes just to see that that hasn't been tampered with or what's loaded is what's expected. So one of the challenges we faced when we tried to implement something like this is that on memory constraint system we cannot afford go and read all the text segment because it will bring them from the disk into the memory and usually we have compressed file system, so only small subset is used in the working set. So if you go and at some moment try to read all of your text segment, it will have to unpack them and bring into the memory and it just brings the system down. So we were looking at some smarter mechanism that you can see that this page is not mapped so you don't need to touch it. You can just go and check it on the disk and you know that if you touch it, the same will go into the memory. So is the mat implement some kind of optimization strategy like this? I think it's detailed. I'd have to defer to the paper on that one. What I will say in a more general sense is that mat is not supposed to be one specific measurement that is taken for everyone. Mat's configurable in that you could use whatever measurements are relevant for your use case and so on a resource constraint system like for example, we have demos for smaller IoT devices and the types of measurements you'd take on that system, like you described with resource constraint systems would be completely different than something you would take on a more powerful server type context where you have the resources to do these sorts of measurements and you would use different measurements depending on the level of trust that you want to have. Like on the, you know, Wifi, conference Wifi example that we brought up, you might only want basic information but in other contexts, you might want to dive deep and those deep measurements have the trade off of being more expensive. So you might want to only take those measurements in certain contexts when you particularly need that level of trust. So I don't recall everything that was in the user space integrity measurement paper, although it's author is I think sitting next to you. The paper there was the main challenge I think was trying to figure out how much stuff could be measured and how thorough we could be. And so the measurement code that we provide in the repository for user space measurement is as thorough and measures as much stuff as possible but you could write ASPs or APBs that do a more limited measurement or only look at a subset of processes or only look at processes that are currently swapped in. And we did, I'm not sure if I actually linked it but there was a paper published on IoT assurance that was based around MOT and that did actually perform measurement on these very small microcontrollers and things like that. I don't think I've actually seen the use case you mentioned or I've never encountered the problem that you have mentioned of trying to measure a large number of processes on a very resource construing system. But MOT could support that, you'd probably just have to sort of modify the APBs and ASPs to throttle what's actually being measured. Thanks for the presentation. My question is, are you using the TPM also to record suspicious activity and be able to detect after the fact whether the system was compromised? So we do perform TPM integration. The TPM is used to attest the identity of the system and then it can also be used to sign measurements. So the TPM is not directly measuring, say the user space or even really runtime processes. The TPM just provides the lowest level of attestation of the identity of the system and it also is used to show, in the case that you're using XIM, it will show that you actually used Coreboot with the STMP enabled and a version of XEN that supports measurement and it will also show that you are running a real version of XIM. So it's used at boot time to show that the measurement agents themselves have not been modified and then from that level on you can actually use those measurement agents and then you hopefully have trust that those are uncorrupted and that you can trust the measurements that came out of them. So basically it's the measurement of the current state not of the past state. So for example, a process was corrupted and then somehow the memory was replaced with a corrective version. You would be able to detect only if you are in the middles. So if the process is still corrupted then has not been reverted to the good state. If I understand your question correctly then yes. The goal of MOT and user space or sorry, a runtime integrity measurement is that if a process or a kernel becomes corrupted after boot, after it's been measured by TPM or after the kernel IMA has measured the binary before loading it if that corruption happens later we are actually able to find those corruptions or those implants or malware and flag them. Yeah, that is the purpose of MOT and runtime integrity measurement. Okay, thank you. Thank you. I think you all ended up answering this in a couple of questions ago but in the case of let's say a server machine with a whole lot of virtual machines could MOT be configured to have contracts for like doing one or two virtual machines at a time to reduce the delay before getting those results? Yes, so MOT was designed with a lot of extensibility and flexibility in mind. So those are use cases that we have thought about and that are supported. Configuring MOT can be somewhat laborious at this time because the flexibility means that configuration is made more complex, obviously. But if you wanted to say only measure one or two VMs on a system or if you wanted to measure all of the VMs on the system that those would all be supported use cases. Okay, thanks. I have a virtual question coming in. How do you detect implants in a running process? So there are a variety of ways you could put an implant into the process. So the answer to that question is going to vary depending on how the implant was made. For instance, if there's a buffer overflow and that's used to overwrite some of the text section, then you can detect that by performing a hash of the text section and you'll see that it's been modified. And if you have a process that it's text section you have a knowable expected state. If your implant has modified that text section then you're able to flag that as being a modification or a deviation from the expected state. There's a component called a baselineer which figures out how a process is supposed to look throughout its run and then the measurement will actually look at the state of the process and compare it to the baseline. And that is done by the appraiser and it will find those deviations. So some of the things that we look at are for instance, hooking in the PLT. That is something that is done by various malware such as symbiote. And so we know what the process linkage table on a process is supposed to look like and if a malware or an implant is able to modify that through PLT hooking we can detect it. We also look at a variety of other kernel data structures that are used for maintaining a process, or sorry not kernel data structures, but data structures used by the load process and the linker and all of those things that are used by malware to hide. So I guess the short answer is that we are able to detect an implant by looking at as many things as possible, putting that data into a measurement and then comparing it with a baseline of how a process is supposed to look. And the ability to detect is really as good as your baselining ability can be and your measurement ability can be. And we go into quite a bit more detail in the user space integrity measurement paper on all of the things that we know how to check and how to detect. I hope that answers the question, it was a broad one. Just of how accurate you find this to be in real life. If I'm running Firefox and you mentioned the browser and I download a new plugin and start it, am I going to immediately get a false positive or? I mean that's a, I don't think we've actually looked very closely at web browsers specifically, but that is the kind of thing that you would actually probably flag. And that is going to be configurable. If you want users on your network to be able to install add-ons to our extensions into their Firefox, you might actually configure that into the appraisal step that we just ignore what all the extensions and add-ons are. And then of course, there's the caveat that if you're not looking at the extensions and add-ons in a system, then those become attack vectors that we're not measuring. So there's a trade-off in terms of how much work you're willing to do and how much security you're willing to get out of an approach like this. But do you have results in one of the papers? Yes, the user space integrity measurement paper I believe will contain some experimental results for things like that. I believe that we looked at some real-world Linux malware. And then a completely different question. You mentioned Zen. What about KVM? That is on our to-do list, I think. And I believe that actually there's a subsequent paper coming, or a talk coming up, about measuring Linux kernel through KVM. Cool, thanks. No more questions? Okay, thank you for your time and your attention.