 Good afternoon. I'm Dave Safford from GE, and Monty Wiseman and I will be talking about a canonical event log structure for AIMA. It's kind of interesting over the past today and yesterday. I've heard lots of comments about attestation, and I've heard it described as difficult, scary, and terrifying. So perhaps I should retitle this as adventures in attestation. So what are we doing with this attestation in general electrics? So GE makes a lot of critical infrastructure. Transportation, renewables, power generation, something like 50 percent of the electricity in North America is generated on GE gas turbines. Gas and oil, energy storage, aviation jet engines, and these are very, very large critical infrastructure businesses. And one of the things that we've been doing, there's been this very large drive of business need to connect these devices and these controllers, the embedded controllers for these, into analytics type of systems, because there's a tremendous amount of money that can be made by optimizing these systems. So we have controllers that run everything from $35 devices all the way up to $35,000 flight control systems. And it's an imperative for us to combine the real-time control system that exists with analytics-capable edge OS, which are basically a Yachto distribution of Linux, running containers to do the communication with the cloud, do sometimes local optimization and analytics, because there's a tremendous amount of money to be made here. Some of the numbers at the bottom we've already been seeing with some of our applications, 10%, 5%, 3% saving, 17% cost down, 1% in aviation. And these represent tremendous amount of money. We're talking billions of dollars to be saved with this. So this all sounds well and good until you start thinking about the security aspects of this. So Mike was saying that Attestation terrifies him. What terrifies me is that currently I don't know what software is running on these controllers that are running these critical infrastructures. So Attestation is a critical piece of our security architecture to be able to answer that question. Do we know what's running on these systems is still what we intend it to be or has it been compromised? And Attestation is one thing that can really produce us. This has been part of a multi-year effort that we've been doing to create the entire security chain of trust working all the way from the hardware, and that's CPU selections, the boards, putting TPMs on the boards, firmware. We talked about all the different secure boots. We're looking at all aspects of secure boot, protected boot to protect the actual flash, verified boot, measured boot, and DRTM, Dynamic Grid of Trust for measurement. We think all of these play an important part in the architecture. The operating system, integrity measurement architecture to collect measurements and signatures such that we can actually do the Attestation. We've been working on this and this year we finally got all of the lower layers done and now we've moved up to the Attestation layer and that's been our primary thrust of work this year. And some of the issues that we're coming up with are things that we would like to enhance in Attestation through main areas. First thing is scalability. We've had some of the issues in some of our very small and surprisingly even in some of our very large controllers where the measurement list being kernel memory resident, essentially over long-term operation being effectively at kernel memory leak can actually cause problems. And so we're looking at some ideas for moving these measurement lists out of the kernel into user space and managing them there. And in some sense there's really not a need to keep them in the kernel. The data itself is protected because it's anchored in PCR in the TPM. So regardless of where it's stored, we can authenticate it, validate it in a strong way. And so that's one of our issues. Second issue is completeness. If you think about the chart of what's currently there, IMA can do local appraisal of data in a file. It can also do remote attestation of data in the file. EVM is about local protection of the metadata of the file. And so what we'd like to do is extend this to include remote attestation of metadata. So with our current thing, if we have remote attestation, since the data of the file is correct, that's good. But we also need to know is the security label on it still correct? Is the mode still correct? Has somebody made it set UID root? Has these other sorts of things we need to have as part of our attestation. The third aspect is standards compliance. It's one of the big issues on actually fielding, taking on this massive task of fielding client and server attestation systems is it gets really hard if everybody's doing things a little bit differently. So standardization is a very important part of that. And Monty's been leading the TCG standardization effort there. And so between the two of us, we're trying to actually have a common design based on what's called calvary type length value triples. And I've been actually doing a proof of concept of that on top of AIMA to show that, you know, to actually test the design, make sure that it's something that's feasible and usable. Some additional bonus things that we're looking at here. Sequence numbers for all events so that we can synchronize it wherever the data goes from kernel to user space to the attestation server. Timestamps. And this is the timestamp not on the file, but there's a timestamp on when did we actually measure the file? When did we verify it? You know, was it today or was it a year ago? Flexible dynamic selection of included fields. And some other benefits that if we actually move the measurement list out of the kernel, that actually makes some of the other operations like K exec easier. So this kind of gives an overview of some of the different things that we're looking at in this proof of concept work. This is very much an RFC right now. It's both a standard that's in development. It's not final. The code is very early code. And so there's a very good time if you can interact and give us comments and feedback is where you'd like to see it go. And with that, I'm going to turn over Monty to talk about the standardization effort. Thank you. I want to reiterate somewhere we lost the disclaimer that this is a draft specification of a given approval by TCG to present it because it's still work in progress. But we think after several months of kind of boiling this down, this is the format that we would like to go with. And as David said, we had a real problem because today we have PC measurements, PC client measurements or server measurements of firmware. But we also have measurements from IMA. We want to be able to convey these out to a verifier. If you think about it, what's the point of all of this attestation? If you just don't do anything with it, it goes to a verifier. And the verifier wants to look at it, compare it against all the measurements, for example, and make make some sort of trust decision. So we started looking at this and thinking, well, we've got these PC client. They were defined back in 2001. Both structures we have IMA structures. How do we encapsulate all of that? So what we've done here is we've created, basically describe it as kind of an encapsulation. And one would be able to put different sets of things in here, as I'll describe in just a second. So the overall format here you're seeing is there's four fields inside of each record. And each field itself is a TLV. So we're starting off with a record number. And this solves the problem that David articulated with. If you go pull this stuff out of IMA, for example, how do you know what sequence? The sequence is incredibly important because that's how you verify it in PC or 10. The same applies for the PC client. How the PC client works is if you ever looked at the PC client spec and how it keeps track of the measurements, the requirement is there's simply sequential in memory. And that's how we maintain them because nobody wanted to consume the extra space to put a sequence number in it. Because the assumption is when you go get it out, they are by definition or by the requirement in sequence. But when you go stick in this thing on a wire, for example, you start sticking this thing in a database, you're really going to want a record number. The other reason for the record number is you might, maybe the server is connected for a little while, and then it connects and you want to ship off 10,000 records or whatever it is, you want to know where to start when it gets reconnected again. You want to be able to put all of these pieces back together again. So we decided the very first thing we wanted to add into this wrapper thing is a record number. And so, again, within each one of these fields is a tag. And right now, my position right now is this is an octet in 8 bits because we're going to try to be very, very stingy about how we allocate these things. And I'll describe that I think there's a very, very few of them that we need to create. Followed by a length, obviously, and then followed by a value. In this case, the record number would simply be a Uint of the sequence of the two fields. So we wanted to keep the TLV across all fields consistent. We really could have just put a Uint of 32 in there. But what we wanted to do, and we debated it back and forth, is we wanted to make the parsers. We wanted to make them very consistent for the parsers as they're walking through this. So every one of these fields is a TLV, even though you technically don't need it, to be that way. So we wanted to make them very consistent, even though you technically don't need it, to be that way for APCR. There's not that many of them. Not even that many possible in the TCG spec. But it's a TLV regardless. Followed by a digest, and I've got another slide that will walk through exactly how the digest works, and followed by the actual event content. So what we will do, we being TCG at this point, we will define this layer as architectural TCG. We'll define this layer of tags for each one of these fields, and that's going to be the distinction between TCG, architectural, defined as I just arbitrarily call it down here, CEL, for canonical event log. TCG, we'll define what those are. And so this is a breakdown of them for right now. We decided to start with zero, because everything starts with zero anyway, but it's also really convenient that every record starts with the value zero. So you'd be able to actually, for debugging this, it might be easy to go distinguish between the records. And then we'll allocate a tag of one for PCR. Two for this field is the TCG digest, and I'll get into why I put TCG digest in front of that. And then three on to the end will be defining content type. And in the table below, I'm describing some of the content types. One is context management. I'll get into that in a later slide. But four and five, for example, will encapsulate the firmware. So the stuff that comes out of the PC clients backer comes out of the firmware, PCR, TCG, PCR event two structure, for example, that's where the content would be. So I've defined two different types. I don't have slides on those because we want to have save time to focus on IMA right now. And then I've allocated six for IMA, and these numbers are kind of just made up numbers for now because the spec's still in development, but a content type for IMA legacy, which is the IMA we all know and love today. And then we're proposing, and David's got some code, for a new format for IMA, which we're calling IMA-TOB. So each one of these would describe, just go back for a second, each one of these would describe the event content on the very, very right now. So let me get into how the digest works. Now, I don't know if anyone's familiar, but the TPM-2 is what's called algorithm agile. So you don't just have a single bank of PCRs. PCRs are actually a two-dimensional array, if you will. So every PCR number can actually be associated with multiple SHA or hash types. So PCR-1 might have a SHA-1 bank, and it might also have a SHA-256 bank. It might have a SHA-3 bank. It might even have a SHA or an SM-3 bank, which is the we've allocated one for the Chinese set of algorithms. So when you address any PCR, depending on how the bios the owner of the machine has provisioned in the machine, you might likely have one or more banks of PCRs associated with one particular PCR index. So what we've done is we've accommodated that inside here because what you do when you create an extend operation in TPM-2, you will actually hand it a hash value for each one of the banks that you want to extend. So we're reflecting that here inside the digest field. So there's a TCG digest saying, okay, this contains an array of digest, and there's a length of the whole thing. So if your TPM was allocated with two banks, SHA-1, SHA-256, you would put a SHA-1 hash in there and a SHA-256. The digest ID is the digest ID, and this is the reason I call it TCG digest, is these digest IDs are taken from the TCG algorithm ID. So if you go download the algorithm ID, it has a bunch of enumerated values identifying SHA-1, SHA-256, SM-3, for example. Excuse me. Lastly, then, is the content type. We will, again, TCG will define this high-level content type, and right now we've got a couple that I enumerated before, but the important thing to say is once you're inside the V portion, the data portion of this content type, how it's defined inside that is entirely up to the content owner. For example, the PCClient ones, I'm actually giving over the two PCClient ones I've allocated. I'm giving over to the PCClient group and saying, you guys define how you want your event log to be transported inside this. And we've got one set up for IMA. Legacy, which I'll describe in IMATOV, those will define exactly how their set of data and their algorithms will be expressed inside this content type. There's a minor typo. I had 80 through 8F, and that's when I thought it set the upper bid, but that was actually intended to be 03 through FF. And so kind of to wrap this up, in the ideal world with new sets of content type going forward, all you would really do when you want to create the value that's going to be extended into the TPM is you would simply hash the entire content field and then you would put that into, you know, you would extend that into the TPM. I think this slides out. So what you would do is you would stick that into the V value and then you would extend that exact same value into the TPM. So the value in this V is the exact same value that you passed to the TPM extend operation itself. So let me touch on a couple of the CEL management. David talked about a little bit. We thought there might be some pretty interesting things. Unfortunately, we lost some of the bullets, so this is actually maybe a little difficult to read. But we thought it would be interesting or a good idea to provide maybe some metadata around the event log itself. For example, as David mentioned, a timestamp. So you got a bunch of events that happen, and I'm a, for example, or maybe the transition between the firmware and the first time of measurement, you might want to know exactly what time that is, what time that happened. So you would be able to put a timestamp, CEO management event right at that particular point. So, and there's other concepts of like EV separators and we're deciding, we think all of them are security sensitive, but there might be some that are informative. For example, what version number, and I'll actually, I think it's on the next slide. I give kind of an example of, you know, what the stream of these TOVs would look like. One of the things would be, I think it's a good idea to first send off, maybe there's a version, especially we've read this. So we create a management type that simply says, this is the version of the spec, this particular, the stream this machine is built to. So we'd probably start with something like that. Then there would be obviously an array of PC client measurements that came out of the firmware before the OS even started. Then you might have something like an EV separator. Okay, the firmware is ending at this point. We are now starting with the set of IMA event logs. And then I thought, well, we probably need something for systems to go into sleep or hibernate, right? These things go into a suspended state and they actually somewhat change the security properties. It would be a good idea to have a measurement of as the machine's going into sleep or coming out of sleep, that we would be able to log that as well. So these are an example of the kind of things we can do with the measurement. So let me just walk through the workflow a little bit. And the workflow, really starting from the bottom, let's start from the bottom, is obviously that we don't want, because the first thing when I kind of shop this around, the rest of the TCG working group, all of the PCOEM vendors panicked, you're going to make me change my firmware. No, we're not going to make you change your firmware. So what we're going to do is, obviously, they've got their TCG PCR event structure. Don't change it. There's going to be a utility, and I'm going to work on that actually, a utility to convert it into this new format when it needs to be transported off the machine. So we would be able to map the existing stuff into this new canonical event format. The same thing going up one more level. If you've got the existing IMA, you might not want to change your server, but you do need to transport this information. You want it to go to a verifier that understands this. So we would write a utility for that. If you were to have a new module, like David is going to talk about in a few minutes, where the kernel itself produces these TOB records, there would be no translation at all. It would just transfer directly. So let me show you an example of our proposal for a new way to represent IMA integrity measurements that's very expandable. So we would basically, inside this content, you can actually have an array of TOBs representing that particular module. It might be mod, it might be path, it might be the actual hash of the data, whatever actually David and Amy will actually end up owning this, and defining what sort of thing goes in here, and you can actually have an array of these things. And even within them, we've architected in the ability to be hash agile within the IMA structure itself. For example, if you wanted to at the same time create a SHA-1 hash of a file because the server you're going to send this information to wants SHA-1 because it's a legacy, and at the same time you produce a SHA-256, you'd be able to just simply append these just like up in the digest field, right? And then this would just be part of that content type record. And then what you would do when you get done with all of your measurement of the file, of the data, whatever you're doing, is you would hash these, and what I'm depicting here is the actual structure that's sent into the TPM. It's called a TPML digest value. You would create, for example, a hash-1, this would be the example of the SHA-1, or hash-2, and that would be the example of SHA-256. You would put those in there and do the extend operation, and then you would simply copy them up into this digest field, and now the whole record would be complete. So that was actually a lot. Before I go on to the example of legacy IMA, do I have any questions, or did I completely lose everybody? Okay, so for example, IMA, how do we deal with IMA legacy? So we decided to, as an experiment, and I would, again, it's a utility that I would like to write is, you've got these templates that are depicted on the left-hand side. How do we represent that? So again, on the right side, there's a concept of IMA tag content, L, and then tag value, and I've got these various templates, and so this would be an example of what we would do, and this is an exception to, when I said you would hash the entire content, because the current IMA doesn't have that concept today. So what we would do is simply take if you were doing the template D, for example, what you would do is simply take a hash of the file content, make a TOV record out of that. You would put the fact that this is, I'm sorry, it's an IMA template, and so that would be identified here. The D field represents the hash of the file content. The N represents the actually the file name, the string of the file name, and what IMA legacy does is you concatenate those and hash them, and then take that and extend it. So in this case, you would hash these two together just like IMA does today, put that into the V field, and then you would, I think I have a slide now, that's the end, then at that point you would extend that into the TPM, or that was extended to the TPM because it had already happened, that's why I have a slide on it. So this is how you would write this utility, you would have to, as you read the SecurityFS, IMA SecurityFS, and read this information, this is how you would translate that, so that on the other end, someone would be able to use this standard format be able to redo the calculations and they would just match. So anyway, any questions on the overall? So the actual deck is about 45 slides. So that was a summary. Okay, I think that was just right to tell you. So David has some real life examples here. Oh, I'm sorry, yes. So are you worrying about how to export this across machine boundaries where you have to worry about encoding issues such as byte order, et cetera, or is that handled somewhere else by some other layer? Yeah, so that's going to be handled. I've got a colleague who's actually working on a Seabor implementation, so we're going to be, that's the next phase is starting to think about those kind of representation as it leaves the machine. Obviously, David's got this little ASCII dump, but that's not what we're going to send. We're actually going to transport this on top of a new protocol that's being developed by TCG called PTS Platform Trust Services. So we're going to transport this on top of that, and I expect, like I said, my colleague's working on Seabor today, but we expect a JSON, a Seabor. This is as much a data representation as it is an information model. So you could take this information model and map it to lots of other sorts of formats to transport. Yeah, thanks. Questions? Okay. So for it, it's all well and good to have slides and a nice design, but the proof of the pudding is actually writing code and seeing if it actually works. So we've done a proof of concept of this. And three aspects of it. The first thing was, while most of the template code was nicely isolated and I'm a template and I'm a template lib, there was some aspect of template throughout some of the other files. And so the first thing was basically just doing a refactoring to get all of the template things into an additional IMA FS template, IMA Q template, and pulling off a separate IMA template.h, just not changing any functionality, just, you know, refactoring it. The second part then was to write the TLV code. And the current proof of concept code, I think demonstrates nicely how simple this can be when you do everything in TLV. The total code is 480 lines of code compared to the 1,700 lines of code in the existing template code, and total about 4,000 lines of code in the basic part of IMA. So it actually becomes very, very simple to do this. And it's kind of compelling. The way it currently is, I've done it in the proof of concept is that it's a kconfig option that you can select. IMA measurement looks format either template or TLV. And I've done regression tests against the original template version to make sure it didn't break anything. And then separately been testing the TLV. One of the things on how easy it is to add a field, basically, it's only a few lines of code to add a new field. You define a content type. Once you have a content type, you have some code that figures out how long that field is going to be. You then have another couple of lines of code that can actually fill in that field. And that's it. All the rest of the infrastructure handles it for you. There's a helper functions like IMA TLV buff that actually does the serialization and handles endianness. Back to the endianness question, we're following for the IMA part of this, the IMA content. We're doing the canonical format for IMA, which is little endian, and everything is converted. And I haven't had actually a chance to test this on a big endian machine, but hopefully the code would actually work correctly and produce the canonical format. So what does this look like actually running? So the first thing is that the existing binary runtime measurement and ASCII runtime measurements are persistent data. You can keep doing them. In this model, when you cat the assist kernel security IMA TLV runtime measurements, it actually reads the data out one time and deletes it out of the kernel so that we, you know, eliminate the memory leak issue. And so what you do then is you have essentially your utility is going to take it and save it somewhere. And this can be as simple as just cat and redirect it to a file like bin data. Then you can take that binary data and run programs on it to do the attestation or to do analysis locally now that you've preserved the data up in user space. And this shows an example in which they're doing just two fields and this is an ASCII dump of the binary but what actually gets transported is a binary. So you have a sequence number. This was a 1364th entry. PCR number 10, TCG digest which in this case is a SHA-1. There's the SHA-1 and that's the digest of the entire record. The record includes PATH and I actually had run the TLV dump program. The data hash for the file is a SHA-256 which is that value. The dump program also does verification on the list. In this case it says the digest matches the content which is good. So that means that this record has not been this particular record has not been tampered with. And after all the records are analyzed and it has been keeping track of what the PCR 10 value should be after all of the extends. And so it says final PCR value should be this and in fact if you do the TPM2 PCR list you can see that that actually comes up with valid answers. So we've been through two different things. One side creating the data and another side analyzing it. It would have been better if two different people did that to actually stress test the implementation but at least we've gotten to this level of testing of it. So before I go into a summary let me actually show this running. So this is actually running on my laptop here and I've been doing kernel compiles on it and it hasn't panicked in at least three days so the code must be ready to ship. So here's the example of taking reading what new thing measurements are in the measurement list and redirecting them up to a file. Here's running the TLV dump program on that. I'm just going to look at the last two things. This actually shows some of the extended fields. So again we have sequence number, PCR number, digest for the record, the data hash for the file and the file name, owner. And now this is showing some of the metadata fields that we're testing to know. So this is owners is zero, group is zero and mode, well look at this, this actually turns out to be a set UID program. So this is some of the sorts of things you can see remotely now and get visibility into what's running on your system. So it's not just the data, it's now metadata about it. And I'm sorry I didn't show an example of including the SC Linux label but that would obviously be the next fun thing that we could show there. And the digest matches the content and so drum roll please, let's see if this actually worked. So if we go back and look, okay, I get paid. PCR 10 should be 8A through EB and that is in fact what we see here. So the measurement list has not been tampered with and each of the entries are valid and the list is valid. That's basically what happens in attestation except we add the additional step that the TPM signs this value and it's a challenge response signature from the attestation server to prove freshness and then it gets the list and validates this. But that's showing the basic functionality. So at least from a proof of concept, if I go back to my summary slide, so this demonstrates at least a proof of concept level of being able to pull the measurement list out of the kernel. It shows a proof of concept of being able to attest to the metadata on the file. It shows how relatively simple it is to write and to parse the measurement list and it demonstrates that we've actually done a validation of the draft standard. So that's actually doing a sanity check on the standard itself. And so at this point, like I say, this is all draft. It's all proof of concept level which is a perfect time to get some feedback on this. And we have lots of questions. Feel free to kind of give suggestions and comments and all the rest. But one of the questions is in the current code it's either template or TLV, so you can compile the kernel to be backwards, you know, compatible for user space, or you can select TLV. Is there a use case where we might need both? I can't think of one, you know, so I didn't implement it, but maybe that's something somebody can think of a use case. The other memory leak that I took out was the in kernel hash table. And maybe already warned me that I shouldn't do that. But the question is, if you have all the data outside, we can certainly regenerate a hash table in user space. Is there a need for in kernel hash table like it's had before? If there is, I mean, it's not hard to add it, but we're trying this as a proof of concept to see, you know, would it work if we did not have the hash table? Long term, is that something that in fact we want to move forward and deprecate the existing template code? One of the convenient things with TLV is it's actually pretty easy to have, you know, forward compatibilities, these things change because even if you're not exactly in sync with the version of the sender and receiver, the fact that everything's TLV-formatted means we can still parse it, and we may not understand all of the fields, but we can still parse it and handle the fields that we do understand on both sides. So it makes it easy to do upgrades. Other fields? Are there other aspects of data or metadata that would be interesting to include that we could, you know, consider addering? Any other comments or suggestions on this? There's still a lot of work, obviously, to do on this in the Standards Committee. There's obviously a lot of work to do if we actually, you know, want to upstream this, refactor it, clean it up, incorporate any suggestions you have, and work with the community to do that. So, you know, feel free to give us feedback on this. With that, questions? I think we stunned them in silence. Really bad or really good? So, timestamps. I'm sure there are situations where you can get decent enough time to be sure of timestamps, but I'm sure there are also situations where you can't. Putting it in as an option means that people use it in places they absolutely shouldn't, which is always a danger. And also there's the problem of, you know, if you're taking it from multiple systems and doing the remote stuff, you need to have ways of managing that. Just a comment to that, really. I mean, obviously having the numbering it so that you can make a lot of sense, but just be wary of the time stamping, I think. Absolutely. Our kind of use case for timestamps is we have an embedded controller that's been running for a year. Right. And the convenient thing is to take the timestamp that's there. What we're actually taking is just the seconds from the epics. And even that's, like you say, probably ridiculously precise considering what we have. But it seems like a reasonable thing to do. The TPM actually has a time capability. And so in theory, we haven't worked, literally it's just a placeholder timestamp. Work to be done here, but one of the ways I thought about is using the TPM's capability of creating a timestamp and putting that in the record, right? So among the options, I think that's one of them. Uh-oh. It's time to run in high. Ducking behind that. Yes. Okay, so you're taking, you're getting rid of the measurement list. And let's say that we're in the case of Kexec. The question is, you're Kexec-ing something. And let's say that you're not doing appraisal, you're just doing measurement. How are you going to know that you didn't Kexec something and that that measurement wasn't included that the next boo isn't, is actually doing measurements? There are, that Colonel might not have a policy to do measurements. So you've now introduced a vulnerability of that you were, you booted something and it's not going to show up in the measurement list. And there's no, and there's no way of showing it in the, the TPM won't be extended because it's not being, it's not included in the policy. So I think there, there are a couple of different issues there. The first thing is that admittedly we're pushing the complexity of handling Kexec up into user space. But I think that's really in some sense the right place to do it. What Kexec as the application would need to do is make sure that it has flushed all measurements out of the existing thing before it actually triggers a Kexec. As long as it does that, then there are no measurements to be lost. The new system comes up and starts measurements from that perspective. So there should not be anything in there that was committed to the TPM that was not already exported to the existing list before the Kexec. But there's nothing there that's going to show that you did a Kexec. Why wouldn't we measure a Kexec event? The Kexec, that there might be a gap because you might have multiple Kexecs. Well, we certainly would get the measurement of Kexec itself and anything that would be triggered by that. But it's a good point. I mean we do need to look at that and in essence we're pushing that complexity up into user space. But I like taking complexity out of the kernel and putting it up into user space. Thank you.