 Testing, one, two, three, testing, testing, and thank you for joining me today. My name is Harris. I'm a software developer at National Instruments, or simply NI as we're now called. I work on embedded Linux firmware for various industrial products at NI. Today I want to talk about TPM chips, a fairly common peripheral found on many computers these days, and some of the benefits and problems of using it to improve security on Linux systems, which is something that we've recently experimented with for some of our more security-minded customers. So a disclaimer, I'm not a security researcher. Neither I nor NI are part of the TCG nor are we a TPM vendor. I'm coming to you today as a user of the technology to share some of the capabilities and problems that we've explored, and I hope you find this useful in your work. So TPM stands for Trusted Platform Module, sometimes called the Security Chip or a Crypto Chip. It is a kind of hardware security module somewhere on your board or inside your CPU. They're sometimes confused with crypto accelerators, a different kind of hardware security module. I can tell you these chips generally accelerate nothing. They're usually fairly slow devices. They come in various flavors. So the chip variant, sometimes called a discrete TPM, is an independent peripheral connected via LPC or sometimes via SPI or I2C to your CPU. Sometimes they're soldered onto your board. Sometimes they're little removable modules plugged in somewhere. More common these days, though, especially on consumer hardware, is some kind of firmware TPM like Intel's PTT, which runs on top of their management engine or AMD's TPM application on their secure technology system. There are also a variety of user mode simulators out there which can be used in VMs or for prototype and test applications. So TPM is primarily a key manager. It can hide secrets, sometimes called protected objects and use them conditionally based on some authorization policy defined at creation. So these authorizations can be simple passwords supplied with a command or one of many other complicated schemes defined by the object's creator. The other major function is a kind of logging capability. So TPM can track hashes of running binaries and other system configuration contributed by the software running on the CPU. And so this state can be used as one of those authorizations for a protected object. So for example, you can create a key that can only be used or retrieved from the TPM if the system is in a certain predefined state. And this can be done in addition to other authorization constraints like a password. So these two functions are primarily what we're going to be exploring today and they enable a kind of secure boot mechanism for the OS and can also extend to user mode applications for key management functions to some degree. Now, the ascribing functionality can also extend to external actors. So for example, a TPM can produce a signed attestation object to prove the state of the local system to some external actor like a network system, for example. This has some interesting dynamics. So it can be used, for example, as part of an intrusion detection system or kind of state auditing system or a credentialing mechanism to bind a user credential to a specific machine. It can also do some bad things too. Like it can be incorporated into DRM schemes to deny users access to services when they try to run modified software on their systems. Security technology often cuts both ways. It can help users or it can harm them. And my hope is you'll appreciate some of the benefits of the technology today. So we'll be focusing more on the local use cases to protect the local data on systems today. Beyond that, it's just a slow crypto engine with some persistent storage for internal use that's also forwarded to the user and a random number generator for generating keys independently of the CPU. Now, this is also a good point to bring up the differences between TPM1 and TPM2. So TPMs have been around for a while. TPM2 was introduced a few years ago. It's a large rework of the wire protocol to communicate with the chip. So these changes were introduced primarily to modernize the crypto offered by the chips themselves. So TPM1, for example, required SHA1 hash and RSA28 for authentication operations and HVAC operations. And so TPM2 increased that to SHA256 and later to 384 and added a couple of elliptic curves. But more significantly, TPM2 also made the crypto engine much more flexible now. So the individual vendors of the chip can add other algorithms the default ones age or become otherwise untrusted. But in the process, much of the API has also changed. And this is what's most significant to software developers besides the crypto offering because it necessitates a different software stack to drive one chip type versus the other. So the TPM software stack, usually called the TSS for short, is specified by the Trusted Computing Group or TCG, which is a coalition of manufacturers and software vendors that define the behaviors of these chips. You'll find that they also love their acronym jargon. So some confusion is natural here, but I'll hopefully try to demystify some of that today. In a nutshell, the TSS is a collection of CAPIs for interacting with the TPM. It exists largely because a correct use of TPM, which makes some security guarantees, usually involves complex message formatting and validation on the CPU side. So it's more involved than a simple command response device. And TSS tries to abstract away some of this complexity. So the API is layered into different OS abstractions. So the feature and enhanced system APIs provide nice things like heap allocated objects, file-based handle storage like for PCR policies or key handles or what have you. They can bind to a crypto library on the CPU side for computing, HMAX or doing a parameter encryption. They're basically kind of nice, easy to use application interfaces built on top of a libc. The system API provides more primitive message formatting operations. So there's no heap, no file IO or crypto libraries. You sort of bring all of that yourself and it's more amenable to embedding. Now in user mode, the system API and the ones above it are built on top of the TPM command transmission interface or TCTI for short, which is this communication layer for targeting different TPMs on your system. They can, for example, talk to a local broker service or via network socket to a process on your system or even a remote system to talk to like a remote TPM chip. Locally, though, there's usually going to be a resource manager, which sort of marshals the commands and responses from different tasks to the actual hardware. So in Linux 4.12 and higher, this is the TPM RM0 device that the TCTI will talk to by default. Otherwise on older Linuxes, you can talk directly to TPM 0. And if you're using an older version of the chip, you can also set up a broker service, a demon and talk to it over a domain socket. There are a variety of TSS implementations to choose from from various software vendors, like IBM and Microsoft, Intel, Google and others. I'll be using the first one today, the TPM2 software in my examples, mostly because it provides a relatively nice CLI interface from Bash. If you have a TPM1 in your box, which are still around in many systems, you'll need the old Trousers TSS instead. So the tools in this case will be different, but the concepts you see here are generally the same. So you can largely remap the workflows I'll be sharing today in the other tool. Another good application to have is a simulator like SWTPM. This one in particular is compatible with QEMU, and it's great for prototyping on a VM. You can also run it standalone and talk to it over a socket. So now that we've kind of looked at a high level view of the hardware and the driver stacks, I want to use the rest of the presentation to explore a few interesting use cases to kind of illustrate what can be done with all of this. So one popular application of TPMs is measured boot. This can be used in conjunction with block or file system encryption to realize a kind of secure boot scheme which can obfuscate data when something in the system's boot path is altered. So this is sometimes conflated with UEFI secure boot, which can be incorporated into this scheme, but is somewhat orthogonal. UEFI secure boot is a bootloader signing mechanism that checks the signature of the bootloader file, the EFI program that's run against some certificate database in the firmware before letting it continue. Boot measurements, on the other hand, are a more passive mechanism. They leverage the scribing functionality of the TPM to track changes as the system boots up and then defer enforcement to the operating system or the user, which makes it a little more flexible. So for Windows users, this is what, if you're familiar with BitLocker's pastoral disc encryption, this is what it's based on, and you can do the same thing in Linux as well, which is what we'll explore here. So that scribing functionality is realized by platform configuration registers or PCRs for short, which are a collection of hash banks in the TPM. So each bank is a checksum of some subset of software and configuration running on your box. So on the right is kind of a suggested breakdown from the TCG spec, and they typically alternate between code and configuration. So PCR0 contains the EFI firmware binaries, PCR1 is the firmware configuration, the hardware configuration that you're booting into. PCR2 is the option ROM code and the option ROM config and then the EFI bootloader binary and then the partition table that it came from and so on and so forth. Past PCR7 is operating system specific, so it's based on what's installed on your system. So in this example, we're using Grub, a modified version of Grub, that measures the running commands into PCR8 and hashes all the files that it reads into PCR9. So PCR11 is used by our init system. And I don't recall why we skipped PCR10, but we did. So all of these measurements are made by the CPU, by whatever software is running on it. And the idea is that each program can measure the next thing to run before it yields control over. And then if everything is measured, a change can then be observed later on in the boot flow to enforce something. So this model relies heavily on the irrevocability of hashes, of these PCR hashes. So the first 16 banks can only be extended. They are rolling hashes, right? So the CPU can only add a concatenate new data to the existing hash and update the PCR as kind of an atomic operation and they can't reset it. It only resets to zero on startup. And so these first 16 PCRs measure what's sometimes called the static chain of trust, the chain of software involved in system bring up. The registers above that, 17 to 22, are wholly managed by the operating system or the hypervisor on the CPU. So these measures, what's sometimes called the dynamic chain of trust, which can go into different processes or different VMs. And so these banks are resettable on a context switch, for example, and can extend a single TPM's functions deeper into the software stack like deeper into user mode applications. So most UEFI firmwares will automatically make these static measurements throughout the normal course of booting. So at early boot, the pre-EFI phase will initialize the TPMs with zero PCR values, followed by measurements of all the loaded software like drivers or option ROMs and the installed bootloader. And then from there, the bootloader can measure the kernel, the init system, other configuration files involved in boot, so on and so forth. You can continue this indefinitely. Up until some point where there's an enforcement action, where the software, which hopefully has been measured at this point, requests some secret to be released from the TPM to continue booting. So for example, this can be the release of a key for an encrypted file system. And at this point, you can do just CPU-side file system encryption into that partition. So this depicts kind of a software-only approach to hold this encryption. Another popular approach is the use of self-encrypting drives, which are governed largely by the TCG's OPAL standard. So this is where the key is released to a drive's firmware. And so in this kind of system, you can transparently encrypt and decrypt the entire disk, independent of the OS, so long as your firmware knows how to drive it correctly. But it largely works the same way. I don't personally use them. I can see where it's appealing on systems with slower CPUs, but I found that the software encryption is usually pretty good and pretty fast. Now, the security guarantees of PCR is largely hinge on where they're initialized. Because from a zero value, you can mimic any PCR state and therefore authorize any action that is just solely based on PCR values. So this makes the very early phases of UEFI boot, the so-called security and pre-EFI phase is very critical. Ideally, you want this code to run from ROM, otherwise it'd be very difficult to change without intrusive physical access to the system. And how secure this is is really up to your hardware vendor. This also means that plugging a TPM into a system whose firmware doesn't understand it can't really do this kind of secure boot. So for example, using the TPM add-on board for Raspberry Pis is not really going to get you that much boot security like this anyway. So PCRs are only as trustworthy as the code initializing the TPM itself. And there have been security challenges in this area in the past, and there are still some today, and we will definitely look at this a little bit later on in the presentation. But for now, let's stay in our idealistic world for a moment and just look at an example of this simple key management scheme with TPM tools. So in the first block, we'll set up our TPM by creating a primary key. So this is a TPM resident object used to protect other secrets later on like our disk key. So that second command, evict control, will persist this key into TPM's NV RAM. So persisting is totally optional. This is merely done for performance. The TPMs derive primaries deterministically from a fixed seed random number generator and a combination of that and the parameters passed to the create function. So multiple runs of the same create will yield the same key. So you can basically remake it after boot. And this also allows you to make more primaries than you can store necessarily in NV RAM at any one time. So you can sort of make a key and use it and flush it out. So the RNG seed is referenced by that hierarchy parameter in the create primary command. So we're creating this one under the owner hierarchy, which is what the equals O stands for. And this is for the operator of the system. So that seed will remain constant so long as the owner doesn't reset the hierarchy, at which point it randomly changes in validating all of the objects underneath it. This is something that's called taking ownership of the TPM or taking ownership of the hierarchy. And there are several hierarchies in the TPM that kind of operate independently of each other for different slices of the system. And you can read about these in the TCG spec. These examples will just use the owner hierarchy. So in the next block we'll generate a secret. Hopefully you'll use something a little better than that, but we'll go with it for now. And then we'll seal it under our primary key and authorize it by the UEFI firmware binaries. So PCR 0, 2, and 4. So sealing here just means encrypting the key, the private data, and a hash of the PCR values by the primary that we just made. And the result of this is a set of data blobs that we need to stash somewhere for use on next boot. So SPUB contains the metadata about the sealed object and SPRIV is the actual encrypted key. And so after boot we can reverse this process by loading that object back into the TPM and then running the unsealed function, which will check the policy and if everything matches the TPM will release the key back to you. Now I'm showing you the TPM commands as an example to illustrate some of the chip's operations, but there are better ways to do this now. So Linux's key manager has these trusted key types, which can basically do exactly this kind of object sealing on your behalf in the kernel. And that code will probably age better than anything you homebrew. Now software updates are generally considered good practice too for security reasons and other reasons. And you'll find this scheme works really well until something changes. Supporting updates usually necessitates some kind of resealing operation to accommodate changes. There are various strategies to go about this. So one solution is to do simple offline backups of the raw key and then manually reseal when changes happen on the system. This works really well on personal systems where there's a user involved like a laptop. This is also just generally a good idea because if the software changes in a way that you can't restore it back, then you really have no way of extracting the key from the TPM. So this is just generally good practice. For headless systems though you have to get a little crafty. And so we've experimented with a few schemes at NI. So one is this kind of boot time resealing where the plain key is temporarily unsealed into TPM's NVRAM after a successful update operation. And then the init system reseals it next boot to new PCR values. And this can be integrated nicely into a package manager for various kinds of kernel or firmware updates on the system. But the downside is it obviously creates a momentary lapse in the key confidentiality across one boot. The good thing is the user can know about it and it can probably do that safely, hopefully can do that safely. And it's fairly easy to implement and generically applies across many different systems. But a far better way of doing it is to pre-compute PCRs and reseal to future values on an update. So the create policy tool can actually accept arbitrary PCR values. And so you can sort of seal to the next state before you actually enter it. And this can be tricky for different reasons, particularly if you have many different hardware models to deal with with model specific firmware or if the user can customize different software on the system. It can kind of lead to a combinatorial explosion. Not to mention that if you compute the wrong values and you have no way of restoring the old ones, obviously leads to some problems. Now user applications can also use the TPM to manage their keys beyond initial boot. So I'm going to explore some of that in this section. So user applications are typically built on software crypto systems like OpenSSL, which usually don't know anything about TPMs, but there are some clever ways around this. So this is where PKCS-11's crypto token interface, or cryptalky, if I'm pronouncing that right, can help. So this is a relatively standard interface to removable hardware security modules like smart cards where it originated, or the more modern flavors like the USB UB key. So TPMs are not generally removable, but they're largely similar in terms of functions. They're also a kind of crypto coprocessor plugged into the computer. And the really nice thing about PKCS-11 is its API is more wisely adopted in the TSS, which can bridge TPM functions into other crypto libraries, which can then in turn be bridged into applications. And so the TPM to PKCS-11 project implements this kind of cryptalky API using the TPM as the backing token. So continuing our example from before, let's see how to set up this library for our TPM chip. So the PKCS-11 API is rooted in the smart card world, which exposes this notion of tokens and slots to the caller. So token is a crypto device that is plugged into a slot, so like a smart card reader plugged into a smart card reader. And so these APIs are kind of slightly funged to accommodate TPMs. So this example is basically constructing that adaptation layer. So the first block uses P-tool to create a virtual slot. So this is simply a mapping of slot ID 1 to a persistent TPM primary key, and we get that persistent handle from the previous example. The next thing we do is we create a token, which in our TPM world is just a sealed random number, which will be used to authorize the keys that we're about to create. And so this token is really a kind of key ring, right? The user has to unlock the token first and then use its value to authorize the use of actual keys. And so this sort of simulates plugging a slot into your computer and then plugging in a token into the slot, if you will. And then finally, the last block, we create a key pair from our token on our virtual token. So this would be a TPM resident key capable of encrypt decrypt functions through the chip. And so once we do all this, we can now treat the TPM like a smart card. And from this point on, we have many options. But for this example, let's use OpenSSL. So the OpenSmartCard project provides the libp11 library, which is an OpenSSL engine for smart cards, for PKCS11 style smart cards. And so we can do something like create an X509 cert and load it into Apache or some other application. So we'll call OpenSSL request to generate a self-signed X509 certificate, and we use that engine parameter to direct it to the PKCS11 engine. And then this will use the private key reference by that URL to create a self-signed cert, which will prompt you for a bunch of different parameters too. And then we can configure Apache to use it. So we point the public cert to this new file, and then again reuse the same PKCS11 key file URL to redirect Apache to the TPM as well. And for convenience, you can also add this certificate file into the token and just reference both using the same URL. Now, why do this? Aside from creating a slow web server, this places a requirement on Apache to have continuous access to the TPM in order to service new clients, in order to establish new TLS connections. And the idea is that it makes it more difficult to duplicate the identity of the server without continuous access to the one TPM holding the private key. It can also provide a single point of disabling or revoking in the event that some security policy is violated on the system. So you can shut down the TPM and then no clients can connect to the server anymore. And so depending on the value of the data going over the wire, this performance trade-off might be useful to you. Although I'll admit there are definitely some diminishing returns involved as you add more layers of security on top of things like file system encryption. But nevertheless, it can make managing network identities easier in certain situations. And the TPM2 PKCS11 documentation also has numerous readmeans on integrating this into various applications like SSH, VPN, Radius, Wi-Fi authentication, and others. And I'll leave those to you to explore. They all basically follow the same pattern just with different tools. I mentioned earlier that TPMs are no magic bullet and indeed no security technology really is. So in this section, let's explore how to break everything we just talked about and then talk about some ways to improve the aforementioned examples in some circumstances anyway. So the TCG specifies TPM's security model in terms of API behaviors and some internal state. It doesn't really mandate any particular hardware security. That's largely left up to vendors and integrators putting these chips into boards or into CPUs. So one common attack point particularly with the discrete TPMs is the bus. So they're usually connected to relatively simple buses that can be decoded with cheap tools and therefore exploited with man-in-the-middle style attacks. And indeed, passwordless disk encryption schemes like the ones I described have been broken or not flying attacks before. For example, there are some pretty simple tools to sniff bitlockers, master key during power-up with a $50 FPGA board. And that project also includes software to decrypt the disk plus quite a pleasant video demonstrating how it all works in about 10 minutes. And this can be easily adapted to Linux as well. The only difference really is just metadata. There are tools for older TPM12 as well that can do exactly the same thing. Now, even with a shortcoming, TPM is not entirely useless. It makes transient attacks more difficult. So for example, somebody with momentary access can't boot malware from USB to steal confidential data or manipulate something without notice. But somebody with prolonged access to your system will not be stopped by TPM, perhaps deterred at best. But we can improve this situation with some trade-offs. So we can mitigate offline attacks by adding another authorization requirement to releasing or using secrets within the TPM. So almost every layer of the object hierarchy can be configured with an access password, including the hierarchy itself, that seed which derives the primitives and objects underneath it. So to use any element of the hierarchy, the password needs to be supplied with the creation command. So we can amend the examples from before to include an externally supplied password saved by the user. Of course, this is not very amenable to headless systems, but if there is a user or other device involved in these systems, this can mitigate many of the shortcomings we discussed earlier. So TPM also provides several ways to authenticate to prove you have the password. So the simplest method is to transmit it in the clear. So this is what's happening on the left with TPM tools. And this is vulnerable to online man-in-the-middle attacks. So somebody monitoring the bus in real time as you enter your password. But there are more sophisticated options so they can prevent this as well. So for example, the TPM will accept a nonced HMAC of the command messages that are keyed to the password, instead of sending the password in the clear. And then it will also reply with an HMAC, similarly keyed to the same password that was agreed upon earlier, so you can verify the TPM's response. So basically the user and the TPM can use this scheme to mutually authenticate each other without sending the actual token across the wire. There's an even more complicated auth session approach, which can actually chain together multiple factors of authentication together to authorize something that happens. You can connect your fingerprint reader and your retinal scanner and whatever other security device you want into a complex policy to authorize some action. You can also do encryption of in-flight data with certain commands to prevent eavesdropping on the bus. And there are some limitations with this. But this is what the PKCS11 library on the right is using. And I believe the Linux key ring uses the HMAC option. Now one challenge with multi-factor, particularly in the context of secure boot, is where do you enter the password? So if you can't trust the UI until you talk to the TPM until you enter the password, where do you enter the password? So you can do this from a second trusted authentication device communicating through the system to the TPM. That would work, but it's kind of clunky. There's a human involved though, at least one point during boot. There is a software solution. So TOTP is a clever little tool that can generate those six-digit one-time password codes inside the TPM using an HMAC key authorized by a PCR policy. So you can use this, for example, to give the human operator an opportunity to visually inspect the hardware for alterations and then using their phone to inspect the software, so to speak, before entering credentials to continue. This also enables other security mechanisms that are sort of orthogonal to the TPM. So for example, you can use the TPM to verify the firmware in the init system with a one-time password and then use a simple password for the file system, completely orthogonal to the TPM thereafter. So this is a great approach for personal laptops, obviously a little more difficult on headless systems. It's also not without its own limitations. An attacker, for example, could precalculate OTP values into the future with one of the offline bus attacks and then present those fake values to a user as part of an online attack to get their password. It's just another layer that makes things more difficult. And this kind of brings me to my final thought. Secrets inside the TPM are just blobs of data encrypted by keys derived from some seed stored on flash underneath some plastic and silicone. And many TPMs, this can be removed with a good laser or some potent acid. And some manufacturers have decap countermeasures, but these aren't mandated by the spec. And certainly no countermeasure is ever going to be perfect. So if you break the root of trust on the TPM, you can then use that information to slowly break things above it, whether it's fake OTP values or fake PCR measurements to do something else or something else. How hard this is to exploit largely depends on how well other security systems are designed around the TPM. No security technology is perfect. The TPM's job is just to make the attack more costly in a relatively standard way that's easy or easier for us to implement than homebrewed solution. And I hope you can see some of these benefits for your systems after today's presentation. So once again, thank you for attending. And I will yield the remainder of my time for your questions. Okay. Hello everyone. So this will be the Q&A portion of the session. So feel free to ask some questions in the Q&A window. I see that some of you already have. So I'll just go down the list kind of in order. So Thomas asked if there are any affordable TPMs available for Raspberry Pi. So the Infineon actually produces a few boards, like the $50, $60 range and connect to the Pi. There's also something similar for like D-Word. I will just plug into the IO bin on the device. But I'll caution you with those add-on TPM boards on things like the firmware on the Raspberry Pi because I don't understand TPM and the new boot measuring. So if you're hoping to use like a whole description, it's not really going to be that effective as it were on a board with the TPM being built in. Oh, and someone also asked if there will be a video recording available. My slide back is already posted on the abstract page and the video will be available after the show, as well as posted to YouTube I think in a few months. So someone asked if considering bus attacks should a firmware TPM implementation that relies on FSC integrated blocks be preferred to a discrete TPM? Yes, in certain cases, really I should say there are some trade-offs with it. The problem with firmware TPM is oftentimes they rely on memory isolation. So like the Intel PPP runs in a memory enclave on the chip and so it can be vulnerable to like the Spectre class attack to extract secrets from it. Yeah, on the flip side of it, there are no leads that you can probe with a hardware stack. The AMD secure technology, I think on some of their chips, that actually is a coprocessor embedded inside the FSC. So that actually provides a little more security guarantee. You don't want to have like the benefit of a discrete TPM and with that without some limitation of it running within the CPU and memory. So it sort of depends. Thomas also mentioned that the Monday server is an example of user-free two-factor off. So I've never heard of that, but it's interesting. Certainly a legitimate strategy for headless systems with TPM is to have a remote system provide the authorization value to release the disk encryption key. So that's definitely one way to go about it. I think that the most common implementation is either user-interactive or just kind of one-factor, in this case, like the Vista Locker. Let's see. Oh, yes. Has somebody just responded that there's a link to a TPM for Raspberry Pi from a company called Let's Trust. Okay. Well, if there are no more questions, you can feel free to email me. After the show, where I will also be on the Linux track on Slack. We'll be in chat after this. So thank you all again for attending the session.