 Hello, everyone. My name is Tushar Sugandhi. I'm a software development engineer in Microsoft. For the last few months, I'm working as a Linux kernel developer on technologies like device mapper, integrity measurement architecture, also known as IMA. I'm also working on various aspects of remote attestation in general, which we'll talk about in detail as part of this presentation. And I'm Alasdair Kogan, working in the Linux kernel storage group at Red Hat. I've been involved with device mapper continuously from its beginnings at Sistena 20 years ago, and now I've been helping Tushar to get the project we're going to be talking about upstream. Today we are going to talk about remote attestation in general and how various device mapper modules and extensions measure their data using IMA's runtime measurement capabilities and how we can use that measurement data for remote attestation. Remote attestation is all about establishing trust between two systems which are part of some network. Before one system entrusts a remote peer, it must ensure that the remote peer meets certain security requirements. IMA provides the necessary framework from within the Linux kernel to measure and report the necessary configuration of that remote peer for attestation in a tamper-resistant and trustworthy way. Let's work through an example for this. Here we have Alice, who wants to delegate some work to Bob. But first Alice wants to verify that Bob meets certain security bar. Alice doesn't want to get into the business of verification herself. She may not have the right expertise or resources to do so. So she relies on Charlie for verification. Charlie is trusted by Alice. Now Bob sends his information to Charlie and Charlie sends Bob's verified report to Alice. Here Alice is a relying party, Bob is an attestee and Charlie is a trusted verifier. When we translate this analogy to the network of systems, then we have one system which is a relying party and wants the remote service or system to perform certain tasks. The relying party wants the remote system to get attested first before interesting it with the business critical data or tasks. And it relies on the verifier to do the attestation. Here the attestee sends its set of claims to the verifier and verifier verifies the claims and sends the verification report to the relying party to indicate if the attestation has succeeded or not. Now let's zoom into the details of the attestee. Here we have IMAG which is part of the Linux kernel. It can take the measurements from various kernel components like slinux policy hash present in kernel memory or the kernel boot command line parameter, etc. And put that information in IMA log and extended TPM PCR codes. And then the IMA log along with the TPM PCR codes and some other metadata is sent to the verifier. This information is sufficient for the verifier that the data indeed came from this TPM on this attestee system and the information present in the IMA log is not tampered with. Now let's dive into the scenarios that can take advantage of the remote attestation. We are going to talk about these three example scenarios. So in a network endpoint assessment scenario, the relying party is some network resource and the attestee is a client that needs network access. Whenever a new client wants to get access to the network resources, the relying party would want to validate its storage configuration before granting it access. Notice that here the relying party and the end user are two different entities. While the relying parties are the systems on the network talking to the new client, the end users are the actual network administrators in charge of protecting the networks. The advantages of remote attestation in this scenario are it prevents the non-compliant clients from getting access to the network and potentially harming other systems on the network. Confidential data retrieval is the scenario where one system holds some business critical data such as access tokens, digital certificates, application source code or customer data and so on. And the attestees are the system that want to retrieve that data or gain access to that data. Here the end users would want an assurance that their confidential and sensitive data is only available to the systems that meet the certain security bar. This helps protecting the data from the vulnerable or potentially compromised machines. The last scenario example we have is critical infrastructure control. Here we have some sensitive infrastructure like a mail server or a web server as a relying party and the attestee is a system that is used to remotely manage the critical infrastructure. Here the end users would want to ensure that if a client wants to control the critical devices in the environment those client themselves meet certain security bar. This is to ensure that the clients that send the commands to manage the critical devices are not compromised and those clients are trustworthy. DeviceMapper is a framework available in the Linux kernel for creating new virtual block devices out of existing ones. LVM2, the user space logical volume manager is built from it. You can configure these devices to pass their I.O. through specialized modules that can do all sorts of other things. They can provide resilience by placing copies of your data on more than one disk or they can create snapshots that track changes so you can retain records of how the data looked at earlier times. They can cache data to speed up access. They can compress data to save disk space using a compression module called BDO which is currently being fully integrated into LVM2 and user space. If you need to use several of these features simultaneously then many of the modules can be stacked up to combine their effects. As the diagram shows, amount of file systems normally lie above DeviceMapper. The exception is if you use loopback to convert a file into a block device. This talk focuses just on the DeviceMapper layer. Next slide. There are now over 20 DeviceMapper modules available. For this project we worked through the list of upstream modules and divided them into three categories. Firstly, the ones that are the most important for remote attestation because they provide features that people who want to use remote attestation are most likely to be using. Secondly, are the modules that we think some people will be using but which aren't crucial. And thirdly, everything else including niche modules and ones designed specifically for testing. The first category includes modules key to the integrity of the programs and data on the operating system. If modules are stacked together everything in the stack has to be recorded to provide a complete record of the relevant state. Next slide. As its name suggests, the CRIPT module uses encryption to protect data at rest. It hooks into the kernel's cryptographic API to encrypt data on a device block by block. The location of each block gets factored into the algorithm people normally use so that two blocks that have the same content in different locations produce different cyber attacks. Unlike the earliest implementation best practice nowadays means device map avoids handling the encryption key directly and instead accesses it by reference so the key material can be stored securely. Next slide. The M Integrity is designed to emulate a setup where each block of data can have an extra checksum attached to it so that various types of corruption can be detected. Each tag is handled atomically alongside the data it protects using a journal. It can generate the tags itself or they can be passed into it by other layers of the stack. When the data is read back the tags can be validated and an error can be returned instead of the data if silent corruption is detected on the disk or in the IOPA. The M CRIPT can generate tags that are cryptographic checksums designed to detect attacks to temple with the data. The internal journal can also be encrypted to avoid exposing information about what is bitter to. Next slide. The Enverity provides an efficient way to verify the integrity of a read-only block device. It's used for systems that want to verify their booting. A tree of cryptographic checksums for device blocks is generated in advance and stored all traceable back to a single secure key. When any block gets accessed for the first time the checksum is validated back to the root of the tree. If the verification fails an error is produced instead of supplying the data. There's also an option for some error correction. Next slide. Attestation can only be as strong as its weakest link and so we need to track the state of all the modules that can touch the IO in any particular device stack as well as tracking the composition of the stack itself. We expanded the list to include the most commonly used DM modules including the ones on this side and few more. For completeness we might also support the remaining modules eventually and require support as standard for any new device buffer module. Thanks Alice there for describing the motivation for including device mapper measurements for remote attestation. As we saw in the previous slides I may support measuring critical data from various kernel components at runtime. Until now device mapper was not part of the list of those kernel components but we do need to measure various attributes from device mapper to help with the example scenarios I mentioned earlier. In case of network endpoint assessments the end user or relying party would want to know the answers to these questions from the attestee system like is OS partition protected with DM variety or the network block devices encrypted with DM crypt etc. For the confidential data retrieval scenario they would want to know if the block devices where the confidential data is stored are they encrypted with DM crypt plus DM integrity. This is to ensure only authorized processes have access to that sensitive data. Since the data in this case may be very large several devices might be stacked together and other device modules like DM linear, DM multipath, DM red, DM cache might also be used for redundancy, reliability faster data retrieval and so on. And the similar questions can be asked for critical infrastructure control as well. So far we have talked about the basics of remote attestation. The role IMA plays from within the Linux kernel to measure the data necessary for attestation and backed by hardware root of trust provided by TPM. We also have talked about the basics of device mapper and the functionality of its different modules like crypt, variety, integrity, raid, multipath etc. And lastly we have talked about various example remote attestation scenarios and how measuring device mapper attributes can help with strengthening the remote attestation story for those scenarios. Now let's start talking about the actual solution that we implemented and how you can build on top of that. So far we have identified the important configurations relating to relevant device mapper modules and we have written the code to measure them. In our implementation we ensure all the runtime changes to the configuration of the devices are captured. And of course we leverage IMA to ensure the data is tamper resistant and verifiable by the remote peers. For the IMA to measure the device mapper specific critical data on a given system the IMA policy on the system needs to be updated to have the following line. And the system needs to be restarted for the measurements to take effect. Please note that the label and the template here are optional. If you just say measure func equal to critical data it will measure the data coming from all the components all the supported components. And if you don't specify the template IMA will just pick up the default template. And the measurements will be reflected in ASCII runtime measurements and binary runtime measurement logs. Before going through the different device mappers states first I want to clarify the terminology what the device means. When I say a device I don't mean a laptop or a desktop or a server etc. For that I use a word system. Here device means a block device it can be a hard drive an SSD a USB stick a CD or a network drive etc. When you insert a USB drive in your laptop the device mapper is essentially creating a device and there is a table of configurations associated with that device you can load that table or clear the table. Loading a new table prepares the device for a change with all the detailed configuration info then you can resume the device which is essentially which is essentially applying the changes prepared by the load table command you can remove the device just like you remove the USB from the system you can also suspend the device without actually removing it from the system and of course you can clear the table which would abort the changes in waiting to be applied Lastly you can rename the device we measure the table load, device resume device remove, table clear and device rename because these state transitions meaningfully change the system state in the context of device mapper and remote attestation the other states like device create or device suspend don't really impact the device configuration being measured through IMA so we don't measure them now let's go through the example of each state and how measurement look like in the IMA log for these state measurement examples we are using device mappers linear module which is very simple to use so when you create a linear device using DM setup it involves creating a device loading a table and resuming the device in this case the table being loaded has just one row and it maps to just a single physical device a slash dev slash sd1 and the IMA log gets the following entry for the table load here you can see the PCR which is being extended the template hash and the event hash and the event data hash in the device mapper world and the module is also called a target so you can see all the detail attributes for that target if there are multiple rows in the table you will see multiple rows here in the event as well with the target index going from 0 to n minus 1 where n is the number of rows in the table once the table is loaded now the device mapper resumes the device or you can explicitly resume the device if it is suspended I want to talk a little bit more about this active table hash in this simple case there is only single row in the table but in more complex scenarios there would be thousands of rows in the table all this information does not change once the table is loaded and the device is resumed so if we keep logging the same information during the other device mapper stages like resume, suspend, remove and all will be unnecessarily bloating the IMA log so we only log the full information of the table during the table load and log the hash of the load event during the subsequent events like resume and rename and so on and this is what the active table hash is the hash of the load event that we saw on the previous slide since all this information is present in the same IMA log IMA log instance and it is ordered chronologically and the IMA log contains both clear text from the load event and the event digest from the resume it is easy to map the load event to its corresponding resume event on the attestation server side and this is what we get this is what gets logged when a device is removed from the system notice here that there are two sets of data here active and inactive a given device can have one table that is actively loaded its configuration is active and applied and it can also have an inactive table waiting to be loaded when the device goes to suspend and resumes back again in the future if the device has such inactive table we do log its metadata as well as inactive table hash when the device mapper stays change is logged and this is the example of table clear the inactive table that we discussed on the previous slide device mapper provides the functionality to clear that table that inactive table before it is loaded and we can capture that event in the ima log as well a device mapper's name can be changed in order to keep track of what device we are measuring and tracking the data for we need to log this device rename event in the ima log with the mapping of current device name and uuid with the new name and new uuid in the single ima event ok so far we have talked about measuring device mapper states we only used linear module for this illustration but there are about 20 different modules present in device mapper which provide various different functionalities out of those 20 odd modules we identified 10 modules which are relevant in the context of remote attestation and security and hence as of today we measure the data coming from these 10 modules the feature that we have checked in linux kernel 5.15 few weeks back has support for these 10 modules in this presentation we can't go through the details of all modules but let's go through some important ones which are very relevant from the security point of view for instance when you encrypt a disk using dmcrypt we measure following attributes in the ima log which are important for determining how secure that device is available allowing discards on encrypted devices mainly to the leak of information about the cipher text if the discarded block can be located easily on the device later or if the device is set up to perform encryption using the same cpu that was that io was submitted on and of course the attributes like encryption algorithm used to encrypt this disk or the key strength are also important to determine if the system meets the security bar if it does then only the attestation succeeds for integrity module we measure attributes like mode of writing the integrity tags is it direct write or journal write or is it written in the bitmap mode or without any synchronization or recovery mode we also log the fixh mac which improves the security of the internal hash and the journal mac because in this case the section number is mixed to the mac so that an attacker can't copy sectors from one general section to another general section for the verity target as you know it provides the transparent integrity checking of block devices using cryptographic digest provided by the kernel crypto fbi this target or this module is read only and is mainly used for secure boot scenarios here we measure what is the data device what is the hash device what is the hacking algorithm etc and of course the most important attribute the root digest which is the hexadecimal encoding of the cryptographic hash of the root hash block and the salt this hash should be trusted as there is no other authenticity beyond this point so here is the demo of device mapper module attribute measurements using IMA for this we are going to use the crypt module I have a USB which is encrypted using crypt setup I am going to insert this USB into the laptop and demonstrate how it triggers the measurement of dm crypt attributes using IMA but first let's see the current state of the system as you can see there is nothing reported by dm setup no devices the IMA measurement log contains the boot aggregate but it does not contain any other information it does not contain any other information there is no USB disk per se and as you can see there is no USB disk here as well so now I am going to insert the USB disk the USB disk is detected and since it was encrypted using crypt setup it is asking me for passphrase so I entered the passphrase and the disk is unlocked and I can access the content so what is happening behind the scenes is a crypt setup calls device mapper and device mapper creates the disk and the crypt device and loads it so if you see dm setup table now the device is there and if you see ask a runtime measurement logs you can see there is a dm table load and there is also dm device resume these are the events that are logged into the IMA and of course you can see this information in the binary runtime measurement as well which gives you the actual text but let us stick with ask a runtime measurements for the time being and if you convert this information to from hex to actual string values you will see the information like this is the device user ID name and this is the crypt target and the target version is this and there are a bunch of attributes which are important for determining if this device meets the security bar and attestation service can make the decision that whether if the device meets the security bar then attestation service can say that hey, the attestation succeeded and my system can go on with its work it also contains some important information like the algorithm used for encryption, the key strength and key parts and so on and so forth and that information is used by attestation service as I said earlier to determine if my system meets the security bar so this is during the device the table load and of course we also have the device resumes the operation and that information is also reported in the IMA log and as you can see the IMA log says that hey, this device has been removed and now it is functioning properly one more thing that I want to call out here is this is the simple case in the simple case the the device mapper table has a single row but in more complicated scenarios there could be thousands of rows in a given table so all this information does not change once the table is loaded if we keep logging the same information during other stages of the this year's videos like device resume or suspend or remove will be unnecessarily bloated in the IMA log so we only log the full information during table load and log the short of 56 hash of the table of the load event during the subsequent event for example this is the load event and if I want to find out the short of 56 hash this is the short of 26 hash of the load event and as you can see the active table hash reported in the device resume event it is exactly the same as the hash of the load event and lastly when I actually remove the device that event is also recorded in the recorded in the IMA log as you can see the DM setup table says that no devices are found and when you see ask it anytime measurements you will see that there is a device remove event that was logged into IMA log and you can see that information in the IMA log as well the same hash of the table and this is that this device was removed from the system and this concludes the demo of device mapper modulator measurement using IMA now to talk about some future ideas there are many more related things that we could do here firstly though I should say something we should have mentioned more clearly earlier which is that we added a new command to DM setup stream last week called measure which is mostly there to help with debugging internally it works in a similar way to table or status if you know those commands except it shows you the content that device mapper would send across to IMA subsystem if a measurement got performed at the time it runs but it doesn't actually cause a measurement to happen itself next the actual IMA data visible in sysfs which we've quoted from earlier isn't very user friendly in the way it mixes up ASCII text and binary for example so we'd like better tools and libraries for displaying manipulating and extracting that content and then we need a good way to define policies that reference it and be able to take actions based on what is reported such as notifying people stopping or disconnecting services a likely way to achieve some of these things might be to plug into the existing key line project so one of our next steps is to investigate what our options there might be next slide please so far we've talked about remote attestation which detects what happened after the fact but we could also make attempts at enforcement for example by allowing the measurement books to determine whether a state is acceptable or not according to policy and if not to actually block the operation from happening or we could link device mapper directly into SE Linux defining SE Linux types according to device mapper objects and allowing policies to determine permitted configurations next slide please we also need to repeat the exercise we've gone through for device mapper with some other components of the system for example to record the details when we're using other types of block devices like ND or if we're mounting a file system but with containers and name spaces where mounts depend on which process you look at this work soon starts to drag in additional complicated context if it's going to be comprehensive and we probably need to define some clear boundaries to keep it usable and maintainable then there's FS Verity to consider if people are using that it's like the Emverity but for individual files where you don't need to verify the entire device but just some specified files on it to read only and you still want to do this efficiently block by block and not have to wait to validate a complete file before you can start accessing any of it so all in all we're really only just beginning this work and if you're interested in helping with any of it in any way please do get in touch with us. Thanks Alice there for explaining the future work in this area there are many people that helped me for implementing this Linux kernel feature first I would like to call out Mike who is also a device mapper maintainer along with Alice there he helped me understand the details of device mapper then Mimi who is the maintainer of IMA for her constant support and guidance around IMA my colleague Lakshmi for code reviewing my patches and helping with the upstreaming process and lastly Thar Samar for testing the patches and providing feedback here are a few references if you want to dig deeper in this area it also contains the link to my patch series which is now integrated into Linux kernel 5.15 we also have a very detailed documentation it is also checked in the Linux kernel documentation the documentation talks in detail about the IMA lock format the expected data format of various device mapper events the supported targets and the measurement data specific to those targets I believe this documentation is sufficient if you want to implement the attestation client and service pieces to support device mapper IMA measurements thank you everyone for attending this talk we will use the remaining time to answer if there are any more questions on this topic you can reach out to Alasdair or me after this talk if you have any follow up questions thank you