 So, hello, I'm Nick and I'm going to talk about what's going on on the trusted execution environment workgroup on the RISC-V foundation. So, first of all, what is the workgroup? So, the foundation has some committees. There is this technical committee where all the working groups or task groups are doing technical stuff. And there are some other committees like the marketing committee or the outreach committee. So, there is also the security standing committee that deals with security in a larger perspective. So, security is not only a technical thing. It also matters for, like, they try to create policies, for example, or guidelines. So, these two groups, they interact with the security. We have the trust execution environment and there is also the cryptographic extensions workgroup that works on extensions for accelerating or implementing cryptographic algorithms and hashing algorithms, you know, the well-known primitives. And these two technical, for now, we might have more in the future. These two workgroups that are part of the technical committee, they interact with the security standing committee and then, for example, we usually have some tutorials going on. It's a process for creating not only technical solutions but also, like, policy solutions. So, what's the task group? So, each task group has a working space. And this is one of the largest task groups right now on the foundation. And initially, there were two groups. It was both the cryptographic extensions and the trust execution environment. They went the same group and it was the largest group, so they split it. And because security seems to be a big thing, I mean, a lot of people participated in these working groups. And we have, like, 112 registered members, which is a lot of people. We work usually by conference calls, like once or twice a month or maybe more frequently. And we also have a mailing list. So, the mission of this working group is to define an architecture specification for trust execution environment or risk five processors. Let's start with that. So, you know, ARM has trust zone, Intel has SGX. We want something similar for risk five. And we want not only to provide them a mechanism for trusted execution environment to be initialized, but to discuss other aspects of how to protect flow and execution flow. And also, it's not, because we're talking about a standard, we're not only, we're talking about creating APIs or specs. Our goal is not to provide one solution, like one implementation for everybody, but to give some set of guidelines and specifications for them to implement or how things are done in a secure way and respite. And we want also to provide, like, some reference implementations like an SDK or something so that people can use that as a base to create their own stuff. Something like the OpenSBI. We want something similar for a secure monitor. So, what we're working right now. We have some proposals on the hardware front that we want to change some things there that will help in our mission. And so, first of all, we have some modifications of the physical memory protection mechanism. I'm going to explain what this is. And the physical memory protection mechanism helps isolate processes, execution environments, let's say, contexts from each other. But it's not, you know, the virtual memory, like you have an address page identified and when you do a context switch, you change the address space. Here we are talking about physical memory protection. So, it happens independently of the memory translation. So, it also works for machine mode. Machine mode doesn't have a virtual address translation. It uses physical addresses. So, we also want some way that this mechanism also protects, is able to protect regions when you are running on machine mode. So, this mechanism is pretty strong, okay? Because it happens on physical memory. It's like, you know probably what ARM does for trust zone. It has a secure, an area of your memory that the supervisor cannot touch. Similarly here, you can have like, it's a physical region that applications cannot touch unless there is a specific register set that allows that. So, basically, what you can do is go and protect the physical memory region when your application that you want to protect is not running so that anyone that tries to touch this physical memory region either through memory translation or directly will get a fault. And when you want to run your application, you will have a secure monitor that will go and remove this protection. So, your application runs now. It can see its own memory and then you revert back. That's one of the uses of the PMP. So, okay, we have a mechanism that protects applications or execution environments from each other, but we also want a mechanism for isolating devices from each other. Now, this is not part of the core. So, our job here is to talk about the eyes. So, this is not part of something that we would expect to see in eyes because it doesn't have to do with one heart or there is five core. But we want to have a proposal for vendors that will implement a RISC-5 system on chips on isolating devices from each other. So, the same way you isolate processes and configuration environment, we want to protect the memory from other devices because there are attack vectors when you can use another device for accessing the memory. So, let's say that your CPU now has PMP there. It cannot touch the memory, but your graphics card can do that or another DMA engine can do that for you. You can use another third-party device to bypass this mechanism. So, we have to also protect, to also provide isolation between devices. The last thing is also, I told you about protecting the flow of an application. Now, let's say there is a bug on your code and someone can do buffer overflow and override your return address. Usually, this is a common scenario when doing exploit development. So, this is what messes up with your control flow. So, what we want is to discuss about the control flow integrity extension. So, this way, if someone tries to override your return address, he or she won't be able to make it. There are some... ARM has something similar, and I think Intel has implemented something similar. We are talking about... I'll explain a bit about it later. This is on the hardware side. I'm going to talk about these things a bit more later on. And on the software side, again, this is not part of the ISA. The SBIs are also not part of the ISA, but we want to provide an architecture on how a secure monitor will be implemented, like the same way you have calls from the supervisor to the firmware, for example, through the SBIs to trigger a timer or to do a remote fence or an IPI. We would want to have an IPI that would say that would come from the supervisor or the hypervisor or from an application to the firmware to do something... to mess with the physical memory protection registers to allow isolation and to modify isolation settings or do something in firmware that you don't want to do in a more, let's say, in a less trustworthy space. Think of trust zone on our... oops, I talked too much. So think of trust zone on our... you have some services running on firmware that, for example, can access... Crypt accelerator or shortcut access private keys because they can read if uses because they are on this high privilege level, the equivalent of our machine mode. So, for example, let's say that someone wants to ask the firmware to encrypt something using these secret keys that no one else can see because firmware runs on machine mode. We want to have an IPI for that or an IPI for writing programs like that. And, of course, together with the IPI it goes with the whole design. You have lots of things I'm going to give you an idea later on. So let's talk a bit about the physical memory protection, what we have right now. So it's part of the machine eyes. So your core needs to implement the privilege spec for this to be there. And the privilege spec says that you have up to 16 regions to protect, but vendors are free to implement more or less. For example, the sci-fi board has eight, which is enough. Other vendors might implement more regions there. The ideas, the mechanism is the same. And also, by the way, the privilege spec allows the vendors to implement another mechanism for physical memory protection. This is the standard one. So think of it as IP tables, but instead of IP addresses, you're having memory addresses. So when someone goes to write or read or execute something on memory, it passes through this firewall. And this firewall is for 32-bit addresses on the 32-bit RISC-5, and for 56-bit addresses on RV64, you don't get the full physical address. So even on 64-bit, the CPU, you get 56 bits of physical addressable. There are four ways of matching an address. This is addressing matching modes. Here are some examples. It's how you describe, because you want to describe a range of addresses. You don't want one specific address. So there are some also describe this range of addresses. You can describe it in one register by having this natural, allowing power of two regions, for example. Or you might want to split it to, say, start and end. So you need two registers to describe the register range. And these are the bits you have on each of the configuration registers. What permissions you can assign when an address matches. And you can assign the read-write and execute permission. And there are also these other bits here. I'll talk about the local bit later on. But you get the idea. This is a logical diagram of how PMP works. It seems a bit complicated. I have it here for your reference, because I understand that this is not readable. But what I want you to get from this is that the PMP behaves differently when you are on machine mode than when you are on other, on supervisor and user mode. So when you have a rule, when you have, in the current spec, because I want to tell you about why we want to modify this. In the current spec, when you get a request for an address, it goes through this firewall and then the PMP mechanism sees if you are on machine mode or not. So if you are on machine mode and there is no match, you get a successful, it allows you to access the memory. So let's say that I want to access an address range that is not, there is no rule for that address range. When you're on machine mode, this will succeed. When you're not on machine mode, this will fail by default. Which is, it makes sense. But the other thing is that if there is a rule for that range and you are on machine mode, you will still succeed. So basically for machine mode, always succeeds. The only way for machine mode to have a rule right now that prevents it from doing something is for the rule to be locked. So right now we don't have the option of having temporary rules locked rule is a rule with the L-bit here set and it means that this rule stays there permanently. You cannot touch the registers that describe this rule. You need to hardware to do a hardware set on the hardware thread. PMP, by the way, is per heart. So you have a different set of registers for every heart you have, hardware thread or core if you like. So right now when you're on machine mode which is these blocks here, unless a rule is locked permanently, it will always succeed. You will bypass the rule. So we want to be able to have rules on machine mode that will be enforced and that can be removed, that can be temporary. That's one thing. The other thing we want is for the machine mode right now. I'll probably tell you about virtual memory protection first to give you an idea about the mechanism we want to mimic. So on virtual memory protection this happens on supervisor mode. So it's an extension. It's part of the supervisor isa. Here we talk about physical memories after the memory has been translated. We're trying to access physical memory. Here is when the supervisor or user mode uses virtual memory addresses. So again we have 32-bit virtual addresses and 39 or 48-bit virtual addresses and when RV128 comes, we'll have larger than that. And we have the usual page table. We have 4K kilobyte pages by default and we have also huge pages like 4 megabytes up to 512 terabages. And each table entry hardware is again read write and execute permissions as usual. And here we also have the U-bit, the U-permission on the page table entry. The U-permission means that the supervisor that this table entry, this memory region is allowed to be accessed by user mode. If this is not said, then a user mode application cannot touch this. So for example for kernel memory we'll have this U-bit to zero by default. So the user space will not be able to access the kernel's memory. That's okay. But what happens if the kernel tries to access the user's memory? Is that okay? So lots of exploits can use a technique like if you find a bug in the kernel and you can have, you can put your code somewhere, you can make the kernel execute your code with kernel privileges. So that means that when you are on the supervisor mode, in our case, you will try and execute an attacker will try and execute this mode that relies on user, on a memory range that belongs to a user application. Now on RISC-5, this is denied by default. So the supervisor cannot execute the user memory, which is really good. But we'd also like to have this, also have this, but we don't only want to protect the user memory from the kernel from execution, we might also want to protect this memory for read and write. So we don't want the kernel to read and write user memory. So this is when the sum bit comes. We have a bit on the status register that says if the supervisor is allowed to read and write memory on the user space memory. If this bit is not set, then the supervisor not only cannot execute memory that belongs to the user space application, it cannot read or write it. And there is also another bit there. I mentioned this for completeness here for the people that would read the slides in their free time. So the MXR bit says that pages that are marked here and so if you have read and execute, so if a page is marked as execute with this bit set, it's also treated as readable. Now this is only for virtual addresses, okay? It doesn't deal with PMP settings. So if your PMP, at least from the way it's written in the standard, so if this bit is set, it doesn't mean that you have to, you can execute, you can read regions that are protected from PMP because PMP is on machine mode, it's one layer above. So if the memory region is protected from execution, this set will on the supervisor mode will not override it, okay? If you protect the memory from reading, sorry, with PMP, even when this bit is set, you won't be able to read it because the PMP that's one layer above that will stop you. So what we want to change to the PMP, so first I said I talked about this thing that we cannot have temporary rules for machine mode. The only rules that can now be, that are now enforced on machine mode are the rules that are locked. So we want to be able to have temporary rules that are enforced on machine mode and we can swap them on and off. This would allow us to have isolation between things that are running on machine mode. On machine mode we don't run only one application. We might want to run multiple applications on machine mode. This is something to be considered as trusted from the perspective of a trusted execution environment for example that wants higher privileges because you can have like multiple, one scenario is to think that you have multiple services that run on the firmware and you want to switch from one service to the other and they all want higher privileges than the privilege higher than the supervisor to speak to some secure devices for example or to do DRM and whatever. Another way to see this is if you are an embedded developer and you only have machine mode or machine one user mode and you want to run something on machine mode that is not one application, something more complicated there. So you want to be able to have rules that are applied to the machine mode and you are able to swap on and off. So that's one thing. The other thing is that we remember the Sambit I talked about on supervisor mode when the supervisor cannot access the user's mode. So we also want something similar for machine mode. So right now someone might be able to find a vulnerability on the secure monitor or the firmware and make the firmware change a point or something and make the firmware execute some user memory or some supervisor memory. Now we have a guarantee on the virtual memory that you are not allowed to execute user memory but on machine mode we don't have any guarantee that machine mode will not try to read write or execute memory from that belongs to the supervisor or the user. So if someone finds a bug on the firmware he or she might be able to make the firmware execute this piece of code with the highest privilege we have. So we want to have something like the Sambit I mentioned that will also get rid of the execute permission in this case because all permissions are allowed on the mode. They can do anything right now. So we want to have a global bit like as you to prevent machine mode from accessing the supervisor and user memory or any kind of access. So that's why we tried we will add another status another bit on M status on the register that we will call the machine mode isolation bit this proposal is almost finalized so we are looking at privileged spec 1.12 we hope this will get merged there and this is the truth table so we basically we want to be backwards compatible obviously so when the L bit is set we want this rule to be enforced on machine mode as well but when the L bit when the MMI bit is set this rule is enforced on machine mode but not on other modes. Now why we did that? Think of you might think of a rule that does not deny access but allows access so maybe we want to protect them we want to allow something to machine mode that we don't want to be allowed for the others in this way for example we want the machine mode to be able to execute the firmware code but not read it or write on the code so this is the scenario we are trying to cover here and this is the scenario I mentioned before so right now if the log bit is zero the rule will not be enforced on machine mode but when the MMI bit is set the rule will be enforced the machine mode access will fail always so basically what happens with this bit is that when you do something on firmware legitimately and you return you exit from firmware you set this bit to one and now there are no accidental the firmware cannot mess with a user or a supervisor memory because the PMP will deny access and physical regions again while you are running on this mode and we are basically this won't solve a security issue but will prevent obviously if someone hacks the firmware he also can remove the entries there and do bad stuff but we are trying to prevent buggy implementations or some accidental reads and writes it's very similar to what the sound bit does if someone owns your kernel they can obviously modify the page table or if they can create another mapping to the same region and bypass this protection so this bit does not have an idea of protecting the lower levels the lower privilege levels from the above levels is to be proactive because if you manage to take control of the higher privilege levels then you own the lower lower privilege levels about the IO PMP block I told you this is a block we are proposing that will sit between so you have the CPU inside each of the hard you have the physical memory protection mechanism then you have the physical memory attribute checker which is kind of a PMP that's permanent and it's for all cores I mentioned that PMP is per core so there is also a mechanism that acts as a firewall for accessing memory that is for all cores and it's persistent for example what this guy is doing is protecting let's say if you are touching a ROM for example it will not allow you to try and issue a write operation if you are trying to reach a device that doesn't support writing it will prevent you from writing but we can probably add more stuff here the standard allows for the PMA checker to do more stuff it's not very specific about what the PMA checker is limited to so we have the CPU protecting itself let's say protecting one thread from the eye one hardware thread from the other but we want to protect the device so this is what the IOPMP block comes from this block only handles protection it's only a firewall for physical address so it will see if you have an IOMMU that will present a virtual address space to your device this block will sit after the IOMMU after the translation because it works for physical addresses so this will sit after the IOMMU or between your device and the system bus and this block will the idea is that it will isolate devices from each other and from the CPU now there is a catch here the address when the device does an access that's when the IOPMP will work because the addresses comes from the masters on the bus they will not come when you try to reach the device so if this device so you only have a firewall of the outgoing traffic let's say so if you have an incoming grid here so if someone tries to if one heart tries to modify a register here then this block will not handle that because there will not be this block will only handle outgoing traffic the same with PMP but because it's on every core it does a job so we have to find it we have to discuss how we will treat MMIO accesses how to protect the devices not only from reaching others but from being reached so this is work in progress right now another thing is the control flow integrity so I told you that there is this type of exploits that someone messes up with your binary on the RAM overrides the return address when your code returns instead of returning to the proper function it jumps to some malware so how do you protect that remember that I told you that in the in the virtual memory protection you have 39 bits on 64 so you have some bits left there so because the entry is 64 bit but you can only use 39 or 48 bits so what you can do stuff with the rest of the bits on the virtual memory so you can have like let's say we are using some bits that are higher than 48 these bits are not addressable these addresses are not valid for the core right now so what we can do is we can have we can have addresses outside of the allowed range by using the higher order bits, virtual addresses that only the call and the return functions can write to so if you try to do a load or store on these virtual addresses you'll get a fail but when the call and the return function will run the call function will go on this range and store the return address and when the return function comes back it will go there and remove it so basically you have something like a shadow stack that's there at all times and you cannot mess with it because only these two calls can write and read there, load and store will not work and if someone tries to so if someone messes with your code and overrides it then when the return happens it will go to the shadow stack and it will not find the proper return address so it will block you of course this is on virtual addresses it can be when you work on machine mode for example that you are using physical addresses we cannot have something like that and it doesn't protect against all cases but it's a decent mitigation so again it's still work in progress and now about the secure monitors architecture now here is a lot of work to be done we have some implementations for secure monitors we have one from HEX 5 that's called multi-zone which is basically it acts like a hypervisor it deals like the whole memory is protected by default and then you tell the hypervisor to allow specific memory regions to give specific permissions to memory regions when your applications or your environment will run theoretically in one of these environments or in any of these environments you can run a full operating system but this basically focuses on embedded devices when you want applications to be isolated from each other so it's the cool thing with multi-zone it's very very small it has a very small overhead it's written in assembly they have open-sourced most of this they have a library there that's again but it's basically the code they have in assembly anyway and they have open-sourced some tools as well so you can check their workout and the thing with multi-zone is that when you create the firmware you have to pre-define these regions so you cannot change them at runtime if you want to have a new zone you can ask the supervisor or the secure monitor to create a new environment for you you have static environments while you compile the thing then there is Keystone Keystone aims to implement something like SGX something that's more feature that has more powerful it aims for things that will run on a scenario that you would want to have dynamically created trusted execution environments the idea is that I want to create a secure environment for running some process and then destroy it and then ask for another one and destroy it so the multi-zone thing will not work there because you have a specific set of environments this will what you want with this is to be able to scale so you might want to have a lot of trusted execution environments you might want them to be resizable you might want lots of features it comes to how you manage your trusted execution environments and that's where Keystone comes into play right now it's fully open source by the way and it's been maintained by Berkeley by some people there and it's based on another work from MIT called Sanctum which was also included some hardware stuff on mitigating side channel attacks it's pretty interesting work you can go to their site and read about it so Keystone is like more feature proof so we have something that looks like a hypervisor and something that looks more like a secure monitor and both these teams are members of the workgroup so we are having discussions with them and see where this thing goes and what we want to do is to have to define some common APIs of these approaches and other approaches in the future can use and come up with a set of policies as I said we create a spec we do not create just an implementation or an SDK so there are lots of areas that need discussion so for example not only the APIs then we will need some interaction with the SBI for example the SBI group we will want some commands to be able to send to the firmware through the new interface we want to define a memory isolation sim using PAB like how should you use PMP to protect to do memory isolation so we have a draft on that for traffic execution environments how to protect one execution environment from another using PMP we want a memory isolation sim for the IO PMP so because the secure monitor will also configure the isolation between devices so we also have to provide a sim on how to do this properly and then there are more serious stuff like how do you do this properly between hearts for example an attack scenario could be that one heart would issue a command to reset another heart when resetting the other heart the whole PMP stack goes because the register will be zero and you will be able to do something on the other heart without PMP for example so we want to be able to there is also multithreading in general when you are on the traffic execution environment how do you do multithreading in there because the PMP mechanism is per heart so when you switch when you send something send some job on another heart you need to have the same PMP settings there as well or the application won't be able to access its memory or even worse it will be able to access someone else's memory that's where interrupts come in as well so we need to be able to have a sim on how to handle interrupts to the traffic execution environments or what happens when an interrupt comes and you are doing something secure and these environments how are we going to express them are we going to have an image format are we going to have like only the binaries it will just be an ELF binary how we want to discuss about having a common description of traffic execution environment and of course we have to write the code for it hope that we are very lucky that we have some implementations already there that we can use it be inspired from and we would also want to provide an SDK for people to use the same way that ARM and Intel have provided their SDKs for class zone and SDK and SDKs yeah so that's all thank you so sure it's not part of the CPU well that's an implementation so if we have this like a spec if you put this in the spec let's say that people will be forced to do it so vendors might want to use different ways of storing stuff one vendor may want to have a secure ROM another vendor might want to have effuses another vendor might want to have some encrypted stuff on the beginning of their storage like a partition we don't want to limit that and secure storage like encryption is not to be handled it's a software it's something that can be done from the software side but if you think if the firmware does it for you then how I mean for every transaction you'll have to go through the firmware yes that's for secure boot secure boot is part of our to-do list yes sorry about that but we are not there yet there so we so secure boot there's a lot of discussion of how to do it for example do you do it's a different approach when you're booting your own yourself when you're running when you're trying to secure boot on the same CPU and there's another approach where another some other unit does the verification for you and handles the booting process so again we want to be flexible because RISC-5 is an open ecosystem we don't want to limit people and we want to have a specific scope so if we try to say things about the storage or the memory for example you may have row hammer attacks or someone might tap into your memory to protect you from that this is an implementation specific thing actually the same secure boot is part of what we are working on so you may not subscribe to the list we just don't have a proposal yet for secure boot we have a lot of discussion but we don't have like one approach I agree first of all to be fair here if you read the SGX they said that we don't protect you from side channel attacks so they said it from the beginning you may read the sanctum that paper their work was on side channel attack mitigations and side channel again is something that you can it's something that has to do with design we can design for example the software to be secure against timing attacks but the other types of side channel attacks they are hardware specific which in our case is an implementation bug let's say so we are trying to make the spec properly secure for the implementation have some guidelines for them oh no you get a trap you get a trap, you get a illegal access denial in this case not really because you have trap vectors on other modes as well you may have traps that are handled on supervisor mode or on U mode so that's not always the case but in our case yes it gets trapped and it gets handled by machine mode as well ok so we can talk afterwards