 Welcome to my talk, it will be about confidential computing and the role of SUSE in this area. I hope that I will be over the course of the talk, make you excited about this topic and actually make you want it for your future deployments. So just briefly about myself, my name is Wojtek Pavlik, I am the general manager of SUSE's business critical Linux group, so heading much of SUSE's Linux engineering and business. And as such as being a general manager it means that I'll be mostly presenting the work of others, but I'm glad that I can because it's pretty awesome. And to introduce you to confidential computing, right, so we did a little bit of a research asking people, so would you actually be worried in case a cloud provider for your instances that you are running in the cloud would have access to your data? And the overwhelming answer like what is it 82% was yes we would be worried. And well this is kind of surprising because you know the cloud provider actually does have the access so people should be worried if they feel that they need to be worried in such a case. The cloud provider obviously does not have any reasons why to access the data and actually most of them are building significant infrastructure internally and regulations to prevent unauthorized access to customer data, but still technically there is the possibility. And this is one of the things that confidential computing is solving. Also 94% of customers are willing to invest into security of their data in the cloud and elsewhere beyond what the regulations require which is great because again this is clearly indicating that our customers are willing to look into new technologies for security and confidential computing is one of those. Obviously when we come to that the willingness to invest is there but there is also an expectation that well this will be easy and so that's what where we are going with confidential computing we are trying to make it really easy to consume. So what it is about it is all about data in use. When you are running a system most people today would not give a second thought about encrypting data on the network and then early days of the internet everything was going on plain text and actually almost human readable even like major protocols like SMTP or HTTP were designed to be human readable so people would just send packets open over the network. Nobody does that today and even just suggesting that somebody should use HTTP instead of HTTPS is considered almost insane so everybody is just encrypting data over the network. The same thing is happening with data in storage so if you look at your own phone then you will probably know that some years ago the internal flash of the phones was also unencrypted but today every single phone manufacturer is encrypting the data in the phone by default so that when it's lost it's not easily retrievable by somebody that has phone to phone. So the remaining state of data that actually we have not looked at is data in use so data that is residing in the computers or phones or anything operating memory the RAM the DIMMs themselves and that is still a potential avenue to attack a vector where contents of the RAM in clear text are accessible either by snooping the bus or by dumping the computer in liquid nitrogen while that's a little bit more advanced technique and then pulling the memory out because when it's so cold the RAM the dynamic RAM does not lose the contents and you can then read it off so that's what we'll be talking about figuring out how to encrypt the data in memory and that is not an easy task because obviously you can't operate on an encrypted data because it's encrypted there are some theoretical studies on how to do that but they are more on the crazy fringes of mathematics so there are two implementations that are interesting for us in this respect and that is AMD SEV and Intel TDX so how does it work it is actually pretty simple there is an encryption engine that sits in between the memory access module on the CPU and the actual physical memory that is using AES the industry standard block cipher to actually encrypt the data as it goes between the memory and the CPU so the CPU caches are not encrypted but the memory is and you can already see that this means that inside of the CPU and the CPU data is unencrypted and but at the moment at least it tries to leave the CPU it gets encrypted in the memory that certainly prevents any attack that would be using like physically monitoring the bus or physically getting at the memory contents there is also quite obviously a performance hit because every time the data goes from or to the main system memory there is an additional latency of a couple nanoseconds for the encryption and decryption but fortunately that is not a huge system impact because compared to the CPU the memory is already very slow and so most of the immediate data interactions happen in the caches chips are available today from Intel ARM AMD and even IBM but not all of them are in the mass market at the moment AMD has been there for quite a while Intel is coming to the mass market this fall with the Emerald Rapids chipsets now people have been talking about confidential computing for ages and that is also why a lot of companies are kind of tired of it and do not pay a lot of interest the first iteration actually was in 2008 where Intel did introduce loadable applets in the CSME in the management engine of the CPU it's a small processor that actually is embedded in the CPU that is handling the management and has been a center of a couple controversies as you have been you may have seen in the news so Intel actually implemented a technology where you could run your security critical code on the management CPU after being verified by a proprietary algorithm which was pretty secure but also not very useful because only very small snippets of code could run maybe something for a banking application there was a complementary technology called Intel secure display that would then allow a secure way to put the content of that on the screen like for pin n3 or similar but given that the use cases were so limited this never really caught on so later we see Intel trying again with SGX in 2015 and that is a lot cooler it actually does encrypt memory it does allow for a large application even a possibly a complete container to run encrypted but it has significant limitations so again useful for um very critical data processing that must not be seen by other processes or other applications or other containers but the main limitation is that although two applications would not be able to see each other memory even if they bypass some of the restrictions that the CPU is putting on not allowing that access um they still share a common kernel and that kernel is what is trying to separate them what is trying to enforce the separation so if one application manages to somehow get control of the kernel because it's actually a hacker running and exploit well then there is no protection from the encryption to the other application because once you go to the kernel you get everything and that's where the latest generation of technologies AMD, ACV, S&P, Intel, TDX or ARM CCA actually differ they implement the separation at the virtual machine level and the main difference is that this separation is enforced fully by the CPU it does not require the hypervisor to actually do it so the hypervisor itself and that is the beauty of it is outside of trust zone so the hypervisor even when compromised is actually not able to access the contents of the VMs and that gets us to the analysis of possible hardware attacks or possible attacks on on AVM that this is designed to protect against so the first one that we see here is actually a physical attack so somebody gets their hands on a server that is running a critical workload in a virtual machine and are trying to get the secrets that are inside so they employ a I don't know a bus analyzer a high speed device that usually they are those are quite expensive and is sampling what is on the bus or actually then using the liquid nitrogen method that I explained earlier to get the contents of the memory that is certainly protecting against that so one of the things that that comes here into play is that we are trying to get the data are limited to just the CPU die right because the die the silicon itself is really hard to examine first on modern CPUs it's actually flipped upside down so that all the actual active content is glued onto the substrate so really getting at it requires well disassembling the whole CPU and just extracting the die and then connecting it which usually just damages it beyond repair but also you would need something like an atomic force microscope to actually observe the data that is that is live on the CPU so that would require a significant research lab to just implement that kind of attack so we are trying to limit the the the plain text data to stay just in the die this technology helps us to encrypt it when it goes to memory but data actually needs to get off the chip also to the disk and to the network and that encryption happens in software on the CPU itself right so through the software encryption and this hardware excited encryption we can make sure that any data that exists in clear text exists only on the CPU so that is sorted the typical application for this would be when really our machines are in a hostile environment where something like this can be expected there is one way how to get through however and that's been documented in a few research papers but it's not really quite that practical and new CPUs are actually fixing that so the new silicon revisions are taking this attack into account and it's voltage glitching so most processors most electronic components do not react very happily to disturbances in their power right so if if a CPU is executing an instruction and suddenly the voltage that is powering the CPU drops then it actually may interpret a zero as a one or vice versa in that calculation simply because it's comparing the voltages in its internal memory to the voltage of the power supply and suddenly if they flip and they are the other way around well then it actually is executes instructions wrong and this is an attack that can be applied to the internal security engine inside of a modern processor which then allows to extract and reveal the internal keys and basically the nut is cracked the the secrets are spilled there is nothing more to do the the the fix is actually pretty easy most embedded CPUs have something that's called brownout detection so they have a circuit that is monitoring the voltage at very high speed and as soon as it sees a glitch the whole CPU is reset into default state so we're like rebooted and that means that that cannot progress because a CPU requires a stable voltage as soon as the voltage is unstable it really should just give up that is one way the other way is like embedding included or including embedded capacitors and diodes that make sure that any energy that went into the CPU can can get over such instructions and not be siphoned out anymore so this attack is currently possible with some of the existing CPUs but mainly desktop CPUs it will be rectified in the future and even then it is actually this on the picture on the on the left is a picture of how to do it right it requires opening up the the the machine and then spending several hours tuning that attack such that the voltage glitch happens exactly at the right instruction in the security processes firmware such that it hits exactly the condition that then skips the security checks so not very practical for um let's say attacking um something very quickly or remotely you need to have full physical access and it will be rectified so but there are many more interesting uses that do not require physical access and the number one is an attack against a compromise hypervisor so because a hypervisor can be compromised remotely this is a something that is much more likely to happen when a hypervisor has a vulnerability where an input from a vm can actually allow to take it over then this technology will prevent from the hypervisor from going back down into the other vm because the hypervisor does not have access to its memory it could overwrite it for sure right it could just write into their memory because the hypervisor has full access to the machine but then because the the vm expects the memory to be encrypted a block of zeros or whatever data that the hypervisor would put in would be encrypted into garbage and it would just crash so it's not a protection of the vm against crashing but it's certainly a protection against infiltrating or exfiltrating any kind of data from the vm so one interesting part is that caches cannot be encrypted because it would be a too massive slowdown our caches needs to be accessed within a few cycles of the cpu but they are each tagged with the encryption id so we can have multiple vm's running on a system and each has its own encryption key for the memory and so the cache lines in the cpu are tagged with that id such that access to that cache line from a vm that has not caused that cache line to be filled is resulting in a cache miss such that the the vm does not see that cache line the same for hypervisor if it tries to read something that is cached by the wrong id it just sees a cache miss and it tries to fetch it from the main memory gets it encrypted doesn't see the content so this way we can actually ensure that that the hypervisor cannot go into our vm we may still lose its contents but we will know that the contents will not be tampered with or that the attacker will not get what is inside. Now the even more interesting use case is actually a protection against a rogue hypervisor administrator so you may be running your workloads in the cloud or at an msp or at a data center where there is a 24 by 7 service monitoring your machines and a administrator may try I don't know be bribed or something to access your data and this is the same case as the last one there's really not much difference except that this is always an option right somebody that has rights to the hypervisor today has all rights to all the vms that are running under that hypervisor so the technology that prevents a compromised hypervisor from accessing the data in the vm as well protects against the administrator of such a hypervisor and this is actually quite beautiful because that means that you can put your data into the cloud and trust that nobody can see it now just encrypting the data is not enough because you could potentially upload your vm image into the hypervisor sorry into the into the cloud ask the cloud to run it but the cloud could either modify your image before it puts it into this secured confidential encrypted state or just emulate the cpu if you have a emulated fake cpu that does not really implement the encryption but just tells you that it's encrypting well then you all your protections would be again in vain and so we need a proof right we need a proof from the cpu that the image is running encrypted in the confidential computing mode and that it is running unmodified and the question is how do we do that and the answer is a remote attestation how does remote attestation work and this is really the magic of the whole technology you upload an image um the hypervisor puts allocates memory for it puts the image in the memory and then the image starts and the image asks the security processor inside of the cpu the your security processor please attest me and what the security processor does it looks at the memory contents and calculates a hash a cryptographically secure hash of the memory it looks at the register contents and calculates a hash of the register contents and then signs those hashes with its own embedded key that has been put in by the manufacturer of the cpu again a asymmetrical signature that is that is then put there this signed hashes this is called the attestation report and that then goes and is sent to you to the owner of the machine to the administrator of the vm that wants to run that vm and you can then check well this hash matches my image this register hash matches the expected state of registers at the start of the image or at the actual actual moment that it's asking for that attestation and because it signed and that signature or that key that signed it is actually signed by amd or intel or um whatever manufacturer of the cpu this is telling me that it's exactly the image that should be running and that is running encrypted and that the hypervisor did not modify the image did properly enable the encryption encrypted mode and could not have access to the contents and from now on the vm i know is the right vm and is running in a mode where the hypervisor no longer has access that is cool but it's also somewhat impractical and so that gives us to the next slide current software is not supporting this mode um using a new hash using uh while calculating the hash for the whole vm uh and the vm may actually be changing over time if it is stateful if it actually has some configuration the hash will maybe always different every time it started uh that's impractical so the easiest way how to take it from here to an actual practical state is using trusted boot trusted boot has been around for quite a while it is using a tpm to again measure individual stages of boot and so if we just could have a virtual tpm that is running inside of our vm that is providing these services like the trusted boot needs well then we would have a solution to the problem we would have something that is connecting the confidential computing the s e v s n p and the tdx technologies to our own uh old known uh trusted boot and so if we can get that piece of glue uh then we have a full solution that works with existing software and provides what we were asking for and that is the confidential computing there is such a piece of software and actually susan has implemented it it's called coco not coco for confidential computing svsm is for um secure vm services module something like that um secure vm services yeah second vm service module it's it's it was written by ur gretel big thanks to him um he is the reason why i can have this talk uh because he has not just implemented this but also a number of other components for confidential computing in the next kernel um it is now the default and standard part of the solution um taking over even other implementations and what it does it attests it's it's written in rust it attests itself using the service processor um and then creates a virtual tpm process platform module using the own chip uh ftpm or firmware tpm and then provides those services in uh to to to the operating system it is running as a browser so it actually running in vm pl zero so the privilege level zero and runs the operating system then in vm pl one and so it's isolated operating system from itself so the trust part is limited to a fairly small amount of very carefully tested and reviewed code and then the operating system even if that that gets compromised cannot compromise our coco not so it's just another level of of security in there um one thing one more component that we need to complete the whole picture is the remote attestation server so a machine that can talk to coconut and ask okay who you are give me your attestation report i will validate it and that is key lime it's a solution that was developed primarily for the standard trusted boot but can be just as well applied in this case because again we have converted the case of confidential computing into a trusted boot scenario uh using using coconut so it is actually more extensive than just the validation it is then using also runtime monitoring using ima um of course we would be using encrypted disk uh full disk encryption via tpm which is also something that is now available in linux product in susus products and has a full revocation framework for keys and what else so this is what completes the recipe with this we can have a fully attested fully confidential machine uh running anywhere and um this is then what i call a on-premise equivalent privacy right so your vm is as secure as if it would be running well in your basement if you want to try any of this uh susa is uh releasing regularly every three months a release of our next generation linux product um it is both available in its open source variant open source open susa variant community variant and uh currently also freely as the enterprise version the latest prototype that was released in march and we will be releasing another one soonish uh is called piss bernina and includes all these technologies in it so if you are looking at testing them yourself download piss bernina and feel free to play with it again susa will continue dropping more code and more complete implementations of this um and we will be very happy for any feedback now there is much to do this is not a complete stack yet so number one the cpu's are still missing right so md has been on it for a while since 2021 there are cpu's on the market from amd that support this but they are still not all that widely deployed when it comes to intel um the current uh sephir rapids cpu's do have an experimental microcode that enables this functionality but i doubt that will become available widely uh only the emerald rapids coming this fall will have it so we are talking about really um software implementation that is ahead of the actual availability of the hardware nevertheless even on software end there is a number of things that still need to get done and one of them is that um kvm is still missing support for a cvsnp so we do support a cvsnp as a guest uh but not as a host yet in the linux kernel um and that is depending on specific patches that um enable and i will not go into the details here enable specific aspects of member management to actually enable that and this is also what is currently blocking intel tdx support in the linux kernel but i'm quite sure that this will be resolved rather sooner than later and that both those technologies will get uh fully implemented into mainline currently they exist on this out of three patches um there is also uh somewhat missing support uh by some of the major cloud providers um aws and google cloud are using kvm or sort of kvm because what is aws is using isn't today called micro sorry nitro uh but it is still originally based on kvm and so they are hit by this missing support um microsoft azure actually is using their own little para visor and that is currently having full support for a cvsnp so that would be the place to test this technology um if you want to test in aws there are reserved instances with the md epic processors or somewhat more expensive than phc go instances but at least um the the the amazon nitro hub provider is not blocking anybody there from from trying this so we will of course as soon as i continue working with all those cloud providers making sure that the technology is fully available and enabled in our products such that this can be properly used uh and looking towards like end of the year time frame to put all the pieces together now i was talking quite a bit about how this protects the user from the from the cloud provider right so as if the cloud providers were evil of course they are not and so the question is should a cloud provider care about computational computing well yes because um when i'm saying that the technology protects a cloud provider also a user from the cloud provider that also means that this is a benefit for the cloud provider they have an additional technology that protects their customers from anything that can go wrong on their end right a again rogue employee or um even a even legally they are actually giving or they're actually being relieved of the liability of what is running on the customer system if they have no access right so uh that is actually quite a big win that the um cloud provider no longer has to care what the customer has on the system because they will never see it um what also is important is that this brings potential additional customers to the client provider or to the msp customers like um banks any enterprise that is actually um you or that that that is managing personal data under european gdpi directives or equivalents um it is not possible to transfer the data to other countries other legislations and so limits the ability to process them in the cloud banking industries uh trade secrets uh car processing what else uh any regulated market actually has trouble moving into the cloud and this enables that uh even a customer like susa currently uh would uh be enabled by this to move to the cloud susa cannot currently use process our code when we are compiling the distribution uh in the cloud because we have a certification called con criteria el four plus um that prevents us from using that because somebody on the way could modify the code introduce back doors or what else find out about vulnerabilities and so we have to process on premise again this technology would given that we will be able to prove that at any point in time uh the code was always encrypted and not like accessible by any third party would allow us to do that so could be a major business enabler what it means for the edge right um that's back from the hypervisor to um to the physical security if you are if you if you have your service if you are a telco uh and have your service at the curb of a street streaming videos to uh the houses in the neighborhood um or you are a restaurant chain where you have small communities cluster in every restaurant uh those places can be considered hostile environments where somebody may be interested in actually tampering with your system making sure that they get all the streams of the video for free or whatever and so this is actually a very nice way how to do that because you can take off the shelf hardware and actually run something on it that you know that it's exactly what you want to run on tempered with and um even though there is no specific modification hardware to enable extra security now given that we are at the open source summit uh i need to ask myself the question of devoyization so as probably everybody knows gplv3 was created exactly to combat a situation where somebody using cryptography or additional security measures prevents um modifications to the software that uh that is running on a device that is based on open source but still completely locked down um because like the vendor does not want that device to be tampered with uh a typical example or the first example that actually cost this to be called devoyization was the tivo uh video recorder that uh would not allow users although its firmware was more or was fully open source based modified that firmware and potentially fix bugs or improve it so simply that would mean that if you would modify the firmware it would just break itself the device would be a paperweight uh the nice thing about solving this exact problem so i am actually delivering a service and i absolutely need to make sure that that service is being delivered um by my uh business uh without somebody being able to temper with the service it could be uh maybe a device in a car that actually needs to pass all the certifications such that it is safe and does not kill people um or something similar so i need to deliver that service um and i need this level of protection and confidential computing will give it to me but at the same time it will allow a user to actually replace my VM with something completely different um and the device would still continue functioning so given that the protection is only given to an ephemeral temporary VM in the system i can use the hardware for anything else once my serve once is no longer being used for my service so from a operating uh from a open source kind of ethos this is actually a much much better solution than locking down the bootloader and the firmware so i hope that you have enjoyed the talk i think i'm mostly on time uh and i will be just closing with the question so why would we be using this right and the answer is really why you wouldn't because the same way that everybody is encrypting um network today everybody's encrypting drive today uh well eventually it is quite likely that everybody will be encrypting the memory just because it is giving an additional layer of protection without almost any drawbacks so you will want to have your memory encrypted you will want to have uh well many people will want to have compliance uh with uh with regulations we are striving to make it really zero effort such that the operating system potentially may be even completely agnostic of of this as long as supports are trusted boot um so i hope that um my talk made it an interesting um sorry my talk made made you actually be interested in this technology and consider it in the future and thanks please right so it would be multiplicative effect right so if you lose a certain percentage because of ima and certain percentage of this extra not at the moment plus you actually do not require ima to run this right so it i mean if if you run ima it gives you an additional layer of protection for runtime but um but confidential computing is useful even without ima right so so there is there is one more source of potential uh performance impact um that i did not mention so far and that is that um given that you want to have all your disk encryption and all your network encryption happen on the cpu die you can't upload it to a network card or to an io card because then you would have the data going over the pc express bus are completely unencrypted so we are watching the space but uh but of course um well not majorly involved in coding up anything yet right um there is another that there are like two avenues to solve this one is to actually have encryption on the pc express bus and to the card the other is having uh the encryption acceleration on the cpu and both are being developed in the hardware industry right so um i guess eventually we will support both but yes yes because because um if you are actually if you would be today combining confidential computing with that you would not have a solution true true but still even many of the models are hugely benefiting from running money on a gpu but so so then you would actually be sending plain text data over over the pc express bus which is not great here certainly not for this model over this threat model so we have key lime as a part of the distribution and we i believe made some contributions already but not a huge amount yet because from our point of view that is one of the more functionally complete parts of this whole puzzle uh and some of the other parts needed a lot more attention you want to answer well actually i was talking to uh to aws just last week in berlin and nitro actually does not really uh get in the way of confidential computing let's say that way right so they will be able to enable it that's the point right uh when you are doing the remote attestation it's the vm asking actually it's it's it's the coconut component in the vm that is running at vm pl0 asking through the hypervisor to the security processor please attest me yes yes that is the idea and then of course the system can use that to talk to the attestation server validate itself establish htds communication ssh keys and so on that are based on this this this exchange and then be security securely identified by anybody connecting to it that it is exactly what we wanted to have it is provided the keys to unlock the storage as well during the boot process that it doesn't have to contain them uh and so on fine i'm cool again thank you all