 Hey, good morning, and welcome to the level of our where we talk about all things containers Kubernetes and open shift And so I am joined today by my co-house. Mr. Scott McGrion Scott. How are you doing? Doing great Randy look forward to talking to our guests this morning Indeed so, you know, we have a we have a really interesting show today Because it starts with a premise. How can we maintain trust in the cloud? Ensure that we're building secure systems, which I think is an open question, but maybe not completely open We're gonna talk a little bit about the key lime project and perhaps act before we dig into the details of key lime Maybe help understand what some of the main issues are and so today We are joined by Lucy Kerner who is our red hats director of security global strategy and evangelism Michael Peters Principal emerging technologies developer and Lucy Lillie Sturman who is our senior software engineer. So greetings all thank you for joining us today Before we Get started, I just want to remind everybody please like subscribe and share to let everyone know we're on the air and Let's get into it. So Lucy How can we actually make sure that we're building secure systems and applications in the cloud? Because there's there's an implied trust there. We're we're going into the great beyond and How do we do that? All right, and it's it's not a simple answer, right? It's not like I can say you can you know security's not like oh I buy this one product or I do this one thing or you know, and I have the silver bullet solution kind of thing It's about tackling it across people process technology. It's about shared responsibility with the public cloud providers and yourself It's using a layered security defense strategy Again, there's no such thing as 100% security. There is always going to be a trade-off You can't fix everything. You're not going to have enough people time resources, right? That's always going to be a challenge You know, for example, you know, you have a strategy for data data security across the hybrid cloud Whether you're talking about data in the cloud or on-premise the CIA try it, you know confidentiality integrity Availability it's it's still true, right? You still need to know, you know, is the data appropriately accessible when needed is it Who has access to the data, you know making and then I'm making sure that you tackle that It's a security data security from not only the cloud platform, but the cloud image the cloud hosts, right? You know the growth of workloads in the cloud is It's growing all the time, right? So In developers, you're seeing them have much more control over the application like pipeline and lifecycle They're downloading things directly off the internet whether it's open source packages or libraries that kind of thing So you really and you're seeing these increased breaches in software supply chain like the SolarWinds Rates So you want to have a strategy for software security You want to make sure that you have that Consistent repeatable framework like an SDLC in-house to develop applications not only quickly but also securely and built in early in that pipeline when you're consuming open source technologies you want to If you're not if you're doing it yourself, right? If you're consuming it directly yourself and not using enterprise open source from vendors such as ourselves You want to have you want to create your own product security team? You know look for find out new vulnerabilities that are all in all the open source packages you're using You know first of all, you know, but first you have to know all of the open source tech technologies and packages you're using It says the security impact of the software that you have You know, of course creating or getting the security fixes and not only getting it but also doing The isolating and doing back ported fixes as well As I mentioned earlier, it's not just a technology problem. There's a huge people and process Aspects to it as well, you know, you see a lot of Research and survey out there around either. It's just not enough security resources out there and people people There's not enough security people out there, right? So you can't just you know, throw bodies at the problem. That's not that's not going to be sufficient, right? You want to especially bodies available, right? There's not and also, you know, the good ones They usually you know, there's a very short life span of you know, sees those like they don't stick around forever, right? Like a lot of them leave within a year maybe two years, right? So and goods really security people are high high demand, especially, you know, anyone who's doing cloud security So that's another challenge, right? So and especially with containers and Kubernetes and these newer cloud technologies You really need to have that Cross-collaboration that no side you can't do silos, right? I think you can't have silos and one of that that's one of the huge Challenges and why that these technologies are not getting adopted very quickly, right? Because you have You need to have active operation security and compliance all of those teams working Understanding how you do how do you tackle security differently in this new world? You know the tools may be different the way you think about security is different And so you you want to make sure all of these teams are educated You don't want to be Thinking about doing security by you can't just like throw it over the wall anymore They're doing last minute 11th hour panic like pen testing and you know security scans Those kinds of things you know those are highly prone to human error, right? So you don't want to and misconfigurations those kinds of things which are you know the high one of the top reasons for the Security breaches out there make it a team work, right? So that's another big one. Go ahead. Sorry So Lucy you start talking about some of the challenges And like technologies and a variety of other kind of top-level Issues when security space right now, but what are the biggest hurdles in The IT space in regards to containers and Kubernetes in your opinion Yeah, so I think one of them is as I said the disk a lot of teams are not used to this new way of thinking So I'll give you an example So I was at a customer about two years, you know before COVID right and I was sitting with the security team and they were trying to Manage their containers and Kubernetes environment and they were trying to use some of their legacy tooling That's what they're familiar with right they were tracking IP addresses They were trying to track IP addresses of all of these containers and they were like why are they I can't keep track of them It's crashing all over the place my tool, right? So they didn't realize that, you know, this is a Inevitable environment. This is an environment where containers are coming up and down and cupping up and down You can't be tracking it with things like, you know, Nagios or, you know, things like the standard monitoring tools that they're used to They so but that's what they're familiar with and not only are they familiar with it They're comfortable with it. They built their career around these two tools So it's very hard for them to kind of say, oh, well, no, you have to use this like Tool you've never heard of, you know, there's you know, that kind of thing and also it is this different way of thinking, right? And then it also like they'll be saying things like well I need to be installing my antivirus or my third-party scanner on, you know, core OS like an immutable OS They're not used to well, that's not what you're supposed to be doing, right? So this kind of education session and then not only that but the compliance teams as well They are if you look at the auditor, you know, a lot of times it's about The auditors are very used to this defend the castle provider security type of approach Where, you know, it's not you're not used to this kind of immutable infrastructure So a lot of the compliance rules out there like those securities controls and the security policies are built Are written for this per a different capital in a perimeter security of approach where, you know You're not dealing with this kind of immutable infrastructure So that's that's another challenge that the that it's a lot of this You know security teams talking to the compliance teams and the auditors to try to make them understand that You know, yes, we third party In doing antivirus or agents doesn't make sense in this environment or you know, you those kinds of those kinds of things Is another big big challenge as well and also for The containers and kubernetes side that that's it's another You know layer defense that Teams have to take, you know everything from using trusted content. That's another that's a big challenge Like, you know, the developers just downloading stuff from the internet not using things like what we offer for example Like a UBI, you know container images that we've put in red hat container catalog You know those kinds operator hub, right? They're just also using an enterprise registry those kinds of practices should be in place to you know control the application security, you know protect the platform itself and then Also detecting and responding at the runtime and then so you know protecting the platform itself You know, if you're not going to use let's say open shift then you're gonna have to figure out a strategy to You know, how am I going to do all of the deployment? Things right like, you know, if I don't have I'm not using operators and Enabling in our back by default or you know protecting the platform data, which we do by default and open shift You know all the policy-based deployments we're doing with like ACS advanced cluster manager advanced cluster security that we provide and then all the run times Related security that we provide right also with things like ACS. So those are mentioned so you'd mentioned you know Auditors and security and tooling around controlling containers there's a Newish member of the portfolio around auditing containers, right? When you say newish member, are you talking about like a feature that we're bringing into an from open shift or? Well, we've integrated it into open shift But we had acquired a company and Yeah, I am mentioning staff box. Yes, but we don't call a stack works enemy. We're calling something else, right? All right, all right redhead advanced cluster security, right? So For Kubernetes. Yeah, so that's that's that's our product where we're you know It is focused on securing that the workloads that's running on top of Kubernetes, right? So and and at a cluster scale. So this is this is You know with lot of built lot of the technologies that are built in that allows you to secure kubernetes at scale And also of course run the workloads running on top Well, if we could step back for just a minute You know, there's there's a lot of things about the way that you're working when we're in this world of containers and kubernetes And so on that are very different You know, you talked about sort of defend the castle defend the perimeter and you know Sort of a presumption almost of that, you know, sort of that physical data center and and all that goes with that now We're in this very trusted environment where you're relying on artifacts and files and images and all these other things And it's also very dynamic dynamic One of the premises here is that it's not something that's locked in for 20 years is that it's you know It's something that is going to evolve and improve and and change a lot I think that's part of the issue, but stepping back even further, you know open source is is Sometimes believed to not be secure, but in fact it is actually The more secure choice because it's transparent, but you touched on Lucy one of the you know, sort of a fundamental Issue there is that okay on the one hand, it's open You can see you can look in and you can see where the vulnerabilities lie But you you have to do one of two things right you have to either go and look yourself Or you have to have somebody that you trust to to do that and presumably we are one of the Parties that you can look at you look to to trust can you talk a little bit about you know how we approach looking at security things? You know in terms of you know, basically take off loading that effort from the end consumer and making it something that they can trust Right. So yeah, you know as you said, you know the benefits of the open source Or obviously are things like you're taking advantage of all that innovation that it's Donated to the upstream whether you know Kubernetes from Google all the lessons learned they they're from including security technologies, right? Also, many of the container security isolation technologies like a sea linux the sea groups all set up Those are all contributed to the upstream as well However, you know, you really do need to put security gates in place in terms of using these open source technology securely You know, as I mentioned, you know, make sure that if you're not Transferring that risk to a vendor to provide enterprise open source such as ourselves that you yourselves make sure that those open source packages that you're using are Consumed securely, right? Make sure that you're You know, you find out where all those vulnerabilities are and let me know find out, you know Where we're all those open source packages that you're using We we also don't just take the upstream linux and make rel, right? We put it through a it's own our own software supply chain process You know, we're very picky about what packages make it into rel. You know, we do Code the security scanning on the source code itself, you know, we do things At the compile time or we we also have extensive quality engineering quality assurance testing We do on all of our software, you know, we have a secure way of distributing our packages And of course we have our own global product security team, right? So if you are not doing this you you need to have something like this in place When you are also consuming upstream directly if you do that here to yourself Right well, so I think that's where we fit in in you know Well in one dimension of how we fit in because I think it for most organizations the idea of being able to do All of that simply to be able to stand up applications is probably not realistic And so I think we have a we have a really important role to play there and part of our role also is participating in what projects that That help improve Things like security Particularly when we look into this sort of brave new and uncertain world of Cloud deployment and some of the new kinds of risks that come in with that that are very different from the old You know defend the castle world and so one of those one of those projects that we are involved in and that I think is It shows a lot of promise is key lime And that's how we happen to have Michael and Lily with us today is that key lime is Fundamentally about answering the question, you know, how do you secure the platform? You know, how do you how do you get to that level of comfort that you might have with you know The old defend the castle Procedures and routines and so on but when you are in this world that it's someone else's castle that you're you're running your mission critical absent and so It's also interesting because you know that root of trust is you know being that redefined for the cloud era I know how do you think about root of trust differently in the cloud? And so, you know, that's that's also what's interesting about this as well Right. So with that Mike maybe could you tell us what key lime is a little bit more than just the hint that that I've alluded to here Sure The problem the key I'm trying to solve as you were talking about is how do you? Secure something when you don't control the underlying hardware So this is a scenario we see in the cloud right where you're running VMs on somebody else's hardware It also comes up in edge and IOT cases where it may not be running in a VM But it's running on some hardware that may not be under your physical control And so key lime tries to come in there and give you this root of trust When it can hardware based when it can't software based But it's using something called a TPM TPM Is a hardware chip or can be simulated in software that has a lot of really nice cryptographic properties such as built in random number generators Cryptographic hash functions it can do encryption seal data It can store cryptographic keys. Also one of the cool things about it They have these some registries kind of like a normal CPU, but specifically for security that can't be written to only extended So when an application is writing to the TPM, they don't Overwrite what's there. They instead extend the hash The hash of the first value might be in there and then they add a new value and it gets hashed and added to that other hash So every thing that's measured and and sort into this TPM or what we call extended in the TPM TPM registers is a Sort of a history of every action that's come before it and let's give them some really nice properties that you can do for Are used for a lot of things like measured boot and remote attestation, which that's key lines Core focus is on on those two things measure boot and remote attestation Well, so what we want to do basically by using the functionality of the TPM in this way is to make sure that At every stage, you know the state that that system is in you know the state of the machine So that's what it's tracking with those hashes Okay, so Let's unpack a little bit of that. We talked about TPMs Trusted platform modules There's also something called a TEE And how are these things different and when what might we care about one versus the other? Sure So Michael did a good job of explaining what the TPM does so it's it has more of a specific Functionality for example in measured boot. It's very useful and it gives this property of integrity So you know that the system is still in the state that you expect TEE's are similar in that They have this hardware root of trust which is important because Hardware is basically much more difficult to Manipulate than software, you know, you have a lot of different software attacks We want to have a hardware root of trust for maximum security Because it is a much more tamper evident And so the TEE Uses this hardware root of trust in a very different way. Um, it is not A a specific functionality it more gives you the ability to have on general computation In a secure area of memory On a I would say regular CPU only some CPUs are manufactured with TEE's but they're generally available now And so the way that you would use that is Um Let's say that you have some code you want to run. Um That is very sensitive or some data that's very sensitive You can run just that application In this secure area of memory and that's the property that the TEE gives you um And uh, the keyword there also is uh, confidentiality. It has confidentiality properties. Um, and many of them also have integrity So what that gives you is that the underlying host system or the hypervisor Um, both will not be able to introspect into your data. It can't read it. Um, that's the confidentiality property And then with the integrity property Um, it won't be able to tamper with it So you get confidentiality and integrity and um, general computational ability with the TEE but only for um, a specific sensitive application Um, whereas with this TPM, um, you are more doing integrity measurement of your entire um machine Um, does that sound right michael? Yeah, one way I like to think about it is that the TPM helps protect Um, the files on the system and and things like your your firmware your Your kernel those kinds of things while a TEE is a runtime protection So as your programs are running they can't be altered by the underlying hypervisor or altered by somebody. Um, But another program messing with the memory Yes So this would be interesting for folks like using hyperscalers where they know that they're in an environment where the hypervisors are being shared against across multiple organizations And but they don't know who the other tenants are is that we're going Yes, so like on AWS when you spend up an ec2 instance you can ask it to give you an enclave And that's what a TEE is is a secure enclave And so they're exposing that TEE to your virtual machine that you can then use and there's ways to do the attestation against that to make sure that It's the TEE that you think it is and then nothing is has altered that The the memory of your runtime system now TPMs are not like they're physically more available I think the TEE's in general like TPMs appear and everything from your phone to your laptop routers that a lot of TPMs are everywhere But they aren't as exposed on the cloud system. So it would be nice if Next to that checkbox that gives you an enclave when you spin up an ec2 instance Amazon would also give you a checkbox It's like give me access to the TPM as well Um, they don't do that currently so in some situations we have to use software emulation for the TPM So, um, even though you don't get literal hardware root of trust when you're not using a hardware TPM There are scenarios in the cloud and other places like that where software TPM Can still give you some good guarantees about your system And a TPM would be useful in situations where you're concerned about The addition of hardware to your machine that may have not come from one of your trusted suppliers Is that am I reading that right? Um Yes, kind of in the teacher actually that's that's it's interesting to bring that up like microsoft has a patch Out currently for the linux kernel In the part called the integrity measurement architecture that would do that with the device manager so when new devices are added or removed it would cause triggers to The integrity measurement architecture which then extends those attributes into the TPM Which could be verified by something like key line so that That brings me to another point the this A linux integrity measurement architecture, uh, or we call it ima for short Is a key part of this that linux has so it's a part of the kernel that can intercept Certain system calls like I mentioned microsoft trained extended to do device management, but currently it does Anytime a file is read it could potentially be Intercepted by ima and measured so a common integration is to have ima set up to measure every executable That's run by root and that would be the executable and in the libraries that are linked that executable And so ima right before those files are accessed actually measures them does a like a shaw 256 Hash on them stores that internally in the kernel, but also extends that into the tpm And so those measurements can be validated externally and that's where key line comes in so key line has an agent running on your target systems And then takes these measurement quotes from the tpm and sends them back to key line and because with the the security guarantees of the tpm and and the Key exchanges and lots of things like that that happen with key line and and the chip We can guarantee that the files are exactly what you want them to be That and if anybody tampers with them even temporarily They cannot escape that footprint that's left in the hardware so You used a phrase that I think is actually very important for what key key line brings to the table here and about You know this broader issue of securing the platform and that is remote attestation Uh, you know lilier might can you maybe give us a little bit of clarity about what remote attestation refers to and You know maybe draw some parallels to to an earlier world You know what it would be the parallel to right? Sure. Yeah, so I think a lot of people are familiar with something What we call an ibs or an intrusion detection system So it this is like aid on uh linux It's an open source system it you set it up to monitor the files on your system or certain files And if they get changed without you knowing it it can send alerts about that Um, so that's sort of the old way of doing it basically software monitoring software Remote attestation takes it a step further and puts that Harbor root of trust in there. So it's not a separate program monitoring things It's the linux kernel itself when it accesses a file Um, it measures that file and then records that entry into the tpm And so each new file that's added gets that new measurement added to the tpm and so The agents the key lime agent running on the host can when when the verify the key line verify We'll talk about the architecture a little bit later But the key lime server side basically ask the agent give me a quote from the tpm about its current state and give me the list of Files and their hashes that that the ima subsystem has accessed so far and so it sends it back and with the Signatures from the tpm that that we can ver verify this came from this tpm Um, and then we can walk that list of measurements and validate that we get the same resulting hash at the end that the tpm quote comes back with So it's totally separate system. I'm running separately can verify that the integrity of that machine and which is nice But it gets even cooler when you think about I can do this across my fleet I can have a key lime server monitoring hundreds and thousands of servers Or another node and protecting their integrity rather than having just ids on each individual one and not having like a centralized view of that so Remote attestation gives you that centralized view of a large cluster and also that hardware root of trust so Like lily was saying like hardware attacks are much harder than software attacks if I if somebody root gets a machine How can I guarantee that my intrusion detection system is still going to work because it's software monitoring software that could be compromised But they can't do that without physically getting that chip altering that chip and when you alter or or tpm for design So that they're tampered with their program. So it just it it provides some nice security all the way down the stack Are there like equivalent solutions in the third party event a security vendor space that you've seen or Evolutions that you're seeing already happening in the vendor space. I haven't really seen it. So I'm curious if you have eyes on I haven't seen it really in in the vendor space. I mean, that's why Red Hat kind of got into key lime is My colleague lukind was looking around at the vendor space and looking around at open source space and didn't see anything This was back in like 2016. So a couple years ago But he came across this paper from these mit researchers That talked about this theoretical way to to do this trust in the cloud with tpm's And then started working with them to build a prototype key lime launched as a community project back in 2018 Last year. We were part we became part of the cloud native Uh the cncf the cloud native computing foundation as a sandbox project Um, and yeah, there's not I don't know of anything equivalent in the commercial space. I do know some other larger companies The google's microsoft that are exploring their own attestation solutions internally for their own systems But uh, nothing that I've seen like that other people can adopt All right, well, um about remote attestation just quickly that is really cool Is that um, it gives people the option because there is such a chain of trust that's cryptographically verifiable It gives people the option to Actually audit that and check over it themselves if they want to um, they don't have to but it's not necessarily Offloaded to a third party. Um, so you're not just moving the problem around It actually gives you this property where you can check it and verify it And so not only are you maybe trusting the the the third party But you're also given the option to actually go through and check yourself periodically to make sure that it's actually doing what you What you were hoping and presuming that it's doing Exactly. I think um when when tbms first came on the scene The one of their their first users was to do what's called trusted boot, which um, it's different than what we call measured so trusted boot Basically allowed the vendor to lock down the the machine or the the piece of hardware and not allow the user tamper with it or alter it in ways Um, whereas measured boot is the opposite So trusted boot restricts the freedom of the owner of the device measured boot increases the freedom Because you can control what software runs on that and you can measure it and verify it yourself So, um, I think those terms can be a little confusing and people can complain them But measured boot gives the user and the owner of the hardware the freedom and control versus trusted boot Which restricted it from the vendor. Yeah, so so I if I was to try to restate that measured boot is really more a concept of visibility and clarity and transparency about what's happening as opposed to saying Here's a class of things that we're not going to Permit to run or to do or to act right and so it's really more about You know security through visibility and clarity and less about prohibition, right Well, I mean you can have it doesn't prescribe the policy if you can have policies that do prohibit things as part of measured boot Um, right if you don't want somebody tampering with your kernel You can prevent that from happening, right? That's a good example But you as the owner of the system get to decide with that policy you decide what that is right, right Okay, well, so, uh, you know, I think we are going to throw caution to the wind today and uh Undertake a demo which is always a good way to make a a little bit more tangible so that these kinds of concepts that we're talking about and so, uh Um, who's going to drive the demo for us today? Is that? I I've drawn the short straw on that so, uh, all right. Well strap in let's see how it goes So just just to get a little visuals on this the key lime architecture We mentioned this before but we have the key lime agents running on some, uh untrusted hardware somewhere Talking to a tpm or a virtual tpm And then over the network it can be trusted or untrusted doesn't matter It talks to the key line server side and the key line server side is basically split into two parts Which is the verifier and the registrar. So when the agent comes up, um, it contacts It's registered to say hey, I'm ready to go. Here's the details of my tpm. Here's the certificates for my tpm Uh, let me know what what you want me to do and then When after a notice registered somebody can tell the verifier Um, okay, this is these are the details of how I want this Uh machine to be tested and so then the verifier starts talking with the um the agent getting quotes and then periodic Quotes and it just does this periodic attestation and then at any point if it fails it can do revocation actions um, and so in this demo What we're going to do is uh, we have a setup on aws where we have a key lime server with the registrar and the verifier Which is talking to several different, uh web applications and so that it's just a generic web application doesn't really matter what it is But it's all running behind a load balancer And so we have the key lime agent on each of these web apps talking to a software tpm because again aws doesn't expose hardware tpm yet Um, and then getting the quotes periodically to the verifier and then we're going to fail a node by running some evil script Um, and then the verifier is going to kick off a revocation action to remove that node from the load balancer So this is a real world scenario where you have a normal web application behind a load balancer If any of those nodes gets compromised you would like to remove it from the load balancer So this will do this automatically So let's uh See if this works All right, so um, I had some terminals here and I'll tell you explain what's going on In the top here, I have the output um from the key lime registrar. So right now it doesn't do anything until new nodes come on About bottom left. I have um things running the first application This is the the output from the key lime agents running in that application And this is the second one. So I'm going to start the key lime agent here on the third application And we can see it starting up and so it talks to the the register It's everything recurrent returned. And so now it's just waiting and so you can see here waiting for revocation messages so then we can go to The screen and so the top here I have the key lime verifier running and so it's checking these two nodes. Um, just continually checking between these are the, uh The ec2 id's of those nodes in amazon And so now I'm going to tell The verifier about this new node this third node and what it should do and how it should verify it And so I use that I do that with this key lime tenant command And so I walk through these options real quick just so you can understand what's happening Um, so the key lime tenants were saying that the verifiers at this ip address The tenant of the the new um post that we're monitoring is at this ip address This is its id which is corresponds to its ec2 id Which could be anything like if it's an ec2 it's ec2 id if it's um, just random hardware you can use hardware Specific keys or or id's on that Whatever you want it could be even randomly generated uji id so it doesn't really matter But it's just to track this machine over time and then we give it an allow list An allow list is a list of all the files and hashes that we are allowing to run on this system We can say uh what we're going to exclude from monitoring which it's very common to exclude things like um log files and anything in in that database directory or things like that Things that are going to be changing all the time Right right that you can't You can't know ahead of time what they're going to be um This is just the the ca certificates that are used to communicate for that This is a payload that's sent over when access station passes and this payload will also include the revocation actions And so we'll look at what all of these look like So at the top here, this is what a sample allow list looks like It's basically just a list of hashes and what those files should be so etsy password should be this um or or anything else So it just it's a list of every file and their expected hashes And then our excludes are here and so we're this is a Pattern matching so we can exclude anything and in fact see we have etsy in our allow list We have a lot of other things too that the allow list Um Is pretty long, uh, you know 64 000 entries But we're just going to ignore everything in etsy right in this scenario it can change We're not guaranteeing that they're Something might be pushing out in a more mature evolved situation We you might actually want to monitor the files in etsy and say um They can only change after going through our cicd pipeline and our cid ci cicd pipeline also updates the allow list Just out of curiosity. Does it does it calculate allow or exclude first? uh, so the allow list um Is definitive so it it you can't have something that's not in the allow list Right, and then if there are things that fail to be in the allow list and don't match the excludes list those cause problems So the allow list is the definitive one and then exclude list kind of layers on top of it. Okay um Excluding stuff at home and i'm also excluding stuff in this root key line Which is where i'm running this from source currently um And then we have our revocation actions and in this case this is it's just a python script that's run by the revocation actions And in this case, so if the action is a revocation, we're going to Run it and what we're going to do is make a Call to the aws command line. Um, the elastic load balancers. We're going to read it or deregister our target This is the the group which is just here. That's the our elastic load balancer group and the target is whatever agent was just revoked And so we can see here So, um michael if I can interrupt you for just a second Yeah um So being familiar with aid they it tracks a whole bunch of properties about files And then the downside is that you actually have to run a check against your database to know that anything has changed So here we're looking at less metadata about files just looking at the content But dynamically Monitoring those files so that as soon as the content changes we can take action, right? Yes, or yes or no so in this scenario i'm showing, um, how to use key lime to just do the the, um Measurements of the file so the contents of the file as you said There's another way to use it and to use ima as well so that it you have Each file is signed And by a key so for instance and in upcoming versions of red hat Each file that's installed by an rt rpm Through red hat like an official red hat file will also have signatures of that file from signed by red hat keys and that signature includes the contents of the file and also the attributes of that file so So the attributes can also be measured. It's just you get two different things like And you can use either or with t-line and ima or you can use both together so They just have different Reasons you might want to use one over the other but you can get both the integrity of just the contents or the contents and the The properties of that So there are certain um Let's call them environments where maybe the data Uh integrity is One concern which we solved here But then also access to that data may also be a supplementary concern So like if um someone was able to get read access on a file that they did not have read access on before That that could be a problem Um, so do we have a way of addressing concerns like that? Yeah, if you're using the ima signatures versus the ima measurements, um, then the the signature When you when it's signed by the vendor will have or or in this case like when when you're signing the file and It could be red hot specific files or if you want to do your own files, then that signature would be Um, you'd have to manage the key and do the sign the signing and all that yourself But you sign it with the contents of the file and the attributes of that file. So yeah, somebody who changes it from um, you know To give themselves read access when they didn't before it would then fail that that integrity check Excellent, thank you yeah, but I think this is the simpler way to to, uh Visualize what's happening is just to look at the file content. Um, and then as you can see from here, I have a, um An elastic load balancer which has these three nodes at two at three and a half one That are these three nodes that are being monitored. Um, so now let's let's uh go back and add So we added this node and now waiting on the revocation messages. So now let's do the key line tenant command to tell Um, you know, I'm exactly how to monitor that file All right, so I've got the quote validated from that new system and you can see now here the verifier um We were doing It was the i058 is now validating the i058 machine So now let's I have a an edel script in my home directory. I'm gonna come probably to use that evil evil And then I'm going to run it and so now we see here Not found this file was not found in the allow list some entries couldn't be validated And then Uh, sending revocation to listening nodes Right, so it triggered the event. Um On this remote attestation system. So I just ran this thing that shouldn't be allowed to run And now I get this Then so now we come back here and I've noticed there's a little bit of a lag But now we can see that the node is draining so it's been Deregistered from the target group in elastic load balancer is draining any connections that we're there to it And now we'll be not receive any new requests to that node And there we go And so the the thing that so user bin evil sh was there when we registered the node and we had the um content hash in the In the um allow list and so when you overrode it and you changed its Hash, that's what kicked off uh key line from issuing the revocation Uh, well actually the script was there from before but still wasn't in the allow list What triggered the the attestation value was the script being executed. So because I have ima set up to um only monitor files that are executed by root Um, you can have it set up to do lots of different things um to monitor a lot every file that's accessed that read or write written to Or any file that's that's executed. Um, in this case, it was executed as root And so that's what triggered the attestation value And the nice thing is like if I go and remove that file and then try to restart attestation It will fail again because the the history or the the trace that that file was run is now in that tpm It can't be removed. So even just removing the file won't restart attestation. So I can't like just Compromise the system run a script real quick and then get leave the script and get out the trace that that ran is still there so being being an operations person and May or may not have made a mistake one or two times in my history of doing things Let's say that I accidentally did something caused a ricovocation to occur and it's not as simple as just reattaching the the node like approving it and reattaching it. How do we do that? You have to reset the tpm, which in most cases just means rebooting the machine So you reboot the machine it goes back through measure boot and gets to a state Where it should be and then you can restart attestation. But yeah, I mean that's something like key lime when you're when you don't have a very mature environment or or very disciplined environment where Any change that goes through has to be automated and go through your pipeline If it's changing or altering files that are being monitored You have to create a new allow list to go with that and update key line the same way you're going through it and then You know, so if you're manually ss ating into these systems and manually changing files You have to be very careful with something like key line because You probably want to put it more in an advisory state. So the way key line does revocation right now There's different revocation actions. It can send to be executed. It can also do in an upcoming release We'll have web webhook support so that it can call out to a webhook. So what? When you're first starting out, you might want it to just alert you Send a notification to slack this node fail that station It needs to be restarted or something like that so that you can make sure that your process of how you update these machines is very mature and You don't want a fat figure do s yourself Or or run and yeah So it does take a lot of discipline and but it is very it's for these very specific cases where you do not control the hardware And you want to control what's running on them Yeah, and I think Discipline is a great description Over the years we've moved from kind of the gilded age of system administration where I lovingly handcrafted four systems And that's what my job was to um, you know, I have to manage hundreds or thousands which brings that need for greater discipline and not just like ssh and running commands by hand because it would be impossible to manage it that way Exactly in fact it in fact it it structurally makes that I think that you probably are not going to want to do Because you're going to bump into exactly the kinds of situations that were described Right and this brings it. Um, I think it leads into another scenario of like, how do you manage these allow lists? Like, how do you Craft them? How do you know what software is approved and not? And that's where a vendor like red hat comes in We have future plans to make this an integral part of our system to allow to give you a pre-approved pre-signed allow lists from our red hat products that you can then incorporate into key lime or any other system and That you can combine that with your own allow list that you generate through your cicd pipeline lots of different ways to do this but we in a Vendor supported way where we are helping you, uh, you know solve these management headaches of how do I manage my allow list? So Michael we did get a question from the from the chat Um for one of our youtube viewers So we talked about how you can um get notified When events happen. What are some of the mechanisms by which those notifications are viewable or available to operators Yeah, so, um, there is a very simple web app that comes with key lime, although it's it's very simple But there's a rest apis to all the different pieces of key lime. So when you're integrating key lime into something else Because specifically I think most organizations want a single patent class. I don't want something else They want to look into I have to see what the status of the nodes are so the rest apis for the register on the verifier can let Another system know the status of individual nodes at any time and these revocation actions Can either be like I said, um these python scripts that are executed either on the nodes themselves like if a web app node one node fails all the rest of the nodes can execute some script to Coordinate off that node or or in some way, um, but you can do a lot of like custom actions that way Like I showed here. We did it elastic load balancer Removing a deregistering a node you can do Like a kubernetes cordon and drain where the node is like the worker node is actually pulled out and all the The work on it is migrated to another node And with the webhook you can integrate with things like slack or other messaging systems Or or you can just send it have a single webhook that sends back the notification And then you do a whole bunch of chain of stuff with it. However custom you want to do that on your side So it's pretty flexible in those revocation actions Very interesting stuff indeed. So let me just kind of Uh redirect us a little bit out of key lime, which uh, we'll come back to in a moment, but uh, you know, are there some other projects that are sort of in the same space and Trying to accomplish some of the other kinds of some of the kinds of things that we're trying to accomplish with key lime I know there's there's sigstore for example Yeah, well, let me um Talked a little bit just about the or I should turn some time over to lily to talk about the future of key lime Okay, well, yeah, let's talk about what's coming up Yeah, because you have you have a Yeah, that'll help us but it also helps us bridge into what you're talking about six door and the other supply chain um, yeah, so uh So we have some exciting future. We're coming up with key lime. Um, I can definitely talk about here the first one The uh, the key lime rust agent, which is what I've mostly been working on. Um There is a small team of us to uh port the um, the agent which is the the part of key lime running on the node that's um proving its state um over to rust which is a Uh safe systems programming language. Um, and rust is a very cool language, but um the important thing here Is that this makes it more appropriate for um minimal and immutable operating systems. So like Cases on the edge or cases like red hat coro s or fedora coro s We don't necessarily want to pull a bunch of dependencies and python into that So this rust agent will let us um bundle like a self-contained binary and easily like place it in those environments Which will make it more widely accessible. Um, so that's pretty cool um And then uh, michael, I don't know if you want to comment on some of these other ones. I could uh, I could read them, um, but uh, we're we're moving towards some containerization Is kind of the the exciting stuff We were able to um run this vtpm the software tpm In vm's right now, but we want to definitely also be able to do it in containers. So some of the future work is moving in that direction Right. So as the demo showed, yeah, we were monitoring a vm Because that's where ima is focused like the linux namespace or linux ima subsystem So it's not namespace, which if you're talking about containers like file system and a network and The the pid namespace all those things or what allow The container to exist kind of separately So there's work being done to create an ima namespace that will let containers have their own view of The ima measurements and then you could run a key lime agent inside of there and verify the contents of the containers at runtime Other things with the like the runtime of containers with run cd tpm being allowed so the container can talk to the tpm To bring on these operators and all kinds of stuff that are in the future for coming up But yeah Yeah, talking about please go ahead Now you you mentioned it's like six store six stores an interesting project That's also coming out of our group here at emerging technologies um in red hat and it allows the software supply chain to publicly sign artifacts in a way that wasn't available before and not only just publicly signed but publicly verify from everybody In what's called a transparency log. So this is very similar to The transparency or the certificate transparency project out of google which tracks every ssl certificate that's ever issued And that browsers can then verify and look at certificates as they're issued or revoked And so it takes those concepts about that idea kind of combined with the let's encrypt Idea of making democratizing ssl certs. So six stores democratizing Artifact signing essentially So you can prove the provenance of any artifact you want in your system And so as we build these into Lots of different open source projects and lots of different open source projects are signing their stuff All all these signatures can then be verified all the way down And then the integrations with key lime will be really nice because Your allow list could also be a signed artifact where you're pulling your allow list from for instance If red hat start publishing allow list for a certain version of our operating system Then we can publish that allow list in the signature to six store Key line can automatically pull it You know find out where it is to six store and pull it down verify it because it's publicly signed in this public transparency log And you get all these just nice guarantees about your whole system and everything moving through the system So there's a lot of really cool opportunities there. Um, so here in emerging tech. We're looking at like one to two years out Um, and this is going to be an exciting space to keep watching For integration. Well, it sounds like it. There's a lot there and you know, um, just to sort of wrap things up here before we get to our sweet sweet internet points Um, you know, somebody's interested in learning more about key lime I guess they can go to to key lime dev And I think we posted that in the chat But I know that we've actually also got some security workshops coming up. Um, can anybody tell us a little bit about some of those? Yeah, so, um, if you go on, um, I don't know if you if we put it in the chat, but it's um, red dot Okay red dot ht security workshops So we have the virtual security workshop coming up in august october 21st And it is focused. It's the various hands-on Exercises we have that's focused on open shift security and advanced cluster security. So, um, you know, if you want to Look at different use cases, you know around Around these technologies I highly encourage you to sign up, but it's virtual and it's also free to attend All right, and we also have a red hat security symposium that will be on demand Anything? Yeah, yeah, so for security symposium, you know, we it was back in july, but it is it is on demand So you can go and listen to all the different sessions. We had around You know containers security, kubernetes security data security Compliance we had a lot of various different sessions around this. So, um, you know, encourage you to go listen to those as well All right, well Coming up at cube con Um, we should have a booth for key lime and six store as well So feel free if you're at cube con in person And there might actually be some virtual Q&A booth times as well. I don't know the details of those but Be on the lookout if you're curious about those and you're attending cube con virtually or in person and come Come by and say hi to us Yeah, uh, good mention, uh, there will be a A solid red hat presence there including myself and my colleague henry main to talk about some of the Certification things that we do around kubernetes and open shift and so You know with that, you know, thank you very much All of you for joining us today. This has been incredibly informative I feel like my head's about to explode actually but uh, but it'll it'll pass Um, I think this is a fascinating subject and it's also a very fast moving one So we might have to have you back to give us an update on where things are in, you know, a matter of six months maybe, you know Maybe less because that's how fast things move sometimes. So anyway, thank you again Uh, it is that time now for us to talk about sweet sweet internet points um And you know, we actually have a bit of drama this week scott Look what we got They're under I mean seemed who seemed you know Unbeatable basically Yeah, so all those people out there they're like, oh, I'm not going to bother because why there's so many points that I don't have Clearly if you uh, just keep keep coming You'll you'll get on the leaderboard eventually Right. So keep coming keep collecting the sweet sweet internet points. You do that by going to uh, one of the urls Here in the bottom of of this slide, which we will also Post to the chat, you know, enter your code collect your points. Remember narenda looked, you know Beyond reach and yet now neck and neck with an lhacl. So collect your sweet sweet internet points and uh You know keep watching and again, please like subscribe and share Uh, and thank you everyone for joining the level up hour this week