 And welcome to another OpenShift Commons briefing. And as we like to do on Mondays, we have an AMA with one of the many upstream projects. This today is Claire, which has been associated with Quay and Project Quay as well. But we're going to get an update on that project from three of the key folks from that team. And there's Alice. And Hank, Doné, and Louis De Los Santos is going to take us through what's going on in the Claire community and give us an update on the latest release. So ask your questions in the chat. We'll have live Q&A at the end, and we'll try and answer as we go as well in the chat. So feel free to ask and post questions there. So without any further ado, Louis, take it away, introduce your cohorts and tell us all about what's been going on in the Claire community. Hi, my name is Louis. Today, Hank, Aless and I will be presenting the recent work we've been doing to rejuvenate the Claire application and its community. I am a principal engineer on Claire along with Hank and Alex works on EXD Cloud in the PNT organization. In this talk, we're going to go over what Claire is, why you should care about Claire, how Claire works internally, how to move from Claire v2 to Claire v4, where Claire is being used today, the v4 one roadmap, how to contribute, and then we're going to end with an ask me anything. So what is Claire, right? Claire is a set of scalable services for container security. It can be used by both developers and operations to understand any vulnerabilities that might affect your container builds. It's open source and it is community developed. So why does Claire matter, right? As most of you know, we're moving to distributing applications in containers. Before this, it's more likely that you would deploy applications on servers, right. There's some kind of configuration management or your application there. If you had issues, vulnerable dependencies on those deployments, the splash damage was confined. They were confined to the service you deployed it on. Now that we're shipping containers as the default way to package and deploy applications. We're now pushing those vulnerabilities to anywhere those containers run. This path shouldn't become apparent that your organization's posture around container security becomes pretty important. There's also something to be said about the recent supply chain attacks of the solar winds hack. Obviously, when you're moving to a centralized repository where your artifacts go for deployment. That repository can become subject to attack as you probably are aware of if you've been following any of that. So what's new in Claire before. So previously in clear B2, the API was layer based. You were responsible to handle the parent child relationship of layers. That became a little bit cumbersome. So we moved to a manifest focused API. As you can imagine the manifest structure itself expresses that child parent relationship of the layers. So the client no longer needs to. It's much more intuitive to deal with that schema versus having to deal with the layer pushes yourself. Claire before. I would ask content address ability as a first class citizen. I put a little star there because it plays a key role in Claire before and it's a concept in the rest of this talk. I was introduced to contents address ability in the service in the storage domain rather. And what it is is it's a way to have a unique identifier always associate with the unique resource. And I put that in context is a checksum of some binary data. As long as the binary bits don't change, you can be sure that the checksum identifies the data hand player utilize this and you'll learn more about this in a couple slides. We have completely updated all the security advisory data sources that we go out to we moved away from doing any kind of ad hoc database parsing where we can avoid it. The JSON dumps or YAML dumps we now favor oval as the standard and we're able to move most of the database or data security sources to oval, which is nice because we're able to create some tooling around oval that makes parsing pretty quick on our end. We now have a native CLI tool. We no longer depend on external repositories to provide a CLI tool for Claire. We've completely rewritten the notification subsystem. And we also have baked language support into the data model from the start. And we currently support Python. We've also completely redesigned Claire before in a micro service architecture with an emphasis on scale and performance. This allows operators to asymmetrically scale Claire, whether it's a push heavy day, or whether it's a read heavy day. These functionalities of Claire can be scaled asymmetrically and really conform to your performance characteristics and trends. So how Claire works. This is the 30,000 foot view of what Claire does right. There is a container will use container and manifest somewhat interchangeably manifests is just our JSON schema, it represents a container. So Claire is broken up into three services, the indexer, the matcher and the notifier. The indexer is responsible for taking manifest or a container and parsing out the contents, you know, typical things like packages repositories what distribution those containers are representing. And it places these contents in a report that we dubbed the index report. On the other hand is responsible for generating reports of flagged content. So, when the metric gets a request saying hey, does this manifest have any vulnerabilities affecting it. It goes to the index says hey do you even have an index report to this for this manifest. If you do can I have it so when it receives it. It goes to its database and just sees that there's any flagged content packages distributions repositories. It'll take all that flagged contents and it will place it into an ultimate vulnerability report and return that to the client and notifier just hangs out down there he watches the matcher and the indexer. And he identifies new updates that come into the system and, and then asks the index or hey, do these vulnerabilities have just entered the system they affect any manifests I care about. And if they do they'll send a notification to any subscribed clients. So let's dig into indexing a little bit deeper indexing is a term we use for extracting the contents of a container. A quick explanation of what a container is in the context of clear before a container can be viewed as a content addressable hash with layers inside that hash. It's like a container object, and then layers are content addressable as well. So for a hash here represents a high level manifest and the subsequent layers you see are also content addressable. The order of the layers themselves are dependent. And again that is expressed in our new manifest API. So how does indexing actually work. First thing that happens is Claire needs to identify if the manifest should be scanned. If this manifest submitted to it, it's going to check. Hey, have I ever seen this content addressable identifier right have I ever seen this manifest hash before. In this case, hasn't. So it goes okay yes you should be scanned let's move to the next phase. Next it determines what layers it needs to actually scan. So we express a common case here, and that the base layer might have been seen a million times by clear right this might be a base layer. So it says hey I don't need to scan this I already saw this, but the subsequent layers, it has never seen before so it's going to mark those as yes we need to scan these I have no content available for this. To perform the scan, the base layer is simply just retrieved from the database any content we know about it is not there's no work performed it is just a get from the database. The subsequent layers are actually scanned. This is the process of going to fetch the layers decompressing them looking for package databases, identifying the characteristics of the individual layers. Finally, it coalesces all those results into a final index report, and it saves that. So now the next time it comes around, if it sees that okay the 604 AC index manifest hash. If it sees it again it's going to say okay yes I have this data we already scan this by looking at the manifest hash it can already know I scanned all those layers right because if the layers changed the manifest hash would change. Cool so for the matching part I'm actually going to hand it over to Hank who's going to demonstrate this further and just give me a small queue when you want me to change the slides. Will do. Okay yeah so now that we've got this index of what's in these contents, and it's content addressed obviously when obviously we don't need to redo the work, because it won't change. But what does change over time is the vulnerabilities there are new vulnerabilities discovered all the time. So that in this matching process is handling this more rapidly changing data or the data that changes at all. Next slide. So the matching bit has two parts to it the matchers and the updaters the updaters feed data into the system and the matchers use the data, the updaters provide and sort of turn into our common format. The the matchers take the this index report and all of them in parallel examine the the index report for the sorts of data they know about. So here in this in this example, we have bits of code all at the same time looking for rel packages, another one looking for Ubuntu packages and other looking for Alpine packages. Most of these won't find anything. And the ones that do return their results. And we combine all of them together and return the matcher doing it this or return the report doing it this way means we can just sort of additively add bits to look at new parts. So, for example, our language support is implemented as a as a box on this second row here, and any sort of OS just automatically gains that support because of doing it this way. And these are all then fed and kept under the same, the same hash as the index report that we gained the same same sort of content address ability benefits there. Next slide please. And updaters updaters just on whatever interval an operator has a config configured for sort of check if there's a new version of the data source they care about and if it is patches that parses it loads it into the database for you. And then any new matching requests will use that version of the data. We know internally, whether a matcher has been run against the latest version that's in the database or previous version, and if it's not run against the latest one, it'll run again and save the results for you. So even if these updaters are run on a staggered interval, you'll always have the latest version of the results, and you won't redo the work that you've already done. So, next slide on to notifications. Like I was saying, because we sort of keep track of the versions of the security databases or when we update them. We can internally figure out what's been added or removed, and then work backwards to figure out which manifests we've seen that we were affected by what's been added. So the notifier effectively diffs these two versions of the database, discovers the vulnerabilities, and then looks at the manifests, and then does some sort of configurable action to issue notifications, whether that's issue a web hook, or push it out to a message broker like am QP or stomp or something like that. Next slide I believe is. Oh, sorry I skipped over a whole bunch of topics without viewing slide changes. Here we go moving from Claire v2 to Claire v4 so Claire v2 is the most widely used version out in the wild. That's what powers quay.io. The process of building the new version we ended up actually rewriting just about everything. So the API is completely different again it's manifest driven not sort of layer stitching. And all of our internal data is completely different so you do have to resubmit images to Claire. But you know it should be faster so little initial pain and some win on the back end. And now I'm going to hand it off to a lash. Okay, hello everyone. My name is Alex and I will tell you something about how we use and release Claire at the redhead. Next slide please. So as you might know, and you already heard Claire is an open source project for container security scanning. But it's also part of redhead quay product. Redhead quay is a container registry for the enterprise that brings solution for storing and building container images. And since the security point of view is essential part of for the container workflow, quay has integration with Claire. And it provides information about vulnerable content in both user interface and the API. So if you either build or push container into quay registry Claire in the background index the image and produce index and vulnerability report as was described earlier. Those reports are visible in the user interface and the user can check whether their container is vulnerable and whether they need to rebuild the container to update the latest version. In the previous version of quay Claire v2 was used up until quay v3.3 where Claire version 4 was available as a tech preview and it becomes default container security tool moving forward. So Claire version 4 is released as a GA in quay 3.4 and Claire version 2 is deprecated. Next slide please. Claire is also available in a public instance of quay which is known as quay.io and Claire version 4 will be used as a default scanner at the end of March 2021. Next slide please. And redhead not only release Claire and offers it to customers but we also use it internally and we are integrating it in our container release pipeline. This means that every image that is produced by redhead or ISV partner is scanned using Claire security scanning tool. And the release process is driven by the results so if we discover vulnerability in the report we don't ship the container and instead we rebuild the container because we don't want to ship content that is vulnerable to customers. Our container security information is also publicly available in redhead container ecosystem catalog and it's represented by so called freshness grade which is simplified and graphical representation of container security score. It's a scale from A to F. Previously in a redhead we use the custom solution for container scanning but since it was completely internal and we didn't have or our customers didn't have a way of to reproduce upgrades. In 2019 we replaced the custom solution with unofficial version Claire version 3. And last year in October 2020 we upgraded to the current latest version of Claire Claire version 4. Our instance of Claire is deployed in an open shift cluster on the production scale as a free separate services one service for indexer one for matcher and one for notifier that allows us to easily scale up or scale down those services based on the load. In the past three years, redhead built over one million of images and currently in our production database Claire version 4 indexed over 400,000 of them. Okay. Now, let's hand it over back to Lois. So let's go over what we expect from Claire before one. Claire before hit GA couple weeks ago so this is a roadmap for the next minor version. We'd like to expand on language support quite a bit. I believe there are some plans are ready for Java which are going to make its way in via another collaboration with code ready containers. We'd like to address going and definitely get some JavaScript, the note MPM ecosystem in there. We're currently plugging away vigorously on a Kubernetes operator. We should just make the deployment of Claire and the various architectures that you can deploy Claire in pretty trivial to set up. We're going to put an emphasis on performance and scale analysis. So we really want to know, you know, how well Claire scales in the event of excess traffic trendy and bursty traffic. So we'll be putting a lot of effort into that and then just general reliability efforts things like making sure database connections don't flood. You know the downstream database and making sure we really push the right heirs forward so clients can react to your various states of Claire. So for contributions. We're going to begin community development meetings at a bi weekly cadence starting in March. I will have a link to those meetings in a subsequent email burst along with the mailing list blast and other ways that we go about disseminating that information. There's a quick link to our mailing list. I would go follow this if you are interested in community development or further OpenShift briefing meetings will definitely be posting there. We'll also be placing a lot of information in the repository, which is a quay Claire the discussions area of that will be placing a lot of information about how to interact with the community moving forward. So there's just a couple contacts here. And if you need to reach out to us these are our emails and then there's a general way email. So that kind of wraps it up I hope that was informative build you in on we know the work we've been doing on where before. Well, thank you very much guys for joining today. I'm wondering Hank, if you have, and if there's a repo for the Kubernetes operator, if there's an ETA for that and if there's, if you need people to test that from the community. Not yet. I'm still it's still in like half broken building phase. So get ready for that. Let me know. There's a lot of folks and that we can reach out to and get them to test that for you and on different platforms and different. Yeah, we will do our ultimate goal is to make it sort of seamless into the Claire operator. All the, yeah, the, yeah, sorry, quay operator, the quay operator, it pulls it in sort of transparently behind the scenes and get all the sort of scalability benefits of something that understands Claire's workload. But it just comes along for free with your container registry. And Lewis, you were talking about performance and scale analysis and and trying to get some better, a better grip around, you know, how, how Claire was doing out there in the wild. If there's folks out there who want to work with you on that or give you feedback on that. What's the best place that is it that Claire mailing list to reach out to folks to do that. Yeah, definitely. I mean, that would that would be really great if it does peak anyone's interest. They can definitely contact us in a couple places definitely posting anything on the mailing list. We'll get our attention to help issues or get help discussions will get our attention pretty immediately, or just shoot us an email and, you know, we'll take a look at that. But yeah, there's a couple options right now. All of them should be pretty viable. But the use of it in house Alice referenced the a million images scanned with inside of Red Hat or we have created a million I think we've created more than that but personally I think I might have made a whole bunch of messy images that should be scanned. But that's growing into other organizations that can can utilize performance and analysis metrics and give us feedback. Is there any telemetry being added into Claire or anything like that to collect that from outside external. Yeah, I think Claire is capable to send the metrics into Jagger. It's like building a feature and of course, things like making logs available in the system that can easily make a dashboard out of it. For example, in our use case we use the Elk stack together, like together with Kafka. So we are able to see some metrics directly in the Kibana dashboard. And I'm going to read out one of the questions I know you answered it in the chat, but I'm going to read it out for people who might be watching the video here. Later, will the expanded language support for Java, etc include sources outside of red hat I maven central or will it be primarily focused on libraries that red hat supports and Lewis had a nice answer to that. Yeah, definitely. So Claire takes an open source approach to begin with. So all the data that we collect are independently upstream managed and the reason we do that is because the most accurate data is the source of the data right so we go out to your local manifest for red hat data will go out to the Ubuntu security tracker for Ubuntu data. We are pretty adamant on not using quote unquote closed sourced or aggregated data, we really do want to keep a consistency and a level of accuracy with the sources. So the long winded way of answering your question, definitely we will go out to, you know, maven central or any of the official sources that we can find. As long as we can grab the data in an open source way. Any kind of closed source data sources we have to open up to more of a discussion around how that could work, but we haven't had that happen yet. So for me it's always annoying to find a piece of software that looks like it does exactly what I want only to find out that there's some sort of hurdle in the way and you need to go get a key, plug it in and use it. So we don't want to, we don't want to be able to do that, but we want to make sure everything we ship by default works. All the data is there, you can all the all the checkboxes in our read me, you can go do without having to go sign up for another search. One more question in the chat, the in this probably top of mind for a few folks. How would you define the relationship between Claire and stack rocks? Are they mutually beneficial or is one a superset or subset of the other? How does this we just had the cube linter folks on last week, I think so it's probably topical. Yeah, so I can talk about what I know so far. I know that there's no official talks of how the teams will work together but they utilize clear B2. So we've already started discussing at a high level what clear before can offer them. I imagine that they'll be a pretty tight collaboration in the future, as long as you know it's a little bit above my pay grade to determine those things but as far as I can tell they utilize Claire as a product and they would be interested in Claire before so we should see more of a merging than a, you know, any kind of separation or a schism between the products I imagine that they would fall in line pretty well since the underlying technology they use for at least the scanning portion of their product is the same. So it works out pretty well. There's another relationship where we can get more performance and scalability from people using that and so there's another whole chunk of the ecosystem plus more engineering resources coming. Hopefully our way to work on this and so that that's always always always a good thing so yeah. That's the most exciting part. That's always the most exciting part. One other earlier question that I just wanted to get a read out on to for anyone who's watching this and not in the chat was. So, I just wanted to ask you, Murphy's question on can partners build custom matchers are custom matchers using Python and you had a nice answer for that to as well with. Yeah, definitely. So we support we support remote matching, which is when in the process of looking at the data that the matcher service received from an index report not to get too technical and terminology, but it can take a look at those contents instead of looking at its own internal database, it can go out to an API. So anyone can go and write an API and in any language you'd like. If you do want native matching support, obviously there's some advantages there. So, right in go at this time, and it will need to reside in the bear repository at this time we are working on on a possibly allowing X out of tree matches so matches that live in other repositories. But yeah, if you are interested in the cross platform, looking into our remote matching might be applicable to you. So, I don't see any other questions in the chat, you said you might have a little demo to give us. If you're up for that and that would be great. For those of you who haven't played with Claire. Let's try that for half a iota of a second. There we go. Okay, cool. Yeah, so what I have here is a pause for one more second. There's one more question. Will Claire v for be pushed to the container security operator and if so when. So, as I understand it, the container security operator is written against ways API. So it depends on having quay and the latest version of quay ships with Claire v for as it's sort of security back end. Thanks. All right. Let's get her get her done with the demo. Cool. So what I have here is a local Claire environment. You can do this today. If you want to you can go and pull the Claire repository. And do this command, make local dev up just make sure you have the necessary dependencies. We won't go into that right now it involves. Compose, but you can just look at the repository for that. And what it does is it sets up quay and Claire locally on your machine. And quay lives at local host. So what we can do. Is we can go over here. I've already logged in. I created an admin account. And I created this organization Claire before org. When you're utilizing the local environment. You can use pod man. As far as logging in and pushing an image to the local container or the local repository. You'll want to specify this TLS verified false because it's a local environment. We just didn't bother to wire in SSL. It just kind of gets in the way. So what we can do is a login. And that's at port 8080. So this will log into the quay instance that is running here. And I have a container. It's just even too latest. And we can go ahead and we can push that. You'll notice that you'll have to tag images with the local host 8080. And then we can go ahead and push that to the local quay instance. Okay, just got barked at me for a bit, but it looks like they're pushing successful now. So we'll go over here. And having new repository called testing. If you go to this tag area, you'll see that we have successfully scan. Let me make this bigger. I'm sure it's quite small right now. This was the image we just pushed and then Claire was able to scan it. You can go and you can dig into the security information via this gooey. Yeah, so if you'd like, we can just give you a quick overview of what Claire actually did. And Louis pulls this up. Like, again, one of the cool things is, because everything is driven off of this content address ability here, because Claire did the work of scanning this base image once any new container that shows up only the new layers need to get analyzed, which makes it a bit snappier. So this thing is pretty quick. So, if we just stand through the logs right here, you know, you have to be acquainted with the application a little bit to understand, but the indexer is really going and it is. It's doing an analysis on the layer right identified an OS release file. Okay, so we couldn't identify it. So the match identified for the AWS distribution. Okay, I can't find an OS release files. I'm just going to continue. So yeah, these will be the indexer itself is doing this it's grabbing the layers. It's identifying any items that can then it's storing. The index report. And then the clear match is really just going ahead and it's, it's issuing these these matching. So as you can see, it's a Ubuntu image is found that is 92 are interested. So this corresponds to in the talk, Hank mentioned that, you know, most the mattress aren't going to care right because it's not like you're pushing a, some kind of hybrid between an Oracle and photon container. So it's nice because even though we're kind of fanning out. We're not really doing a lot of work, only one of the mattress actually wind up doing work in the common case. So this is the, the match are basically evaluating the pushed content so this process will return vulnerability report, and then that vulnerability report is pretty much just parsed by quay and presented here. So yeah, you can do all this right now, if you wanted to go and pull the clear repository and play around with it. So just basically go and run, you would just go and run this. And that will deploy a quay instance at local host 8080. If you have any troubles with that, you know, just, just, we've been doing a new thing where, if you have like quick, quick support questions, drop them on discussions. And that seems to work pretty well we can mark things as like yes this is the correct answer, but it's been a nice place to put anything really support it should not serve it's a bug right you should put something there will will triage it will work with you to understand you know why your environment might not be working. But yeah, that's clear before. Hope you'd like to demo. Love the demo. How about if you can throw up the resources slide one last time so we we end on that people know where to find you again. And I don't know if anyone else who's on the call if there's anything we miss Daniel or bill anyone else that we should have covered. I think you did a great job here guys and so I'm looking to see if there's any other questions coming in from chat but I think we're we're good. And that you get a bunch of folks hopefully listening to this out there in the universe to come to the community development meetings starting soon. And we will post this video up there along with the slides if you share them with me and weed it out and as always, great stuff from the Claire team and we're really grateful for all the work that you guys do. So thanks for coming today and sharing all of this with us. Thanks a lot. All right, take care guys.