 March 23rd, it's Claire community development meeting. I'm going to share my screen now to show a current agenda. You guys see it all right? Yep. All right, cool. Yeah. So a couple of participants today, myself as always, Hank probably as always. Then we have Ivan and everyone. Ivan, you want to introduce yourself real quick? Yeah, sure. So my name is Ivan. I'm a quay clash Claire support engineer working from EMEA currently in Ireland. Hopefully that will change. Sometimes I'm in the future. Um, thanks for inviting me to this meeting. No problem. And then everyone. Hey, hello, guys. My name is Arun based out of Bangalore, India. I'm majorly working on Red Hat DevTools analytics team. So we are actually building a software compulsion analysis platform. So mostly, you guys will, will see me working on various integrations, like integrating our AC platform with the various tools like VS code extension, Intel J and the clear. Yeah. That's what I do. Cool. So a couple of things on the agenda today. I'm going to go over the enrichment specification, uh, moving to non-blocking Claire initialization better integration testing. And then a question, uh, that I kind of wanted to bring Ivan into, uh, because it's more of like what I would think it's like an operations concern. So I wanted to get your opinion on that. One of the reasons that I wanted to pull you in, then Hank's going to go over some stuff around the changes to notify or Json work, uh, Ivan, uh, wants to talk a little bit about sent OS and then a run's going to introduce, uh, their remote matching concepts that he added to Claire. Um, so first things first, the enrichment specification, um, we have this repository. It's quay Claire enrichment spec. And this is where we're housing the specification for enrichments. So in the initial design of Claire, we spent more time making sure the matching was accurate. Um, and we made a somewhat conscious decision to ignore, uh, metadata. As you know, there is some valuable use cases around, uh, metadata for MBD and even extending the concept further, um, red hat container grades and other kind of scoring mechanisms to understand how vulnerable, not just particular vulnerabilities are or the severity of particular vulnerabilities, but also how vulnerable or what's the grade of a container as a whole, right? So we want to hit those, those two concepts, we want to bring those back into the equation. Um, so we've been working on this specification, uh, called the Claire enrichment specification. Um, and I'm not going to dig through the entire thing, but I'm going to try to give you a quick overview of basically how this is going to work. So on the last step here, we're going to add an enrichments field into the vulnerability report. This is basically the schema you get that expresses which packages were found in the container and what vulnerabilities affect them. Now there will be an enrichment field, um, with a string and an array of raw JSON objects. So when your client tooling goes and deserializes this, it can handle arbitrary schemas, right? Because we're just giving it a JSON role message, basically like a blob of, of text. Um, now the string in the map, we encoded some information so that when the client sees these strings, they're going to use it as a hint to understand exactly what schema this metadata is in. Um, so if I'm a client and I'm looking at the vulnerability report, I'm going to go ahead and look at the enrichments field. I'm going to look at the string and then the string is going to give me, if it can, you know, if the schema data is available somewhere, it's going to give me a hint on that. So this allows us this kind of in between where Claire doesn't really have to care about the schema of metadata, which I know Hank, that was one of your big goals. But it can also inform the client of exactly how to interpret this metadata. Um, so the next obvious thing your brains are probably doing is like, okay, how does that key work, right? So inside this topic, the MIME type usage, this defines exactly what that key looks like, that the client's going to find on that enrichment data map. Um, I won't dig too much into it because it, you know, it's all here. You can basically read the specification. But it provides a MIME type like, well, I guess it is a MIME type. We follow the same structure. Um, it provides a MIME type, which kind of expresses the container format. I'll get into that in a second. The enricher, which is the enrich, the updater inside of Claire, that actually put this data into the database, a schema. So now this is the important part. When the client is looking through the enrichment data, if it finds a schema, that's great, right? It can go right out to the internet, grab the JSON schema file and understand the metadata. We have a mechanism placed in here that if the schema is not net hosted, there is no place to retrieve it from. You will be able to place a well-defined type in there, which really just means that you've created a type, you've documented it in our documentation, and that the client can go to the Claire Core docs and actually look up that type to understand how it works. And then if it decides to place no data, no schema data whatsoever, then the client, you know, is probably a dynamic language like Python, where it's just saying, hey, I don't even care about the shape of this data, I'm just going to forward it to a GUI or something like that. So now you might be wondering about this portion right here. So this is a concept we have where we defined container MIME types. And what this allows is basically you to wrap the NBD data with an associated vulnerability ID. And why we do this is because most of the time, when you're looking at enrichment data, you want to map it to a vulnerability. So this just is that glue work, right? So if I'm a client and I see this MIME type, then I know that the data is going to be wrapped in a map and I'll be able to say vulnerability 18 has this metadata associated with it. So that's just how we link it together. The rest of the spec is mostly just plumbing. The spec basically just piggybacks on all the already implemented updater work or updater content, business logic and whatnot. We already do this, right? We do it with vulnerability data. So now we're just extending the concept to enrichment data. We call it enrichment because we're enriching the vulnerability report with auxiliary data. So that's the specification. Inside here, I started working on the implementation. So this is the nitty gritty. This will probably get parsed out into actual tickets in our ticketing system. And this can be read over to understand literally the code changes we're going to make to make this specification work. This is still in a little bit of review. I think we're getting it into a pretty good place. But I do suggest that anyone that's interested in the current state that we have missing severities for particular vulnerabilities, if you're interested in that and why that is and how we're going to fix that, then you'll probably want to go and actually take a look at this Claire enrichment spec. This is slated for Quay 3.6. It might be, I'm assuming it'll be upstream before then. But yeah, so that's the enrichment spec. Just keep it in mind, it's currently being developed. And the whole purpose is to bring back metadata into Claire's results. Cool, so let me get back to the agenda. So the move to non-blocking Claire initialization. Currently in Claire v4, the stable release, the releases we ship with Quay, the matcher completely blocks until everything is updated. Or at least we run a full update interval. So obviously there's some downsides to that. We blocked everything. We didn't even return health checks, the entire thing locked up. So we just changed that concept. I think after spinning around with it and some group discussion, we're moving to the stance where Claire does not block, but Claire will return a particular HTTP code if it's not initialized. So the history here and the way we even got here is that there's been a lot of tickets around bad responses in Claire v2. Responses where the database isn't initialized. So it says, hey, everything's okay. But it wasn't. There was just no data in the database yet. So that was one of the first tickets we got. And Yvonne, I'm sure you're aware of that issue too, because it's just an ongoing thing about how to actually do this correctly. So where we landed was it won't block. So you can actually use the service. And we're going to return you an HTTP code that a client can basically ask for vulnerability reports until they get a 200. Now, if your client doesn't really care that much about accuracy, they can take what they get. Because while we're updating, we'll still return vulnerability reports. They're just not complete data. So now you have the option. You can either take what you get real fast, knowing that you might request it a little bit later, or you can have the client sit there and wait for a 200 and then return the vulnerability report. So that's how we're approaching that problem. All in the effort of informing the client that, hey, these results might not be completely correct. We haven't finished initializing the database yet. So Hank, you worked a little bit on that. Did I get that mostly correct? Mostly. We still won't return partial results. So that's not correct. But we will start serving API requests right away. We'll just sort of swallow your request and return a non-200 return. And then I think as part of this effort for the next release, we'll probably add an explicit readiness probe at some point. Cool, yeah. So we won't give partial vulnerability reports. We'll only return that. What is that HTTP status? I set it to be 202 accepted. Gotcha. Yeah. Which is, yeah, it's like, hey, you made a request and I know it's a good request. Not going to do anything with it though. Okay. All right, that's cool. Yeah, what's happening there? Gotcha. I do have a question. Yeah, so what's the what's the ops? What's the upside of this? I mean, the client still will not get any vulnerability data back. So it doesn't make any difference between what is going on now and what will be going on in the future functionality-wise. The upside is any sort of monitoring system that's trying to care about whether the API port is up and accepting traffic and now do that because we'll start serving traffic immediately and not wait for an entire update or loop to run. Yeah, I think the premise really came from the fact that right now we have the TNG operator. Correct me if I'm wrong, Hank, but the TNG operator now like just blocks until Claire is available and Claire doesn't become available until we run what could be a rather long update interval. So we wanted to skate around that problem but also be able to tell clients like, hey, this request isn't valid yet. We haven't completely initialized everything. Especially this is a problem where you're running in combo mode, right? Because it would block the other services from starting as well. So is that correct, Hank? Like you wouldn't even be able to index. Yeah, it wouldn't even start the HTTP server until everything was good. Yeah, this makes it so actually you can do at least some useful work with it as soon as possible. Yeah, the TNG operator actually blocks it because the validation will not go through until Claire responds with something. Yeah, that's why it's blocking the whole deployment. That can be circumvented by just saying, hey, don't validate. And I'm also thinking that maybe health checks should not be connected to the fact that Claire does or doesn't serve traffic. Maybe check health, bot health in a different way. Yeah, I mean, I think that's just sort of an artifact of way the actual Kubernetes manifest is written right now. Because we do serve a health check on a different way that comes up immediately. But it's just not being looked at. Yeah, but even so, before your changes, that introspection server was blocked, right? No. Oh, I thought it was. Okay, never mind. No, that spun up immediately. But it's more so that like, I think, not to like, I don't know, not to talk shit, but I told Alec what the differences were and then he said, yeah, I want to look at serving API responses. But this can also be, I mean, I'm thinking that if we have a health check, health port 8089 for overall Claire health, we can actually implement something similar. I mean, you can implement something similar as Kuey does Kuey returns a JSON that says these components are alive. These components are not alive. So if the health point actually returns something like that, it can be interpreted and then Kuey and any other registry that is hooked up to Claire can actually know when to start sending data across and that wouldn't block the validator either. Because yeah, we would see that Claire is up. That's fine. So Kuey can continue bootstrapping, but we won't send any manifest to it until the health point says, hey, we are now available because the database is updated. Yeah, that's, I mean, I definitely want to get it. I want to get it sort of like the Greenetties API health endpoints that does something like that. But yeah, we just haven't got around to it needed it. We have all the plumbing set up to basically inject a health check of arbitrary complexity. We just haven't been, we just haven't gotten around to actually writing that health check of arbitrary complexity. So it's all there. It can be applied. But I'm not exactly, I'm not sure if the concept we have of a non-blocking start and your health check concerns are mutually exclusive. I think they basically live together. Yeah, they're definitely not. This is like a first step towards having it work that way. But yeah, I mean, I think we all have, we definitely have the plans to get there. I do know Kuey's health check is actually pretty nice. So I think we do use that as inspiration. But yeah, I totally agree with you. It'll get granular as we move on. Cool. Well, that's just a heads up to, you know, watch that because it does change the behavior of Claire a little bit. If you have any kind of like mechanisms that are sitting there assuming that it will block, it won't no longer. So you'll have to actually check. Okay, better integration testing. So right now we have a pretty poor testing story. We don't really, we build via tests, but we don't really do much in terms of verifying and especially comparing to previous builds. So I was thinking about ways to actually attack this problem. And what first comes to mind, the simplest thing is that we have a local development environment and GitHub actions allows for Docker compose, I think. I mean, I have to confirm that. But it's possible that we just run the local development environment inside GitHub actions. And then basically have some kind of comparison logic. Hank, we talked a little bit about creating a testing harness around this, right? A little bit. So that would basically just be like an executable that knows that the local development environment is up, right? More or less, yeah. And then we'd have to see the thing I'm thinking of going. Yeah, I think the actual machinery to spin up of everything we support and configure it to talk to each other isn't, I don't know, super interesting. It's more so like what does it look like, what are the actual tests we want to run? Because right now it's pretty fuzzy as to what a successful run looks like. It's very easy for a human to classify, whether this is okay or not, and not very easy for a computer. So one of the things I was thinking about is that somewhere we can, so the actual testing system, right, like let's say we started it tomorrow, right? It has no data, just run with me here, like just conceptualize this with me because let's say we started it tomorrow, it gends vulnerability reports and index reports. It caches those reports somewhere. Now the next time we run it, it checks against the last build to confirm that things look the same. So the onus is on us to make sure that first run is correct, at least correct as can be, right? There could still be bugs, but that's just, you know, what are you going to do? We have to identify those in another way. Pure comparison isn't going to be enough. But this is just, I'm trying to scope this small to begin with, what would just be like, hey, this thing looks different from the old one is something wrong. So if we, you know, create this timeline where we start tests on one day, we basically seed an index report and a vulnerability report, and then then every, maybe we'll have to figure out when it actually runs and merges or releases or whatever. It looks at the previous one and does a comparison. And then if it doesn't look right, we need some kind of tooling and GitHub that says, you know, it's not right on purpose or it's not right, fail this basically, which I haven't really conceptualized either, because I'm not sure if there's a user input steps in GitHub actions yet. Yeah. But does that sound like a general first step towards better testing? Yeah, I think that would be, I think that would be okay. We'll probably need to come up with like a, I don't know, like, I guess our test harness will have the comparison function. Yeah, exactly. I don't think we'll be able to just like, quite compare things. Absolutely. Yeah. So we'll have to write whatever that equality function is, but that sounds good. I think we can have that run. We might want to split it in half and have it think about like one part runs against one set of like images. Yeah, one set of containers that we pushed up to like our own quay repository and have to like handle regressions and then some that pull against live containers to handle like changes that are actually happening, if that makes sense. Definitely. Live containers to evaluate any new bugs basically. Yeah, I get what you're saying, because we'll be very, you know, we'll know exactly the differences in the managed containers. But then there is a little bit, you know, whenever we go with live containers, there's a little bit of concern, but it is what it is. I mean, if we pick containers that don't shuffle their tags around too much, or that are pretty dependable, it should work fine. But I do see them. We can set up crown jobs to pull things locally so that it doesn't like break CI. Yeah. Yeah, that makes sense. Yeah, yeah. Okay. So I think, you know, with the with this strategy, I think there's a lot of room with testing Claire, especially because, you know, Claire's a little, a little complex to test because it does all that deferment of work. So that becomes the pain because, you know, as soon as you make one index, you're caching. So it gets a little bit. Yeah, we might want to implement all of the cash busting flags that we keep talking about every so often on a tangent. Well, not so much of a tangent, but I was thinking, like, what if we just literally instrument out an API that shows you indexed and also just has a delete, like make it a little bit simpler? Yeah, we could do that. I think this is the second, this is the second thing that's come up. That makes me a little wary that someone might want that to be behind a Nazi system. Do we punt off, though? I mean, we have JWT built in. So we have some. That's just authentication. Yeah, it's not saying. Hey, you can talk to the API at all. You can talk to the API and you're allowed or disallowed from doing some things. I mean, we could just start checking claims, you know, instead of just verifying in the JWTs, if we did want to go down that route. But then we need to be able to specify multiple ones and multiple ones mean it, like with different power levels and I don't know, I think if we do this, we should just shove it on the debugging on the introspection port to start with. Okay. Because that way, like you have to do shenanigans on the control plane to be able to talk to it at all. Okay. Which is easy enough for development. Okay. All right. Yeah. So I mean, we're splintering a little bit, but I'll make that another topic soon, which is like, how do we start busting the index or caches so we can actually do things repetitively? Because it's just a pain in the ass to test right now. Without that, you know, like I just dump the database and start the database and dump the database and start the database. Sometimes I've run truncate commands and queries that I then lose because I don't keep them around. So okay, we'll make that as a agenda topic coming up. But this is pretty good. I think I'll start at least conceptualizing how to go about those comparisons. Cool. So now a question that I wanted to pull in Yvonne for as well. And a question that me and Hank, you know, it came up on a PR and we were both like, I don't know. So should Claire fail startup if introspection fails? So if we can't connect to sinks for event data or for some reason we can't set up Prometheus, which is probably not going to happen since it's all local, but whatever might happen, should Claire fail totally? And we or should it continue running and there'll be no metrics? I'm not exactly sure because, you know, like running in a production environment without metrics, we probably want to not do that, I would assume, but I'm not sure. I mean, you might have a little bit more experience, you know, just dealing with the running of systems. Well, I'm not exactly sure. I mean, I think that the primary reason why someone runs Claire is to get their containers scanned. And these scans should be complete. Or at least they should be complete as they could be. So if introspection fails in the sense of sources are not available and updaters cannot function, then Claire should definitely report this in one way or another. I'm not so sure about Prometheus. I mean, we haven't had any questions about Prometheus metrics that are exposed by Claire yet because the thing is rather new. So there is no data that I can share. Yeah, I guess for context, when we talk about introspection, we're talking about the second HTTP server that it spins up that serves like profiling information, health checks, and metrics information. Well, definitely, we should definitely need health checks. If we are using them, then they definitely should be there. So yeah, I mean, I guess for any reason Claire should not function. Yeah, I guess my thinking is if you're actually paying attention to the health checks, you'll be unable to read them and the container will get owned though. Because, well, currently it crashes. Like we'll just pull everything down, right? In some cases, for some reasons. Oh, yeah, this is the Yeager stuff, because yeah, but you do make a good point. It's not, well, one, yes, if you can't get the health checks, the system's going to pull it down anyway, but two, there's not a clear Boolean introspection is on or off. I mean, there is, no, there's not, because we can configure aspects of introspection. Yeah, I mean, for historical curiosity, the reason why I implemented it as everything keeps chugging along if this doesn't come up is because I was running a bunch of these locally and was too lazy to change two port numbers in a bunch of configs and didn't care. So they just tried to open the socket and failed and kept running because that was easy. Gotcha, gotcha. I mean, this whole discussion with introspection it connects to the second point we had, the discussion we had about non-blocking clear initialization. If the health checker is returning JSON or something similar, like an object that says these components of clear are functioning, these components are not yet functioning, and we go towards that approach, then without the introspection being online, we cannot know what other components are actually alive or not. So we must assume that something went terribly wrong and we should just drop everything and restart, in my opinion. Yeah, that's fair. Yeah, I guess this is sort of a question of how much misconfiguration do we want to tolerate? And I guess if you frame it like that, my answer is less. Okay, no, I think that's a good point too. It seems like, you know, I've made a note here that an emphasis on a good health check is going to kind of clear the fog on a lot of these questions that we have because it does, it will provide some granular details about what's actually working and what's not. At that point, we can make a more educated decision about, you know, does the client pull it down for that reason? It maybe moves the responsibility of what's acceptable to not on us. In my opinion, we could tolerate Prometheus going down. I don't see it as a highly critical component. Metrics can be restored at any point. But if the introspection itself, like the health checker is down, then no. Player needs to be restarted. Yeah, and to Hank's point, it's, again, it's a little bit of an ambiguous statement because if you're watching the health checks and they don't go up, then it's going to tear it down anyway. So do we tear it down or do we just let the infrastructure do what it should do with health checks? Yeah, because I mean, yeah, we're talking about the thing that serves a health check is if that fails to come up, should we just keep plugging ahead or not? Yeah, because if it fails to come up, then, you know, the infrastructure that's checking the health check should tear that thing down. But do we rely on that, you know? Well, it depends on how frequent you actually check the health check. I mean, it should be, if it's for a minute, you can get a lot of stuff done in a minute. And it also has a threshold. So if the threshold is, for example, five continuous errors, and you have a flaky instance or a flaky service that is going up and down constantly, then you might miss things. I don't know. Yeah, yeah, it's getting a little tight here. So Hank, you want to do the Notifier JSON? Oh, sure. So I, at one point, a week ago, two weeks ago now, was working on the Notifier. We think we need some structural changes to the Notifier. Because the way it works now is it, season update, like, takes that one update, processes everything in one node's memory, and then sends it off to be delivered. So that probably, or yeah, because of, like, the way the Red Hat databases are structured, when they show up for the first time, they might be quite large. And they're showing up, they can show up at any time. It's not like we can just whitelist new ones. So we need to, like, split that into a, I don't know, checkpointing work model that gets spread across everything. And I, so I started on the design of that a little bit, but before that, I did some efficiency work, which included reworking how we handle JSON. So now we should be doing, dreaming JSON serialization everywhere in Clare, which just uses less memory and should generally just be strictly better. So when I'm working in the HTTP layer now, are there changes I need to consider? Do I have to use the codec package now that you added? Yeah, yeah, so, yeah, there's just an internal package as the functions for you and just use those and they'll do the right thing. And pool everything and, yeah, it'll be nice. There's less memory usage. Cool. So basically just look at the functions in the codec package. That's all I need to care about when I'm just, you know, munging. Yeah, when you're reading and writing JSON, just use those. Like the PR that pulled them in changed all the handlers to use those packages. So I'm going to go ahead and do that. Pull them in, changed all the handlers to use those packages. So just read a handler, make it look like that. Yeah, yeah, basically. It will get all the benefits. There's some movement in the Go standard library to implement a streaming JSON kind of scheme, but that's a ways off. So we're going to continue using this third party package that does it for now. Cool. All right, that's good to know. Yeah, because I know there was some changes there. So it sounds like there's plenty of examples. So that's good. Okay, Yvonne. Yeah, so this is quite a huge issue for a bunch of our clients who are still using CentOS images and not only it's a problem for images that they are building on CentOS, but there are also a bunch of other images like open source projects that used and are using CentOS as base images. Currently, Clairev4 does not scan CentOS images or images based on CentOS, which is, as I said, a problem. It also breaks functionality because Clairev2 did have this functionality and Clairev4 doesn't. I do understand that Clairev4 has a completely different infrastructure and is built differently and it also scans differently. It uses different sources. And I understand that because of that, the functionality of the new Clairev4 is different than the old Clairev4, but I really think that we should do something about enabling CentOS scanning at least until CentOS is alive, which it still is. And also if we don't, if we say that CentOS cannot be scanned, then we should move it to the unsupported, completely unsupported list of operating systems. Currently, the problem is that when you push an image that is based on CentOS, either 7 or 8, to Quay, and it's being scanned, the results that Clairev4 sends back is passed and it doesn't show any vulnerabilities. And we had a case where a client pushed the same image to QuayIO and their Quay local and QuayIO returned a bunch of vulnerabilities while their local Quay did not. And there was a question why these things are different, like so different. Yes, so my thoughts on this right now is that I would love to support CentOS. I think whether we can do that reliably needs a research spike. I know that in Clairev2 there were quite a few issues with package alignment. Around matching, I personally have not done any research into that. So I need to do that research or talk to an individual who knows about CentOS packaging with RHEL and whether they are completely compatible, right? So if I'm searching through an RPM database on CentOS, will those package names and versions match up directly with vulnerabilities in the RHEL ecosystem? I don't know that. I don't know, Aless, you have joined too. I'm not exactly sure if you know this or have details about that. No, I only focused on the redhead thing. So like CentOS was completely out of my scope. Yeah, that's totally cool. But as far as I know, that's not the case. It's usually the case being how CentOS, I don't know, Classic, I guess, 8 is a downstream of RHEL. It's usually the same, but not always, which is part of the ambiguity that we wanted to avoid by using security databases provided by the distribution publishers. And I mean, I don't think this is going to be satisfied or I think this is only going to get worse with Scream where that's now an upstream of RHEL. So the RHEL data is even less relevant to the CentOS packages. And I mean, if they don't maintain a security database for a stable distro, I highly doubt they'll maintain a security database for a rolling release distro. Yeah, but now to your other point as well. I do think that we need some kind of mechanism that says this container is not supported, especially with Quay. Doing that, we're going to need to take a look on where's the appropriate place to kind of place that business logic in. I can kind of wing some out there, but we're kind of short on time. But we can definitely make that a conversation point for the next community development meeting. And if it's priority for you, we'll make it a conversation point right now in the GitHub discussions. But I do agree with you, like bare minimum, we should be telling the client like, hey, we don't support this container. You'll have to proceed at your own caution. We shouldn't just... I mean, in a sense we do. We don't return a thing that says, hey, we know that this is a sentos image and we didn't find anything. We return, we don't know what to make of this. As in we return nothing. Yeah, but nothing is also what we return when there's no issues. No, no, no. We still return like, hey, we discovered this distribution. True. I don't know how Quay does it with Fedora and every other distribution that is not supported, but if you upload a Fedora image or Arch Linux image, it will show as unsupported. No, you make a great point, Hank. We might be able to shortcut most of the work and just say, hey, if we didn't find a distribution, this container is not supported. We'll have to play around with the idea, but you are right. We didn't detect anything. Yeah, I mean, that's a distribution layer or a client layer thing. So that's interesting. Yeah, we can definitely play around with that. I mean, it would be a tiny Quay PR that's just like, hey, if you don't see a distribution, say that this container is not supported. The only thing that does it run into issues where the user knows it's like a Fedora container, but we just didn't identify it correctly. Like there's an OS release file missing, or is that okay? If the OS file is missing currently, then it will still show as unsupported. That would be the case. Yeah, if we went that route. So that's the deal with Quay. Okay, okay. So that would be feature parity. So yeah, that's a good point. Okay, we can definitely take a look at bare minimum. We could take a look at that and maybe approach the problem with just a quick Quay PR to at least appease that. But we'll have to do a little bit more research on the state of CentOS. Yeah, I shared a link to Aqua Security's trivia, which is used for Harbor. I mean, Harbor uses it now because Claire is being deprecated by Harbor. And it does support CentOS completely, and it also supports distro less containers. So we might check that out as well, because we've had questions about distro less containers. Yeah, I think, I mean, it's a small blip, but distro less is on our radar. It's gotten brought up and I did do some early analysis and it seemed possible. I don't think it's a big hurdle. We'll just have to take a read look at it. But I do agree. I think that's a hot topic. And I don't think it's really that hard for us to support at this time. So we'll put that on the radar. So let me, okay. I'll type up some notes I run. If you want to go over remote matcher, Yvonne, are you good with all that? Yeah. Okay. I'll type up some notes that run. You can just start. Yeah, sure. Shall I take the screen? Yeah, no, definitely. I'll stop. That's awesome. Okay. Hope you guys can see my screen. Yep. Okay. So yeah, I'll give a brief introduction about the remote matcher. Anyways, it's a very small feature. It's not a big deal. This high level architecture of clear, it's not really for you guys. It's for the one who is going to look at the recording. So basically the clear mainly consists of two parts, lip index and lip one. The lip index is responsible for extracting the package and version from the container layer. And it produces the index report. And the index report is fed into the lip one, which basically consists of two major parts, matcher and the operator. The operator fetches the advisories from various publicly known sources and populates the database and matcher basically matches the index report with the database and produces a vulnerability report. So yeah, so this is how a clear look like with the remote matcher. Basically the remote matcher combines the functionality of matcher and operator. And it's a kind of a parallel implementation. It bypasses the actual matcher and the operator from the lip one. Yeah. So it doesn't actually replace. It's a kind of an add on to the existing matcher infrastructure. Yeah. So basically the purpose of the remote matcher is to talk to an external service from where you can get the vulnerability matching done. The main purpose of this is to leverage, for example, like the security vendor API where you may not get the complete database to populate into your local database, but you can make use of the security vendor APIs and do the matching. And also you can use it for the use cases like where your org has a set of packages, white, the allow list packages. And you want to check the container against that allow list. And if you don't want to ship anything, which is not in the allow list. So you can make use of the remote matcher for all this kind of use cases. And okay, so why do we want to do this? Specifically for the work which we are trying to do. So before getting into the nitty-gritty, I just want to give some introduction about our platform, which we have in Redat as part of DevTools team. So the team name is Redat Codery Dependency Analytics. We usually build the software composition analysis platform, which is focusing mainly on the security analysis, dependency analysis, and license analysis. And we have basically it's a hosted platform. It is hosted in OpenShift OSD and it exposes a set of restful endpoints to do, perform all this analysis listed over here. So also we have various integrations in place, like a base code, IntelliJ, where you'll get in IDE security analysis experience. So we can do all this security analysis without leaving your IDE. And also like currently we are focusing on integrating our platform with Clare, so that you can do the same with the container scanning as well. So right now our platform supports four ecosystem, Python, Node, Maven, and Go. So we support a vulnerability analysis for all these repositories. And the main point here is that our data vulnerability data partner is SNCC. Most of you folks already would have heard about SNCC and they provide a very reliable and good vulnerability database. It's a high-level overview of the platform and I can skip this. Okay, so the main reason for building this remote-matcher implementation is that our security data partner is not allowing us to share the database to the Clare. Actually they want the data to be showed through our layer, through our hosted layer. So that's why we built this remote-matcher concept with the help of Lewis. Yeah, so the next one, yeah. So this one is like, suppose if you are a VS Code fan, probably if you want to see what we are really doing, you can just download this extension and give it a try. It got a pretty good download. So yeah, so this is what finally we are trying to realize. Like as I said, we want to make use of the hosted API which we are exposing from the Clare. So basically we want to propagate all this information to the OpenShift Dev Console through various layers. So from the remote-matcher, it will go to the Clare. Then from Clare, it will go to the Quay and from Quay CSO through a CRD. It will reach to the OpenShift Dev Console where the developers can see all the vulnerabilities associated with their container image which is deployed into the cluster. So this is a pretty long path, but yeah, it is okay. Yeah, that's what we are really trying to achieve with this remote-matcher concept. Yep, that's it folks. Very cool. So will we eventually see language support in Clare with extended to that supported language support list that you had? Is that the overall goal? Because right now I know that you have something in flight for Java, but is your expectation to continue the same path to get Python node go into the remote-matching facilities in Clare? Yep, definitely, Luis. So as you said, currently the Maven support is in flight. So basically the indexer part of the Maven is done. It's kind of in the testing phase. Anyways, the remote-matcher out-of-box supports all four ecosystems mentioned over here. The only thing is we need to take care of the indexing part. Very cool. Yeah, I mean that'll be a great addition. There's a little bit of caveats with remote-matching and notifications. I don't think we have a great way to bridge them together because notifications requires us to understand when database has been updated. And because you have a remote database here, we don't have that concept, but we might want to spend some time in the future circling around whether we can get that data somehow. Maybe we can bridge the system to work. Obviously, when we first designed notifications, we didn't have remote security databases in mind. We assumed that we would always be holding the data and understand when the updaters go and grab new vulnerability databases. But given enough brainpower, we might be able to bridge the notification systems into the remote matching concept it just needs to be sat down with. But I mean, it's great to see that in some way, shape or form, we're going to have a pretty robust language support moving forward. Yep, sure. And another caveat is that currently the integration work which I did only supports the connected environment. For the air-gap environment, we still don't have a working solution right now. We are focusing that as well. Yeah. And as far as I understand, there's not a way to do that for air-gap because your partnership with SNCC says you can't maintain this data, right? No, actually, the contract is something like we can serve the data to our layer. We can't deliver the data as a whole or we can serve the partial data through our layer. So that's probably we can think of having some component which can go into the disconnected environment that can act like a remote matcher. Very cool. Very cool. Okay. All right. Well, that's an awesome presentation. I appreciate that. Cool. So we're about at the end of the agenda. Brad, I see that you have joined. If you guys don't mind waiting another couple of minutes, Brad, do you want to introduce yourself? Good morning. Yeah, I am. I don't know if my video is working. There we go. So yeah, I'm Brad from AWS ECR team. We are currently using for our image scanning solution. We're currently using Clarev2 and we are looking at migrating to Clarev4. Just trying to get a feel for the roadmap and what's going on and what's coming down the pipeline. Very cool. Yeah, we recorded this session. So you'll be able to play back anything. We're actually at the end of it now. But did you have anything you wanted to bring up specifically? Or, you know, would we just wait till the next agenda? Or I am curious about poking you because I don't know much about how AWS is using it in your back end. And I think there's some really good conversation points around there, especially with Clarev4. But I don't have anything particular. But I'm curious about just with your experience so far, do you have any comments or concerns? Sorry, no, I think my video got out there. Nothing particular at the moment. Just like I said, trying to get a feel and I'm sure things will come up and I'll bring them up when they do. Awesome. Well, it's great to meet you. I look forward to hearing from you more. So, yeah, we'll wrap this one up. I'll drop the video in the agenda so you can catch up on anything that you missed. But I appreciate it. It was a great presentation. Bye, everyone. Thanks.