 Hi, should we go ahead and get started? Sounds good. So I think the first item on the agenda is KubeCon Cloud NativeCon in 10 days and how we want to use the intrusion to pull new people into this effort and then form them what we're up to and then how to use the long time we have for the deep dive. OK. So we did an intro at Copenhagen. Do we just want to rerun that script or do we have a different idea this time? Well, we have this stuff from Shanghai as well. I was wondering whether we just wanted to use that material. OK. So what type of people showed up for those sessions? Is it something where you want to explain to users why they should care about conformance or something for contributors about why they should help with the effort? I think that's really clear to make. It's important to make it clear both for us and also in the event description to make sure the right people. So I feel like we kind of had that content divvied up across both sessions. The intro session was a combination of me giving a high level overview of what and why conformance and then Kevin gave a brief overview of how to participate, like how to run conformance and how to contribute results. And then the deep dive was three of us giving an overview of how to contribute conformance things back and Kevin presenting an overview of some of the lessons learned about air gapping and some of the difficulties you might run into in certain scenarios when running conformance. So we could shift those sessions so that one is entirely contributor-focused and one is entirely offering provider-focused. I don't know how other folks feel about that. The attendance was about six to 10 people for each session. I found we got more questions in the deep dive session than we did the intro session. It was kind of tough for me to tell how well it was all received because of the language barrier. Anybody else who was there want to summarize? Well, I have a quick question. So Dan, do we have an intro and a deep dive as well as the brainstorming session or just the intro and brainstorming session? The latter. OK. So I think at least the way I see it, the intro is kind of us teaching. So whoever we identify as the audience, we're kind of like broadcasting information. Whereas the working session is really just a face-to-face version of this, hopefully extremely productive. So I think to me what makes sense like if it's contributors that we're asking questions of or hoping that they contribute more, that would make sense in the intro section. I think if they came to the working group session, they'd probably get overwhelmed. I feel like an intro session that's aimed mostly at people who want to certify their cluster as conformants or understand what conformance is. With some additional details that go into some of the requirements that specify what and why conformance and maybe a light overview of what progress we've made in terms of improving conformance and its coverage might facilitate dialogue for those who are more interested in coming to a working group session. Yeah, just my two sides. So would someone volunteer to rewrite the intro text? I mean, I do think we can address both of the audiences that Brian mentioned, at least briefly. I'm happy to take the lead on the interest of the sitting. It doesn't conflict with any other sessions I have looking at the schedule now. I would love to talk to you about it. Yeah, I'm also available, Aaron. So we can just chat or slack about it. Okay, that sounds good. And if you can use me in some way, I'd be happy to help too. Thanks, William. Are we just having one speaker for the track or I know at COVID Hay and I think we had a couple. Well, that's where like I'm happy to have Dan maybe give more of an overview of the what and why of conformance from a CNCF perspective. And I could take a little bit more into the details of what the specific requirements for conformance are so on and so forth. All right. Does that sound good? Yeah, I'd prefer to just do the first five minutes as the intro and then I'm happy to hand it off. And I don't know if Doug or Brad or anyone else would like to do a portion of it. I guess the piece that we had in Shanghai in the deep dive that maybe someone would like to show is just running Sambui and demonstrating that step. How much time do we have, Dan? Is it 35 minutes like Shanghai or is it longer? It's just 30. Okay. Dan, in that case, I'd prefer to just have the two of you because I think three speakers will get too much back and forth. So I think just you and then one other person would be sufficient. Yeah, I mean, whoever Shrini wants to know whoever is fine, whatever makes it less stressful and easy. Yeah, there is a need I can help with the, with Aaron, so I can go back to there. Yeah. Can we do the Sambui demo into the intro? I don't see why not. We were pretty, we had a bunch of extra time for that in the intro session in Shanghai. Okay. Yeah, I'm seeing 35 minutes. So that one depends to me, like if I was coming and I wanted an introduction to conformance, then seeing the Sambui demo there makes sense. Whereas if someone comes to the, it's not even a deep dive, right? It's a working session. If they come to that, then they're gonna get really into the, we're gonna get into the weeds and it may not be relevant to those people. Yes. So I'd prefer if somebody else could run a Sambui demo, but I'm happy to do all of the talking, but I can sort that out offline. If somebody gives me something pretty big to run or we have a video to run, that's just as fine as well. Yeah, I can record a video just to give to you because it's, I can do the very simple, seamless sort of summarized versions because otherwise it's gonna take too long for any standard human to run full test suite. Sounds great. And I'm happy to do the cooking show, you know, pull the phone, make a thing out of the oven that you had to. All right. It doesn't interest a lot conflicts with any of my current things. So I can do this, I can talk. Okay, so we move on to Hippie Hacker. I think we're just gonna give it an API snoop. Update if he's on the call. One quick thing before we go on, I think we've mostly just covered the intro. Are we all set for the working session? Do we need to do any planning now or is that we just turn off and just start working? I think for the most part it's different from the session. I think we're gonna come back to it later on the agenda. I wanted to just review where we, I think the consensus was left a year ago and not, I don't want to assume that that's where we have to go forward from, but I think we're gonna hit that after the API snoop piece. Okay. Can you see and hear me? Yes. One of the things that I just wanted to quickly go through from a where we are standpoint is what we accomplished on that first link. I don't know if it's a 404. Clinton said one of the links was a 404, but I'll walk through and update that later. What you can see now is the front page for API snoop and we're bringing in all of these different buckets from GCS. And we'll notice between the first and second where we're doing E to E only. I'm kind of going back and forth. You can see we've added darker colors for when something is actually conformance and when it's not conformance. With E to E only, we're able to go through and say, for these tests, for example, that this is the endpoints here when you mouse over them and that's kind of where I set the link if you want to mouse over this section here. If you're part of any of these SIGs or are interested in suggesting these that you've upgraded these particular tests that hit this endpoint might be a great way to upgrade these to conformance if we want to increase our coverage very simply. I will show you how we start to generate all this. We're exploring generating data for multiple sources at this point. So we've set up a binder on CNCFCI for bringing in. So you can put in the URL of the GitHub through CNCFCI Snoop. Binder is the branch we're working on that. And when you click on that, and I think I've put a link for an already set up binder, we can actually go through and the running binder includes the data that gets generated. So we can look at our conformance data and the JSON and SQLite information. It's all driven from our sources.yml which allows us to bake things off of our conformance buckets and specific jobs. I think there's also support for being able to share the research of the idea we hear and we can pull together our generated data for multiple sources and share that with other folks to do some research across teams. One thing that I've put a link here to is our sources.yml link. Beyond this, we're wanting to be able to prioritize which Kubernetes applications we're looking at. And we did get granted access to the charts bucket and there's only BigQuery access to access that. Could use some help trying to grok pulling out one of the most heavily used Kubernetes charts so that we can prioritize bringing those into compare because once we have those, we can do some comparison against what applications are heavily used that use the API and which API endpoints they use heavily that are not tested. Most of the updates are kind of what we've had for a while. The user agent now includes EDE updates. We have some UI improvements. You should some feedback on. We are pushing the projects directly in to feed this obviously from our sources.yml. And that's where we're hoping to use the Kubernetes charts high usage to drive which common use cases and help to guide write-in some tests. I think that's the main updates. Is there any questions? Yeah, I had a quick question. Apologies I'm very late to the party here so I'm kind of catching up. Do we yet have a sense of, for some set of real world Kubernetes applications what percentage of their API usage is covered by conformance? It sounds like that's the name of the stuff you just demonstrated. Do we yet have a sense of how big the Delta is there yet? We do not. And part of that is I was trying to choose the applications that we heavily use and I just need a little bit of help with the big data queries for the Kubernetes charts. I can help with that. But so I think maybe put in when you say for some subset of Kubernetes applications is it this group's consensus that taking the most commonly downloaded charts and then running those charts against Kubernetes and seeing what Kubernetes API calls results is that sufficient for determining what we think the appropriate group of apps to cover is or should there be some other determination that you had in mind? Like, that's a good question for me or a poll or something. My kind of knee jerk answer to that as well. In theory, we could have a look at the Kubernetes API and we could go through it and we could say I would think that real world applications would probably need these things. So one of the things that would be a guess. So I guess the data is a much better way of getting at least an approximation of that. I'm not sure the data will necessarily give us a perfect idea either, but it's data. Right. I sort of tried to cover a good chunk of this through my talking performance where we were talking about is API coverage an acceptable way? We're not really sure. Is line coverage an acceptable way? We don't really think so. We ultimately need to look at what are a set of behaviors that we feel are appropriate. But in terms of API coverage, the stake we put in the ground was all of the stable stuff. So in this picture that you're seeing right here, we want that inner green ring to have nothing but solid colors covering it, right? And you can see that the first thing clockwise is all of the core APIs. So things like namespaces, pods, services, stuff like that. And that seems 50% covered. The next one after that is apps. And that's most of the workload style controllers. Those seem pretty important to cover first. So at a swag without more data, I would say we consider everything in stable needs to be covered. But again, that's a little fuzzy. But then there's still a gap for a reasonable user of Kubernetes needs more than what is currently stable for Kubernetes. And that needs to be the focus of the community is to bring those APIs of reasonable use of Kubernetes wired to become stable. So there's an example of something where it's necessary, but it's not stable. Correct. And bringing those to stable, one part of bringing those to stable would be to make sure they're sufficiently covered by tests that meet the conformance requirements. But I'm sure there are many other requirements that need to be enumerated by sig architectural or some appropriate group to say, here's the checklist of requirements you must meet for you to promote your feature from beta to stable. Yeah, and we, I'm not entirely sure what we're trying to achieve with this analysis. And we still have a lot of work to do that's somewhat more bottoms up to get basic features, make sure they're covered. You know, I think it's pretty straightforward. We looked through the API, we identified features that we believe should be stable and portable and make sure those get covered. It's neither line coverage exactly and it's not API in point coverage is not sufficient. You know, we actually just need to do the hard work of actually making sure those things are tested. And it's pretty, we have so few in twin tests right now that it's pretty easy to figure out which things are not tested. Maybe that will change in the future, but right now that's the case. I would push back on that and say, I'm not sure it's actually phenomenally easy to say which things are or not are not tested. And I'm also not sure it's really easy to enumerate what full coverage means from a functionality perspective. So like we have liveness probes and readiness probes are covered, but I saw there was some email traffic on confusion between the interaction of the two. And so it's not just looking at individual API features or individual features, but like how does everything interact with everything else from an user perspective? Well, whoever is making sure they are covered. So accidental exercising features accidentally, I don't count as covered, which is why I don't believe this kind of coverage measurement is useful. But if we actually want to test the behaviors of the features that matter, we need to make sure that we have tests that actually do that. And if the behavior is not clear, it needs to be better documented and then we can make sure that the tests exercise it in the right way. So one of the things that was added to API SNP in the API SNP and Kubernetes in 112 was the ability to change the agent to the test that was actually exercising the API. So we can start to see which APIs are actually being tested by which test. And it turns out- Well, exercise, not necessarily tested. Sure, that's valid, but it definitely gives us a list of what are the APIs that all tests hit and what are the APIs that are only hit by a few specific tests. Oh, by the way, can we just eliminate all reads from this coverage measurement? Reads are useless because all controllers just try to read all the resources they care about. That's why we're looking at API coverage from the end-to-end test perspective as well. But we can chart that out by individual client. So I'd like to maybe just put a point in that. This is a tool that is useful, but what it was confusing to me in some of these conversations is we have a backlog, but we don't have a prioritization in an execution model, similar to other SIGs, right? Ideally as a working group or even as other working groups, right? Ideally as a working group, I would like for us to focus on dovetailing this execution of this with other work items that we actually have a backlog for, but it has no priority and no one's been assigned and people aren't really executing on all those pieces, right? So this is nice. It's good to get a status readout, but as Brian points out too, there are much higher priority items that we should be focusing on addressing. Such as... The coverage, just general coverage for some of the basic things we already know about. Yeah, so Aaron has been plugging away. We do have a rough prioritization and Aaron has been plugging away on that. Do you want to talk about that, Aaron? Well, unfortunately, like largely, I've been trying to take pod behavior and pod functionality, and so I've been focused on the node conformance test as an area to perform first and foremost, but now I'm looking at any of the core namespaces that revolve around pod, and I've been using line coverage to take a look at what are the various packages related to pod functionality that are or are not covered. I'm trying to come up with some test cases from that, but I found a way to walk through the different API fields and go field by field or feature by feature to enumerate all of the test cases and then cross-reference that against all of the existing end-to-end tests that we have or find the files and see if there's sufficient coverage of all the different cases. So I agree with you. I'm sure we have really important things to focus on, and I agree these are tools to help us, but they are not the end-all, be-all answer, but I'm not sure, like, what this well-known backlog is that Tim is talking about other than pod functionality, which we all agree on. Pod functionality is what I was referring to. I think that's... So, like, live-ness probes and readiness probes are kind of pod functionality or are they not? They are. Right. And so how much should we be walking through the interaction between live-ness probes and readiness probes as it pertains to services hooking up to them, as it pertains to them being restarted, TCP versus HTTP versus exec probes and the interactions between those? That's just one specific area, and we can drill in on that, but I feel like I'm not quite sure how to federate out the drilling in and then reduce through a prioritization function. Because it sounds like... So currently, like, we have a single query, which is area conformance in the main repo, and we have currently 21 open items that only have a couple of fields actually have priority and assignment to them. If we are going to make traction on this effort, enumerating the state space and starting to open up the issues and give an overall priority for execution and federating that across a wider group, which is this group I think is highly beneficial to talk about because that will allow us to actually get stuff done because currently, who are the people that are executing on changing some of these things that we're reviewing and making some of these changes happen? There's a very small subset of us. Yeah, I've seen like a few Googlers here and there focus on storage-based things as it pertains to hooking up to pods. I have the two global contractors who are working on sundry pod-related things, and that's about it as far as people actually banging bits together to improve conformance coverage. And to your point of enumerating state space, that's what I view these tools as things to help with because I'm not sure how to actually go about enumerating our state space. I don't know how to quantify that. Someone in this meeting should take lead and say we have this backlog and we say that this is the most important thing or we agree to disagree at some point and say we are going to work on this one. It's the highest priority thing to work on. And we enumerate priority of the issues we have logged. This is exactly how I operate inside of SIGS. So I'm happy to start doing that based on what I see here. I'm not sure if I'm the right subject matter expert to really go through everything line by line. I can try and we can see if there's too much back and forth here and discover if somebody else needs to be doing this. But that's why I've been pushing to have these tools and use these tools. I need to figure out how to get the right subject matter experts involved in the areas. It's a little I'm done. So my like honestly, I really feel like Tim says we have this well-known backlog, but I can't look at the areas that are tagged as area conformance and understand what the known state spaces as far as tests. Brian. I thought Tim was saying that we need a backlog, not that we have a backlog. Both. We have both. Yeah. Prioritization across our backlog. So in terms of priorities, my high level priority and you know, you can try to contradict it, but since the goal is portability across multiple Kubernetes implementations, we need to prioritize areas of the system that obviously the people use, but also that are highly pluggable or likely to have multiple implementations. Parts of the system that are very unlikely to have multiple implementations should be lower priority. So this is why we're focusing on pod because it is the most used part of the system. It is has a very wide surface area. Yes. Cuba is highly pluggable and there are even multiple implementations. Yes, but I need to understand Brian, I need to understand what you mean in documented form when you say that very little is covered of pod functionality. I think this group is in broad agreement that anything that is pluggable should be covered and pods are the place to start. What comes next? So what comes next is we should get someone from, I mean, I could do it except that I don't have time, but we, we could get signal to help put together a list of the features, but literally you can walk through the pod API and look at what features are there, cross out the ones that are not portable and figure out whether those things have tests and then maybe consult with deeper experts about what behaviors would be interesting to test. There are some things that won't be discovered that way, like networking between pods, which is not strictly a cuboid functionality. Storage volumes other than local ones are very tricky. We haven't, because they're all non portable. So we haven't figured that out yet, but you know, the readiness probes, I think. Yeah, we have, we have some tests with readiness probes and liveness probes. They may not be good enough in terms of exercising all the different corner cases. I would say we should prioritize covering more corner cases probably lower than just making sure that all the really critical functionality gets covered. But if we think, you know, that would be a good area for someone to dig into. We can file an issue and say, Hey, we need to improve coverage. Could we get someone to go dig into that really deeply and figure it, figure that out. Okay. My gut just tells me dig in really deeply is really big and people are going to come back asking, but what do you mean by that in terms of federating, but I hear you. And for what it's worth, like that's what I did do that enumeration. And in concert with line coverage, that's what led me to identify that, for example, TCP probes are not covered by any of our end to end tests whatsoever. But broadly, many of the other fields seem to be covered by different tests. I'm not sure to what sufficient level of detail they should be covered. So is it just a matter of finding an owner for that? To make it like less airy fair, do we just need to find a, someone to own it and assign to them is that. Yeah. I think that's our fundamental problem is that we need people actually doing the work. People who can understand the domain and put the time and actually determining whether we have adequate coverage. And the fact that some fields are exercised by some tests is not the same as testing whether those are actually doing anything. Because I think there's cargo quilting and copying of configs. For example, that's been in the past. Maybe now that the no performance tests have been, some of them have been converted to conformance, we have much better pod coverage and we should move on. That would be, that would be great to know. It's really the right owner would be the person who actually owns that feature. Right. Rather than finding someone just. Well, many of the features were developed by people who moved on from the project or to other areas of the project. So, you know, roughly speaking, there are six, like in this case, the big node owns all the pod features. So they should be able to find some people within the same to help out with this. And how can we make that up? Well, we could go to sign out to make the argument that this is a, needs to be a high priority. Okay. And I see you have your hand up. Yeah. I just wanted to. Check whether do we have a, a common agreement on what the. Answer when we're finished, but, but presumably there's an initial goal, which is that, you know, a given application or a. If an application came along and. Wanted to decide whether it would run on a given application or if an application came along. And wanted to decide whether it would run on a given cluster. It could look at the conformance profile of that cluster and say, all the stuff I need is there. So that's definitely goal. Well, that's a different topic. Also that's profiles. Right now we're at a much more basic thing, which is making sure basic. Sure. Definitely the goal of profiles. So profiles is, I think a mechanism by which to, you know, proportionally achieve that goal, but, but is that, is the goal. That. Applications can determine whether they can run on clusters. And that some proportion of applications are covered by that. And, and, you know, maybe initially that proportion is like. Five percent. And then we, you know, push that number up so that it ultimately. 60% of applications can figure out whether they can run on a cluster. But, but is that a. Is that the goal here? Well, I think there's, there are questions about, should applications be able to automatically figure it out. Which is an API discovery problem versus shouldn't people be able to figure out. What they want and what they need. I think the other aspect of that is how many profile. You want. I think one thing that needs to be done is we need to go through. The functionality of the system and the test, even some of the tests that currently exist. And categorize them. By. Features that may not be a hundred percent. Portable or ubiquitous. For all these cases. And once we have a sense of that, we can also look at providers in terms of which. Features they provide and. Provide a lot of, provide a lot of features. And we can also. Try to find the sweet spot in terms of like a profile that covers. Features that are both commonly needed and. Commonly provided. So we don't have a cross product of 40. Different profiles that results in. No. Actual portability in practice. So we have a bunch of optional features like load balanced. Services and dynamically provisioned volumes. Ingress. So we have a lot of, a lot of things and. But I suspect that actually. Many of them are provided by most providers. So we could. Or even our back, for example. So we could have a profile that just bundles up. All of those things. And this is the common suite of. Cloud provider. Features of the system or something like that. And I love the way you're thinking about that. Cause that's when you get to the, the, the more, the more interesting aspects of this, where you start asking the interesting questions of. Can everybody support this? And if there's some feature that can't be supported. Somebody wants to do a one off profile. You have the hard discussion and you say, well, wait a minute. The profile. You go figure out what it takes for, for your provider to be able to support that function. Because anytime you can kind of tug people along, just like we did when we first started this, and you tug people along to a consensus, the end result is typically better. Right. So I think you're hitting all the right points and we'll get to. The more interesting and, and, and valuable conversations by, by, by what you were going through Brian. Yeah. I mean, so just as one example, even for the base profile, I went and looked through a number of the current providers. And when I was considering whether multi-node should be a requirement for the base profile. And, you know, from a user perspective, unless the provider has the ability to create clusters with multiple nodes is probably not super useful. They couldn't create a high availability application. And in practice, all the providers look like they did support multiple nodes. As far as I could tell at a glance. So making that part of the base profile, even though it makes me sad that mini cube would not be covered. I, and people say, well, mini cube would be more useful if it supported multiple nodes too. And people have ideas about how to do that. So, you know, whether it's in the, in the base profile or in sort of a mega bundle profile of common functionality. Yeah. I agree with that approach that we should look. Uh, hard at, um, you know, fragmenting the profile space. Okay, but just to kind of on that. That is actually the end goal is to, and this is a question, not a statement. The end goal is for applications to be able to look at the badge on a cluster or combination of badges, whatever the end result is and say, yes, I think my application, which should run there. Yeah, it's been more from a huge perspective. So like if I'm shopping around and I'm like, I want to, I want to take like application A and I want to consider providers, you know, BC and D. I should be ideally able to do that at a very high level. Yeah. From an application point of view, I'm hoping that we can just use API discovery to do it. Like we can disable APIs that don't actually work in clusters. And then the tooling can just scrape the API discovery information to determine whether the application is going to run. And again, the goal that just, the goal should be to try and tug everybody towards very common ground. That's when the users win and when the users win, the ecosystem grows. And I know nobody said this, but since I saw this happen in previous open infrastructures, you know, what you don't want to have is a race of what we're going to provide these eight profiles. And we think we've got a better cloud platform because we've, we've provided all these profiles. That becomes analogous to in the old days of the previous, providing their own schedulers or something and causing, oh, it's better. But now your stuff only runs here. If we're all working together to make sure it's more of a common infrastructure, because hopefully that's not where we're competing that, well, you're going to run great here and you're going to break over there. Yeah. So that won't help your ecosystem grow. If that makes any sense. I hope, I know nobody said that, but I just wanted to feel like saying it. Yeah. I don't think the goal is to differentiate using components. That would be the anti-goal. Thank you. So on that note, I'd like to just sort of remind folks where I think the consensus was a year ago on profiles, which is that we were hoping to minimize them. And I think folks have been pretty happy. You know, we'd have 77 certified distributions and hosted platforms and installers is a pretty extraordinary accomplishment one year into this program. And that we've been able to get everybody to pass the entire conformance test. Sweet. So I think that the topic for our conversation on the deep dive is about making windows the first optional profile. But I did want to throw out there that I think the approach that we're discussing here or the default approach to consider is that it is an optional extra that is an additional thing that conformant clusters can provide, but that we're not offering an option for windows only clusters that can't run Linux workloads. And then that we're not saying, and here's the next 10 profiles that we want to approve as well. But obviously, you know, the move from zero to one profiles is the most profound one. So we should assume that there'll be ones that fall afterwards. Yeah, I think there are other candidate features or functionalities that we can think about in the same context and think about whether profiles are the best or only way to handle them. Windows is certainly one and there may even be multiple windows versions that have to be considered GPUs and other particular hardware devices are another area where not all providers may offer the same capabilities. Something coming up is distinguishing privileged operations from unprivileged ones. So we could see more kind of pass like environments where the average application developer operator doesn't have the ability to perform privileged operations for various reasons. OpenShift is an example of this today and we probably need a test suite at least if not a profile that can be constrained within that set of behaviors and features that are allowed. So, you know, as certain new use cases and new capabilities become important, you know, we'll need to think about what is the best way to incorporate those. But all the examples you just gave there are potentially additive on top of the base conformance profile. Well, so the application operator. Again, I didn't hear you, Brian. The application operator use cases is subtractive. And one approach we're talking about, yes, actually the subtractive is to actually carve them out of the base one. Would it make sense that the preferred route would be additive and everybody accepts it being added to core? Because what I'd love to see is people go through the different features and you at least have a check that says, well, is there any way to push this into the core and could everybody do it? Because every time, let's say there's 20 of them, if 15 of them you could do that, I think you've done everybody a great service. I agree. I mean, if 98% of providers would have no problem adding that feature, then that's probably a very good sign that it should be in the core. In the base profile. In the base profile. But I think, you know, obviously Windows is a great example here and not everyone's going to support Windows. So I think that that's why it's an excellent first profile to develop. And Brian, on your subtractive, sorry, example, is it actually the case that nobody has root password on the cluster and can pass a full conformance on it? Or is it just that most users don't have that capability? In the OpenShift case, it's the latter, but I am seeing use cases coming up that are the former. Do you want to come up with the policy for that in Seattle in two weeks? Or is it possible for us to say that so far we only have a strategy for additive and will reconsider if we need to look at subtractive? I think profiles should only be additive. I think that's simpler. What I would do if that more constrained use case becomes important is carve that out and make that the base. Almost all of the important operations for applications would fit in to that base. Otherwise it's going to be very feasible. It's kind of like a stock split. Anyone currently conformant would then just automatically be conformant to that profile. Right. And then there'll be a bunch of new people, I guess, that would then only be conformant to the base. Yeah, that's a good analogy. So one of the things to be wary of with those subtractive type things, I guess my mind gravitates mostly towards security related things for admission controllers is I'm not sure we want to allow a scenario where some cluster operator is allowed to disable all these awesome security measures just so their cluster passes conformance. Then they turn those security measures back on. They put their certified conformance stamp on their cluster and then users show up and are surprised why their application doesn't work because certain features that they use are now disabled or don't work correctly with security measures enabled. Yeah, only people with super user powers can pass conformance effectively in that scenario. And there are already cases that are like that. That exists today. I mean, that's the reason why we have the suite publicly available so that way people can actually report back. We already have a policy too for remediation of those situations. So I mean, I don't think that's a problem. I think we've expected this. We kind of built it into the whole policy. What's the policy for remediation? I guess I just like I'm taking the, if this is about end users not being surprised and expecting their workload to be portable everywhere that a stamp exists. This seems like a scenario where that wouldn't be possible. So I mean, users can report. And we did. Go ahead. Sorry, Dan. Users can report. Like if they get an unexpected result. So that would probably be the remediation there, but it sounds like we just need to have that profile so we can capture that use case properly. But. But yeah, there is a remediation for this. Effectively this. To be clear. Yeah. There's multiple appeals processes, but we can and would take away someone's certification. If the direction that they posted on how a user can replicate this test results were no longer valid. Yeah. Okay. Well, and even I would say within most large organizations, there will be users with different levels of privilege. So given that some cases require cluster root, some require network privilege and some require node privilege. If that there are going to be different sets of credentials. In the cluster and not all of them will be able to pass. So that's an interesting case. We have different users that have a different kind of view of the conformance. But then that would be interesting. Like if the user doesn't have the admin access, they can just pick those applications that don't require. Right. Those sound like three different new profiles to me on top of that. Well, yeah, it may end up being that they all are just a privilege profile. At least that's the case that I've seen so far. Is it the people with superpowers have all the superpowers and the people with, you know, the other people have no superpowers. So I just wanted to bring this back real quick to the enumeration of the state space discussion. If I can just to review some of the charts that I showed at Shanghai to talk about our progress and what I thought the next steps were to see if that's in line with where we were all headed. So this is a chart of how API coverage has improved for the top three are the conformance jobs. The bottom three are release blocking jobs that prevent release from going out. The intent of this chart is to show that if you focus only on stable core or stable, the API coverage doesn't really change all that much between the conformance and the release jobs. That's indicating that our end to end coverage isn't really great if you look at API coverage solely throughout them. And conformance is relatively decent. And that we've always been going up and to the right. If you look at API coverage purely by does any single test hit, hit it or not. This is API coverage copy pasting a bunch of Google spreadsheet things. Green means at least one test hit it red means it does. It didn't. This is only for stable end points. But if you look at it this way, we really don't cover a lot of our API. It's kind of unclear how much of this relates to cons. API snoop shows this a little more prettily. Wait, were those just end points or is there a few? Those were just end points. And so, but some of these like maybe like a lot of them are proxy end points where certain verbs aren't used. The proxy test only uses HTTP get it doesn't verify whether put post patch delete things like that are used. Some of them are RBAC, which is an optional thing right now. And some of them are workload related tests, which haven't transitioned over. I think also resource quotas aren't really exercised because I'm not sure that's an optional feature or not. If we looked at the API end points as exercised by like how many tests exercise again API endpoint, here's where you can find the API discovery client doing its work. And if you chop that out, it's really only these most tests end up creating pods and deleting pods. So I feel like there's a lot of incidental coverage there that I can't look necessarily at their fields. But there are a number of end points that are only exercised by one or two tests, which tells me we probably could use a lot more test cases on these things. So for example, I dug deep deeply on one of them and discovered that yeah, the proxy proxy for pods proxy for services proxy for nodes only tests get and ignores these other verbs. So we could do really great. We can also proxy sub paths as well as regular paths. I looked at line coverage line coverage as we all know isn't really super realistic. If I'm comparing line coverage with conformance on the left with release blocking tests. Sorry, conformance coverage on the right release blocking tests on the left. The release blocking tests do ultimately cover more, but it's all relatively red. And it doesn't particularly change over time. If I look purely at, say, how Kublai interacts with the Docker shim layer, it looks like it's a little bit better and more in the green. It could be perhaps when we talk about exercising more pod behavior, we're looking at having more of this level of stuff exercised. And then finally, here's an example of how line coverage caught that TCP probes were not exercised in any way, shape, or form. I use this in concert with walking through all the API fields. Perhaps a less noisy way of representing this could be using something like code Cubs, a tree map graph. Although this doesn't let you drill down. This could also potentially help us identify areas in the code base that we think should be covered, but absolutely aren't. And so I have next steps for the tools and the next steps for the coverage are really filling in the gaps that I just described to you. So what I can do is start chipping away and just filing a bunch of issues like this. I'm not sure if this is going to create too much noise or if there's another owner who should be doing this level of enumeration. But this is this is how I would start enumerating things. I guess I would be looking for somebody else within this group to say, yes, that's important or no, we don't care about that right now. How about we file them and then we can talk about them as a group and give it priority. Sounds great. Aaron, I would think that the going through from the user point of view and identifying the specific behaviors is really the starting place. And what these tools can provide is just sort of the check on did we or actually really it's identify. We already have a test hitting that behavior. We'd still have to go look at that test to make sure that it's actually testing the behavior. I would see this more as just not the starting point. It seems like something we should be doing sort of as a follow on and really the starting point has to do that. I completely agree. Like I've always said when pushing us towards this stuff is to say these are like we should use these to measure our progress, but they do not necessarily chart our end state or our end goal. So that's where I feel like helping identify what charts are most downloaded as a proxy for what charts are most used as a proxy for what applications are most used as a proxy for what API endpoints most user applications would hit could be another step forward in terms of looking at from a user's perspective what functionality do we believe to be exercised the most. It seems like I agree we should be enumerating from a user's perspective. This is just helping me identify gaps. I'm just wondering if filing issues against things that are results of those tests is the most productive way forward or we should be as I think was discussed earlier at least as I understood it. Looking at say pod and and the high level functionality breaking it into pieces and then assigning those pieces for somebody to enumerate the specific behaviors of that high level piece. That's one issue somebody goes in and gets assigned they need to analyze what all the different API options are probes that we want to be a part of conformance and enumerate those test cases as these are the behaviors we expect out of this and then somebody has to build the tests and it's a lot of work but that's what we need to do. Yeah I agree wholeheartedly. As for a mechanical process to get there is it a feasible I mean I think these automated processes which draw graphs and give us an illustration of what we have right now are useful I can't help thinking that a fairly brute force approach of taking you know enumerate the API and for each piece of the API apply a subjective ranking to it to say you know if I cannot create a pod I clearly can't do anything so that's a one and you know there are other things which I can definitely build an application that's useful without certain other features so those are you know a 10 or whatever and literally just go through that list and apply a number to each one and say let's start at the number ones and make sure that we have coverage of those and then go to the number twos and the number threes. I feel like we're running close to time and I think these are all good ideas I will try to take away some of this I really do think I'm gonna just start filing issues and see if they get swatted down I do think we should think about how we could do this style of enumeration at the working group session and how we could do this in a way that it can be federated out so that people can do the mechanical work and then bring it back make it done have it done in such a way that it can be merged sorted in a same place. So that's all I've got there. William did you have anything you wanted to add? I mean it sounds like we're in violent agreement that we need to fall bugs and prioritize them. And have more people actually. I guess that's what we should do. Yeah, but is that some people that filing bugs might not be the most useful medium for this maybe we should collaborate in something else like a document or a spreadsheet or something. I'm going to start with filing bugs and see what happens because thus far I don't think we've really done that. I like bugs. We can always export that into a spreadsheet. Or a project or something. But is one of those higher priority issues to produce this list? I think it is the all of these issues are this list. Okay, all right. But yeah, I mean I was thinking that doing it from just looking at what what's red and green in the charts may not be that's what I meant by it's not so much filing issues as what's red and green in the charts is kind of spraying. Okay we hit this API endpoint but there's 20 different features in that API endpoint. It's a guide. It's not even a map let alone the territory. These are not purely mechanical. These are just tools to assist us but they do not enumerate the state space. That requires a human to actually interpret. Right, so I mean we certainly need an issue that says we need to enumerate the state space and start peeling it off from there. Okay. All right. We'll see you all in Seattle. So one quick question Dan. I think you would originally on the note you sent out the other day talking about these two sessions that we have. I think you're asking for the moderators or speakers of the sessions. We have the intro stuff done. We needed a moderator for the brainstorming and working session. I was going to volunteer for them unless someone else really wanted it. I don't think it's a guiding through the role which is make sure we're staying on track more than anything else. Okay. And Brian are you able to make that decision? I'm going to. Okay. Excellent. So I want to make sure we get some players. Cool. We were worried about the day of the schedule and people being able to make that day. Okay. Cool. Thanks guys. Thanks everyone. Thank you.