 Hello folks. Hello Taylor. Morning. Morning. Good morning Victor. Howdy folks. Please add your name and. Any agenda items to the meeting that's we'll get started at five after the link to the Google doc, the meeting that says and the zoom chat. It's also. And the calendar, I think, event. And, and CNCS slack. We'll get started in about five minutes. I'm sorry, we'll get started at five after in about one minute. All right. We'll get started. So this meeting is being recorded. This is Monday. So you know, it'll be published on the CNCS YouTube channel. Please add. Your name and any agenda items to the meeting notes. If you have any and you can't reach the notes and agenda item, then you can drop a message in the chat. If you have any, you can drop a message in the Zoom chat right now. Or speak up. Does anyone have anything they'd like to add? Sure. My screen. Can folks hear me? Yes. All right. Well, you can add your name to the 10 days, whenever. Let's see. So other than the upcoming events. The tag. We have a new event on Monday. We have. Kubecon and ONES. In October. The CFP is a close for that. If someone has something interesting, that's going to be. And one of those, if you've been accepted, then. Maybe we can add that. As a. List to the list that folks are recommended to see. I'll drop the pull request to the end of this. And then we can take a look at. Whatever it's there. And then tell. You're up. Or the CNF operators. I can, I'm going to let you share screen because I know it's a pretty big. Doc. Sure. That's, that's a good idea. Let me do that. Here we go. Did that work? I can see it. Okay. Good. Thank you. Yes, I've been promising this for a while. And it's finally ready. And started to get feedback from colleagues. And of course I would very much appreciate feedback from this group. So it's a very long document. It's, as I said, it's a 15 pages in length. Not in a format that could immediately be usable. Within our working group, but. It could be part of it can be, or it could be the starting point for best practices. It is not written as best practices. It is written more of a discussion. Of the problem field. Solutions. And. Divided, I guess into two, there's first of all, discussion of the operator pattern more generally. And then there's another section here. Hope you can see the table of contents here on the left, which is practical considerations. And here I think maybe we were getting more into things that could be best practices. In terms of implementations having to do with Kubernetes specifically. And those of you have been following the Slack conversations. You know, I think that's a really good point. I think that's a really good point. Again, raised a very good point about. CRDs being requiring administrator permission. So I go into quite some length. Into alternatives to that. Which I discussed. So again, I'm not going to read this whole document right now. I'm just giving you an overview of how it's structured. And what the purpose is. Uses for the operator pattern. Or I think again, where. I go into some CNF specific aspects, but some aspects are not specific to CNF. They're really specific to any cloud workload. Or and Kubernetes workload. So again, going into detail. And of course I would be very happy to hear feedback if there's. Uses that I've missed. Or uses that I can go into more detail or of course got wrong. So any of you can comment on this as I noted in Slack. So just be aware that this document is publicly globally viewable and commentable. The reason I did that is well, that's the only way I really have to share a Google doc with with this group. I'm not able to share it from my red hat account. Yes, another point to mention when I discuss the operator pattern more generally, I do have to get a little bit into what I call a terminological soup. What is called operators and Kubernetes is actually has nothing specifically to do with the operator pattern. And even in Kubernetes and in a lot of presentations, people talk about the operator pattern in Kubernetes and they usually mean well, CRDs, custom resource definitions and a custom controller built around it. That doesn't necessarily apply the operator pattern in the classical sense in which it's been used for more than 20 years right now, computationally. So I give you some length to actually separate these two. Maybe it's a little bit pedantic, but that's the kind of person I am. I feel like these two things should be understood. So despite the terminological soup, I think we can get through those problems of names and really most of the document is discussing the meat of the issue. So yes, there's a lot here to go over. I'm not going to, as I said, read it all right now, but I would be happy and honored if people in this group would read through it, provide comments. You can add comments directly into Google Doc. And oh, and I should also mention, I realize not everybody has access to Google. So just please ask me I can provide a PDF too, although that will be a snapshot. This is a living document. I'm constantly editing it. So I could provide a PDF of a snapshot right now. And pretty much it. I wonder if there's any general questions or comments. Well, if not, then I, what I could suggest as homework, whoever is interested, read this and maybe next meeting, we can circle back and discuss, well, what do we do with this? What can we do with this? If at all. And that's it for me. I just have a question. I will, I haven't read the document yet. So probably, but I, do you have any conclusion at the end of the document or anything that you, final thoughts. That's a good point. I probably should add a section like that. I tend to think of those things as being obvious. Yes. This is very opinionated, written in a very opinionated way. I guess the conclusion is, yes, operators are very important and useful. And there are some practical challenges. Numerous ones that I. Discuss briefly, but they're probably worth going into more detail on our end. As I said, things like the CRD and what to do with it. Things having to do with namespaces. I see Ian already wrote a comment. And we're going to discuss the next aspect of the agenda is the aspect of privileges and how, how do we deal with the containers that require privilege as well? That's true for operators as well. Now I would say that you need to remember, at least operators that do work as a Kubernetes workloads, right? Because they could, they might also require things like special host access, et cetera. That's actually quite common for networking solutions. Right. And yeah, the other aspect I talk about as well, it's very hard to write good operators. that I see sometimes, and there's really no simple way around this. They're just complicated because of the way Kubernetes is designed. And yeah, my conclusion is I strongly believe that this is one of the killer features of Kubernetes. And I would love to see our working group take operators seriously as a general concept and then think of how they can work with a lot of the best practices that we suggest. So, but thank you for the question. I think I might actually make that explicit and add a section at the end of which is kind of the bottom line. Okay, thank you. All right. Well, thanks. We can continue with the agenda. I just have one more point. I've just written a note in there. And I think it's worth mentioning because it seems important. One of the things we're saying in the, and let's be honest, I haven't read the entirety of the document, but one of the things you mentioned here is that you could be providing the operator from the system to many containers. And you know, that I think is the obvious. Well, if you're doing that, you need to sort of define what you're providing. And it seems to me, if you're doing that, you're actually providing something a bit more as a service, a little more like to take a bad example, EKS, you're providing something that delivers a service, you're not providing an orchestrator that will orchestrate, you know, container images, that's the, the CNF, you're not likely to be providing something that orchestrates container images, the CNF brings along with it. So I'm not, it's got a useful or there are certainly potential uses for that, but we might want to draw our boundaries. So the thing you deliver is more testable. It's very hard to test something that orchestrates container images that don't live with it. But it's quite reasonable to deliver something where the end result is, you know, a service, a microservice that might form part of the application, but you can detail what microservice is good for. That's a very good point. I originally actually, when I started the document, I had a section exactly about that, how do we differentiate between what's part of the platform, right, what you're calling here as a service, rather than part of your workload. I ended up breaking up that discussion into it's woven through the document, but you're making me think right now that it's worth highlighting it as its own section and maybe I wonder if it's, here's my thinking on this is that I think to an extent that's a practical consideration because in the end, whether it's provided as part of the platform, you know, who's providing it if it's a third party operator that you're using or it's the CNF vendor who actually developed and packaged it as something specialized. There's an extent in which that doesn't matter that ends up being an implementation detail right if you need administrative rights or not to install it. But but yeah I think I think you're right I think it is worth addressing more generally and I agree with. I wouldn't necessarily change this document I think this document actually fills people's heads with the background. I think a comment we're throwing in about what I just said but what you're then saying is, given the implementations of this document what approaches could we take. And we don't have to basically ball off approaches and say this is never going to work, which frankly is, you know, not given anyway. But as an example one approach is when you can bring just the operator. Okay. Not from CNF it's a part of it's provided independently of the CNF you can bring the operator and the software it's operating on so ultimately you're delivering an as a service API supported independently of the CNF and then you can bring. Obviously if you bring it as part of the CNF then you're bringing both operator and the software package together and they come as part of CNF and they belong to that CNF. No other CNF will use it. Those are three patterns I would personally discard the operator alone one out of 10 because I don't think you can easily define what an operator is doing. If you don't describe what software package, you know the container image it's operating on it doesn't travel together because again I don't think you can test it. But the other two, you know we can say well if you're going to do a then you do it like this if you're going to be then you do it like this. So we can take possible implementations measure with the information you provided here and say it's got strengths and weaknesses and in general we would or would not recommend. Yeah, that's I think where we would go within the work group. I'll add you know that the general topic is here are the CNF requirements extending beyond what the platform provides right part of the problem of Kubernetes. What I want to consider the problem or maybe a challenge is that no two distributions are equal and even within the telco you know they could create clusters with a certain certain base requirements so when when you're targeting CNF you have to list well what are the requirements for the CNF and they can be things like SDN environment. There's top of the self switches you know there's top of the rack switches you know there there's a lot of requirements that go beyond just what Kubernetes provides as a platform in itself right there. You might need some plugins installed. So an operator becomes another aspect of those those requirements and yeah. So this where it leads into exactly I think the next item on the agenda which is you know privileges right privileges are also a kind of requirement. I require privileges access to the host. I didn't hide among privilege but it's more a matter of if I'm going to and I'm using air quotes here if I'm going to right size the cloud. If I'm going to build a cloud that will specifically run this kind of network function from this code this provider, then I'm going to have to meet its requirements for the cloud because there's not one to come to Kubernetes one configuration. There's a separate topic to both of them that I would want to say well these are the features I'm going to come looking for and they better be present, which could include I need an operator that provides ultimately when I run it a function that does something. And, you know, and that would go hand in hand with I need multis and I need the SRV plugin and I need human awareness and I need whatever else it is that I need out of the cloud. So I think that's another one we could write and again we cross the line between simply describing a best practice and standardizing it which we might want to do sort of as a sideline rather than with the authority to CNF working group. You said right with your going to provide a CNF it's got to come with a manifest that details its requirements from the cloud, and it might require operators. Then, you know, if we go into well the system will provide operators then those things knit together at that point. Right and this is again where I see the big benefit of operators because it lets you encapsulate encapsulate those requirements with, you know, a single item on the list you know this is I need this operator. Now that operator might itself has have all kinds of requirements but you're, you're not passing the buck here you're really saying that this is, you know, another system that you need so the CNF itself becomes more minimal. Right there are less moving parts in the CNF itself and those moving parts are moved elsewhere in areas where it can even be a third party you know another vendor. I mean, or not as the case may be. If it's supplied by the CNF vendor then the point is, it would become it wouldn't necessarily become a responsibility of the CNF supporting ops team it might become a responsibility to the platform is supporting ops team. And it privileges less of a problem because everything the platform ops team does implicitly has all privileges in the world so that one might solve itself if we approached it that way. It's true, you know, I can't give all these details but I'm working right now on a few projects with vendors and you know the reality is very messy. A lot, a lot of things that kind of break the, the lovely cloud separation right. I think we can forget about ever having CNF's running on public clouds at least not the way public clouds are currently designed and ends up being very custom. That's my point that you have. It's a CNF plus a very tweaked up Kubernetes cluster that is designed to work together with hardware accelerators other things that sometimes come with their own operators. And the complete solution you know you have a bill of materials for everything, going from the hardware to the software, and all these parts are really designed and certified to work together so. So in the end, you know, making these kind of clean separations between which team works on what I hope we can reach that point I mean I think that's something that we would like to see those are one of the benefits of the cloud and what it can give us. We have a ways to go to that and maturity both in the platform and in our development practices, which is part of what this group, I hope will move forward. Yeah, the question there is going to be can you for a CNF. Given what it's got to do, can the whole group of people involved supply a CNF with more moving parts with less effort than they can supply it with fewer bigger moving parts even if that includes a little duplication. And, at least with the level of technology and frankly, people skills that we have around a moment I think duplication might make life easier, but it's it's a topic open for discussion we shouldn't block off the option future we might not necessarily take the option on day one. Right, I think our attitude so far which I agree with very strongly is that we keep where we're going with the method of like if you want to do this, then this is your best practice we it's very hard for us to state an opinion where you shouldn't do something some things. Right there's a reality out there in terms of what telcos need what providers can provide and and that relationship which you know we can change that business relationship what we can do is look at how those relationship manifest in practice and with those manifestations here's what we're recommending. That's the right attitude for us to take. Well, thanks. Good discussion. We'll continue it. Ian and I have been working on some material for related to the principle of release privilege probably be multiple best practices that end up coming out of this as well as maybe a couple of use cases. But Ian, would you like to go I just put some of the bullet points based on what we've been doing. Thanks for that. Yeah, I mean, so let's start from the beginning the principle of release privilege is not a use case, and it's not a best practice either. It's, it's a principle that the reason it's not a use case is because I need to be doing it for a reason. And it's not the reason just because I say it's a principle. So I've got to justify it. The reason it's not a best practices it's not measurable. So I can't say you shall have least privilege because you know someone's going to say well this is the least privilege it just happens that I need like root access to the whole system to get things working. So, so it's an unmeasurable statement as it comes so to break it down into use cases and best practices. I need to delve into it and see what's coming and and we take Taylor and I spent a long while discussing this, and we've got parts of notes and those notes of, well, I can share them if you want but they're not terribly human readable and they might be rude as well I'm not actually sure what we wrote at this point. But what the way we found that this was leading was effectively the principle of release privileges, a consequence of wanting your applications not to affect each other if you've got more than one of them. And your application, your application to stand separate from the platform so the application can't break the platform that it's running on top of, which absolutely can with lots of privileges you can give it. And if you start with that as your reasoning then the principle of these privilege sort of falls out of that more privilege you get the more you can do things that you should under no circumstances be allowed to do if I have access to host networking. I can basically break my worker off from the rest of the cluster because I can break host networking. I can probably get into the management network because I'm not networking. For instance, that these are sort of things that start to look dangerous. So, from there you end up dropping out things like well, this is a thing I should not ever do. And another thing, because we're in a real world here, not in the world that we might wish to be in is that we know well that pretty much every CNF that exists in the world is not sticking to the principle of these privilege it lays its hands on all privileges available to it catnet admin is a particular favorite and to to get the forms of networking in the absence of something that would allow them to do that without grabbing. So, this one is a procedural point, but I think what we have to work out as we come to best practice baselines is how somebody would document their compliance with that baseline, they would probably say this is what I'm doing. Yeah, they would probably say, this is what I'm doing. And it's not compliant with all of the best practices and I have an exception. And this is the reason for it so you know when they have no compliance they would want to document why they've gotten on compliance and maybe what they're going to do in the future to remove that we should be thinking in those terms but we will build best practice we can't limit ourselves to best practice that everybody already does. We have to come up with we're going to come up with best practices that people don't do. And so we're going to have to give them a way of saying well I'm not compliant. This is why I'm not compliant. This is how I'm going to become compliant. Anyway, the, the, the privileges we looked at and Taylor's been kind of making notes right he basically said right. My platform to have integrity independent of the efforts running, I want applications, not to be. Again, if we're talking in the world of vendors I bought my applications I don't want one vendor to point to another vendor is the reason why their patient is not working. Yeah, so, what was I saying. So we went through some of the standard ways of using least privilege, right one is, again, as I say, documenting your exceptions where you can't basically run the zero privilege but we should explain what we do in that privilege. That would potentially give us ways of documenting, you know, best practices that if you're going to use cabinet admin. These are the rules you must follow and these are how we will check that you are following those rules as an example, bad example, but it will do. So when it comes to a few other things I know we discussed routine containers I know towel has a specific interest about routine containers because it tends to deny you the use of the root user in containers theoretically the root user in containers does not endanger other applications or the platform at all so it's got a slightly different use base the reason you don't want routine containers is because it limits the damage a compromise app can do to itself for broken as in a floor application can do to itself. Unrestricted access to these APIs is obviously dangerous there are communities applications API is that you probably wouldn't want an application to access. We might want to set some rules around this the obvious one to my mind is when we talk about cluster wide resource types, I think network attachments are probably the most dangerous. As an application. I don't have the context of how the cloud is connected to the wider network. So I can't just say use the land on this port without really that consideration. There are certain big ones in certain ports that are probably, you know, outside the scope of what I'm about to do. And there are certain attachment points that probably designed for other CNS to be using which I absolutely shouldn't be fitting with. So, we need to figure out for an application what communities API should it be allowed access to, and what shouldn't it be allowed access to, and what would I want to delegate on a case by case basis like for instance again access to an existing network attachment so that you can attach it network function to it would be an obvious thing to give to a network function, you know, I want your input to be here I want your output to be over there. I don't let you decide which belong that is that when I actually selected it a little like kubernetes and like neutrons networks. Once the network is made you can use it but you can't make the network in the first place. We talked about CRD and a lot already. A quick question about that. If I may, I feel like it's a separate aspect here you know you're talking about privileges but also you're talking about rights and service account RBAC roles. A privileged container doesn't have unrestricted access to Kubernetes API is that's no unrelated right. But we're not talking about containers with what kubernetes calls privilege we're talking about the privileges in general that software has which among other things is access to the kubernetes API at all and access to items on it. So, I want to keep that to the smallest set that's meaningful that allows me to do the job and say well you shouldn't be doing this because this is more privileged than you absolutely need. Whether it's privileged whether you it's a privileged container quote unquote or or not, we're just saying, whatever the application is, however it's running, then try to limit the scope that it has on affecting other apps as well as itself and the system. So, yeah. Okay. All right. Because, you know, root and containers is, you know, a very specific aspect of privileged containers. There are other alternatives to that to providing capabilities for containers rather than outright privilege, you could use Linux capabilities. So again, that's least privilege instead of going completely privileged. Instead of setting privilege true, you can ask for specific capabilities. Absolutely. So, so privilege. There's two parts to this one is the principle of least privilege literally means you have only the privilege required to get your job done and no more. But again, the use cases we might write up here are things like privileges that exceed a certain threshold that endanger the stability of the platform, for instance, should not be given to applications. Your privilege, you know, I could turn that around. You only get privileges you need to do the job, but you are explicitly not allowed to have privileges that stop people doing their job. Okay. And the thing that makes me a little bit uncomfortable in this particular path is, is that the, like, the understanding in terms of least privileges is that the application says these are my, these are my minimum requirements. So if you end up with a with a pod that says, I need capability net admin, the amount of damage I could do with that despite the fact it's not root, it's not this admin is absolutely immense and it's very close to get very close to having root access to, to the system so that's. So, so I think from a guidance perspective, like, we cannot say no you can't have this like we're not enough, we're not in the space to say where we're going to block you from having it, but guidelines would say, like, if you really need to, like, Cap net admin, you should, the best practice should be to separate that component from the rest of your application and put it into an isolated area so that the attack surface is significantly smaller. So that when your system is compromised, not if but when it is that they don't get all these additional privileges simply because they happen to break in through some through through the front door. Very good point I'll add, you know, if we could make suggestions upstream. It's a problem in Kubernetes it's a very, very course. It's kind of if you open the gate you open the gate for a lot of things, even if you want to enable ping from a container. You're going to have to have privileges that you can create a lot of damage with right there's, there are solutions right things like psyllium and other things that let you have much more control over the security in your container but Kubernetes out of the box is doesn't care. Yeah, cap cap net raw, you can use to respond to our request to our poisoning. You can do DNS poisoning depending on depending on your infrastructure, because allows you to craft and run the packet and listen to musculously. So, like these these are, these are privileges that they look small in the front, but they actually have huge implications for for an attacker. And that doesn't mean you don't need cap net raw you shouldn't have it, but if you, but if you do need it. You want to try to isolate that into into a different location like this is my control part and this is where most of my application exists. That's the dangerous part right there that I've isolated, so I can better defend it. And I think the point I was making about rules and exceptions is important here. I mean, I remember doing this for coding standards and all things we years ago I was in safety, critical software, and we had a bunch of coding standards, and you follow the coding standards unless you couldn't follow the coding standards and then you wrote up an exception for the specific place where you weren't following coding standard and why, and why it was okay to do that in this one instance and that was all perfectly acceptable. We can do the same thing here. I don't think it's possible to write, for instance, a best practice of if you're going to use cap net admin this is how you use it. The only way it is possible to write a best practice is if you're going to use cap net admin or sorry don't use cap net admin and then an exception process that says well you get to explain to the person who's going to operate this, this network function, why your use of network admin in this one case is acceptable and not dangerous and so on and so forth, using any of the suggestions that Frederick was saying there, or any others up to you, but you know you aren't following the best practice, but you're at least explaining So just another point I think Frederick pulled us into maybe too far here. We're not talking about security right attack vectors things like that we're not the vendor is not going to attack anything. But I think the principles here about words, we mentioned our integrity, damage. I don't think we're the topic here. It's a defensive deck thing so it can be security as well but it goes hand in hand with frankly software never does what you expected to do. So if I hand, you know a network function a bunch of privileges I don't know what it's going to do with them. Either way, whether it's doing them because it's code is broken, or whether it's doing them because someone's invaded it and starting to try and dig for a, you know an escalation of the foothole they've already got doesn't really matter as because the point is the wall is effectively the same thing. I don't give you more rights than you need to do the job which means that were you to be compromised. There isn't anything you can do that is dangerous to other things I might be running. Okay, I would like to point out that integrity and availability are straight up in the security domain as well so if you look at any security literature on what security is. I always point to the CIA try it, which is the confidentiality integrity and availability. And so, these are very cross cutting concerns and of course you can have integrity and availability with without focusing on a security aspect. In terms of, in terms of that particular path. Even just looking at the stability of the system, like how robust is the system. If the system has the capability to use the guns on itself and it's in its infrastructure, you're going, you're opening yourself up to more risk. And so this, it's a, it's a principle that where you have good cohesion, and you also have good loose coupling, where you've coupled with something that has that privileges, and that thing that has a high privileges can be better audited. Right, absolutely. I'll say that just let's take into account that we're not discussing security in general here, you know, where security is a much bigger topic than justice. It's, as you said, it's part of the domain but we're not really dealing here with attacks right. So, actual attacks are something we should discuss right how what are the best best practices for security and Kubernetes that's a very, very big topic and a lot of companies are working on outside of telco. We're specifically talking about one sub area of security principle of least privilege. We're not taking on all of it. It's fine if they come up and comments and they have like there's a lot of other best practice ideas that we've noted while working on this one. But the focus for this is primarily on least privilege. Correct. I'm just, I'm not sure I would even call it security, but okay. Yeah, I mean, from the surface, then, again, a CNF that's gone insane and a CNF that's been compromised are very different things, obviously, but from underneath, right, a CNF that has gone insane and a CNF that's been compromised are going to try and do the same things they could do. They will use whatever privilege they have in unpredictable ways and so making sure that they don't have any more privilege than they need is it means to make sure that they don't start doing damage, the damage doesn't spread. Yeah, and typically when you start to get to the largest organizations, what when they talk about security, what they really mean is risk management. So, what is the risk this thing is presenting to me, is this a risk that I can accept based upon the business requirements, or do I need to do something else like mitigate or eliminate or transfer the risk somewhere. So, it's in a code that is acting up. You know, even if it's just someone, a developer makes a mistake and accidentally starts deleting databases, like that is seen as a as part of the security domain within within many organizations or most organizations at scale because it's considered to be part of that part of that risk. And so, at the end, at the end, I think the security aspect does, does matter here. And we can tie this into where we can scope the the part that we're looking at and we're saying hey we're only looking at Prince principles of these privilege for for this. And if it makes sense to slowly to slowly expand it out we we certainly can. Sure, so I'll add another point then you know, privilege is having to do with the writing to disk which every container has this is a topic that's very dear to my colleague Sean's heart right now, logging. Kubernetes, it is not throttled in any way. You can, you can think of it as a denial of service attack if that's what somebody wants to do but a wild out of control container that just logs too much can bring the whole system down. Yeah, I think that's an interesting category of isolation perhaps rather than principle of these privilege but I absolutely agree there are certain things where Kubernetes does not constrain resources. We talked about it CD and be probably overloaded if you really put your mind to it. You're absolutely right on the lobbying things I'm sure they're only two of, I'm sure we come to others if we actually went lucky. But yeah, where a resource is shared, but it's not shared with any degree of enforcement. You have a problem. On true isolation you need virtual machines that's basically containers were not designed for it. Kubernetes is designed from a point of view of trusting your applications right. Well, it's at least designed or not. It's definitely used in the sense that a single application is running up. So, you know, if the application breaks because Kubernetes is being abused, then it's still the application team problem because Kubernetes is a component that they're supporting. But that isn't the world we're expecting to move to here because the application and the Kubernetes platform come from potentially two different sources. There's a point you made a few times and I really wonder how true that is that these teams should be separate. Well, so would you like to sell OpenShift to service providers? Me personally, sure, let's sell it. Anybody would buy it. Are you planning on writing the entirety of a mobile package called because if you're not, then you can't sell the whole solution. You can only sell OpenShift. Well, I'm not talking about who's writing the code for the platform. It's who's actually managing it, right? But again, right, they'll come to you because you sold them OpenShift with support and OpenShift broke because an application is doing something stupid. Why should that be on you? I don't really understand your point. Sorry. The point I'm making is that as a vendor, taking one example of this relationship and it's perfectly good one, if I'm selling your piece of software, I want it to break if I did something wrong. And I want it to break when I can fix it because a support contract is a gamble. It's the gamble that you won't make so much work for me that I don't make profit. And in order for that to be true, then the consequences need to be consequences I brought upon myself by writing bad code in the first place. So I have some control over the amount of support that you will ask from me. If we're saying that I can't support platform without also knowing everything about the way the application is running on it, then I don't understand how we can make this ever work in the service provider. So this is one of the ways in which we constrain platform teams to basically dig their own hole and not have other people dig holes for them. Well, we're not talking about pass, platform as a service here. We're talking about ownership of the network. Even the title here at this point is platform integrity, but we're talking about CNF best practices to maintain platform integrity. Everybody's responsible here. You know, you're responsible for the end to end of the network. So which parts of the software you call platform and which aren't are, you know, it's software that you're using. It's all integrated. Yeah, I'm a service provider. I'm trying to run a mobile network and it breaks. Who do I call? Probably a lot of people, right? But the thing is that make wasn't true. I mean, but in an ideal world, I know who to call. It's one team who's not living up to the responsibilities I set for them, which is why it's useful to. Again, it may not be perfect, but it's useful to put boundaries between components so you can say this is not doing its job. It's your problem. You fix it as soon as possible. The best resolution to, you know, what could potentially be an expensive network outage is that I point to the right person. The right person makes a one line changing whatever they're responsible for and the whole thing starts working again. The worst solution to that is I can't honestly tell which of the many teams that I'm working with is responsible for the problem. And I have to call them up at 3am on a Sunday and, you know, then they argue among themselves trying to point the finger at each other. I get this sorted in a week while I'm not making any money because my mobile network's down and the government's on my back. Right. There's certain expectations for what this particular platform provides, say, versus OpenStack, right? So you just don't have the isolation. Containers don't really isolate. So it's a different kind of environment, I think, a different kind of platform. Yeah, and maybe that we're not expecting it to be as robust, resilient as virtual machines are. But on the other hand, we get benefits in terms of better, you know, it's smaller, it's lighter weight, it's more affordable. But on the other hand, you know, where we can set boundaries on this, we can improve the whole thing, right? It's not that perfect. It's about better than better than you might end up with if you didn't. Right. I know, sorry, I feel like we're really getting off track here. But my point is I'm happy where this started, where we talked about a principle of least privilege. It's a good principle. It has a lot of advantages. Moving it into the general issue of security, security is a very important topic. I think we need to discuss it unrelated to this principle at all. There's a lot of issues having to do with that. They're not necessarily related to what the CNF developer can do. You know, they're just problems or challenges, I would say, with the technology itself. Well, all right, let's turn this around. The thing I said to begin with was that principle of least privilege can't be a use case, and it can't be a best practice by itself. So you need a use case that justifies the things that the principle of least privilege applies. What use case would you use? No, I think you're right. It's a general principle. I mean, there are use cases where this would come up maybe more than others, right? Those use cases where you need privileges, right? Yeah, but I mean, it's not necessarily use cases where you need privileges. If you need privileges and there are no consequences to certain categories of privilege, because fine, you can break the platform, but that's fine. Then, you know, that's a use case where it actually doesn't matter if you have that privilege. But yeah, I mean, I don't want you to answer this here, but I want you to go thinking about it. What best justifies using the least privilege? What makes it the most advantageous approach? Well, I won't answer, but I'll ask the opposite question, you know, cases where you do need privileges. Well, what now? Well, that's your least privilege, isn't it? I mean, that's the point. Right. Well, okay, least privilege, but there's still privileges and they represent challenges. So the next step is really thinking of, okay, what do we do with privileges and how do we use those privileges responsibly? And my argument is that you need to, and Kubernetes, you need to do everything responsibly because the least privilege principle doesn't really protect you from that much. If you're thinking of it in terms of security. We're starting with the lowest level that we can agree on. That's the point. So if you said that you want to limit the privileges when you need it of an app, then you could talk about the single app integrity. So if you have multiple containers, then maybe something's compromised, but you don't have the one that needed the net admin privileges is completely separate. If you don't immediately get access, then it's not going to cause problems for the rest of that app. It also wouldn't cause problems for the other applications or the platform. Yeah, that doesn't fix other things. It won't fix all the other potential issues that are happening. We're not trying to do that. We're trying to work through and say, if you follow this, then it's going to help in whatever we say, it will help in these situations. Yeah, and I think we should be careful not to not to propose or pretend that these are silver bullets like the principle of least privilege is a principle that should be followed for a variety of reasons, one of which is security related things. And with the goal of frustrating the attacker and where they gain access, they have to take yet another step to get to to escalate or to transfer their their access to horizontally to to another location which increases the risk to to the attacker. But it's not the it's not the only reason why we want to have these privileges, but it's, it's something that that is generally a good principle to have but it needs to be applied across across the board, while also understanding that there needs to be other things And this is the reason why in open shift environments that Se Linux is recommended, strongly recommended to to remain on that people when they get frustrated with that Se Linux they don't just say oh let's just go disable it, or disable it entirely through a security control policy and the SEC and the in their pods in their on that specific pod. I think it's, it's important to, or even if they have privileges that are escalated that they get enumerated out these are what the privileges are because in that scenario it also allows an administrator or by administrator I mean I'm not sure that he's administrator but as I'm like the administrative control and leadership and apparatus that was in the company to also work out like well what systems do we need to spend more time auditing what's where do we need to spend more time looking What systems may have may have issues and things like this guy opposite and similar yes they can be negated through careful careful application of epf or there are kernel parameters it can help tune that, but it doesn't it, but you ultimately need to have good observability somewhere like there's no, there's, there's ways to bypass the kernel IOPS depending on how it's configured, where you might have a direct mode so we tend to use we have to be very careful on how, on how we, on how we approach these these guidelines to say they're not, they're not perfect. Let me add to that in building on Frederick what you mentioned last time and Taylor just hinted the principle is not only least privilege but it's also isolated privilege. We're not just talking. Okay, so let's say we have the minimum amount of privilege required right it's not going to be zero. Some aspect might require privilege. We want that component as separate as possible exactly to allow that to be a legal point of failure, observability in one location, etc. So that to me is that the fuller expression of what this principle is about that makes sense calling it something like the principle of least and isolated privilege, something like that. Well, don't join them together put them as two separate things principle of least privileges principle of isolation. Again, I just want to point out that the principle of least privilege was the jumping off point it can't be a best practice and it also can't be a use case use cases are likely to be related to it so we had to work kind of. We work this problem in lots of different ways one of which is what reducing privilege would template would help from actual use cases. And the other is what are typically applied principle of least privilege derived rules that we see in applications, and then we were trying to join them together in the middle so you know security is not. It is clearly tied to principle of least privilege and it's a use case right somebody's invaded one bit of one CNF. What would you want to happen at that point what could you do about that you could make sure that that CNF doesn't have the power to break other things in its term. Yeah, and this this breaking up a part of things this makes a lot of sense as well like in enterprise systems you often see things like policy will say all data address must be encrypted. They'll say the standard is we use a yes the procedure is we use bit locker on windows systems to implement a yes and guidance will be don't leave your laptop in the car. So we're finding the right level of what we want to call these things it's like principle principle of this privilege. Yes, like how do we get to procedures. I again I would also make it clear it hasn't occurred to people that a best practice isn't necessarily driven by a single use case or user story or whatever right security dictates that we probably want to restrict privilege stability dictates that we probably want to restrict privilege as well. So, but the best practice of how we restrict privilege is one thing. The reasons why we do it is another and they could be many perfect time to have that right. One best practice for two reasons. Anyway, we will keep going with this. We do rather hope to have a few kind of outputs in terms of both user stories use cases and best practices this week. We will try based on prior experience to make sure that these commits probably live in a branch of their own that are independent to each other, so that we can commit the ones that people like and continue to discuss the ones that people have issues with. Otherwise, we get into a log jam where we've got lots of ideas that we can't get any of them into into our documentation. But we will keep going with this and your feedbacks welcome by the way, this actually helps a great deal. Thank you for your comment about auditing Victor. I think we do have to weave that in somehow. I'm not quite sure how yet but but it's a perfectly valid thing to consider. Taylor, back to you. That sounds good. And we're out of time for anything else at the top of the hour. The pull request, we still have some open, including the glossary. I haven't checked if tal if you've gone through and responded to some I think you were going to accept some and had some comments but if folks can take a look at that. The other pull request is a use case for onboarding CNF to platform, and Vook is going to be out for the next couple of weeks but if I think most of the stuff was pretty agreeable but if we can get some plus ones. Then we can get it merged potentially before he's back, unless there's anything that we want to update. So check those two use to pull request out the onboarding CNS and the glossary from tal. Thanks everyone. See you next week.