 Hello everyone and thank you for being here. This is the security working group update at Kubeflow Summit slash KubeCon 2024. I'm Diana Tanasova, software engineer at VMware by Broadcom and today I'm here with Julius Bonkohert who is a freelancer and a DHA employee and a leader of two Kubeflow working groups, security and manifests. So for the last two releases, Kubeflow project has a dedicated security working group which main goals includes defined clear policies and procedures on how to report and disclosure vulnerabilities and of course enforce the use of security best practices across all working groups. We welcome new contributors, we offer mentorships and here on this slide you can find useful links to our Slack meeting minutes so feel free to reach out to us. For more technical details you can check out our previous KubeCon and Kubeflow Summit presentations. Now regarding security requirements there are the official CNCF guidelines and even the CNCF graduation criteria contains a lot of security related stuff. We are still examining it, we are going through the checklist with the CNCF so it's work in progress but in the meantime let's talk about architectural issues because there's one major one it's about multi-tenancy especially within Kubeflow pipelines and that's the case that we have zero multi-tenancy for object storage as well as zero multi-tenancy for metadata storage which just means these storage systems within Kubeflow pipelines are just not isolated per user which is of course a severe problem for our enterprise customers. I mean there are some downstream solutions but it's not yet properly implemented upstream and last but not least we also have to improve our automatic CVE scanning and updating of vulnerable dependencies because at the moment it's still a manual process. Now talking about CV image scanning this table shows the results from our last CVE scanning it contains information about the number of images per working group with their CVs divided by severity. We already had a great improvement in lowering these numbers but as you see there is still more to do but more most of these CVs actually comes from our external dependencies or from the underlying cooperating system and can't be addressed by upgrading to on your version or rebasing the image. There's one major change and it's about replacing the OADC off-service with OF2 proxy so exchanging our authentication system within Kubeflow and it's finally going to happen in Kubeflow 1.9 especially with proper token-based machine-to-machine authentication so no ugly authentication hacks are needed anymore. Then we also have network policies as a second layer of defense before Istio. If the KSC the Kubeflow Steering Committee or the Technical Oversight Committee agrees we might also enable them by default as a second layer of defense and the third major area is rootless Kubeflow because by default we still use root containers especially in user-controlled namespaces and this not only violates Kubernetes best practices it also makes it very easy to exploit. Therefore the FN optional Istio CNI proof of concept available in the manifest repository please try it out and provide feedback because it might allow us to get Kubeflow 99% rootless in the future and as soon as this is implemented we will go to the next step that's about enabling pod security standards first with warnings and later on enforcements to block dangerous containers by default so as you can see on the left hand side we've already achieved quite a lot over the last few releases but today I want to focus on the right hand side where you can see that rootless containers as well as automatic CVE scanning and some kfp UI issues are actively being worked on while the kfp denial of service attack is finally solved. It's quite important for large enterprises which want to scale to tens or hundreds of millions of runs in the kfp database and that is finally possible but the three major problems regarding multi-tenancy ML metadata, artifact storage and namespace sharing multi-tenancy are still pending so we're still working on them and looking for volunteers so thank you very much for this high level overview there are very more technical talks available as well please rate our talk and reach out to the security working group. Are there any questions for Julius and Diana about security or anything related to the security working group? Thank you for your update yeah I have a question on the name sorry netpal enforcement you were you were mentioning is that targeting the cube flow let's say control plane components so the I don't know the pipeline namespaces and all of that or is it targeting the end user namespaces because it would be interesting to understand if you are providing like a baseline of controls which gets applied to all your different like user namespaces regardless of yeah which workloads they're running. Well it's definitely possible to do just both at the moment what I have upstream and the manifest repository in slash contrip network policies they are just for the cube flow core control plane at the moment but you can also easily extend them and create them by default in every customer or username space to protect the username spaces as well. And yeah very quick follow-up on this with regards to pss enforcement which you mentioned will be down the line is this also going to apply to the core namespaces only or by default or is it going to apply to the user managed namespaces as well. Are you talking about rootless containers? Well no I'm talking about pod security standards. Ah pod security yeah okay pod security standards of course we can easily enforce them in the cube flow main namespaces that's already possible now but we also have some prs open to make it possible to easily specify the pod security standards baseline restricted and so on per username space as well the prs still pending. Thank you. And there's also probably we're also working with solo.io the company behind Istio on ambient mesh because this would be the next step then after Istio CNI. I have a question specifically about kserve it seemed that that one was pretty hot as far as the cvs and I wonder how much of that oh sorry how much of the how much of the the security issues that you that we saw on the graph regarding kserve is specific to or is because kserve is by nature serving models out in the open and how much of it is like core to the core to kserve itself because one of the biggest things that we get asked a lot by security is kserve specifically probably because it's you know like we're serving models but so I want to know like what's the what's the kind of breakdown between you know the the serving part versus like issues that are that are big that are not related to you know like incoming requests and that sort of thing that makes sense can you rephrase it a little so how much so how much of it how much of the how much of the security issues are because it's you're serving models over HTTP and that comes with some some risk and how much of it is related to things unrelated to that part of kserve the way kserve is architected itself I mean I would have to look into a list of cvs to provide you exact details there's a script within the manifest repository it can help you to extract the images per working group then you can run the images through a security scanner as for example trivia then you get the list of cvs and then you can just check yourself okay yeah keep in mind that why kserve is a core component of cube flow kserve itself does not belong to the cube flow project kserve belongs to the linux foundation and their ai group so there while it is a core component we don't have direct control over that that piece of it so that does that also help explain some of that we do have a break now 15 minute break but if anybody wants to continue like some hallway track talks or anything we'll be in here and then we'll start back again at 15 after so please come back or continue the conversation with us while we're in here