 Okay, welcome everybody, thanks for attending this talk. My name is Carlos Parnatou, I work at Changuard and I also, I'm tech lead on the Sieg release and Kubernetes and I maintain other projects in the cloud community as well. Hey, welcome everyone, my name is Ricardo, I work for VMR, I am one of the ingress in Ginex maintainers as well and you can find me around in Sixialize sometimes and yeah, you can find me around messing with code, breaking things. We, today's talk we're gonna like try to understand like the future lifecycle in Kubernetes and how we like people can open features and like do implementing to the whole lifecycle. Before we start like, we have two questions. The first one is who here is here for the first time that doesn't know anything about the caps and the future lifecycle? Just for curiosity, okay. And who are here that have features, ideas that want to know how to push those to the, through the lifecycle, okay. And the final one, who will here open already like one or more features, ideas and it's stuck in between. It's like, are here because think that we are jerks. Yeah. Okay, thank you, everybody. We're gonna like go over some examples and then we're gonna like explain the how the cap works in real life. For example, this is the one feature idea that is like a kind of let's say easier way to roll out config maps. The feature ideas like already open till like 2016 and like there is no conclusions or things yet. Like this feature is still open and it's not yet implemented. Like for example, we have like more than a thousand people like saying this is a good thing to have but there is no concrete like outcome from this idea here. Another one is like a simple start-up in ordering. Like I still open like from like the 2018. There is a lot of questions and a lot of like comments and the people like, should we use a readiness gate? Like some order new filtering field like to give ideas. There's a lot of discussions, a lot of closed issues around these ideas here. Another one that were like, it's a dual stack for IPv6 was open 2018 and it was closed like last year. That took like in quotes. Just three years. Just three years to get like the full thing, the full story like roll out and everybody agreeing on those things. Sometimes takes less time. Sometimes takes more time. Sometimes like never end, right? For example, Ricardo open one idea like for the printing HTTP life, life as check and this idea didn't like went through and they just closed the idea. They closed my issue in my face than maintainers. Okay, there's a lot from other examples. You can go to the enhancements repo and then in the others and then you can take a look but why like some features are like just refused or not moving forward? For example, the first question if you want to open a feature ideas like this it's like solve as a wide problem or very specific, like very central and very specific use case for maybe for your company or maybe for your use case. The other one is like it's bring or solve a security concern that's like wanna fix any security issues or maybe close the door for possible issues in the future. This is fix or solve a performance concern like there is any performance issues in the cluster in the Kubernetes itself that this feature is gonna like fix or help to improve a little bit and this feature gonna be a breaking change like for something like in the future like remember that there's no breaking change allowed within the GA when you move forward to GA and this was discussed before like what's the conclusions what's the other discussions like people are talking about those things or not was just a brand new idea. Okay, then let's meet the cap. Like if people know this is Stephen Gustavs he was the one of the persons behind the implementation of the cap itself and then we like to make jokes and memes with Stephen as well. Okay, what's a cap means? Cap means like a Kubernetes enhancement proposal. It's a way to propose a communicate and coordinate the new efforts in Kubernetes project itself. It's this caps are still in beta but it's a mandatory for all enhancements starting the release 114. You can scan the SCAR code that go to the page in the GitHub repo that have all the explanation for what is a cap, what's demanded our fields. We're gonna cover this. Try to cover a little bit in this. 30 minutes. Okay, that's it me? Yeah. Thank you. So first of all, we do discussions everywhere. So we have what we call in Kubernetes a CIG which is a special interest group and when you wanna add something in Kubernetes you need to go to some CIG meeting and say hey I want to add some new feature. Then the CIG may say yeah we agree or you can get stuck like for six months bring in this discussion to every meeting and say hey we wanna implement this new admin network policy and yeah hey Andrew and get like stuck in the discussions. So there are some things that you need to understand when you wanna bring that to the community. Like is this thing an enhancement? Does this need like a blog post? Does it require more CIGs? Is this just something for network or like network security and maybe the API or whatever? Does it redesign something? Needs a big effort impact user experience and so on. And mostly like people can complain or notice about it, right? So if someone can complain or notice about it probably it's a cat that's not like a book fix or something like that. Then we discuss and we discuss a lot. So this is an example that I brought here because that's something that I was close to some time which is the cluster scope admin network policy that wasn't called an admin network policy. It's when you went who actually every time wondered like why don't we have like a network policy that the admin manages instead of like each namespace here? I'm not sure. Okay we have one too. Okay so it's the idea is actually me as an admin I wanna create some firewall rules that can allow my pods to go to each DNS and so on. So it looks just like 670 comments to get this thing merged. It was pretty fast, right? So it took like one year and a half just people discussing how this could be created or not. Then you go to the alpha implementation, then you discuss again and you go to the beta implementation and before reaching GA a feature can always be rejected so it happened that you say yeah we wanna implement this and then you know what we don't want this anymore. We think that it's gonna have some impact. It's kind of frustrating but yeah it happens. And then we discuss a lot more, right? So the conclusion is actually yeah it takes some time because we discuss a lot. Thank you folks. I mean it's kind of part of the process. Yeah just one quick comment like when the discussion happened like as you can see in this diagram here like before the alpha implementation there's a lot of discussions and then when the feature like that people agree like let's move forward, let's implement this feature it goes to the alpha implementation and then it's gonna like after that gets merged it's gonna like roll out to the release and then it's gonna be like a little bit like running in the cluster like you're getting a release for a few releases. And then the discussion like people start discussing about like how to move this to alpha to beta. And then there's a lot of other discussions involved that this is gonna make require any change in the API. We're gonna require any like breaking change in this alpha to beta it's allowed to have breaking change that's not a big problem. When you move from beta to GA then there's like another conversation then there's a lot of other process around that to start like say okay we need to stabilize. In beta we're gonna stabilize the API and all the stuff most likely in beta you're not gonna have like any breaking change only if you move from beta one to beta two but most of the time like people try to avoid having like API changes. And then when you like there's a lot of other discussions involved in that and then you move to GA and then you like finally it's like in GA and then it takes a few releases to get out of this process. You do like five lines of code in three months discussing something at least. Three months I'm gonna be, I'm being really nice. Seems like the job. So how can we get this feature in Kubernetes? It's like usually this is something that we see a lot like some folks they just say hey I need this thing they open an issue and like yeah keep poking the developers every month like hey is this ready? Hey my boss is asking me is this ready? And then you have some folks that they just say maybe we can fork the project and if you fork the project you're gonna have some problems because maintaining Kubernetes is not that easy like I think the code it's just like a huge code base and actually the idea is discussion with the community it's the way to get that. I decided to bring some real life examples to you so you can see actually how this goes and what can go wrong actually because me and Carlos just saying yeah so you just take time because you discuss a lot like you folks you people at Kubernetes you like to talk that's not that case. So we have the first case which is you figure out a way of implementing some feature and then you'll say okay whatever I'm gonna implement this this way this may see the right way of doing that and then you figure out during the discussions that it may work right but probably it's not gonna work on the right way so I started implementing my first cap in Kubernetes was this what we call network policy port range which was just a new field like everybody was like hey can you just add this new field in the API like you have the starting port and like you may have a new field calling start to end right and yeah I mean it's a really common case so the first idea was in the network policy specification just having a new field called port range and you can do from and to and the exceptions pretty cool right so that's easy. The thing is that while discussing and saying yeah we can go through there the signature was like so if my network plugin doesn't know about that you are just gonna open all of the firewall rules for everyone so is this gonna work? Yeah sure it's the pigeon it's working right my network policy it's covering all of the range of the ports so it's working the way that I'm so we based on that discussion we reached the conclusion that maybe we should start simple and just adding a new field which is like the end port so at least if something fails for the network plugin it will interpret this way without like the field here right so you don't have in the network plugin it wouldn't implement it wouldn't interpret this network port as like anything so was the right solution maybe but was gonna have some problems. The second one, second case is when you have this great idea in 2019 I was like on the left side when I started implementing stuff like yeah pretty I'm still pretty but a bit more fat right now and then in 2022 you are still discussing admin network policies or any other new feature so some features they take time right so one of the examples was like adding colors to Qubectl and I want you all to speak with me that's Qubectl Tim Hawking is not here right so that's a Qubectl thank you so there is an issue since 2018 about that and the problem was can we add colors to Qubectl output like describe get pods sometimes you do like Qubectl get pods and you wanna see in red if something is in crash loopback and in green something is running right cool so you have some work arounds for that already but I don't wanna install and have a lot of work arounds like Qubectl plugin I want to have this inside and then you start having discussions with the community like can the colors impact the scripts that rely on Qubectl output so imagine you are like using Qubectl and GitHub actions and like the red color actually can be interpreted as delete everything in your cluster right can happen can someone write things for outputs like I like Dracula probably someone likes other team like Dark Knight something like that and what about Autonic people I think it's colorblind actually did the right yeah but like people that cannot make differences between red and green so you need to think on the accessibility of that right and finally and which this one is the most difficult in Kubernetes today it's like is the maintainer really into long term maintain the solution because sometimes someone just jump into the repo and say hey this is the new feature I can go through there and then like okay the feature is fine but when you have some bugs on that who's gonna maintain that right so it's kind of a discussion that does it is it worth to have this so there is an example on that we have a work in progress Eddie is as an example is taking care of that for a new QBAR C file and was like hey is this the best solution for that problem no but it actually generated a new discussion on how can we make Qubectl configurable for the users right so cool okay the third case is when actually when something gets ship it so it's like after four years discussing and you say you're starting to maintain after all of those discussions I don't think it's gonna work but it works who uses distro less in the cluster who knows who knows who doesn't know what is distro less actually can do some okay cool so distro less is like a Docker image that doesn't have anything other than your program whatever you wanna run right so it was made for that you don't have curve you don't have the W get you don't have bash you don't have anything there so it's kind of hard to debug you can do Qubectl is that get into that and it's gonna say hey I don't have shell script I don't have a shell I don't have a shell interpreter nothing so since 2009 theme there was a discussion on how can we rely on those really secure images without anything and still I need to debug those stuffs right I need to see if the network is working so this is a really hard feature because besides saying like yeah it's just like can we create a new Qubectl debug program you have a lot of machinery under the hood like you need to implement a way that the Qubelet knows that it needs to attach a new container into that pod and create a new network and so on other things so it got merged but it got merged like where is it I can see it but yeah it started on 116 like 2017 and it got stable in 125 so that's just nine releases each release it's like three months or four yeah you make the calculation I'm better at that so that's four years maybe okay the last one and I like this one because it's a who does Red Hat installation things here Red Hat I mean Red Hat basic distributions who has ever installed some Red Hat basic distribution Linux who does this as the first step me too yeah so Red Hat Linux distributions basic that's not I'm not talking about Red Hat but I mean this basic feature it has something called as a Linux that was developed by NSA and it's really really hard to maintain and it's enabled by default in Red Hat distribution so the idea is that you have some security on that right and the first thing that people goes and do is like hey I'm just gonna disable this thing because I don't want to manage security in my Red Hat installation whatever and in Qubenet is we did something similar which he was pod security policies which is not enabled by default but some admins they enable by default and when they do that the easier thing to do is actually disable that in your own in space so I do that a lot of people I know that do that you just don't need to tell right like I can see my manager here oh he's there okay sorry I don't do that in our production clusters so Qubenet is not secured by default and I wanted to repeat this to yourself Qubenet it's not secured by default right so the problem was we need to establish policies that block my pods to run with insecure configurations I don't want to any pod running with host network running as root and other things right but PSP it's really hard to configure in a workload it depends off a bunch of knowledge it depends on the user and you as a user probably you don't wanna know all of the permissions that you need you just wanna know that you can run in a secure way or not right so that's at least that's what I expect if I try to do something that I'm not supposed to okay I'm not supposed to run this as root so I'm just gonna drop this configuration and still you as a user you can change the permissions with pod security policies so you can get for yourself a better permission right so I don't think it's a good idea from the security perspective that I can change my own permissions so it ends up being easier just to bypass like as a Linux and pod security policy even having a lot of discussions it was deprecated I guess in 120, 121 yeah and remove it on 125 right so it never reaches GA which is really important to know because we cannot deprecate anything that reaches GA unless you have another GA that's compatible to the previous before but it was beta and even in beta the community decided that yeah this is too hard let's replace you with something else right so even after all of the discussions and all of those cycles in beta we say okay let's block this so there is a new feature called pod security admission it's configured on a namespace basis it applies to all of the workloads the community decided to make it simpler right so you cannot configure a lot of stuff on that but at least you may have some baselines on security so it was like yeah after all of those discussions we think that we can just go through this like not leaving the user a lot of choices but at least giving them something better than we had before and allowing them to be a bit more secure okay you want us you can finish we can do it together we can think it yeah how much time we have okay that wasn't pretty fast sorry so some conclusions that we have first of all features they are hard right so and I've been in your spot for a while in Kubernetes as well thinking like hey this is easy like I don't know how to code but maybe it's just like three lines of code like stop being a jerk come on folks like they are hard we need to discuss a lot and a Kubernetes code base it's huge right so that's something really big and we cannot test everything in Kubernetes today a new feature needs like to you need to test all of the scenarios if something new breaks something old and based on the combinations you may end up with each test running like for maybe one week so it's hard to test all of the things if something breaks people are gonna are gonna be mad at us right so hey why did you implement this call that broke my workload it was working fine so I cannot migrate anymore right so we need to be extra careful when changing something and the last sometimes we just don't have people we need to implement and follow the whole process and yeah it's tiring it pays its price but it is tiring so I am like I can see a bit some of Kubernetes maintainers or members here and I can't speak for myself but I know that the other ones they know as well like sometimes it's just tiring like you start implementing and you need to discuss a lot and you have your day job and not all of us are paid to work on the upstream so yeah like people need to understand that and if you are willing to have some new feature please join us like we need people to help us on the discussions and it's fun actually the discussions are great we make a lot of jokes during the meetings some takeaways don't give up right so Kubernetes needs ideas we are here actually to tell people why it takes times but still we cannot know everything that everybody needs and if we are missing something that's a really required feature the execution sometimes it's hard but it pays its price take a look into past enhancements we left some QR codes and we are gonna upload the newer version of this presentation to SCAD the other one it's already deprecated it didn't reach a GA GA is now yeah maybe this is GA I'm gonna see by like the questions after yeah explore the features track new ones that are in alpha we need feedbacks even if they are alpha and you wanna jump in like say hey I saw this as an alpha I have this test cluster I can implement on this test cluster I don't need to keep an SLA so I can just add this in my cluster and my users they are not gonna complain you can do that as well production clusters just don't tell your boss don't be shy bring any ideas there is no right or wrong so maybe you have this new idea and you say yeah this is stupid I'm not bringing bring this you can call me on Slack you can call Carlos on Slack you can call a lot of us on Slack if you don't want to speak publicly but please bring the ideas if you think it's worth to implement it raise the issue bring to the CIGs as well and understand the reasoning if someone says no we are not gonna implement that as well that's not because we are jerks that's because probably we had some discussion on past and saw that maybe this is gonna bring us more problem than solution in Kubernetes code but we are our nice people 99% of us not Carlos and not only of code Kubernetes is made so you can propose improvements in docs feedbacks feedback in your own experience as a contributor as well if you start contributing and you think it's too much hard to contribute we need feedbacks for all of those things so just feel free to reach us on Slack you are around that we think sometimes to answer but at least we try to answer everybody I wanna speak something else I have just one I remember just one another example that doesn't touch the Kubernetes code base itself but we are planning to move part of the inclusive naming work group we are proposing to change the Kubernetes KK the GitHub repository for the master branch to the main branch like becoming not calling master but main we are discussing this just to have an idea that doesn't touch code base we are discussing this size last year and we didn't implement yet because that are gonna affect not only the entire Kubernetes organization but gonna affect downstream users that people consumes Kubernetes and build Kubernetes on their own like we send a survey last year we got like a few persons from different organizations replying we decide to not move forward in that time because we just got like 10 answers and I think we have much more organizations that rely on the building on the Kubernetes on their own infrastructure and then we are like just to say this is like a simple renaming but it's not that simple and inside the Kubernetes organization we like need to scan the entire repositories, jobs, pro everything to know who is relying on the master branch and the name of the like to change we did a lot of work this is like just a normal rename but in our Kubernetes organization that is a huge stuff is not that easy in GitHub is just a button yeah right that's not that way okay folks I think that's all for now thank you everyone and we have time for questions yeah if you have oh yeah Karen is with the microphone if someone wants to make some question Andrew cannot make questions thanks I was just wondering what your thoughts would be about having a either a product manager or product management sort of principles included in this type of process and whether you think that would work wouldn't work or how that might be best utilized let me see if I understood the question is like including a project managers and not a project product yeah so I'm a product manager yeah we have some persons involved that's a program and a product manager in the community yeah like for example in the single release we have Lauri and she's working like to create a vision and create a roadmap and we defined all the stuff for at least for the single release itself but that impacts the other six as well then like I would like to see more product manager and stuff like running and helping the other six I think that's a nice thing to have but again like we need those people like to jumping and show up and participate yeah so the best way is just to jump into a stake and just see what you can do from there great sounds good thank you great talk guys that was really awesome I have a question on you had a slide that said the flow chart with discuss discuss discuss discuss right in a constant feedback loop do you think there's a way to or do you think there's a better way to update that process to make it more exciting for contributors who want to get involved in like we're contributors right we want to write code we don't want to sit and write docs do you foresee some optimizations to that process to the cap process that could help you want to answer okay so I am someone that actually don't that's not that I don't like the cap process but I got burdened by that as well as you Andrew yeah right so we've been dealing with a lot of caps and sometimes you just get tired I think that one thing that actually could make caps more exciting is actually we move in a bit of like the discussions being something that that formal and maybe having more opportunities between the contributors together together and do whatever they think they want and then show the result right so but but that's not about the process it's more about like how people they need to deal with that right so I don't want to keep writing these documentations and discussing my whole life like for one one year and a half maybe I just want to make my proof my point so I think it's more about like the contributors to showing how this would work like they're changing their own mindset and just going ahead and proposing to to like to the tech leads like hey we did this and we've seen a lot of this happening in Signature recently right so people just say hey we did this if you don't want to do if you don't want to use that we are gonna make our own product like a ping is one of the things maybe it's a way to go on that right so yeah totally great thank you is is the cap program if there is a program measuring the actual like tech time versus cycle time versus like interactions because like at some point let's say it doesn't look like it's going to take less time to get features in it's going to take more time so in the manufacturing industry this was figured out so like they said like what is the the lead time tech time and it's like when you do process analysis like this has been figured out to understand where actually the time is being spent is it just waiting is waiting for someone to comment is he waiting for someone to write code is it during writing code is waiting for someone to approve right so those I think folks when it's gathered with more deterministic into identify where is the actual wait time in each of those process because it might be three years but it might be three months of work and everything else is waiting right do you see anything being done in that sense do you have my treat? The answer is no it's fine yeah like we need to think that that that's a lot of people that is like a tech lead that are like involved in the community are doing the free time right and sometimes that free time I only have my free time to review a cap or take a look or make a comment gonna be like in two or three weeks and like on Sundays? Yeah on my Sunday and I'm like I'm not want to do that in my Sunday I want to do the next one like I think we should track more and have more better metrics on those things and then maybe starting like us we like maybe have like some product managers that can track and push things but I think we need to remember that's like most of the time is people are doing like in the free time not in the company base there's a lot of people that's not company based back at workloads Anyone else? Thank you folks. Thanks so much