 There are three stages in the contribution ladder. Members are the individuals who have been contributing to the Kubernetes project and the code base for a while. Reviewers are experienced members of the community who have demonstrated deep understanding of the code and have been actively reviewing and approving code changes to ensure that it aligns the goals of the project. Approvers are the most experienced members of the community. They have earned the trust of the community and through their consistent contribution and effort in the community, they have made sure that they have the rights to approve code changes and in order to get your code merge, you have to convince them and make sure that they are happy with the direction that that particular change is taking. In order to merge a particular PR, you have to go through a code review process. The PR must pass all the automated tests. It includes unit tests, integration tests and end-to-end tests. These tests are there to ensure that the code is functioning and is not introducing any regressions. In addition to that, you have to get an approved and LGTM label. This is to ensure that multiple six stakeholders take a look at the code and provide their seal of approval. The PRs must be reviewed by at least one member of the community to ensure that the code is well written, meets the project's guidelines and standards. Once a reviewer has indicated that the PR looks good to them, the PR must be approved by at least one Kubernetes approver. This approver has the authority to approve the code changes and you have to, like I said previously, you have to convince them and make sure that the code changes align with the project's overall direction as well as the SIG's overall direction. It's important to not assume that reviewers and approvers are always available. They're often very busy members of the community and it is prudent to coordinate with them over Slack or SIG meetings. We just discussed the merge process of a PR but that's overly simplified. It is useful to know that when you're making small changes like bug fixes, changes to end-to-end tests or cleanups, maybe start contributing a PR itself makes sense. However, if you're contributing a feature, directly submitting a PR is not the right starting point. It's important to make yourself familiar with the contribution process and to ensure that you understand this whole process because it's a structured process to facilitate our proposal development testing as well as the final release of the feature into Kubernetes platform. This process starts from an unstable alpha, goes through to the feature-complete beta state and finally to the stable GA state. In order to simplify the process, we have here three stages that are three different repos where you have to make changes. The first one is the enhancement repo. This is the repo where you have to contribute to submit a particular proposal. Then you have the set of PR that you've submitted which is made to the Kubernetes repo. This is often referred to as KK and then you have to finish up by submitting your changes to Kubernetes website repo which includes all the documentation related changes and goes to the Kubernetes website. An enhancement issue is used to track the progress of the entire feature. This issue is referred to by the release team and it's important to keep that up to date as that's how they track and make sure that the feature becomes part of a particular release. This diagram shows the summary of the contribution process throughout its entire life cycle. The enhancement issue should be updated at every stage ensuring that the first time the feature was introduced and every time any update was made to it, the changes are reflected in the enhancement issue. So what happens at pre-alpha stage? At pre-alpha stage, you should determine if the change that you're making is significant or it's a small change. If the change is small enough, you can go ahead and submit a PR, perhaps have a discussion in the community via Slack or SIG meeting. But if the change is significant enough, it means that you have to write something called a KEP. It is important to ensure that you check with the SIG if writing a KEP makes sense or the change can be made without writing a KEP as well. So what is a KEP? KEP stands for Kubernetes Enhancement Proposal. It's a design document that captures the proposed changes and put proposed solution for that change. It includes detailed description of the change, the rationale behind the change, the design implementation plan as well as production readiness of the feature. It has to be reviewed by the sponsoring SIG and at times another SIG that is collaborating on that particular feature. What are good KEP practices? It is highly encouraged that you get involved in the community early. This can be done by clearly articulating the problem and the proposed solution from a Google document sharing it via the SIG mailing list or presenting it at any of the community meetings. This is a great way to identify other contributors who are interested in the same problem that you have or a use case and potentially are interested in collaborating in the contribution process. Once we have a general consensus, we move towards writing a KEP. Kubernetes community provides a template that can be followed. It has a set of questions that you have to answer to ensure that you're meeting the criterias that a particular enhancement proposal requires. An important thing I'd like to highlight here is the production readiness review. This is a part of the KEP template that is used to fill that we have to fill to capture the impact of the feature. It captures the impact on scalability, reliability, performance of the system and overall user experience in the system itself. This will be covered in more detail in the subsequent slides. As a quick tip, I'd like to mention that it's important to gather feedback early and at every stage. And this is what helps to drive consensus within the community. Scoping of a KEP can often be very tricky. There's no one-size-fits-all, KEPs can be of different sizes. It's important to ensure that the scope of the KEP is well-defined, feasible, and it addresses a real-world problem. This is what helps increase the chances of the proposal getting accepted and adopted by the Kubernetes community. So what to expect out of the KEP review process? It is a collaborative process that is designed to ensure that the changes in terms of the proposal, the API changes are well-designed and aligns with the overall goals of the project. It is important to be responsive to feedback and be willing to make changes. And it could be making changes to the scope of the KEP postponing some components of the proposal to a later stage, to a later KEP perhaps. It is all an iterative process. So how did we do it? One of the first features that we contributed to Kubernetes was CPU manager extension to reject non-SMT-aligned workloads. We started by articulating the problem statement and proposing a potential solution, which was to create a new policy in CPU manager. Previously, we had CPU manager policy of none and static. When we had a discussion with the community and spoke about our potential solution, we got input that there are other potential use cases and scenarios where we want to modify CPU manager behavior as well. And that's when we realized that we have to come up with a better construct of capturing other potential changes. So we came up with a CPU manager policy options that paved the way for many such policy options in the future. Another such proposal that we had was related to the topology of our scheduling use case that we were working on. We wanted to determine a way to get allocatable resources from Kubelet. In order to do that, we thought that we'd introduce a new endpoint to capture this information. But after discussion with the community, we realized that there was already an existing endpoint that could be leveraged for this with some minor modifications. So what are alpha essentials? At this stage, we've managed to get our KEP merged. We need to work on the implementation details. Identify the key audience members. And when I say audience members, it's to do with who's going to be reviewing your changes, your PRs, and things like that. It's going to be developers, reviewers, and expert users who have feedback on that particular feature. Again, iterating through the solution is extremely important. Coming up with a minimum viable product as early as possible with some tests always helps. And as we go through the contribution process, we keep filling the gaps depending on the feedback we get from the reviewers as well as the approvers. In terms of alpha stage preparation, it's important to keep in mind that we have time for API reviews. API reviews can often be a very grueling process and requires careful consideration and addressing of reviews. So in addition to that, it's important to keep in mind that we might have to circle back to the KEP because the implementation might have slightly diverged from what it was previously or rather as it was captured in the KEP. So how did we do it? After we had careful consensus and we had gotten to the stage that it was clear for us that we had to introduce CPM manager policy options, we went ahead and wrote up a minimum viable product with end-to-end tests, went through the review process, made changes, and essentially with addition of end-to-end tests, there was enough confidence in the feature that was introduced and the feature made it into alpha. So once we have gotten the proposal merged, we have relevant PRs merged, the feature has made it into the alpha stage and it's time to celebrate. The next step is to keep iterating over the feature and making sure that it goes through a more stable stage. And my colleague here, Francesco, is going to walk us through how the feature goes from beta all the way to GA. Thank you, Zwati. So we made it, our feature is alpha, and now it's time to plan for the next step and consolidate it. And the next logical step is going to beta. Thank you. So what we should expect from beta. So one of the key factors I would like to highlight is that feature and Kubernetes, they are guarded by the feature flag, which basically enables them or not and in which cases. So in alpha stage, the feature are disabled by default. So people interested in to want to try them out to experiment that opt-in. When a feature is promoted to beta, that flag is turned on by default. So everyone is exposed to them and you still have the option to turn off the non-stable feature in your cluster. Or if you're running to issue, you can disable them. But still, you need to be aware that the feature flag is enabled by default and your code is now exposed to a much broader audience. This, however, is changed in 124 cycle for REST APIs. So REST APIs beta, so they are exposed by the API server. They are no longer enabled by default. So if your change involves changes to the REST APIs, you may want to check it out, the changes introduced in the cap link here and make sure you are prepared for. And another thing which is very important to highlight when we plan to change the promotion to beta of our feature is that we'll need to make sure that the API review is addressed because this is more detailed as the feature matures. And it's also that the production readiness review, which we mentioned already a few times, has been performed and all the criterias are met. And when we go beta, we really need to have some coverage from end to end test to make sure that the behavior we're introducing or changing is tested and doesn't break in the subsequent releases. At each stage on which we move forward our feature, we mature it. We need to make changes to the cap which documents the state of a feature and make sure that the state is updated. And at the very least, we need to change the latest version, which we work on that feature. But we also need to address a bunch of questions regarding the production readiness and which boils down to the robustness of the feature and how it fits in the larger project ecosystem. And since we are changing the cap, it's a very good chance and we actually encourage to re-evaluate the cap, making fixes and addressing the feedback we gathered so far. So the thing that I really want to highlight is that when we promote to beta, we are, the feature is recognized its maturity and its scope and relevance. And so the feature flag is enabled by default. So this is a recognition of the guided maturity of such a feature. But it is also a responsibility because now everyone is potentially exposed to the feature and we need to make sure that the stability and the quality criterias they are met. And of course, we build on those criteria, we build on those items to make sure we have a good GA feature. And to make sure that those criterias met and the stability and the quality of the feature is up to the expectations, this is what the production readiness review. So it's time now to really spend a few words about the production readiness review. What is this about? The production readiness review, basically, it's a way to make sure that our feature, the work we are doing conforms to the standards of we all expect about a Kubernetes feature. And that review is done by a different set of people with respect to the reviewers coming from the CIG, which is sponsoring and assisting you in making a change and introducing your feature. And these different set of reviews, they are on different set of things they look for. And all this process goes side by side with the growth of the feature in the respect of the, from the CIG perspective. So all those concerns are addressed while the feature grows in respect of the direction set by CIG. And of course, as you would expect, as more the feature materials, the requirements are higher and higher. And what those requirements are about? They are about actually the things we want for our code, which is about, for example, are there any upgrade constraints? So you need to meet some criteria to upgrade or is it no extra dependency added? Or how do you roll back? Should you need to roll back? Are there any constraints or any issue you can run into? How our feature scales mean? Does it impose extra requirement to exist in scalability? Is the scalability addressed when we design implement our feature? How we can monitor them? Hey, we can make sure this works at all or is working for us. And do we, are we introducing new dependencies to external components or between existing components of the Kubernetes? All of those things and more, they are concerned of the production readiness view. But the good thing, the very helpful thing is that all the production readiness questions and concerns, they are part of the CAP template. So we can review them ahead of time and plan for them and know what's expected from our feature from our code in advance and incorporate them in our planning and in our work ahead of time to make sure we are well-prepared for. In our case, when we graduate to beta, the SAP Manager extension we are talking about, one thing which we want to highlight is that promoting those extra knobs, those extra options to the SAP Manager component of the QBlet in the existing flow of graduation process used to meant that each of those fine detail knob should be guarded by the feature flag. But during the production readiness review and engaging with reviewers, we kind of figured out that this is, okay, this is the following process, but it is too detailed. So we thought about a different approach for this specific use case and we basically figured out that we can reverse the, you know, turn the table and say, okay, we have a set of feature gates which grade the maturity of a knob and this knob moves toward those stages instead of the other way around and this serves better these use cases, this use case and that emerged during the review. And this novel approach was very useful by product of this production readiness review. And another thing I want to highlight that we leverage the fact that we added as early as possible then to test to gain confidence in the soundness of the feature and the robustness of the feature and during the beta graduation, we improved that end-to-end test support but we build on top of that. In the other use cases we are describing in the pod resources and point to gain insight about how Kubelet assigned resources to workloads. The beta stage was gathering feedback and addressing implementation issues and iterate over the existing test suite and increase it as much as we could to make sure all the use cases were covered. So kind of straightforward in that case. When we meet all the criteria, we have our feature which moves to beta and we can be happy about itself and celebrate a bit but not long afterwards. We start the planning to, we want to move it to GA and make it stable and available for everyone. GA, what would we expect from GA? Basically what you would expect from any software component in Kubernetes specific case. We want to highlight that the audience is always on for everyone. Of course you expect the GA feature to be stable to have no ill effects and of course you expect to easily detect, hey, how is this working? Are there any errors? How do I know this is working on my cluster? How do I know this workload is consuming my feature or if it needed? And you expect a smooth path for upgrade like no issues or at the very least they're well documented and you expect that if that feature goes deprecated for whatever reason there is a path and a way forward to cover your use case. And of course we also expect that we have guarantees about the behavior and then to entest we need to add in beta, they are more and more important now because they check the behavior of a feature and going to double click on them in a few slides. Again when we change the maturity level of feature we update a cap to reflect the latest version which we changed our code and to address the latest batch of production analysis to make sure, okay, this is really stable, this is really meets the expectation and again if there is any feedback to be incorporated and any changes to make to align what happened so far it's a great chance to make them. So we mentioned that then to entest a few times and let's just spend a few words about them. So then to entest what quick primer then to entest basically the observed they test the behavior of the system kind of like a black box. You have your system under test, you send the input very similar to what a user will do and you observe your output in terms of observer behavior of the system. So those are the closest to the user flows and this is why they are so important and required to actually graduate and to have a really good pulse of how the feature is doing. But because they test the full system, for example, you may need to have a full cluster to test your feature and in CI and test environment you may need to bring up a full cluster with all the components so that is expensive and could be fragile in CI environment. So the trade-off here is that we need those tests, we want those tests to make sure the feature is behaving as per spec but those are costly so there is a trade-off to be made about how early you introduce from our perspective I say the earlier, the sooner the better and how much of them we want because again they are costly, they sit at the very top of the pyramid but the top could be a flat top let's say. So from an entest perspective, they are so important, they are so neither from beta and much more from GAE and since we depend of them the least I mean the worst thing you can happen to them is to have them flaky. What's a flake in the context of an end-to-end test? A flake is that an intermittent failure so you test your code, you run your test in CI, red. I run it again, no changes, just run it again, it becomes green again. So what happens with what we call a flake? So all of a sudden you don't know anymore, okay was the test environment somehow broken or set up for whatever reason? There is a bug in the test, there is a bug in the code, you don't know. The flake had uncertainty and this is why they're bad for the signal or the quality of a feature and this is why it's so important to invest into keeping them healthy, keeping them healthy to make sure we don't have flakes and we can trust on them. So it's a challenge of course, everyone knows that but that also means we need to plan for that investment to stick around and make sure to maintain those tests to make, to keep the confidence we gain, we are learned about our feature and it's also very good practice to keep watching the CI signal if there are too many flakes, for some degree of too many, we need to check them and go back and fix them to make sure our code works as intended. So in our case, in our case we contributed to the graduation of some long time beta features in the context of the Kubelet. So we have those resource managers which assign exclusive resources to the workload, for example, for exclusive CPU, for memory or for devices and they were better for a long, long time up until very recently and that was also good because the code was stable, okay, some bugs from time to time, mostly stable and people start to depend on that but still those features were beta and that was we called perma beta when it tend to stick to permanently being beta which is bad because again the code is there, the feature is there, it works but it's not really, we don't have full commitment yet. So this is why the signal made this initiative to, okay, let's start to fix the perma beta which we contributed to and the key thing, the takeaway, the challenge here was about first of all bringing up to date the caps and which were using all templates in often times which because those features were better for a long time. So bringing them up to speed with the recent standard of the caps and in some cases address a bunch of extra peer production readiness review or even in some cases address the production readiness at all retrospectively because the feature way was better for so long. So production readiness was very not really a thing back in time, it's more informal let's say. And another interesting case study for us is the pod resources GA so the API we used to pick into the Cubelet resource assignment which we tried to GA in 27 and didn't make it but this is very interesting because the reason why it didn't make it, it didn't make it because we had to address and fix and meet the requirement in some areas which weren't there because the cap was kind of old so it's good. It's good that the review addressed and highlighted those missing requirements and it's good that we addressed them and the fact that the feature didn't make GA in 27, it's actually good because now we meet those criterias and those criterias I want to highlight two key examples. The first example is the multi-platform support for Windows because at least me sometimes tend to believe, hey, it's all Linux but Windows it's also I think it's supported by the platform so we want to have that support and the only challenge that was added a bit later than the proper process would have expected but still we made it and during the production readiness review we also identified a room for improvement area which is limiting the access to that API endpoint to make sure consumers don't consume too much let's say so rate limited and this is again which went through during the production readiness review and the fact that those points were highlighted greatly increased the robustness and the quality of the future and when we make all of that we reach GA so we are happy and done because our work is done and everything we planned back in time is now delivered so we are done well we are done if that's actually it so in some cases it's not the case there is follow-up work for example well at very least despite all the huge amount of effort we put into all the guidance all the help software bugs so we may want to stick around to make sure that now that GA is exposed to everyone everywhere there are no bugs so we stick around and look for them and fix them monitor the status for some time but we may also very much want to build on top of the good feature the good work we integrated so how do we do that well it's simple we start a new cap process or new contribution process from the beginning down the platform incorporates all the good work we did so far so could we have some traits which come across all this process which you know are kind of the guideline all across this process yeah there are few and the few which I think possibly among the most important are the communication side of things we already mentioned that in the beginning and it's worth mentioning again now so the Kubernetes project invests a lot and I mean a lot of energy and resources to make sure it's welcoming it's documenting it's open let's take a simple example when you submit a change during using the github process there is the bot which welcomes you guides you things hey what this is what you expect this is what you could do this is the document you can review this is the people who get get in contact with and all of that is a huge help but to make things even more straightforward and actually faster for your process join the community and say hi and listen to them interact with the community and be a active community helps a lot and makes things faster and smoother for you as well and the other thing is that this is a very detailed multi-step span process which could span across months and sometimes years with you know everything which happens in between and all the availability of people so be persistent and pushing through and trusting the communities against something that helps like a lot to make sure things happen for good in the long term as a parting thought I just want to mention that it's really important to take initiative Kubernetes is where it is today because of its contributing members and all the people who've been putting in hard work if you feel that you're not sure how to contribute you already went through the communication channels and ways to contribute in the community every way that you contribute is helpful there are plenty of opportunities you can start by triaging issues writing up cleanup PRs into an contribution test as well as helping deflaking some end-to-end test contributions in every shape and form are always welcome and we're constantly looking for new members of the community if you have questions you can reach us out on Signode Slack channel and we'd be happy to answer any of your questions I think we've run out of time so we'd be happy to take questions after the talk we'd be here for 10-15 minutes and thank you for joining us today enjoy the rest of kubcombe