 day. Welcome this time live from Amsterdam, KubeCon. I'm so happy to be with you all. If you've ever heard me saying those same words, that means you've been either on SIG apps, weekly meetings, well, bi-weekly meetings, or eventually on SIG SILI, which I'm also helping to run. So, on to things. Like I said, my name is Maciej, and I'm also one of the chairs for SIG apps. The other two chairs and technical leads are Janet Ken, who unfortunately were not able to join us today in this beautiful Amsterdam. So if you have any issues about how the SIG apps operates, if you have any particular questions about technical stuff related with what SIG apps does and what you want to see being worked on, those are the three people that you want to start talking with. So how to best reach us, aside from today, and we are trying to be present on all of the KubeCons and give a little bit about ourselves, give some updates on what we're working on. You can also find us on the bi-weekly meetings every Monday. The next one is planned on May 1st. There are times, depending on which times you're on. If the meeting just does not work for you, there's always the opportunity to jump on Kubernetes Slack channel. There's SIG apps and PINK, either directly one of us, or just leave a question. We're trying to go through all the questions, all the suggestions, and respond to them. Lastly, but not least, there's also an email group. If that's your preference, you can write an email, ask questions, propose suggestions, and we'll be talking about suggestions in a moment. So what does SIG apps is responsible for? In short, we are responsible for deploying, running, and operating applications on Kubernetes. There is a link that you can read the entire charter, which describes our mission, but that's roughly a short description. If you're interested in presenting to SIG apps, whether your solution, it doesn't have to be specifically working or changing how Kubernetes work, but if you are deploying an application, you are struggling with the implementation or with running your application on Kubernetes. This is the place that you can always ping us and we can either help you with your application, or if you have any suggestions, or you would like to present your ideas, we are more than happy to see you join one of our sessions, or bring this topic to our attention. I've also linked the annual report. If by any chance the annual report should be merged within a day or two, if you by any chance to check the link before that, there is a PR still open with the annual report for 2022. So don't freak out, the annual report is empty only for one or two more days. It's already approved by the Syrian committee. It just didn't work on time. So I did mention that we will be, we are doing those presentation every roughly six months, so every KubeCon. So since last KubeCon in Valencia, we basically released three major versions of Kubernetes, 125, 136 and just recently last week 127. So I would like to cover the stuff, the features that we promoted over the alpha, beta and finally to, to stable. So let's start with the stable features that landed in the past three releases. The first one is we finally stabilized the Mac search for Daemon set. The main reason to be able to, or let me go back. The feature basically allows you to set the data. An additional number of, of notes that can run more than a single part of a Daemon set. Normally, we'll be only running one note on one part on a one note, but during rolling upgrade, if you want to maintain a, or minimize the downtime of your Daemon set, you would like to be able to progress in a little bit faster pace and allow during the upgrade to run more parts of your Daemon set. Of course, it has some downsides and you have to be aware of if your Daemon set parts are heavy consumers of resources. You need to be aware that at some point in time, there will be two parts consuming double the usual resources, but you can steer that number either through a direct number of parts or by a percentage of number of parts that your Daemon set is running. The other, the other feature that we promoted to stable is min ready seconds for stateful sets. So previously, the stateful sets controller had a very fixed num, had a fixed timeout for how long it waited for an application to be fully ready. But with stateful sets, it might take a little bit longer for the application to, to become fully available. The most frequently used example was we want to ensure that the caches for my applications are warm enough so that I can serve the traffic already at my full potential and not wait additional more timeout. There can be probably additional more use cases, but that's where setting that timeout and extending the, the timeout for how long it takes for the pod to be considered fully ready. It is now possible for a user to define that value. And let's switch track a little bit. So the two features that I was talking about were covering the apps area. So basically the long running workloads. The other side of the sigapps per view is running workload to a completion. So basically everything related with batch jobs, work, cron jobs and so forth. The two major advancements that we did over the past literally a couple of releases is one is we change how we are tracking the pods for a job. So when Eric and myself wrote the original job controller and the very early days of Kubernetes, we assumed that keeping the pods of a completed job such that we can calculate its status was reasonable and it worked fine to a certain degree. But if you start scaling the, scaling your job to hundreds or thousands of pods, keeping those pods around is literally wasting resources, unfortunately. So we had to change how this is working. Not to mention the fact that this also conflicts with pot garbage collection, which is responsible for removing the pods that have completed. So suddenly you are in a situation where pot garbage collector is removing the pods that it finds completed and job controller at the same time is requiring the pods to be still around so that it can calculate that a particular job has completed. Over the past, I would say like three or four releases, even because we've run into multiple issues, we change how we are calculating the completions such that we are setting a finalizer on every pod that is created from a job. And then upon completion of this particular pod, we are removing the finalizer only after we included the information that over the past three weeks, we included the information that, oh yes, this pod has been already included in the status of the job. So that opens a lot of fields for heavy, heavy jobs basically that could run even to multiple days or weeks. The other one that is also literally fresh from the oven because it was stabilized in 127 and it's actually, the work itself wasn't that hard is adding the time zone to cronjup. So it just so happens that a couple of releases back, we've upgraded the library to responsible for parsing the cronjup schedule. And one of the users discovered that within this library, there is an ability to set time zone through a TZ variable that you can prepend to the schedule. This is not supported. And you should not be doing that if you are. And I'll tell you why in a moment. So that was additional step that we just realized that, oh yeah, we have to finally add the time zone because time zone was requested to be added very early on in the Kubernetes lifecycle when we shortly after we created it. Back then it was even called schedule a jobs. If you haven't heard only a couple of releases afterwards we renamed it cronjups. But that was way before 1.10 if I remember correctly. So yes, we've added the time zone to cronjup but we still had that TZ issue that a lot of people suddenly started relying on. We've put warnings so whenever you try to create a cronjup or update a cronjup with that TZ variable or cron, underscore TZ. In both cases it will warn you through HTTP warnings that API server has exposed. It will warn you that you are using an unsupported version. Now in 1.27 we went even a step further. Because the time zone feature is fully supported we will prevent you from creating a cronjup with the TZ variable in it. You will be after you upgrade to 1.27 you still will be able to update your existing cronjups but you will not be able to create new ones. In the next release in 1.28 we will even prevent you from upgrading. Sorry, we will even prevent you from upgrading. Did it pull the other one? Oh yeah, okay, cool. Sorry, I have a timer. So it will even in 1.28 prevent you from updating cronjups with the TZ. So if you are relying on that TZ variable in schedule please switch over to the supported time zone variable. The one thing that I did not mention, if you were wondering what was the time zone that we previously used it was always the time zone that has been the time zone of the Kube controller manager process running on the host. So the host where the Kube controller manager was running was the time zone that was always being used for scheduling new jobs. Okay, so moving back to moving further to beta features and we will stay in the batch area and I'll cover the apps area in a little bit. So we are working very closely with a batch workgroup and I'll talk about the batch workgroup in a moment but the two very important items that came from that collaboration is retribal and non-retribal failures for jobs. So currently if you're looking at a job or you've worked with a job you are probably aware that the job has very limited ability to specify when it will retry. When we originally designed the jobs based on our experience the goal was to make sure it always reaches a specific completions. Or alternatively there was another option where we figure out that we want to have a guard pod which when it completes basically the entire job is finished. So we implemented only those and we always retried the only thing that you could configure was the number of backoffs that it would retry a couple of times and afterwards. So when we started talking within the batch working group it turned out that if you start building on top of the Kubernetes batch jobs and you start building AIML workloads or any kind of HPC related workloads you run into problems because there are cases where you are certain that the job should not be retried and there are the other cases where yes I'm perfectly aware that it failed but it's something that we can safely retry. So the batch, the folks from batch work group put together a proposal to extend the job API with retriable policy, podfailer policy, sorry that's the field name. In it you can specify either exit codes based on which we will retry or pod conditions and they are still working hard to expand the surface of that API so if you have your own ideas about what should be added which cases should be covered over there I'm pretty sure that even either popping up to SIGApps or the batch work group and sharing your experiences, your requirements with us would be very helpful. And the other topic that I wanted to cover also coming from the batch work group was elastic index job. So I did mention that one of the use cases for a job was that you were able to specify a guard pod which when completed would mean that the job is finalized. You do that by not specifying the completion number so either it will run to a specific number of completions or if you don't specify the first pod that completes the job is marked as a completed. When we initially started working on the index job we reused the same patterns for the index job but it turned out that for the index job which basically every single pod has an assigned index and we will ensure that if this particular pod failed we will retry this and we will hard assign it a specific index this allows you to work with work use or eventually divide your work into specific pods. So we used the same patterns for the index job but it turned out that in an index job that's not quite necessarily useful and we were discussing this a couple of times in how we should be implementing this because we would like to be able to change the number of completions for an index job because the other thing that I did not mention is you were not able to change the number of completions. It was set up front upon job creation and the only parameter that you were allowed to change was parallelism. You could modify depending on the traffic, depending on your time, whatever the workload of your cluster is, how many concurrent pods are running your job but you were not able ever to modify the completions. So with the index job what we found out it would be reasonable to be able to modify the completions. Sorry again. So we discussed this a couple of times and we figure out that we cannot break the index job because that was already at the point where index job was released and it was fully GA and we could just not allow ourselves to break the API promises that both the project and our SIG is. So we figure out we will name it differently and we will ensure that we will not break users by allowing, modifying both completions and parallelism and as a requirement we would allow modifying but only if both of them are modified at the same time. This allowed us to ensure that we do not break users because breaking users is the worst thing that can happen to all of you and as well as the maintainers. Okay. So that covers the batch side of things, going back to the app side of things again. So the first topic that I want to talk a little bit is pod healthy policies for pod disruption budgets. If you use pod disruption budget before, you know that it allows you basically to set a specific numbers of pods that always have to run your application. This is especially important during upgrades when you are rolling your application, when you are rolling your cluster upgrade and you want to ensure that your application will always serve the traffic for your users. This way, that allows you to create a PDB wrapping your application. The problem with that we figure out at some point in time, unfortunately it was a little bit too late, was that all the pods running but not necessarily healthy are also accounted for in the PDB budget which in some edge cases will block the upgrade because the application cannot progress because something has been broken in an app but it cannot remove the broken pod because it is already running. So we doubled back and forth several times because it's a bug on one hand but on the other hand there will be users who strictly rely on a particular behavior that it guarantees just that behavior. So at the end of the day we decided to add another field which allows you to define the pod healthy policy. The new pod healthy policy, so the current behavior is as it was up until this point, the new policy currently allows you to specify that unhealthy pods, the ones that are running but not healthy are not accounted during the PDB calculation and they can be evicted safely and this way this allows you to unstuck your upgrades and at least in our case and in an open shift that I've been working very heavily with, we have multiple options where we got bitten by this case. The other two topics are covering the stateful sets. One of the main objectives when we designed stateful sets was to ensure the stability of the data that is supporting the stateful set. To ensure that we've decided never to remove the PVC backing a particular stateful set. And that worked for pretty extended time but it turned out eventually that there are some cases where you are safe, it is safe to remove the data behind a particular stateful set. So someone opened a proposal that they would like to be able to remove the PVC along with the stateful set. It's a very simple opt-in. Not much code was actually required to be done but obviously we didn't do it as a default option because that would break too many users but we did open an option so you can specify something like that. And we're currently at beta on a happy path towards stabilizing this feature around 129. The last topic for stateful sets was ordinal lumbering. When you're running a stateful set on a single cluster everything is just fine. But if you start discussing the topics of migrating the stateful sets between different clusters, that's when you start running into issues. Because during the migration you still want to maintain the ability to run your particular stateful set. And the lack of control of how the stateful set controller numbers, the parts in your stateful set did not allow you up until this time to spread your stateful set between two clusters. Basically that was the goal. So we've introduced something that is called an ordinal number which basically says this is the starting number of my stateful set. And everything from, so normally stateful set would be numbered from zero, it will number its parts from zero to n minus one. With ordinal you can affect and shift the number to start from your given number to n minus one. And the initial part you will be basically covering by creating a regular stateful set with your new ordinal which is set in the other pod, sorry again, in the other pod, in the other stateful set as a replicas number in the lower one. Okay, so that covers the topics that we've been working on for the past three releases. 128 hasn't started yet, I literally was checking the dates just to make sure that I didn't miss something, but there are no official dates for 128 yet, although the master branches has opened already and there's a bunch of PRs landing in Kubernetes. If you have any topics and I know that there's at least three or four that are slowly starting and will be getting to alpha, such as the, all of the work of controller status is consolidation. We are discussing that topic for I would say a good couple of years by now, we're still trying to figure out how to best design the API to cover this topic. There are a couple topics which will will be like the ones that are beta, which will be promoting to stable in the next releases. If there is a topic that you are interested in, that you want to help with, speak up if there is something new. Like I said, May 1st will be the next SIGApps meeting or send an email to MailEnglish or reach out to us on Slack. Let us know what is something that you would like to see being added in core controllers. I did mention the batch work group. I want to thank everyone in the batch work group. They are doing amazing job and progressing where our batch API, so both jobs and cron jobs, the work has been invaluable. They are even helping us with improving some of the workloads, so the apps controllers that I've been discussing earlier today. They also have bi-weekly meetings. They are meeting actually on Thursday. There's also an email group and a Slack channel that you can join if you have any ideas about workflows or any kind of AIML batch workloads that you are running in your environment or you want to learn about how to run these workloads on Kubernetes clusters. This is the place that you want to listen to the stories that many people are, what kind of struggles people have in the ecosystem or what are the solutions that we are slowly building. Lastly, over the past second half of 2022, I run mentoring session for SIGApps to become reviewers and approvers. We have a small number of people who are actually working on the core controllers. I'm working effortlessly to raise additional group of people who will be able to help with us. It's a very challenging work. There's a lot of initial knowledge that is required for the controllers that we own. It's not an easy task. Not as easy like, for example, for SIG-C-Li, and I have a comparison because I also help mentor folks for the SIG-C-Li. For Qubes-Utl, it's very simple. The risk of breaking something is slightly smaller because in the worst case, you'll break a single command, which is not that bad. If you break a controller, that's a little bit more scary, especially if the controller has some data behind it. That's super scary. If you're interested in something, reach out, let us know. We're more than happy to help you grow in the community and help us all. I think that's basically all the stuff that I wanted to share with you. Thank you very much for today. If you have any particular questions, I can take them either on the recording. If you don't feel comfortable doing them on recording, I'll be standing here for a while and you can just come say hi or throw some questions or tough issues at me. Thank you. There's a question from .com. Yeah, I know what you're asking about. So the batch work group put together a proposal for a job set and they are... We've just... Last week or two weeks ago, we've opened requests to create a Kubernetes-sponsored... Kubernetes... In the Kubernetes SICK orc, we have a jobs at repo, so we are not starting in the core. We will be starting as an additional add-on, as an additional controller. We would like to be able... This will allow us to progress and experiment a little bit faster, gather a lot of feedback within the batch work loop, because like I mentioned, the batch group gives us a lot of feedback about how they want to run stuff around the topic. There was a proposal sent to both SICK apps and the batch work group. If you haven't seen it, just check the archives for either of the group or ping me, I'll be happy to share it with you. And I'm pretty sure that there will be further discussions and further development. If it's something that you are interested in, make sure to sync with the batch work group and provide the feedback so that it goes in the direction that matches your expectations, basically. Okay, thank you very much all.