 Hello everybody and welcome to the sig windows maintainer track talk as part of cube con and cncf con north america 2023 and we'll start with some introductions. I am a mark rossetti. I'm a software engineer at microsoft and I am the coach of sig windows. I also do a lot of stuff and the release team for kubernetes and pop into various things to kind of interact with them from a windows perspective. Professionally I work in azure on a team that mainly contributes to open source projects and kubernetes and the cncf. Hey folks, my name is adamant. I'm a staff engineer at open shift at red hat. I'm also the co sick chair of sig windows along with mark here. So we have this is our agenda for the day. We are going to start off with some kubernetes updates that are specific to windows and we'll soon move to ecosystem updates around windows projects that are connected kubernetes. And then we're going to shine a spotlight on our contributors. That's which is and the other folks who keep the sig going and hopefully we'll give you all some time for q&a at the end. Yeah, so first we're going to do talk a little bit about some of the updates that are happening for for sig windows inside of the kubernetes project. So for people who aren't familiar with how kind of code flows in the kubernetes project. There is this process of getting larger features into the code base that follows this cap process or this kubernetes enhancement proposal process. And that is generally required for anything that's user facing or interacts with multiple components in the kubernetes ecosystem. And those are anything that goes through that cap process is progresses through various feature stages, alpha, beta and then stable. So most of the updates that we'll talk about in this section are the progressions of the caps that sig windows has in flight. And for each one of the caps, there's a quick link here to get to the full enhancement proposal where you can see discussions and if you're interested, comment on it, check the status of it and everything else. And these slides are on schedule. So any of the links like don't worry about it, you can just download them later. The first enhancement that we'll talk about is this support for image polls per runtime class. And this one is maybe a little bit technical, but it does enable a pretty big feature user facing feature that we've been asked to support. And that is hyper V isolated windows containers and kubernetes clusters. So today in container D, the primary runtime or container runtime that we use for windows hyper isolated windows containers are supported. But there's a lot of work that's needed in kubernetes to kind of close all of these feature gaps here. And a big one is, is that there's a lot of differences in the container runtime. For, you know, pulling the container snapshot in the container and just managing the general life cycle of the container images, depending on if you pull a process isolated container. Or if you pull a container for running it in process isolated mode or hyper V isolated mode. And also if you're targeting the difference operating systems on a node. So if you want to run a windows server 2019 container and a windows server 2022 container side by side on the same node, the container runtime needs to know which of those base images to keep around. So we started some work here and there is a feature that went in. It was introduced as an alpha feature this release a 129 and it's all of the kept was approved and off the implementation PR has went through to progress it to an alpha state. So after this release, it should be available for testing with some setup, which is will be documented here. But yeah, so this this kept kind of required work in a first updates to the CI API container D and the cuba lid and a couple other parts of Kubernetes. So we're kind of excited this is this went in as alpha so quickly and hope to kind of progress this pretty quickly because it's straightforward and it does enable a pretty big user scenario that we're targeting. The next one that I'll give an update on is this kept that was driven through signal but to windows is helping to kind of progress this and make sure windows is supported and this is a C advisor list container and pod stats. So this is a there's there's a number of different metrics and stats they get exposed in various Kubernetes components. Some come from the container runtime you can get your own stats from node exporter some come from the cuba lid. Some of them get cashed and exposed in the metrics API is and such and we've seen some issues where for windows containers what you see when you're looking at like the node stats versus what you see from the metrics API is which is what you're auto scaling. Algorithms are based on can get out of sync and so this this kept is really aimed to help consolidate all this so there's a single source of truth for which metrics are being reported and where that all the different higher level components are getting the metrics from. So windows support was added for this and 128 which was released as part of the last release. It is still an alpha feature so you can turn that on but we're hoping that this fix fixes a lot of inconsistencies that have been reported over time around node metrics and those scaling workloads. And this is targeting GA in 130 where it will be widely available because I know some cloud providers tend to not enable different features until things reach stable. So the next the last one I want to talk about is not quite at the point where there's a cap around this too but it is important and we've been seeing a lot more. I'm not going to say a ton but more instances of a particular issue which I'll explain as more and more people start adopting and ramping up their windows workloads in larger Kubernetes clusters and that's the commit memory. So to go kind of on a little bit of a technical on the technical side of this too when you schedule a windows container in Kubernetes and you specify a resource limit in your pod spec you can specify it like I want you know 500 megabytes of memory. Under the hood that gets when windows containers get start up all of the processes under that windows container get grouped in a job object and the windows APIs for job objects allows to enforce limits on that job object and the memory limit is only enforced on the amount of commit memory those processes are consuming. So because Kubernetes was mostly Linux first and Linux links focused pretty much all of like the stats that get reported and and any of the metrics that you're being reported are working on being reported on your commit set or your working set bytes for all of your containers here. Now under a lot of circumstances your commit memory usage and your working set usage are going to be pretty even but there are a number of either allocation patterns or usage patterns or and depending on what type of memory you're allocating where you can have disparities between those two. So most notably in workloads that are using a lot of commit memory and the working set memory is lagging we've seen cases where people set up you know auto scaling for for those workloads expecting that when the memory usage reaches a certain kind of threshold they're going to have the horizontal pod auto scalar come divide out that work and then things should just continue but we've seen cases where since the higher level Kubernetes components don't aren't aware of the commit memory. The horizontal pod like the auto scaling hasn't kicked in yet and the commit memory will go up and you'll start to run to auto memory scenarios in your windows containers which is definitely not what we want. So there was a number of discussions between people in the Kubernetes community to figure out the best way to kind of reconcile this and where we're at today is there's been pull requests into the CRI API and in container D so that you're both commit set or your commit memory and your working set memory are being reported up to the cubelit and that's all been completed and wired up now. The next steps here are to kind of drive a discussion about how we want to handle this in the high level Kubernetes components like all of the auto scaling mechanisms and this is kind of one of those cross components features that we're going to have to probably drive through a cap. So hopefully since all the groundwork has been done this cap will be started in the next release. If anybody's interested in this please you know stay tuned and look into like checking the single windows slack and things which will give resources on later for more information and kind of input and how you think this should work. So the next feature that we would like to talk about is in place pod vertical auto scaling. This is a feature that is going alpha with Windows support in 129. As as you folks can see it's been a long journey to go around four years from inception to the you know the implementation being committed and checked in. The cap was initially driven through signal and there was a lot of support from sign windows and getting the the windows pieces going for for this. The details about this feature is basically that you can now add CPU and memory resources to Windows pods. And when you change them around you don't need to restart the pod itself they take into effect without the pod being restarted. And this is really useful because you know there are less chances of your you know your pod getting and evicted from the node landing up in another node. So this is a really cool feature support for this has been added to the continuity runtime and we have recently added Windows end to end supports for this. So you really encourage folks to try it out and if you have any issues open bugs and we'll get them fixed. Next we have is a feature that is near and dear to my heart because this is something I added in as alpha in 127. This also had a fairly long journey took a couple of years to go from you know getting the kept merge to having the implementation done. So no log viewer is basically a cubelet API that you can access using the proxy endpoint that's exposed by the cubelet you saw. So in other words you don't go through the cube API server to access this endpoint to enable this feature you need to do two things. One is you need to enable a feature gate called node lock query. And the second is you need to add a cubelet option called enable system lock query. The reason for this is that the feature is a little bit sensitive because it is going to give you access to the logs of your services that are running on both the Linux and Windows nodes. So because of this we want users to be really aware that hey I'm actually enabling this feature for all my nodes for debugging purposes and for collecting information when something is going wrong in your node. So it's pretty powerful what you can do is I've pointed folks at an example here where I'm trying to go to a node called node one example and I'm trying to grab the cubelet logs. So this command is going to return all the cubelet logs on node one example. The query is pretty powerful you can add more options that say hey give me all the logs from a certain date to a certain time. So you can get subsets you can even do queries like hey give me all the logs for the cubelet that have I don't know a term called error in it and it'll just give you the logs which have the word error in it. So I would really like folks to try this out give us feedback. I'm also working on a cubelet plugin to make this feature a little bit more accessible and I think the plug in what it'll let you do is it's going to make it a little bit more powerful. You could say hey give me the cubelet logs from all your Linux nodes or all your Linux workers or all your Windows workers so you could do a little bit more selective logging of any services that you run. And this is not restricted to just you know cubelet logs you could you know if you if you want continuity logs you can get it on the windows side. I know that there are some services that run for GMSA you could even grab logs for those services. So it's a pretty powerful utility especially when it comes to debugging issues that are happening in the load. You know your CNI doesn't come up you can look at your cubelet logs to figure out what's going on. You don't have to like figure out SSH access to your nodes you can just you know hit this endpoint. Next up we have ecosystem updates. Yeah so we just talked about where updates in the Kubernetes project itself we wanted to give some updates in the ecosystem like the Windows container ecosystem because that definitely wouldn't have Kubernetes without all these different components. The first one that I'm excited about is the Windows support for the K3S agent. So pretty recently the K3S project has kind of finally I shouldn't say finally I was like finally because the feature request came in I think 2019 to support Windows nodes so it's been a long journey to get here so that you can configure a Windows VM and with K3S the K3S run the K3S agent and then join it to the node. There's this one in pretty recently but almost immediately backport peers were open so if you're running K3S for any scenarios starting with 126 you can use join Windows nodes to the cluster. It's pretty simple to do you just make a little YAML file like the definition here pointed at the control plane for your K3S cluster and run the agent file. This has been tested with Flannel. We know there's many other you know CNI solutions. I think that anybody in the project or in Sig Windows would really appreciate it if people you know wanted to test and make sure other CNI solutions work with that too. So it's also a good opportunity to contribute. The next kind of topic that I want to talk about was build kit support for Windows. I don't know how familiar people are with with build kit but if you use Docker build for the past couple versions of Docker you're actually using a build kit build engine under the hood. If you're building Linux containers on the Linux machine if you're building Linux containers in WSL with Docker desktop but if you're building Windows containers in any of the different environments you're using kind of an older version of a Docker build engine that was built specifically to build Windows containers. Now over the past couple of years there's been a lot of definitely a lot of improvements around build kits to help you know optimize your build execution for your container images by doing like paralyzed graph queries and execution much better image caching the ability to push directly to image stores and all sorts of benefits here. So this is another issue that was open I think around 2020 about let's get like what would it take to get build kit support for building Windows containers on any of these platforms. And there were a lot of discussions for a couple of years and recently I'd say the past six or eight months there's actually been a lot of movement to get this to happen. I'll be clear this isn't this isn't like ready yet I think that you can kind of cobble together a bunch of different components to get it to work but it's not quite ready for just install Docker and use this. But this is an example of like a pretty complicated problem and lots of different PRs from lots of different individuals that are all making progress to get this to work. So in this particular case it started with the need to build an executor to run your run commands. And that was implemented in container D and one and then there was also no number of changes in various other components to support that. And then a lot of changes in various mobile repositories to support Windows paths and then just make sure everything lines up and stuff. So this is a this is a like a lot of people have been asking for this because of all the benefits built kids brings and we hope that this will be available to everybody soon. So next up we have Calico Calico is a CNI available for Kubernetes and recently in version 3.27 they have added general availability for running Calico using Windows host process containers. If anybody here is not aware of what a host process container is, it's the equivalent of Linux privilege containers for Windows. There's a great presentation given by Mark and James at the previous KubeCon. So please look that up and that'll give you all the details you need about host process containers. So what's happening now with the Calico project is they now have their CNI plugin and node container images built with within the Calico project. So you can you can pull those images from there. They have also added support to the Tigera operator, which is the operator that manages their pods and any containers that they have running on their different nodes. So Windows support has been added to that also. They have really good documentation. So if you go to the Calico website, they have detailed notes on how you can use Calico with Windows nodes now. So reach out to Project Calico for more information. Next up we have contributor spotlights. Any thing is not going to be around if there are not enough contributors. And so this is the reason why we would like to shine a spotlight on contributors. We're going to focus mainly on folks who have joined us recently and have added new features and helped us fix bugs. First up we have Kirtan Ashok. She's a software engineer who works at Market Microsoft. She was the person who started on working in the image pull per runtime class feature that Mark was talking about recently. And she was a first time contributor to Kubernetes. And she went through the whole journey of, you know, starting a KEP, discussing with multiple SIGs about how this feature should work. Not only did she have to talk to SIG Windows, she had to speak with this SIG node. And externally she had to also talk to some of the CRI folks in particular with the continuity project. Because this feature actually spans multiple GitHub repos. So she had to contribute in multiple places to get this feature going. So this also shows how you could enter the Kubernetes community through SIG Windows and get this broad exposure to, you know, different projects out there. So I've pointed at some of the KEPs and some of the PRs that she had to open and get merged to get this feature going. So really great work from her. Next up we have Monsecal Kearney who is in here in the crowd with us. She's a software engineer at Red Hat working on Windows containers. She recently worked on implementing Windows support for the pod and container stats from CRI. This was needed to complete the CAdvisorLess feature that Mark was talking about. This is another example of how you can get involved. Instead of having to work on, you know, a feature from end to end, you could also show up and contribute and drive something to completion. Again, enabling you to get exposure to the community, getting you more knowledge about Kubernetes in general. So another way you can, you know, you can start working and contributing to Kubernetes. Monsey has also been the CI signal shadow for 128 and 129 and she's also been involved in the test and for project. The other couple of spotlights we want to shine are on Tatenda and Kulwant. They both worked on Windows operational readiness on EKS. They both work at AWS. If you're wondering what Windows operational readiness is, it's basically a set of standards and a set of properties that if you apply to your Windows node, it basically states that your Windows node is now ready to accept production workloads. So great work from both of them. We're happy to have them on board. And now I'll pass it back to Mark. And now we'll talk a little bit about ways that other people can contribute. Kind of, as Arvin mentioned, no SIG can really contribute without the contributors there. And SIG Windows has, there are maintainers here. We could definitely use more contributors and it's always a better use of kind of the maintainers time to try and mentor and bring on new contributors just so that we can scale and continue to grow. So we'd like to, if anybody's interested, highlight some of the ways that they can help contribute. Each SIG in the Kubernetes community has a landing page in the Kubernetes community repository that kind of lists their SIG charter, lists links to their meeting notes, what time their meetings are and everything. So visit the SIG Windows page if you're interested in finding more information. We do have a contributing guide for SIG Windows for the Kubernetes project specifically, how to do run additional test validation on any of your Windows changes and kind of talking about a lot of nuances around how to, how like Windows containers work specifically for contributing to the Kubernetes project. But there's also a lot of information on the Kubernetes website about differences and what's enabled, what's not enabled for Windows workloads in your cluster. We also have a weekly community meeting every Tuesday at 9.30 am PST, which is my local time zone, which is open for anybody to join. And we do often, if there's a desire, have pairing sessions after that too. So if anybody's interested in coming and looking at a problem, looking at some either test failures or unique kind of instances or behaviors, many of us are always willing to kind of hang around and kind of just go deep and help people with that too. We do have a list of issues and bugs that anybody can look at, which we maintain in some periodic bug triage meetings. So one of the interesting things about the Windows SIG is we're kind of a more horizontal SIG. So we do have a charter of kind of supporting Windows workloads in Kubernetes, but that does often cross a lot of different SIG boundaries. We interact with SIG, SIG security, SIG node, SIG networking, auto scaling, things like that too. So don't think that you just need to know Windows expertise if you want to contribute to SIG Windows. There's a lot of opportunities to go deep if you want to technically and focus on one area, but there's also a lot of kind of opportunities to gain exposure in various aspects of Kubernetes project. Kind of specifically lately we've been, we've really, if anybody has Windows networking knowledge, that's some area that I think we're always kind of short on help with and we really, we definitely value any contributions with that too. We do do a lot of Windows testing and we're trying to make sure that everything stays stable. That could be a good opportunity if anybody's interested in learning how the larger Kubernetes test infrastructure kind of mechanisms work and everything. And we do maintain the documentation on kates.io for Windows workloads, but if anybody is interested in some non, like coding contributions, we always value additional documentation there. And I personally find that having people go through the documentation and like to realize some of the shortcomings if there are any as things evolve is a much kind of rewarding feedback cycle for getting the docs improved rather than us just saying, oh, you need to go in, like, here's what you need to change differently. And there's multiple sub projects in the Kubernetes project that kind of have opportunities for sub project owners, if anybody is wanting to contribute consistently. One other thing that I'll call out is recently there were a Windows CVE and just wanted to call out here that, you know, these things do happen. And this is another opportunity to contribute. The Kubernetes project does have a security response committee and there's guidelines for how to disclose information. If you're interested in just trying to break things or do some penetration testing, that's always also welcome a welcome contribution, but please just make sure that you disclose these responsibly. And there's a link to the website there if you find a security vulnerability in any of the Kubernetes components. And there's also bug bounties for finding bugs in the Kubernetes community so you can maybe get rewarded for that too. Lastly, here's just more information in the SEG. This is more for reference if you're downloading the slides. We've got a Slack channel in the Kubernetes Slack. We've got a mailing list. We've got a YouTube playlist with all of the community meetings, links to some of the various SEG Windows meetings, the teams and the contact information for all of the leads on the Slack and on GitHub. So now we'll leave some time for Q&A. If anybody has any questions. There's a microphone there. So it's on the recording. Hello. Is this on? I'll just speak very loud. The log node viewer. So you showed the API there, but is that going to be in Kube CTL as well for customers to be able to grep through logs on the node? There you go. So it's not going to be, so you can make a raw call to the API itself but it's not going to get native support into Kube CTL. That's a decision we made after a bunch of architectural discussions that we had. So this is why I'm going to introduce the Kube CTL plugin that you can then use. So it'll be more natural. I don't know. I plan on calling it node logs. So it'll be something like Kube CTL node logs and then, you know, whatever options I'm going to expose there. So that is I think the next step for that project. Thank you. Any other questions, folks? All right. Going once. Going twice. Thank you for coming, folks.