 Hello, welcome. Welcome to the session on sync windows. And we're going to do both an introduction and a deep dive into the special interest group for windows in Kubernetes. I'll start by allowing Pengfei to introduce himself. Yeah. Hello, everyone. I'm Pengfei Ni from Microsoft and from Microsoft in the NIR team. I mainly work for improving the Kubernetes experience on Azure. So I work some internal work to improve the Azure container service and also making some contribution to the Kubernetes, such as the cloud provider, the container runtime, the windows, and so on. Thank you. Thank you. And I'm Craig Peters. I'm a PM in the Azure team in the Azure Container Compute. And I'm responsible for container infrastructure upstream contributions that we provide so that Kubernetes and other container workloads work great on Azure. And so what we're going to start by talking about is sort of how we got where we are today. So hopefully you've got the contact information. So feel free to reach out to us. We're going to give the talk in English. But we're going to upload the Chinese version of the slides this afternoon into the schedule app. So by tomorrow, you'll be able to download those if you want those. Also, feel free to reach out to either one of us via Slack, Skype, email, WeChat, however you want. We're in a group and show our WeChat code here, although we both have WeChat, so you can again the key code. I apologize. I won't be able to respond in Chinese. So how did we get where we are today? So in 2014, Windows Container work began. And Docker began the work to enable containers to run Docker and Microsoft together, made it possible to run containers on Windows. And very soon thereafter, as expected, people wanted some higher level way to orchestrate those containers. So Docker Swarm, Nezos, and Kubernetes have all emerged. And as we've seen, Kubernetes has sort of become the dominant way in which to orchestrate containers across multiple hosts. And so very, very quickly, we worked in the community to make it possible to do that in sort of an alpha release towards the end of 2016 in Kubernetes 115. We worked to solve a bunch of problems that existed in the networking in particular and support the CNI. So I guess the initial release, the key thing is that there was a kubelet and kube proxy ran on Windows. That's the anchor thing to make the workers in your Kubernetes cluster be Kubernetes running on Windows, where the master components are still running on Linux nodes. In 1.9, many improvements were made, essentially fundamentally added CNI support to the kubelet and kube proxy. And so now you can use a pluggable network components for Windows. The initial support was kubelet. And then people started using it more and more. In March of 2019, we were able to reach the stable label in Kubernetes. So we internally think of this as basically GA. This is something that we recommend that anybody now use. It's supported with community support. And as we'll get into later, many cloud providers are also providing commercial support for Windows in their Kubernetes offerings. 1.14, the major things were that we made it easy to use Windows Server 2019. That's the server host that is now supported in stable Kubernetes. That means that we are going to support going forward 2019, Windows Server 2019, and hire. And then in 1.15, we made a lot of quality improvements and usability improvements. So let's dig a little bit more into what we actually released in 1.14. So if you're not aware, 1.14 was dubbed catternities. That was the release logo for 1.14. Aaron of SIGbeard, if you're familiar with him, dubbed it so. And it's very appropriate since he loves cats. And releasing Kubernetes as a community project with all these inputs is very much like herding cats. One of the biggest cats to herd for the release team was actually the support for Windows containers in 1.14. And for the general availability of this Windows product, and I think we have also sent a lot of t-shirts to all the contributions that worked for this feature. Yeah, we send a huge thank you out to members of the community. We had a lot of participation from people at VMware, at Docker, at Google, some participation from AWS, and a bunch of other companies all contributed to making this a very stable release. The key pieces of that were that we really improved the end to end testing scenarios. That was actually the number one thing that had to happen in order to call it stable. So we have very robust end to end tests, which we'll look at quickly. And we did a lot of end user documentation so that it wasn't a mystery how you go about doing this. So upstream we have high quality documentation that if you use, I hope you'll have a great experience with. And we'd love to hear about your experience with that and get your feedback on that. In 1.15, we have done some bug fixes. And there was an alpha release of global managed service accounts support in Kubernetes that was primarily contributed by Docker and is supported through some integrations in Docker Shim. That has continued to evolve and gotten more mature. It's moving towards a beta release and beta label in incoming releases. So this allows you to have pods of authentication to an active directory so that your host doesn't have to be joined to your domain in order to have authenticated services running in your container. Yeah, so for the networking part, since the beta of Windows partner in Kubernetes, we have got involved with the whole community from both the CI, from the network providers such as the Calico, the Flando, and we have unimbled many plug-ins to support the Windows server container. So basically for Windows servers, it's for three kinds of network topologies, the overlay, the underlay, and the transparent network plug-in. So for overlay network, basically it is based on VXLAN. And you have two tries to set up your overlay network. The one is the win overlay network plug-in. This is part of the official CN plug-in. So you could find the source code there and the release there. And another one is the Flando supporter. So if your cluster may be built today, already use the Flando, and when you join your Windows node into your cluster, then you can still use Flando. And another network topology is the underlay. So for underlay with the part of actually three kinds of network plug-in, the L2 bridge, L2 tunnel, and the Flando. Both the L2 bridge and the L2 tunnel plug-in are also part of the official CN plug-in. So for bridge plug-in, it actually works the same as the bridge plug-in on the Linux node. So for the Flando, it's part of the host-to-getaway mode supporter. So then the same with the overlay setup. If your Linux node have already used the Flando, then you can choose the Flando host-to-getaway part as your Windows CN plug-in. And the last one is transparent. So this is based on the hyper-way base-switch extension. So then your container could be connected to your physical network and allocated, maybe a manager of your container's network, such as IP addresses from your physical network. So for this model, for this plug-in, I think open-based-switch is the only supported plug-in, right? And I want to emphasize that we really also have a lot of contribution from PubBase in the OVM support as well. So this is a network plug-in, which is based on the AI that is standard for configuring the pulse of the networking in Kubernetes. And announcing is a network policy. So for network policy, we have got both the Calico and OVM to support the network policy. So for Calico, Calico actually provides two kinds of features. One is the CN plug-in, and the other is the next policy. But for Windows, Calico could be wrong as policy only today. So you can use Calico to manage your network policy and you use other network plug-in to configure your network. And OVM is actually a 2K solution, so it supports the CN plug-in and the network policy. OK. Yeah, so when you set up your cluster and chose your network plug-in and set up your nodes, join your cluster then before you run your Windows application, there are actually much things to consider. So hold on and check and read the documentation and take care of a couple of things before you run your application. So the first thing is then how could you decide your container would be wrong? Is it running on Windows Node or running on Linux Node? If you need a Windows server node, then you should probably use some node selector in your policy back. But however, your cluster, maybe, there are already some deployment than running on Linux Node. And those nodes, maybe, those applications, maybe you didn't set some node selector there. So sometimes, maybe your Linux model would be deployed into your Windows node, then those containers won't be run there, right? So we have two options here. You can make this work. So the preferred one is add the time to your Windows node. So for your Windows application, you're tolerant of this time. And also, you may add the node selector in your Windows application so that they could only deploy it to the Windows node. And also, for existing Linux application, you don't make any changes in such a way. And another option, of course, you can make changes to your existing Linux deployment. You can add the node selector to your home task, add it to your YAM files so that they could deploy it to the Linux node. So this way would require a lot of changes if you have already deployed some Linux application. So that's why the option one is preferred. The next one to consider is that as you know, then Windows containers are running actually very different from the Linux containers because each Windows container would require some backend service to run there. So they would require some results, such as CPU, the memory, and also the disk space. So if you set the resource limit, such as the CPU, the memory, then you should probably set a bigger value compared to the Linux application. The last one is actually the most important one. Then there is a kernel and the user version compatibility issues for Windows container. So the general idea that Windows kernel version should be matched with your Windows containers version. That is if your application is building on Windows 7.9, then your application could only be run in Windows 7.9. But if you have already some applications that built from Windows 26, then you can also enable the Hyper-V isolation. So for Windows containers, there are basically two kinds of isolation. The process isolation that I actually same with the Linux names-based isolation. So another one is the Hyper-V isolation. If you enable the Hyper-V isolation, then you could run the older applications on the new Windows 7 node. So you can run your Windows 7.26 application on your Windows 7.29 node. It's important to note that that is currently in alpha. And there's a lot of work to be done to make that a smooth experience. Today, the Hyper-V isolation requires quite a bit of extra labor, and we're actively working on making that. We'll go into that in a minute. Yeah, and another thing, I didn't add it to this slide, compared to the option 1, 2, 2, 2 select in those, the community has also added a new feature that is runtime class. So you can also use runtime class to select your runtime. So maybe your Windows runtime is different from your Linux runtime. Maybe your Windows node will have different versions in the future, because the Windows GA will only support Windows 7.29, right? In the future, you may have more Windows 7 version. So you may choose a different container-run time handle for different nodes. So that is another option in the future. You can use it to select the nodes. OK. So one of the major things that we've spent the last six months really investing in order to build trust around Windows containers and long-term investment that we are making in Windows containers in Kubernetes is to build a very robust set of tests. So there are many, many hundreds of tests that are run on a daily basis. And it's all on test grid, along with the rest of the testing in the community. We've worked very closely with Google and VMware to have this testing run on their platforms as well. We're expecting AWS to join and we invite any other cloud providers to please join the testing effort and contribute resources and run your tests on your platforms as well. One of the challenges that we face is that in the Kubernetes community as a whole, naturally, this system grew up from assuming Linux as the underlying host. And adding in the notion that you have a different host type or even a different platform has been a challenge. So remember what it was like with Kubernetes when you were trying to run on AMD. Then imagine the challenge of adding a different host type. The tests and the conformance tests also are a great example of this. So an active area of work right now is to work through all of the end-to-end tests and the conformance suite to get to a shared community definition of what it means to be conformant or different host types. Today, the way in which we're doing it for Windows is to exclude a set of tests that together with SIG testing we have described as Linux-only tests. That's really a stopgap. That's not really the end solution that we need. What we need is to come to a shared understanding of what conformance means across different platforms, different hosts. As Kubernetes runtime spread across from data center to edge, there are going to be a very wide variety of runtime environments over time. And so that's an active area of development right now. We've built a bunch of Windows-specific tests that we've added into the end-to-end suite. And ongoing, we're testing 1.14 and 1.15. And obviously, 1.14 and so forth will be added to the suite. And that's available. And we please invite anybody who cares about this to take a deep look and ask us any questions about it. So the next point that I want to make is that in the community, there's been a worry. Windows containers and Kubernetes of toys, it's something that's just an experimental thing and it's going to go away. I want to emphasize that this is not the case. There's a huge demand in the customer ecosystem for running Windows workloads by being orchestrated by Kubernetes. And a testament to that is how many different service providers are already providing solutions based on this very new, stable support of Windows containers in Kubernetes. So we can see everybody from Amazon, Docker, Google, Huawei, Microsoft, obviously, Rancher, Red Hat, VMware, and Pivotal. And I'm sure this list is incomplete. This is the ones that I know about from my interactions in the community. And we collaborate with all of these people to make it possible. When I upload the slides, you'll be able to have the links to all of these things. And if your name is not on the list and you want it there, please come. Talk to me. You are running your Windows containers on Azure. We will also have some open-source projects to apologize so that you can set up the environment and try it on. Then, for example, you can use AKS Engine to provision a cluster with both Windows and leading nodes. And you can read through the documentation and try how it is working. Absolutely. It's a great way to get your feet wet and also to operate clusters in production. So let's transition from how we got to where we are and where we are today into where we're going in the future. And I think they alluded to some of that when you started talking about Hyper-V and the alpha support for Hyper-V. So I'll give a quick overview of the big bullets, dig a little bit into what we're doing in container D and how we're working with SIGNOD. And I think we'll take it. So container runtimes is kind of an esoteric topic. And not everybody is interested. So our goal as platform providers is to make it so that the developer and the operator even don't care about the runtime. It should just be transparent. In the Kubernetes community, there's Cryo, there's Docker Shim and Docker Runtime. The huge ecosystem of runtimes. And we're seeing that there's a lot of consolidation around Cryo Container D as a way to enable the open plumbing of underlying operating system capabilities into the runtime for Kubernetes. And we're putting our weight behind that as well. The things it will help us do will, the primary goal for us is to make it easier to make the Windows container experience as analogous to the Linux container experience in Kubernetes as possible. Our major differences in the way the Windows operating system works from Linux that mean that your experience fundamentally is different today. There's limitations in the way the Docker runtime works that we can't fix fundamentally. The abstractions aren't quite right. And so Cryo Container D allows us to do that. What's happening is that the Docker Shim layer will be deprecated, and that is coming. So if you're depending on things that happen in Docker Shim, we want to understand that and work with you to make sure that everybody is ready for that deprecation when it comes. An example of that is the GMSA support. So we've worked very closely with Docker to understand the way in which they're implementing GMSA and the customer's dependency on GMSA, and we're working to make sure that that support is built into the Cryo Container D runtime as well so that there's a seamless transition for those capabilities as we go forward. There may be other examples that I'm not aware of yet, and I ask you to please bring those forward. Another thing that this will do is it'll reduce the footprint. So we're going to get one of the things, weaknesses, frankly, of Windows Containers is the size of the images and the sheer size of the node overhead that you need to run. And Cryo Container D is going to help us shrink that. That's just another step for optimizing everything along the way. It will also enable Hyper-V Isolation, which is a key capability for reducing the user experience friction for taking advantage of these. We talked about the need to have the same base image layer as your host operating system when you're using the process isolation mechanism. The feature I'm using from the current Docker is that if you're in-body the Hyper-V Isolation, then your product can't be run with more than one container within one pod. But with Cryo Container D, we can add that pod because the continuity is actually based on the very new features from Windows Server, so in container. This is exactly the kind of plumbing that Cryo Container D opens up for us. So we'll have much better storage support. So we're enabling new Windows features to be exposed through Cryo Container D. With the Hyper-V Isolation, we're enabling new abilities in Windows to control CPU and memory that are not possible today for Docker runtime. So this is the biggest thing that we are working on right now. We really want feedback. There's an open proposal that I'm happy to share with you. Actually, the link is here about the Kubernetes Enhancement Proposal, if you're not aware of that process. It's a shared community way to get feedback around major changes to Kubernetes. And it's all documented in the link that will be available. So for the Windows spot on Kubernetes, so we will have moved it to JS 1.14. And we have made some progress in the last week released, 1.15. Sorry. But if you check the test results we published in the test grader, you may find that some test cases are not stable enough. They are made in some test failures from time to time. So we would like a contribution from all users. So from the community side, we have also set out steps that you could contribute. For example, you can use and try the features and report your issues, your use cases on the GitHub, on the Slack channel, on the Google groups. You can also join the meetings of the SQL Windows. And you can send your feedback on all of these places. And also, if you are a developer and you would like to contribute some changes, and maybe you have some ideas to propose all, maybe it also benefits all of us in the whole Kubernetes. So in such case, you can also make some contributions. For example, you can send out some changes to our documentation, add some missing steps in our step by step, guys. And also, you may fix some bugs. So we would like to add a kind of this contribution. And the last one, if you have contributed a lot of features in this area, and you would like to have more to get this involved, you can also join as such as the Windows, the SQL Windows. Technically, maybe you can also mentor someone else. You can help to reveal and also improve some changes from others. So finally, we could make the SQL Windows more stable and make it more native compared to the Linux container. Another thing, maybe we can mention in the next slide. So if you have decided to join the SQL Windows meeting, but I think it's not very easy today, right? Yeah, today it's very difficult. It's a long time for participants from this region. And so one of the things that I'm working with the SIG leadership to do is to shift the meeting times in a way where alternate sessions will happen in sort of Europe friendly time zone and then Asia friendly time zone. So that every other meeting alternates, which will make it easier to have participation around with one. The challenge is that currently the two co-chairs, whose contact information we'll see in a minute, are both in the North America. That's just kind of the way things have happened so far. But we're working to change that. And so very soon I'm going to put out a proposal in Slack and via the groups about the time zone. So please, if you want to participate, let your opinion be heard there. Also here is our links in the presentation to the reported meeting. So if you're not able to participate live, you can always go back and understand the way the conversations are evolving in the community meetings. The link here is to reporting. So we have a list of all of the meetings. We also very actively maintain and go through in those meetings GitHub Project Board. So the Project Board is another great way to get involved. Project Board has a list of all of the issues that we're currently tracking. It's going to do a prioritized backlog for each of the releases. And we very actively maintain who's assigned, labels to those, like, good first issue, bug, documentation, examples, those kinds of things are also labeled in there. So it's a great place to go and kind of understand the dynamics of what's happening and start grabbing issues and contributing to them, commenting, and doing reviews on other people's stuff. That's a great way to participate. And really, this is an ask from me as a product management kind of guy. I want to talk to you and understand your use cases. Why are you here? What is it that you want to get out of Windows in the Kubernetes ecosystem? The more information we have about this, the better a job we can all do together to solve the problems. So we also have, in addition to us, so Pengfei is actually a tech lead on SigAzure. In Sig Windows, he's a significant contributor. I have no official capacity in Sig Windows except that I play PM and run around and try to help and close gaps. But in here, these are the links to all of the ways in which you can reach the leadership and the community. The links will be active in the uploaded presentation. So finally, we've just got a couple of minutes left. I want to open up the floor for questions. I'm also happy to show a demo later if anybody wants. Are there any questions? Here, Louie for the mic. Hi. What's the current state of service mesh on Windows? Excellent question. I don't have a complete answer for you because I actually don't know all of it. Currently, there is some work to get Envoy working great on Windows. That is a work in progress. I think we have early builds and it's an area of active development. So once we have Envoy, then that opens up possibilities for lots of others. There is one service mesh for first proxy that is known to work on Windows. And I can't remember it off the top of my head right now. But we're also working closely with Boyant and Linkerdee to move Linkerdee support to Windows. I don't know the timeframe for that. But basically, the fundamental answer is nascent, right? I'd love to hear your use case for it and what you're looking for. Because service mesh means a lot of things, right? Are you looking for mutual TLS? Are you looking for your progress? And actually, one thing to mention, you know that many service mesh solutions are based on sidecar, right? But sidecar is actually sometimes not working on Windows. So we need some time for Windows container to evolve and support a new feature there. Right, so actually, the crycontainer D and entropy support are kind of prerequisites for a lot of the sidecar models. Right, we need to move to crack and D maybe to develop more features there. To help, can I answer your question? Great. Other questions? I'll use Chinese to ask. OK, OK. Can Windows container support the crycontainer? I don't think it supports it now. So the question is the privileged container is supported on Windows? The answer is today, public containers are not. So in the, if you go to the Slack channel, there's Patrick Lang, who is one of the co-chairs, has posted a proposal for three different options, which we're exploring together with the rest of the community for how to support privileged containers. This is a significant challenge in the Windows world. I have another question. What about the patch management strategy for Windows nodes in a hybrid cost? Awesome, an excellent question. So patching of Windows nodes, as you know, works very differently than it does for Linux. Right now, we've focused on making that as flexible as possible. We are defining a key strategy for how we do that in AKS, in our, in the Azure Kubernetes service today. It's, right now, it's left to the consumer for how to patch their nodes, look for developments there soon. Essentially, you really need to focus on treating the nodes, not as things that you maintain for a long period of time. You need to build your practices so that the nodes can go away, and that allows you then to kill nodes and instantiate new nodes with whatever KB and version of the operating system you want. I know that doesn't work for every workload, and we understand that. So we're working on mechanisms for doing patching in place. Do you have any thoughts on that? I know, but we're out of time now. Yeah, yeah. Great, so we'll be here all the rest of the conference. Feel free to reach out and ask any additional questions, or please bring your use cases. Thank you very much. Thank you.