 Hello all. Welcome to our talk. Today we're going to be talking about on-board support on Windows. My name is Sanjay Valkir. I'm a software engineer. Previously I've contributed to Windows Container Support in Cloud Foundry application in time and contributed to the Diego Container Scheduler in Cloud Foundry. Right now I'm a full-time contributor to the on-board on Windows work. And presenting with me today is David. Hi everyone. I'm a program manager at Microsoft. I've been a member of the Windows Core networking team for a few years now. My main areas that I focus on are container networking as well as software defined networking, particularly in the Kubernetes on Windows space. So what's on the agenda today? We'll be starting out looking at the motivations behind porting Envoy to Windows. It's history of how we got here at our first alpha release. What does the alpha release actually mean? How to get started with it? You know, showing a demo of Envoy on Windows in action, as well as after the demo going over future plans roadmap and how to get involved with the community and we'll wrap up with a Q&A session after that. So Envoy on Windows, why? Envoy was developed for Linux first and it works great. But the reality is that in industry today many organizations have a mixture of OS platforms in their environments. Essentially without Envoy on Windows, those organizations and users are faced with two choices. Use a different proxy across Windows and Linux or rewrite all Windows applications and migrate them to Linux, both of which are pretty costly and expensive undertakings that take time. Our message today is if you are using Envoy Proxy for Linux services to solve a particular problem, now you should be able to begin initial prototyping work on deriving those same benefits and utility for Windows-based applications and services as well as infrastructure. So also of note is that the Envoy project was founded with the belief that the network should be transparent to applications. And when network and application problems do occur, it should be easy to determine the source of the problem. So this porting Envoy to Windows is basically another step in line with the Envoy's project mission of making the network transparent to any application regardless of language, architecture or operating system. Being able to reuse and leverage existing investments for Envoy on Linux and transferring that knowledge and training to Envoy for Windows makes the overall system and architecture simpler and easier to manage for organizations. Essentially, you don't need to train personnel for multiple solutions that are contingent on which operating system you're running for your apps. Also of note is of course the rich feature set, so getting Envoy to compile and run natively on Windows is only going to make it possible for the project to deliver value through its rich features to Windows as well. Envoy is also becoming an industry standard thanks to its extensibility, you know, pluggable architecture with clever concepts and its usage and service mesh. Support for Envoy on another operating system is only going to strengthen that position further. So project growth, the growth benefits are twofold, I would say. Not only will this effort grow the number of proxies available on the Windows ecosystem, but porting Envoy to another OS is going to lower the barrier to entry for contribution and usage as well, causing the Envoy project itself to grow. So let's take a look at the history and how we got to the Alpha stage. Basically this effort started in July 2017 where there were meetings between VMware and Microsoft to essentially plan and look at what it takes to bring and compile Envoy on Windows. You know, after that in 2018, Bazel was identified as the unified build system. So it was a little bit of a setback because Envoy or Bazel had issues compiling on Windows, so there were some patches needed to get that working. This took a little bit of time, but in March 2020 Windows was integrated into the Envoy CI build pipeline and building Envoy.exe, which was a major milestone. Since then there were a number of performance improvements and additional testing that we enabled that gave us basically the confidence needed to sign off on an Alpha release for Envoy on Windows. So now that we're here, we have an Alpha for Windows. What does that actually mean? So that means that Envoy is natively supported on Windows, so it's not targeting some Windows subsystem for Linux or something similar like that. It is running on Windows itself. There are no Envoy executables on Windows yet, however. We're working on that. So currently the expectation is that users need to build the project from master. The Alpha also means that we're still soliciting feedback from our users. We expect that users will run into build or usage issues, and we will be triaging GitHub frequently and looking at what are the top pain points that are being reported by our users. That being said, even though we are triaging issues, there's no formal agreement that this is going to be supported for production usage basically in its current state of Alpha. So how do you get started? So there are a few requirements. The minimum Windows version needs to be version 2019 or above. You need to set up a build environment using Bazel, and you need to have some familiarity with basically the command shell or the shell that you're using to build Envoy. Also of note is that there are some features and extensions that are disabled since we're still in Alpha. We expect to enable relevant ones before a beta or a GA release. For example, signal handling or process lifecycle control is something that we will be looking at as well as looking at the tracing extensions. However, there might be some features that we find will need to be disabled permanently, such as hot restart. So we will be assessing those and trying to enable as many of these as possible as we're iterating through beta and eventually a GA release. In our details, there is a few aka mess links here, and they basically contain our Alpha announcement blog post, sample tutorial, and onboarding guide. So to get started, as I mentioned before, you need to build Envoy from the master branch. This is actually considered release quality at all times because there's a lot of testing that goes on before code gets into that branch. To get started, there's a few build instructions basically on the Envoy proxy site. We've added Windows instructions next to the Linux building instructions. There's also a Docker image available as well if you want to build Envoy on that. So follow those URLs. One last gotcha tip would be that you have to be perhaps careful with the shell when invoking Bazel on Windows. We have specifically tested the MCS2 bash, but there are other shells like PowerShell and Command. There are different ways to compile Envoy on Windows, and we're looking for feedback from the community to see how they behave and if there are any quirks, although we do expect that all of them should work. So demo time, the exciting part. Here's a diagram of our demo of Envoy running in a sort of cloud native common deployment scenario that you might see in something like Kubernetes. We have a front-end Envoy running in a Docker container set up with Docker Compose. That is configured with a listener that's serving TLS and two upstream clusters that represent two different versions of an application. One of them is an app that serves an image of a dog. The other one is an app that serves an image of a cat. We've configured this scenario to have MTLS between the front-end Envoy and the back-end services, and all of the configuration of the listeners and clusters is set up to be dynamically updateable with Envoy's ability to wash file events and update configuration accordingly. So this might be something that you could see in a cloud native deployment scenario. You have a sidecar Envoy sitting next to your application process, proxying traffic and serving TLS and allowing you to encrypt all the traffic in flight in your network. Of course, you would not have this on a single host, but you would run this across many hosts in a Kubernetes cluster across multiple worker nodes. So let's dive into it. Let's see if it's actually working. For this demo, we have a Windows Server 2019 instance in Google Cloud. We have a VS Code session connected to that over SSH. We're editing some files on that Google Server. And here we have the Docker Compose set up for our demo. You can see the service configuration for the front-end Envoy proxy. We've got some ports that we're exposing to that Windows host, as well as some bimounted directories that are going to be inside that service container. Similarly, we have a similar setup for the dog service and the cat service that will be our two upstream services. To access the demo from our local workstation, we've got some forwarded ports set up in our VS Code SSH session. So let's bring up the demo. We can do a Docker Compose app to get our services running. And just check that they're running. Docker Compose PS, everything's up and running and happy. And let's take a look at our service. We've got a picture of a dog. And when we refresh, sometimes we get a picture of a cat. So we've got traffic writing between these two services. And we can also take a look at the admin endpoint for this front-end proxy Envoy. So we can take a look at the admin interface. This is what you would typically expect from Envoy. We're looking at the admin API. We can look at the search that we've configured on this front Envoy proxy. We can look at the status of the clusters, everything that you would typically expect from the Envoy admin interface. Look at the stats. We can see that we've got a few requests completed for each of our upstream services. So everything looks good. This is what you would typically expect. And nothing different on Windows from what you would see on your Envoy instance, on Linux or other platforms. So let's show an example of Envoy's dynamic config updating work. We can take a look at our listener filter configuration for our cluster configuration for our front Envoy. And we can see here that we have a transport socket configured with a TLS context. So this Envoy is going to present this cert upstream to any clients that it contacts. And it's also going to do some validation of the cert that the upstream presents, similarly for the other cluster configuration. This is so we can start to get MTLS between our front Envoy and our back-end service. And similarly, on the upstream Envoy configuration for the services, we have the same thing as this time in the downstream TLS context on the listener. We're presenting the certs and validating the CA of the search presented to the upstream Envoy. So just to show an example of the dynamic configuration and certs updating, what happens when we update the upstream Envoy with an invalid cert? What happens? Are we actually doing any validation of the certs that are being presented here? So let's make a copy of this listener's file so that we can update it. So copy, we serve this Envoy config listeners to a temporary file. And instead of this cert, we're going to pass in an invalid cert. And then do a move operation to get Envoy to notice this config update. All right, so that update should get picked up by Envoy. And when we go to query our services, eventually, all right, once the cache is cleared out, we get a error, a TLS error from the front Envoy saying that the certificate verification failed. So we can see that that path of the mutual TLS handshake is working. We're not trusting the CA that the certificates were generated from. That's great. And let's update back to original state and do a fix our configuration here. So we'll be able to get our dogs and cats back. And now let's demonstrate the other side of the M TLS handshake and give the upstream Envoy, make it do some more strict validation of the cert that the front proxy Envoy is presenting. So here we'll match some subject alternate names. We'll match and have an exact match and we'll match the name too, which we serve that the downstream Envoy is presenting is not, does not happen. So let's again do a move to get the config update and we'll be able to see a different error, a certificate unknown error coming from the upstream, so it does not able to verify that. So M TLS, an important thing in securing identity and encryption and an important feature that Windows apps will be able to transparently and Windows app deployments will be able to transparently be able to take advantage of if their Envoy sort of mesh is configured with this to be able to verify identity of upstream and downstream services and secure traffic. So for this demo, we'll be showing Envoy on a basic Windows VM and we'll be running Envoy as edge or front proxy on the Windows server 2019 machine and showing some basic Envoy concepts and action using two super simple demo apps. So what I have here is my Windows server machine and I have two apps very creatively called App 1 and App 2 listening on port 8080 and 8080. Just to show them, click browse website. This is the first app, just says hello from App 1 and the other app is saying hello from App 2. So what I have here now is a VS Code window that's connected remotely to my Windows server machine and I have some very basic Envoy.yaml configuration file with some logs being saved and collected and dynamic resources defined for clusters and listeners. So let's take a look at the YAML that I have right now. For the cds.yaml, I have very basic round robin load balancing policy. That's balancing incoming traffic across the two instances of my app. One listening App 1 on 8080 and App 2 listening on port 8090. And for the listeners, I have very basic HTTP listener starting Envoy on port 80 and accepting and routing all the requests to the backend cluster. Very, very super simple demonstration of Envoy on Windows server. Let's show this. We'll be starting Envoy, applying the Envoy.yaml that I created. So we can see here in the logs being printed that we have added one cluster, our backend cluster, and we have added a listener code listener.htt. So now we should be able to connect and see this load balancing policy in action. So let me refresh my window here. We can see hello from App 1. Refresh the window. We are showing the load balancing policy since we're load balancing between the two instances of the application, App 1 and App 2. So one thing that is missing here, however, is basically that our connection is unencrypted. We're using HTTP 1.1 and this is something that Envoy can also help us out with very easily. So let's switch back to VS Code to update the configuration so we have proper encryption. So I'll be updating my listeners here. Copy and pasting them. So we have our new listener here. Call it HTTPS. Update the port value from AD443. And I also have a cert that I generated earlier. And I'll be pointing Envoy to the cert that I configured website essentially. So let's paste that in. You can see transport sockets enabling TLS, pointing it to the certificate chain and private key. And we're also enabling H2. Save this. Update the configuration. Yeah, I'm all. So we can see in the logs that our update was accepted by Envoy and it has added a new listener HTTPS. So going back to our application, we should be able to go to HTTPS. So without any changes to our app and thanks to Envoy's dynamic configuration update, we now have encrypted connection using TLS 1.3 and we have also enabled H2 for our connection. So this is showing how easy it is to configure and add security and TLS to an existing website without having to do any changes to the underlying apps essentially and do this in a dynamic and convenient way. So what's next for Envoy on Windows? We're planning on providing beta release bindings for Envoy on Windows in the 116 release range. You can follow the GitHub milestone that's linked there. Some specific plan improvements that we have are to enable some of the features and extensions that are currently disabled accrue any user feedback and fix any rough edges that we get there, improve event loop performance, and improve process life cycle control. So some of the integration with the Windows Service Control Manager. In addition, we're planning on making a generally available release somewhere in the Envoy 117 release range. That should be in quarter one or quarter two of next year. And farther in the future, some Windows features that are going to be available in the release of Windows Server that comes in the first half of 2021 will enable some new things with Envoy on Windows. So routing traffic, egress traffic from your application through Envoy for use in the service mesh, similar to how all service mesh implementations use IP tables on Linux. That should be available in that release as well as improved Windows SDKs with new socket APIs that support edge triggering for file system event. So the Windows developer community would like to lean on the community to help us get to a GA release. So how can you contribute? As you start using Envoy on Windows, you may run into some known issues. So the area of Windows tag on the GitHub issue tracker for Envoy is a great place to start. Look at known issues, report new issues. And of course PRs are always welcome for documentation improvements as well as fixes and future improvements. If you want to get some more one-to-one contact with the current crop of contributors on Windows, reach out to the Envoy Windows dev channel on the Envoy Slack workspace. In addition, we have a weekly community meeting specifically for Windows development where we coordinate work and set the roadmap. So if you want to get involved there, definitely come to that meeting. So finally, we'd like to extend a huge shout out and special thank you to the contributors from Microsoft, VMware, and the Envoy maintainers that work tirelessly to make Envoy on Windows or reality. And with that, thank you all for attending our presentation. I hope it was informative and we'd like to open it up for questions. Hey y'all, if anyone has any questions, please ask us now. We have a few minutes potentially to answer any questions you guys have. Well, it looks like we started a few minutes late. So we've already passed 1201 now. So please let us know any questions you have in the chat and you can reach us on Slack as well. We can relinquish the space so that the next session can start running. Thank you everyone.