 I'm Matthew Horan. I'm manager in CF R&D at Pivotal. And I'm Natalie Ariano. I'm an engineer in the Bosch Windows team. So just a quick overview of, you know, why we thought Windows on Cloud Foundry should be a thing. With Spring and Java and that world, we had a lot of validation of positive experience with developers doing the CF push experience. And we knew that there was a large market out there that was using .NET. And there's a huge opportunity for us to engage with maybe 50% of developers who are on .NET that we were missing before in the enterprise space. And so we wanted to, as we embarked on Diego and our refactor of the container runtime for Cloud Foundry, we had an opportunity to also investigate adding Windows support to the platform in a first class way. We actually had Windows support in sort of second class way with Iron Foundry, which was a fork of Cloud Foundry in the early days. We wanted to make that first class support. And so we started that off with an MVP. And before spending too much time and investing in the tooling for that, we wanted to do a quick validation. So today we are going to tell the story of the journey of our team from the early days of bringing up Windows cells with MSI installers, the road that we traveled in porting the Bosch agent over to Windows, some choices that we made and some learnings from that journey, and finally the benefits that we found in moving to Bosch. So the initial MVP was this MSI workflow that we came up with. So MSIs are installers for Windows, they're just packages of bits that you can install on a system. Windows administrators are pretty familiar with MSIs. Bosch is totally not a thing they've heard of. And so as we're trying out an MVP, that made a lot of sense for us to just say, here's an installer, why don't you try that out and see if you like this Cloud Foundry thing. But that initial MVP sort of had these Windows cells kind of off in their own on the side, and you had your nicely Bosch deployed Linux VMs all managed with Bosch. If you used Pivotal's Cloud Foundry, you had the ERT deploying all that for you. And then you had to manage this thing outside of that. But the great thing is the developer experience was CF push with a dash S flag for stack, and you could say, I want this to run on Windows. So we really wanted to do that early validation for the developer to make sure that this made sense. We realized that the operator experience is lacking here, but let's validate that developers even want the CF push experience before we spend too much time developing a platform. Our operators were forced to manually spin up VMs. This is a relatively heavyweight process at some of our customer sites. Sometimes it can take six months plus to get a VM, and then you have to go get the validation to install these MSIs on there. But we already had all these components, the Diego Rep console Metron. Those are all written in Go. They were easily ported to Windows. And all we had to do is get that running on Windows else. We also had to write the garden server for Windows. That was fun, but that's a different talk. So going into a little bit more detail on how we get the VMs up and running. On AWS you could specify a cloud formation template and provide properties like the subnet that you're working in, the security groups that should be applied. This automation was not available on other IaaSes like vSphere. We also needed a way to get the right configuration on the VMs, so we needed to install Windows features, configure DNS settings, etc. And this is again something that was automated a little bit on AWS, but was another sort of manual possible point of failure. So let's look at what is actually running on the cells. We have garden Windows, which is the containerization technology. This was written and maintained by the greenhouse team. We also have Diego Windows, which is comprised of the three jobs, console, rep, and metron. This was not written or maintained by our team. And so we didn't have a lot of visibility in what could be going on with those jobs and how they might be changing. And finally we have Hakeem, which is not necessary for the cell to function in Diego, but something that we wrote to help with troubleshooting and more easily pinpointing customer issues. So we needed to be able to start all those jobs with the right configuration so that they could work with the rest of Cloud Foundry. So we needed to pass a bunch of parameters to the MSI installers. And, you know, an example would be something like SSL certificate for the rep to communicate with BBS. So in order to figure out what these values should be, we needed a way to reach out to the Bosch director and grab those values and pass them to the MSI. This is what our install script generator was for. This is something that turned into a little bit of a pain point for our team to maintain, because as more and more of these properties got added, you know, it got more and more complicated. And this is also something where, you know, because we didn't write the Diego jobs, we were often playing catch up to fix things when there were new additions or changes. Another thing that we encountered is that, you know, our Windows operators already sort of have a preferred method for maintaining configuration on their servers. So typically this involves, you know, domain joining the servers together and maybe applying some group policies. This is another possible point of failure because those policies could interfere with some of the settings that we needed for, you know, our technology to function properly. So an example might be enabling interactive logon for IIS users. So we sort of discouraged this practice and, you know, we consider that domain join is inconsistent with 12 factor principles, so we consider that an anti-pattern for CF. Right. So lots of limitations there with that approach. The manual steps led to lots of problems. If Bosch properties were changed in a manifest when you were doing a Bosch deployment inevitably you either forgot to re-run the install script generator and then roll your cell or something broke during that deployment process. And our Linux operators are so used to this seamless canary deployment and rolling upgrade process that they weren't getting with Windows that they were hearing, okay, everything's working great for the CF developer, but the Windows operator experience is terrible. So we were getting that feedback and we heard it loud and clear. One of the really challenges with the rolling upgrades is you basically had to set up a brand new cell, which remember if that takes you six months to get a VM, somehow this math isn't going to work for upgrading your quad foundry. Then you have to drain an old cell, wait for the new one to get those applications and then what do you do with the old VM? Do you try to upgrade it or destroy it? And inevitably it was complicated and fragile. So the group policies are always a huge problem, but operators love group policies because they could do things like distribute CA certificates to all their servers really easily from a central place. So we heard that, but we knew that this was the thing that Bosch might be able to solve for us. Another challenge was this host names of your servers must be unique. So one thing that operators of Windows servers and particularly with vSphere or OpenStack would do is set up one template VM, clone it, and then just start it up again. Oftentimes that just means that the VM had the same host name over and over again, which meant with console, just register the same VM name over and over again. So you'd actually have one cell, even though you had like 10. That really didn't work too well. But these manual processes just led to problems over and over again. Oftentimes people wouldn't run setup PS1, which would set up the default firewall rules, and then application security groups wouldn't work. So all things that we can automate with Bosch. And so one of my favorite tweets from a couple of years ago is this picture of a presentation talking about how no CEO is ever proud of you for configuring servers. And maybe you noticed in the background this is Mordor. So the outcome of this process were that operators were understandably underwhelmed, but developers were really happy. So this was the validation of the MVP that we were looking for before investing in Bosch Windows. Great. So our mission here was to deploy Windows cells just like the rest of Cloud Foundry. We didn't want these snowflake cells that inevitably ended up being deployed. So great, you got that Windows server. It worked really well. Copy it. Change a little thing about it. That's terrible. Really, really bad idea. And we wanted everything to be easy to rebuild and automatable. And this is the thing that Bosch just does for you. Ultimately, we just wanted to distribute the releases and give you a manifest, a couple of stem cells, and there'd be no more manual deployment. So we needed a stem cell running the Bosch agent. So let's look at the work that was involved to make the agent work on Windows. The agent is written in Go, which is great, because we can compile it for Linux or Windows. We compile all of the code for each OS, but we invoke different code paths, depending on which OS we're running on. So there is a lot of shared functionality between Linux and Windows in the agent, but some things needed to be implemented differently on Windows. So the work of our team was to go through and figure out what needed to be implemented differently on Windows, and to do that one function at a time, starting with the platform interface. And part of our work in testing this was to reverse engineer the director protocol, so write essentially a fake director to send commands to our agent and make assertions on the messages being passed back and forth. So this is an illustration of one of the major differences between Windows and Linux in the agent. On Linux, we use Monit to start and stop processes, but on Windows, we decided to use the Windows Service Wrapper, or WNSW, which is a third-party library that allows us to start our processes as Windows services using the Windows Service API. And so this means that our Monit file looks different from... You might be used to on Linux. On the left is Linux. On the right is Windows. So on Windows, it's just a JSON file, and you specify the process to run, any arguments, environment variables, and this gets all packaged up as an XML file that we send to WNSW. So what else is different? So our implementation in WNSW means that it's not easy for us to send a Control-C to our running process, so we just SIG kill everything. And this means that any cleanup that would be required to halt your process gracefully should be done in a drain script, because right now, stop scripts aren't supported. We encountered some difficulties in tailing log files on Windows, so this means that each individual job is actually responsible for forwarding its logs to a syslog endpoint. That means that in your release, you have to specify that endpoint separately for each job. And finally, our packaging script is a PowerShell script, which is significant because that won't work on a Linux machine. And in compiling releases, we don't have a way to specify which OS should run the packaging step, so we needed a way to skip this on Linux. And the way that we did it was this sort of complicated, or not complicated, but a little bit of wizardry here with sourcing a script called exeter.ps1 at the top of each packaging script. And this leverages some different functionality between Windows and Linux. On Linux, I should say first that exeter.ps1 is just one line of code, just exit zero. And on Linux, this actually calls the calling script to exit as well. But on Windows, it is spun up in a sub-process and the sub-process dies, but the rest of the script continues to execute. So this accomplishes the differential behavior that we were looking for. It's still a bit of an MVP, right? So Windows Boss jobs are, we basically decided, okay, well, we've had the MSIs from before, so let's figure out how to get rid of those, right? The jobs that we have are RepWindows, ConsoleAgentWindows, MetronAgentWindows, you notice a pattern here, underscoreWindows. So in order to upstream these into other releases and do the differences in both the packaging.ps1 and the actual monit file specification, we had to namespace everything. And so this has been a little bit of a point of contention for release authors because it leads to a little bit of duplication. We're definitely open to feedback on this interface, and we want to make it better. But ultimately the goal was we didn't want to, as the Windows team, be maintaining other people's component releases anymore. And so this allowed us to upstream everything to the responsible teams. The great thing is we already did the hard work of figuring out how to cross-compile every other team's component for Windows, so we just gave them all that code, which is great. We also had Bosch now, and everyone's afraid of Windows, but no more excuses. We can just say, here's a concourse worker release. You like concourse, so just run that. And that's been an awesome point forward for our team in collaborating with everyone else on Windows. The only thing that is maintained by a Windows-specific team, except for the Bosch agent itself now, is GardenWindows. And this particular component is only responsible for containerization. So the operator experience now is there's a CF deployment ops file. So if you want a Windows cell, you just opt into that behavior. You have to upload the Windows stem cell as well. There's still no domain join. We're still figuring out our story there, but we'll have more on that soon. But you can finally focus on deploying applications as an operator and enabling that within your organization instead of managing servers, which is the whole CF story. There's even an antivirus add-on. Funny story, one of our customers really wanted antivirus for their Node.js app, and so they stood up Windows servers with an antivirus add-on to run Node.js. It works. So for stem cells, we build stem cells and publish stem cells for public IaaSes. So if you're using Azure, GCP or AWS, you can pull down a stem cell for that IaaS, for on-prem, for OpenStack or for vSphere. The process is currently a bit more complicated and we'll talk a little bit about that later on. Basically due to licensing constraints. Distribution, redistribution, Windows, Microsoft, it's complicated. We realize that this is one of the most difficult things about our current solution for Bosch-deployed Windows and we are actively working with Microsoft on making that better. Great tweet from this morning. So we do have the Windows stem cells live on Bosch IO for all the public IaaSes. So if you'd like to check those out, you can go and grab those. These are light stem cells. We're not distributing Windows. So tell all your friends they can't get a free copy of Windows from this. So we still have some limitations of the Bosch agent on Windows compared to Linux. We don't currently support all of the features that are supported on Linux. So that includes stop scripts, as I mentioned, persistent disks, more than one ephemeral disk on Bosch SSH. This is something that we're working on. And also, as I mentioned, we don't currently have compiled releases for Windows. So what's next for our team? We're working on implementing the missing features. Stop scripts should be ready very soon. We're working on building a stem cell for Windows Server 2016. We're hoping to be able to share a strategy for a better on-prem stem cell experience. And we're looking to forward the Windows event log to our syslog endpoint. So that should make debugging easier. Great. So if you'd like to get in touch, you can find us in the Bosch channel on the Cloud Foundry Slack. William Martin, who's in the second row here, and Colin Jackson, who's back in New York, holding down the fort, and you can also find us on the CF Dev mailing list. So please get in touch. Any questions? Zach? Why was Hakeem called Hakeem? Why was Hakeem called Hakeem? There's a story there. I don't remember the entire story. Hakeem means a thing in Arabic, and we liked what it meant. And basically we wanted the thing that checks the health of your Windows installation. And so that was a good name. We wanted a good name that you could remember, and it was short, and you could type it easily. That's the story. Server 2016, are you going to start looking at integrations with Bash and Windows, or are you sticking with the PowerShell route? Great question. So basically on Server 2016, there's a link for Linux, and there's true Bash support. Will we investigate that? Maybe, but we really like PowerShell. So we'll see where that goes. Also, generally, Windows administrators are comfortable with PowerShell more so than Bash, so I guess we'll have to see who our target audience is for writing those scripts and figure out what makes sense. There's no reason that it couldn't work, and then we could support both, ultimately. Cool. Awesome. So the confiled release challenges, basically we wrote all this stuff, we put together the agent, we shipped it out, and then we discovered during an internal release process for PCF release that we were breaking the entire process for building ERT. And so our quick way around that was this source exeter.ps1 because the folks building that compiler release didn't have a way to specify or, say, skip the Windows jobs. So this is a thing that we should definitely talk about, making a proper solution for for sure. For ops manager and other tiles. So we have support in ops manager. So if you upload a Windows stem cell it'll be recognized. And there's both the regular Linux ERT and there's a Windows runtime tile that you can upload as well. So you upload both of those, and that'll deploy Windows cells for you. So in addition to the open source which we mentioned you can have the pivotal deployed. For log aggregation so log aggregator works. Just out of the box that's the metron agent. So that's already running on Windows. Natalie mentioned the syslog and the log forwarder. And that's a thing that we're working on right now. Evan? I heard some vote around licensing there but if you use those light stem cells like the billing for licensing comes out of your eyes though, right? So that's all taken care of. So you're not getting Windows for free. I'll make that clear again. We are not giving Windows away for free. If you use a public IaaS you will get billed for that. So, 2016 stem cells what? So we built one. It works. But what do you want 2016 for? Come and find me after. I'll talk a lot with you about that. Yeah. So, we actually have been working with 2016 quite extensively. The garden anchor and William are here in the second row. You can talk with them all about that or me. But it's boring to talk to me and it's more interesting to talk to them. So. So you just need separate jobs. Right. So you don't need a dedicated release. You can have so, for instance, there's one CF release that has all the jobs in there. There's one Diego release that has all the jobs in there. But we use that little namespace of the underscore windows hack to separate out the jobs. Anyone else? Yeah. When do you think you'll have persistent store? Well, yeah. When will we have persistent store? Navy end of August. We actually did a spike, I think Natalie and I, on persistence. And we discovered it's way more complicated than you'd think it is. Just based on things that the IaaS does with discs and the way that Bosch presents discs to VMs. So we thought it was going to be really easy. We spent a total of eight hours on it and discovered that it wasn't and then focused on other things. Yeah, there are also IaaS subtleties. Great. Well, you can find us after. Ask any more questions and we'll be around.