 Hi, everybody. Thank you to Paula for organizing this and giving me the opportunity to speak. And so this is the title of my topic that was given to me by Paula, how to use concourse CI to deliver Bosch releases. And so that's what I'll be talking about nominally. I say nominally because this is Bosch Day. This may be the one conference where people actually want to see tons of YAML. You're going to be disappointed. So I will try to substantially talk about Bosch and concourse, but I took some liberties to go a little bit meta to start with and then talk about why we arrived at some of the conclusions we've done on some of our teams around how we deliver Bosch releases and how we use concourse. A little bit about me. I'm a product manager at Pivotal. Some of the products I manage, CF release, etcd release, console release, and Bosch bootloader. Previously, I was an engineer working on Diego, ops manager in Bosch. So what are we going to talk about? Why software in the first place? Going real meta on you. Complexity versus simplicity. Testable, discoverable contracts. Complexity in the Bosch ecosystem. And then some of the recommended practices around Bosch and concourse that we've arrived at. All right, so what are some reasons. I won't claim to be comprehensive here, but what are some reasons why we do things with machines in the first place? One obvious reason, automate physical labor. Using machines, we can do things faster, more consistently, at higher scales, and more reproducibly. The things we just can't do physically, or if you get a human to do them enough times, there's going to be a lot of defects. People throw out their backs. You can only build skyscrapers so high before you start to need some sort of machine. We can also use, especially, software to automate cognitive labor. I like this term that Martin Ford used in his CF keynote, cognitive labor. I think it describes a lot of what we're trying to solve for when we build software. You guys can probably recognize that this is an XKCD comic. And we try to solve for the same sorts of problems, speed, consistency, scale, and reproducibility. The crux is complex cognitive tasks are hard to do well. Let me dive into what I mean by some of those terms. Complex, meaning many pieces, unclear interactions, and requiring implicit knowledge. Cognitive tasks, such as doing taxes, deploying software, CF push. When my mom and dad ask me what I do, I say I build software for people who build software. Because we have the same problems as the people whom we're building software for. Provisioning servers is hard, and getting it right, doing it the same way every time is hard. One of my favorite stories is the story of night capital. If you don't know it, just Google it. It's a great story in human error and things that could have been automated when provisioning servers makes you lose $400 million in half an hour. But if you think about provisioning a web server for your app and scaling it up, it's all these many pieces, unclear interactions, requiring implicit knowledge. How do you teach the next person on your team to do this? All kind of are the reasons why we love CF push and Bosch deploy. And they're hard to do well, right? Just like what I had previously. Doing it fast, consistently, at scale, reproducibly. There are many more reasons than this about why we build software. There are complexity, solving for complexity is just one of them. And there are many different sources of complexity and many ways to deal with them. But one approach or one sort of rule of thumb to keep in mind when building anything is having testable, discoverable contracts. So instead of having, you know, when you have to do your taxes, there's just all these forms and nobody knows how to do taxes, let's be real. What's nice about something like TurboTax is it provides you a very clear interaction. You log in and you just enter the things and you hit next. So it provides interactions that let you do what you're trying to do and nothing else. It asks the consumer for input required for those interactions and nothing more, right? It's very clear, like just fill out this line and this line, all those other blank spots in these forms, you don't have to fill them out. Are you missing something? It'll tell you exactly what you're missing and where you can get that thing. And provide the output that the solution is supposed to provide and nothing else. Don't confuse people with other noise. And I say people, but I use the term consumers here because I'll get to it in a second. So everything you see up here, these first three bullets, this is a contract. This is the interaction I'm gonna provide. These are the inputs you need to give me and this is the output and the side effects I'm gonna create for you. So this is a way, what's key is the and no more, right? Make it clean and simple. But one risk about doing this kind of thing is maybe you've just shoved the complexity somewhere else, right? Maybe you've simplified one thing by making just something else more complex. The example I have here is, we all know and love concourse. And concourse is a side project. Originally it was a Alex Sriracha side project. But we're probably all familiar with some other Alex Sriracha side projects, which solved some problems and made something simpler. But I think a lot of people have had to fight with the additional complexity it's created in just another place. So making these contracts testable and discoverable and making it clear to people to understand how you're supposed to use something and how it might work is key to not just shoving the complexity somewhere else and creating just a different problem. And so on the previous slide, I used the term consumers to try to keep it general because this problem crops up not just in a user-facing product, products and services, but also between the services that we care about are services that are deployed as distributed systems, right? This is Bosch Day. But between the system components themselves, you wanna keep those contracts clean and simple. And going down a level between the processes running on a particular component in your system and even further down, down to your modules, objects and functions, right? So just down to the level of your code. And this has quite direct parallels to the stuff we build in the Bosch ecosystem. So going back a slide, user-facing products and services, that's your Bosch deployment. The distributed system components, those are your Bosch instance groups. Bosch releases and jobs, those are your processes and the business logic source code, right? Your release slash source, that's your source code. And so these design concerns happen at every layer, right? Whether you're a product manager worrying about the end user or whether you're a developer worrying about your Bosch releases or your source code, these contracts, interfaces, simplicity, these things crop up at every level. One thing I wanna highlight here is I made it a point to separate Bosch deployments and Bosch releases as separate concerns. And that's gonna be a theme I come back to a little later. So in addition to these four things here, also that the test pipelines themselves can potentially suffer from complexity, right? If you're trying to deliver something, you're delivering it using concourse with your own test pipelines and those things themselves can get quite complex. If you have to ramp up a new team member and they can't make heads or tails of how anything gets built in your system, that can be a big problem. And concourse goes a long way to solving that, excuse me, right? With older systems, Jenkins and GoCD, where really the MO for interacting with them was clicking around in a GUI, after you've clicked around, that information is lost and you can try to export the XML and check it in somewhere, but it's not, that's kind of an afterthought. Whereas with concourse, there's only one way to do it and it's declaratively in your pipeline config. Still, those things can get complex and there can still be hidden assumptions and unknown moving parts and unseen dependencies and that sort of thing. So I'm gonna talk about some of the recommended practices that we've come up on for dealing with these sorts of things. So one is no snowflake environments. If you have, so something that's typical to do when you have a Bosch deployment that you're distributing, is you, in your CI, you actually deploy it somewhere, right? So you might spin up an AWS environment, you might spin up a Bosch Light environment, what have you. And you might have hacked it together by hand, clicking around in the AWS console or you ran some script once and now you never run that script again as nobody knows what happens when you if you run it again. And that's, I'm kind of speaking about how we've done things on Cloud Foundry historically, right? Our initial environments for integrating Cloud Foundry, one of the main environments is called A1. That thing was set up years ago before Cloud Formation was really a thing and it was all done by clicking around. Now nobody wants to touch that thing. Nobody knows how it got there. Nobody knows what's in it. Nobody knows whether you can remove stuff from it. So it's a snowflake and it leads to lots of fear in the development process. So one of the recommended practices is to not snowflake any environment that you use in your CI. Rather, what I'd recommend is to automate the provisioning of your environments in your CI and actually have CI run that build continuously. Prove that you can idempotently recreate your environment at any time, right? Anytime, anything goes wrong. You know, worst case scenario, blow it away and recreate it from scratch. Click a button and bring it back. And separate your per environment config resources. So let me show you one of our pipelines. So this is our mega CI. I won't get into the history of the name mega CI, but the infrastructure team that works on console and its ED release, the environment that that team's concourse itself is deployed to, is deployed by this job and I'm not gonna click it right now, but in theory, I could click it and it should just know up, right? It should just see that none of the AWS resources have changed so it won't touch anything there. Nothing has changed in the Bosch configuration that we're using to deploy that concourse so it should just end after a few seconds of know hopping. So being able to reproducibly create your environments having no snowflakes was one of the points I mentioned and the other one was separating out your environment config. So we've come upon a pattern that we like quite a bit which is to have separate repos for every environment. So if you're by environment, I sort of mean, if you think about AWS, it's a VPC with whatever subnets and load balancers and stuff that you need, right? So if you, for example, if you're testing a release or deployment and you wanna test it on AWS and vSphere and OpenStack, then you might have an OpenStack environment and a vSphere environment and an AWS environment and your concourse itself may live in a separate environment because you wanna have a different life cycle for the things you're actually testing and deploying from to concourse itself. So what we've done is we've separated out any private credentials, SSH keys, just configuration parameters that are specific to an environment. We've extracted that into its own resource so that if a credential gets leaked, if we ever need to rotate, repave, and repair this environment, it's totally encapsulated in one thing and it's not gonna leak into anything else. We, as a team, can comfortably blow this away without affecting any other teams. So this is also, this came out of the history that we had on Cloud Foundry of having all our credentials and all our environments in one big repo. And a lot of these things have come from the fact that Cloud Foundry has just grown a lot over the last three, four years. A lot of these problems, complexity is a human problem. So a lot of things that we've had to solve is how do we scale this thing up to now a foundation with so many teams and so many contributing organizations and members wanting to iterate independently. It used to be easy to just have lots of stuff in one repo, but that doesn't scale anymore. So separating these things out is the pattern that we've come upon here. So a couple of the other recommended practices. Test your task scripts. So what we do in our concourse pipelines is, by the way, it's probably a bit late to say this, but I'm gonna assume everybody knows what a Bosch job is and a concourse task is, et cetera. If not, sorry. So you're gonna have your concourse pipelines, your pipeline has many jobs, your jobs will have many puts and gets and also some tasks. And your tasks are, you have a task, YAML, which sort of tells concourse the basic setup of your task, what Docker image should run on top of, what parameters does it need as input, what resources does it need as input, or rather what things on the file system. And then you have an actual task script itself, which, if it's simple, it should be bash, but if it's complicated, don't write it in bash. Whatever you do though, if you make it at all, if it's at all complex, test it. People usually don't think to test their test scripts, but you should. If anything is sufficiently complex, you should think of your test scripts as a tool that you're using. And if you're building a tool, you should test your tool because. I'll give you an example. An example? I don't know if I have an example offhand. So, I'll show you the pipeline actually. It has, let me, I'll get to it in a second. So test your task scripts and build your own task images. So one thing that we, another sort of one shared thing that we had for all the teams that sort of broke down once we started to scale up the teams quite a bit was, we had this one massive bloated Docker image called runtime CI, which had multiple versions of Ruby in it. It had a bash RC in it. It had, God knows what else in it. You don't need two versions of Ruby to get cloned and do some JQ or something like that, right? When you have no idea what your tasks actually depend on, that can put you in a scary place, right? So build your own task images. Don't rely on external dependencies for them. It's not that hard to build a task images for your bash tasks. Keep them clean and simple and build them in concourse, build them in CI. So let me show you this one. So this is our pipeline for our CI itself. So the things that CI itself needs, the tasks and the Docker images, this is the pipeline that actually builds those images and also tests its own tasks. So we have this mega CI unit tests job. And this will test, I guess we have tests for different sorts of manifest generation things. I haven't looked at this in a while. But any sort of code that we have that's sufficiently complex in our concourse tasks themselves, we've written Ginkgo tests for them. You can look at the repo to get more details. And then we have our Docker images. And we wrestled for a while on what's the right level of granularity for our Docker images. Should you have a different Docker image for every single job? Should you have one monolithic Docker image? Or something in between? And we landed on something that, I think, worked out really nicely, which is something in between. We have a really lightweight minimal Docker image, which just has bare bones, my curl and WGIT and Git and stuff like that. Then we have a Golang Docker image which builds on top of that because we do a lot of goes. So being able to build and test go things. Then we have a deployment Docker image. So this will, it's only here that we actually have some Ruby. So we have Ruby, the Bosch CLI, Bosch init, those sorts of things. And then because we test some stuff in Bosch light, we have a Vagrant Docker image. Otherwise, there's no reason to have Vagrant in all your Docker images. Okay, moving on with this. Manage Bosch jobs with real programs. So what I mean by that is your Bosch jobs. Questions? Yeah, so the VM itself? The VM itself, are you talking about when building the Docker image? When we use that? No, no, no. So we just use Vagrant with the AWS provider to bring up a Bosch light elsewhere. I was actually gonna mention something here. I've never used it myself so I didn't wanna write it as a recommended practice but I know Dmitri's been working on essentially a Bosch light Docker image. And so this would be, rather than you having an image with Vagrant in it where you use to spin up an AWS VM somewhere else using Vagrant so that you get Bosch light running in it and you can deploy to that, you get a Docker image that itself is running Bosch light and has the Bosch CLI in there. So we could, we might actually start to switch to that. If we wanna replicate our like, we can deploy this thing to Bosch light bills, we can just do that inside the Docker image. That'll be pretty sweet. I don't know if it's, he can't hear. I can see him, but we can ask him at some point whether it's legit or not. Yeah, you got a question for him, you can ask him. Bosch light as a Bosch deployment? Okay, instead of using Vagrant, do you have a Bosch light Bosch release? That's cool. Okay, that's cool. Many options in the Bosch ecosystem. Yeah, yeah. So, manage Bosch jobs with real programs. This is the story of CONFAB and console release that we learned the hard way. So most people have Bosch jobs. Just realized I wasn't anywhere near the mic the whole time. Most people have Bosch jobs. Is that better? Was I like inaudible the whole time? All right, okay. I don't wanna hear myself, actually. So most people have Bosch jobs. And in your Bosch jobs, you have your spec that defines all the properties you're gonna use. And then you tend to have two types of things in your templates. You tend to have executables, right? CTL scripts, CTL.sh.erb. If it ends with .sh.erb, it's not a real program. And then you probably have some yaml.erb. And if what you're building is sufficiently simple, you might be able to get away with using that erb to do a little bit of logic and dump that into your yaml or dump that into your bash. And then maybe your bash is simple enough that it's simple and it's not likely to cause you problems. But if you're trying to, especially if you're trying to manage a stateful service like Postgres where you're worried about migrating data, or if you're trying to manage something like console where you can't just bring up a bunch of console nodes and expect them to sync up and work correctly together. They have to be orchestrated quite gingerly, right? Trying to get all the logic right so that you can scale up, scale down, rotate credentials, shoot a node in the head and bring it back up. Trying to get, I was waiting for the announcement. All right, thank you. Trying to get that right in bash.erb is a nightmare. So do yourself a favor and don't do that. There's no reason to do that. You can write programs in real languages like Go which you can test and compile them and use that program to orchestrate the start, whatever logic you need around actually starting up your underlying business logic process, right? So this is like, everybody has business logic. You have console as a binary or if you're writing Cloud Controller, you have the Cloud Controller app itself. And then you have this mysterious contract with Bosch, right? Where there's a monophile and you get some stuff thrown into your ERB context and you have some helpers. And rather than putting a logic in there where you're never gonna test it because you can't test it, just dump all the data into, dump the data that it gives you into a raw file and delegate to some program that you can unit test to actually take all that stuff, unmartial all that data and do all the ifs and else's that you need to do in a tested way. So manage them with real programs and like I just said, unit test and system test that logic. So when you build that program, the one we have for console, we call it CONFAB, unit test it and then system test it. So if you claim that you can scale up and scale down and continue to provide a service, you should have system test for that. I'm running short on time, yeah, all right. Keep releases small enough to be used with hand-editable example manifests and validate those manifests in CI. If your release is so big that it's hard to do that, think of that release as a deployment and think about delivering that deployment separately from delivering the releases, the smaller releases that should be composing that deployment. In the interest of time, there's a pipeline you can click on it in the slides and ask me more details later. I'm gonna move on to some of the other practices. Cut your final releases in CI. If you want to do acceptance on this before you cut the release or you need to get input from several of the members in the foundation before you cut a release, you probably want to do this process manually. If it's something that is small and can move fast like console release, D-release, just cut the final release at the end of your pipeline. No snowflake buckets. So creating a final release means uploading final release assets to a bucket. That bucket should show up as a snowflake. Have CI, I'd importantly create that bucket with the right IM users. Separate pipeline configs from params. This is a really nice pattern that we've come upon of having all our pipeline, all our pipeline yaml in public repositories with mustache templating that fly supports and all the private credentials in a separate location. You could either put them in vault or last pass or just in a private repo. This helps makes how your pipelines work totally discoverable even to the public, which is really nice. Strive to make all your jobs public. This is something we really try to strive for on the CF release pipelines. And so you can see we actually test that all our jobs are publicly viewable. Separate the process of creating the release from deploying it and testing it. If you're gonna be deploying your releases to multiple different environments to test that it works on different IISs or in different configurations, build it early so you're not wasting time building a release over and over again. Separate your deploy and test so that if the test fails you can just rerun the test if necessary rather than having to rerun the whole thing and leverage a lot of the Bosch specific concourse resources that are available to you. Bosch IO stem cells releases and deployments. Here are some resources so you can see our release repos. You can look at our pipelines and look at the pipeline configs and how we've laid them out. If you wanna get in touch with me, email is probably the best way. There's a Twitter handle. I'm pretty active on the mailing list so you can ask questions there if you think they're more general audience appropriate. I have a website and a few GitHub's that's my work GitHub. There's Slack too. Awesome, thank you very much. I think we have time for one question while I work out where the next speaker is. Oh, he's dashing to the front. Any one single question, anyone? It's in the console release. It's a separate package. The source of theirs. Is it useful to anyone else in the console universe or is it specifically? It's specifically to glue the contract between Bosch and running the actual console process. So it's not useful to anyone who's not in the Bosch console universe. It actually, it figures out all the, it renders all the config templates that console needs. It determines when to actually start console and what the parameters. And then it actually just exits and leaves the console process running. It writes the PID file when console's ready and then it itself exits and you get the PID file of the console process. Great, thank you very much. Thank you.