 Hey, everyone. My name is Jason Imerman. I work for Zipcar. This is Derek Van Ash, works for HS2 Solutions, and we're going to talk about how Zipcar has recently been using concourse in order to manage our increased product and process complexity, or more specifically, how we use concourse for all of the things at all the times. So before we get started, quick fire exit announcement. Cool. So as I said, my name is Jason Imerman. I work for Zipcar, and Zipcar provides on-demand cars. It deals when you want them. And as a result, we do a lot of things with scaling, fleet management, and Internet of Things as we pull data from our cars. And I'm Derek Van Ash. I work for a company called HS2 Solutions. HS2 is a digital brand experience agency. We offer the full range of services, including development, management consulting, QA, analytics, pretty much everything in between as well. HS2 and Zipcar have been partnering on different projects for over seven years, and currently I'm working with Jason and his team to build out their pipelines in concourse, trying to optimize their development workflow. Cool. So before we jump into concourse, I'm going to give a little bit of context as to where Zipcar is with our tech stack. And hopefully some of you saw this in the keynote with Andy and Holly. But right now, Zipcar is replatforming. So we're a major replatforming going from a monolith to a set of microservices. And we're using Cloud Foundry tools for our deployment, our orchestration, and a lot of our monitoring, things like Bosch, Diego, and LoggerGator. And we built a subtraction layer that we call Savannah that really makes it easier for us to host and do all these things with a set of microservices. And it's super cool. That's not what this talk is about. So I'm not going to talk about it too much, other than to say two pertinent details. One of which is throughout this talk, I'm going to reference an app or a microservice or a service. I'm talking about the same thing, but at Zipcar we have a very opinionated idea of what a service is or a microservice. But Savannah can host them all. The other piece there is we're going to reference deployment manifests. So deployment manifest is a file that contains the state of an environment. It's effectively an array that's a set of services, their names, their versions, the number of instances, et cetera. So that will come up a few times. So this replatforming has been super cool, but it's also increased the complexity of our system a lot. So a few numbers there. We have now over 80 services, and that number is only going to keep growing. We have over 25 production environments. Because in addition to the Zipcar solution, we also have a white-labeled solution called Local Motion by Zipcar. And we're using the same technology to work with our parent company, Avis Budget Group, on some connected car stuff. So we also have teams that can choose their own languages, their own frameworks, their own runtimes. And as a result, we have a lot of varied host configurations of what we need in the DACR images and the parent DACR images that they use. And then finally, our teams are widely distributed. So geographically and as a result, they're distributed in time zones. So in pictures, we're going from something that looks like this to something that looks more like this. You don't need to read that diagram. It's really just so there's more stuff. And so the problem statement really becomes, how do we minimize the human error while still being able to fully understand our environments and safely delegate ownership to the individual teams? That last part is really important to us. We want the teams to be able to own the code from development to production, to support, to interact with users, and getting technical feedback. And we want to take away a lot of the stuff that they need to deal with in order to do those things. Cool. So in looking for a solution, we came up with a few main tenants. One is communication. We want the communication to be transparent, centralized, and consistent. As a individual engineer, I want to know what's going on with my builds, what's going on with my apps, where they are in any particular environment. As a stakeholder, I want to know the same things, but in a little bit of a different perspective. Change management needs to be lightweight and auditable. Specifically that lightweight piece, we want to make sure that people can apply these solutions to really a general set of technology. And so that a lot of our artifacts, our publications, things like that, are built in a fairly consistent way. Cascading changes, if I make a change over here to a package, that package gets to where it needs to be over here. And ultimately, that thing that is using that package is deployed to all the right environments, and then I can actually check that that happened. And then finally, managed configuration. We don't want snowflakes, we want to be able to redeploy and rebuild and have it be the same thing, and isolation. Let's say if I build a bad version in app, I want to be able to build that same artifact without thinking that the bad version is going to have somehow impacted the good version. So we chose to concourse all the things. And so in the next slide, I'm going to show a general solution, and then I'm going to pass it off to Derek to break down how we're going to go through this talk. The general solution I'm going to show does not contain all the ways in which we use concourse. Because we use it across our entire system, our infrastructure, everything. It's going to show a high-level idea, and then just kind of to give a flavor. And then what we're going to do is try to give you some scenarios about how we actually use concourse to show how you might be able to use it in your organization. So that solution looks a little bit like this. We have pipelines for things like building our Docker images, our parent Docker images, for building all of these individual microservices, for deployments to all of our staging and production environments. And then a bunch of other things on the right side. But the main important piece of this slide is the center. So in the center, all of our artifacts are built by concourse. All of them are built in a consistent way. They're rebuildable, and no one is really manually going in and building and pushing to our artifact stores. With Slack, all of the communication from concourse about our pipelines and everything else is going to the same set of channels. And those channels follow patterns so that as a stakeholder and individual engineer, I know where to look. And I know that I can look there and feel comfortable that I'm getting the right information all the time. Cool, so now we'll hand it off to Derek. All right, so for get started, one thing I want to reiterate that Jason mentioned earlier, Zipcar uses concourse in many different places and a lot of different ways, so many different ways that we don't really have time to really dive into any meaningful detail in the time we have allowed. So what we decided to do was just identify a few different scenarios that might be relatable to a lot of different types of organization. And just kind of break down, like discuss a little bit about the work flows involved and how concourse enters the picture. So as you can see here, this is what the scenarios are. And without further ado, let's go into the first one. So let's say a developer wants to make a change to an existing application. So what I'm showing here is just a basic outline of a process that a lot of different organizations might follow, might vary slightly a little bit, but especially if they use get as source control. So first off, a developer would create a branch. And then make some code changes on them. And like they would iterate on these changes as they further test. Once they're ready, they would create a pull request, maybe iterate some more on their code changes. And once everyone is happy with that result, they would merge it to a master branch, then deploy to staging where some manual testing, some automated testing would occur. And once all stakeholders are happy with the result, then we would deploy to production. So what we're going to do today is just focus on the items in the dotted rectangle here. So the items in green, actually the ones in the dotted before I go further, these have concourse pipelines associated to them. So the ones in green are basically automated triggers and anything in yellow, which in this case the deployment to production is manually triggered. So first off, when it's time to create a pull request, a pipeline that looks like this gets executed. So for those that are not familiar, this is just an example of a pipeline screen within concourse. And again, for those not familiar with concourse, within concourse, you have pipelines. And with pipelines, you have resources and jobs. So a resource could be any object that you might want to trigger action off of or any object you might want to update. Some examples might include like a Docker repository or a Git repository, a database, things in that nature. And jobs are the action you actually take. So to start out, we have a dependencies test. And it's a check that essentially looks out any kind of internal service dependencies that a particular application has. Just make sure that version exists. There's no circular dependencies, things of that nature. And moving on, we also run some integration tests. So automated tests to ensure that the code will work on upper environments. And also we run an internationalization job. So essentially, we take a look and see what languages are configured for a particular application and generate the proper translations there. And then finally, we create a Docker image and publish it to all the Docker repositories we need to. And one thing to note with these Docker versions that we tag, we can overwrite them. So it's not limited to just one. So it's essentially a snapshot version of that particular application. So once it's time to merge the code to master. So we have a very similar looking pipeline here. There's a couple of differences, though. So one is a master dependency analysis job. And what that does is just conducts a security vulnerability check against the code. And also make sure that all API dependencies, external dependencies are up to date. There's no newer versions of those out there. Again, moving along, we have a Docker build step that, again, creates a Docker image, publishes it up to the Docker repositories. And but this time when we tag it, it's a lasting version. So you can have exactly one version of that. If you want to something change, then you have to actually bump up the version again. And finishing off here, we have a deployment manifest update. And this is a very key part of the process that Jason alluded to earlier. So essentially what that is, it's a per environment configuration. One of the things in there is that it has all the current version of an application per environment. So at this point, we update that version once it's version master, we have the hot off the presses version in our manifest for a staging environment by default. And that brings us to deployment time. So once that happens, we kick off a pipeline, a pretty simple one that looks a lot like this. Again, we update that deployment manifest. And by default, it will actually kick off this pipeline, checks to see what if the deployment manifest was updated, and then actually deploys it to our staging environment automatically, which is nice. And then once it's time to deploy to production, we have a very similar looking pipeline. But this time, it's manually triggered. But essentially, it's the same mechanism in order to deploy. We update a deployment manifest for the version of that application, as well as any other application we want to push out to prod. So with that, we'll move along to another scenario to create a new microservice, let's say. So again, the workflow looks very similar to the one if you want to change an existing microservice. But before we can get to that point, we need a little bit of setup. So again, the developer might create a repository first and then create a concourse pipeline. So for our sake, again, we're going to focus on the item in the data rectangle that's creating our concourse pipeline. And we actually chosen to automate this process for a few different reasons. So as Jason mentioned earlier, Zipcar has over 80 microservices that's counting every day. Not all developers at Zipcar are necessarily well versed with concours, know how to create pipelines, and things of that nature. But they don't really have to be. We've created a nice little process to seamlessly integrate with an agile development workflow. So having a nice powerful tool like a concourse pipeline is just a matter of seconds away. Definitely not a barrier to have something spun up to help with continuous integration. Great thing to have. Another reason why we chose to automate it is just issue support. So Jason's team is charged with supporting these pipelines. So A, having so many pipelines running, it makes it very easy to identify issues but more importantly, fix issues. So when you're fixing one issue for one pipeline, a lot of times it'll apply to many pipelines and sometimes all pipelines, which is very powerful and efficient, which we all like. Another benefit is to aim towards stateless infrastructure. So if we wanted to migrate to different concourse hooks for some reason, we can easily spin up all the pipelines that belong to that concourse instance and have it in a matter of minutes. Very nice there too. So going into the solution a little bit more detail. So we've created a command line application. It essentially allows the developer to configure a few different aspects they would like to see in their pipeline so they can customize different types of testing that they want on it, which concourse instance they would like to push their pipeline to, what tech stack that their service belongs to, that kind of customizes actually how the pipeline runs in a lot of cases and as well as if they wanna update other deployment manifests for auto deployment. So as I mentioned earlier by default, we will deploy to our staging environment, but if you wanted to deploy to other environments automatically as part of this pipeline, you can easily do that here. So to draw your attention to the diagram at the right a little bit, so again, we have an application. It basically takes a smaller configuration and generates a larger configuration file that concourse can recognize. We push it out there. One notable thing, we use vault for sensitive data that is within our pipelines. So we'll push it out there. Another thing to note is that we also interact with our Git repository in this pipeline process and this is kind of interesting. So by default, the concourse Git resource that we used will pull our Git server and we found that it produced a lot of load on that server. So in order to, especially with all these microservices, we have a lot of different repositories. So what we chose to do instead is to trigger our pipelines based on a commit against a pull request that we showed earlier and a commit against a master branch. And just by looking at this, you may wonder, well, how does this magic occur? So we have created this application called concourse API and what that essentially is, it acts as a traffic hop. So anytime, like I said, a commit is pushed against a pull request branch or a master branch. It'll flow through this concourse API application and in turn, the concourse API application is smart enough to know which pipeline did kick off, which resource did kick off to get the whole pipeline going. It's a very powerful thing. So moving on to our next scenario, I wonder how Zipcar does code vulnerability analysis. So we have the, going back to our first scenario, one of the steps in our workflow was merging some code to master. We have a, within that workflow step, we have a job that will actually conduct this vulnerability analysis that I sort of alluded to earlier. So again, that's, it checks for old versions of API, but more importantly, any security vulnerabilities that might be within that application. So what we actually do there, going into a little more detail, there's a national vulnerabilities database, which is a US government hosted database of known vulnerabilities of APIs. And so we make that flow through this job. And if there's any vulnerability or old API dependency, we will post a notification to Slack. So the correct Slack channel, so any stakeholder or development team will be notified that, hey, there's a security vulnerability in your application. And we'll also post it to this DevMetrics database there. I'll talk about that in more detail in a minute here, but before I do, one thing to note is that any failure here, so if there is a cold vulnerability, for example, although the job will fail, the pipeline will continue to flow, so we'll allow Docker images to continue to be posted for that particular, and tagged for that particular version. Just allowing the different development teams to kind of prioritize like how vital they see these vulnerabilities, or how vital they should get up to the latest version of an API dependency. So again, we put the control right into the developer's hands and the development team's hands so they can make an educated decision there and not have to block any kind of agile workflow. So going into a little more detail on DevMetrics. So as you can see, we have wrapped this DevMetrics database around a service. And again, we have it not only embedded in our pipelines, but we also have a scheduled DevMetrics run. So for those cases where an application is actively developed, let's say it's pretty stable and there's no code changes against it, we'll at least run this DevMetrics run that we call once a week and it'll essentially do the same thing. It'll look for security vulnerabilities, updated API dependencies, post to the same Slack channels and have the development teams kind of prioritize accordingly. Another interesting thing to note here is we have a Hubot application and what that is, it just takes some Slack commands that'll interpret them and go against our DevMetrics service and basically post the same results. So with that, I'll turn it back over to Jason who will bring us home with a couple more scenarios. Cool, thanks, Jack. So a lot of what Derek has shown so far is our pipelines within the context of a single microservice, a single application, but that's not the only way that we use concourse. It's certainly the primary way, it's the way that most developers or engineers interact with concourse, but there are a bunch of things that we do outside of that. So I'm gonna demonstrate a few of those more cross-cutting applications. So the first one, if we need to upgrade a package that's shared across a lot of our services, in this case, let's say we're upgrading Java, more specifically the version of JDK. So what we've done is we've wrapped the remote Oracle JDK repo in a concourse resource, which means the concourse can natively interact with it when new versions are posted, concourse will know about them. And so for minor versions, we can automatically trigger the pipeline and what's going to happen is if Oracle posts a new minor version of some JDK that we're using for the major version. We are going to kick off this pipeline and automatically rebuild all the images. So the base job image, the image that has Java as well as all the magic sauce needed to interact with Savannah and the same images that are used to test within concourse. And then what Derek showed earlier, we have a developer coming along, making some code changes unbeknownst to them, JDK has been bumped, they push something up to get and the pipeline kicks off. Now the magic here is that the pipeline that runs, both the tests and the artifact that's built and published are using the new version of JDK. Of course, the test can fail and in concourse logs, you can easily see what happened, why it failed and moved from there. But ideally it's published, it works, it automatically goes to staging and the workflow Derek showed. The developer chooses when to push it out to production and we've upgraded JDK. There's a pretty big gap here though, which is because we have over 80 microservices that numbers gonna grow, they don't all get developed very often. So some of them are constantly being developed, some won't get a commit for a little while, but we still need to get the new versions of in this case JDK up to those apps. So we have this thing that we're developing in process called maintenance mode. And effectively we choose some time, let's say in this case a month and we're gonna trigger these pipelines in a schedule, check what pipelines haven't been run in a certain amount of time. And if they haven't been run, we're gonna bump the patch version and then run the pipeline. Ideally the test pass, we publish new artifact, we push that to staging and then we have the new version on staging with the new JDK, even though the developer didn't commit a code changes. Well, the developer didn't commit any code changes so they're not expecting this new version so they're not gonna know to deploy that at production. So we use concourse again for a bunch of reports, one of which is this environment diff report. And for any of our staging and production environments, we can diff production and we can diff staging and we can see what versions are in both, how old those are. And so we publish that to a very public channel once a week and ideally the engineer notices and say, oh, there's a new version of my app, I need to go deal with that. Test it, push it out to production. However, they might not notice and this is where peer pressure comes in. We have this nudging mechanism where everyone's accountable, people are talking to each other, stakeholders may notice, peers may notice. And inform the engineer, get it out to production. And a quick shout out, we also have that as a HuBot command, a lot of these things for Slack integration. So there's one more gap here. I said that ideally the test pass, the thing pushes out to staging and then this environment diff report picks it up. What if the test started failing for whatever reason, time passed or the new JDK version caused the test to fail? Well, that's where another report comes in the persistent pipeline failure report which is effectively on some interval we check to see, has or by admitted consistent failure state for X days, weeks, months, whatever. And if so, we do the same thing, we publish how to report to Slack and through that same mechanism, we hope that the engineer will get it, that information, make some code changes and measure that thing out to production. So one last scenario here is continuous integrations or continuous integration testing. So Holly references this on the keynote stage but we do this thing called journey testing or user journey testing and these are really the revenue critical paths that are required for our system to work and our users to interact with the system both internal and external users. And so we have a set of very stable tests that run constantly against all of our staging production environments to make sure that these revenue critical paths are passing because we have all of these versions of all of these apps constantly flowing into these environments and we need to make sure that they don't break things. If something fails, post to Slack and all hands on deck to get that thing fixed. But one side effect here is that because we're running these from concourse we're constantly ETLing the data from these tests into New Relic which is an application performance monitoring tool. It's really cool if you haven't used it. But one really nice side effect here is that we have constant uptime metrics on both our staging and production environments. So we can see how quality is affected not only in production but staging because CICD comes with the fact that we can't just constantly be bringing down our staging environment. We need it to be usable for people to actually test and things like that. And so as a little bit of a side we take this data and then we have a New Relic dashboard that we can make for any environment that gives us uptime metrics. It gives us things like historical duration of the test. We can see if we have load problems. Yeah, things like that. So with that I will hand it back to Derek to wrap up. All right, so thanks Jason. And as we talked about Zipcar uses concourse in a lot of different ways. We covered some really fundamental basic building blocks just to kind of get your appetite wetted a little bit. Like we said earlier, it's used in a lot of different ways, a lot of different things. It's used for other development workflows. There's nuances to the workflows we've discussed. There's a lot of infrastructure related workflows that it's used as well. So really neat powerful tool. So really I just wanted to extend an invitation. If you see either Jason or I walk it around feel free to stop us. We'd love to talk more about concourse and how Zipcar uses concourse. Or if you're a user of concourse, we'd love to chat as well. There's a lot of different ways to solve different problems. And it's interesting to kind of relate some different organizational challenges and try to come up with good common solutions. And also before I close too, I wanted to give a big shout out to Stark and Wayne. They got Zipcar started with concourse, got them initially implemented. So thanks for that. And with that, I'll just close it out. And I think we have time for a few questions if anyone has any questions. Guess I don't fully understand your question. When you say applications on the core versus the periphery. Sorry about that everybody. So your question is around pulling apart a tightly integrated system is really hard. And so what I think you're saying is that the things in the core that are really tightly integrated you can't get to those right away. You have to have a real strategic approach to those. And so to start out, you pick things that aren't quite as tightly integrated and start migrating them into the environment. Is that what you're saying? So you wanna ask the rest of your question? So did we do it that way? What we ended up doing is starting off with completely new functionality. And we are in the process of migrating some of that deeper integrated stuff. And we do have a strategy in place for that. But I'm not sure I can really talk about how we're going about that without divulging internal stuff. So does that help? Okay. Cool, anything else? Sure. Yeah, so yeah, exactly. So instead of concourse polling get it'll basically anytime you commit to get. So like I said, the master branch or a poll request branch only it'll flow through that concourse API and then trigger the pipeline instead of a polling mechanism. Yeah, webhooks. Yeah, we use webhooks and there's also a PRNFB servlet that also handles the poll request. So that's kind of just something like a plug-in that you put right into in our case stash. I mean, I think it's pretty solid. Like so far we're not, we find some issues sometimes with having to configure workers and container counts getting maxed out but then it's pretty tweakable. Like you can easily just kind of scale that in a lot of different ways. I think we're still kind of exploring the different ways that you can do that but so far it's not like a huge problem. It's usually pretty attainable to all that. We've done a few things to also mitigate that. I didn't know if you noticed in earlier slide we have multiple instances of concourse. So we're having problems with one and that was more of an issue in a much older version of concourse. We were able to dynamically go between them and because we automatically can generate the pipelines that was not too painful. We also have a lot of monitoring setup on concourse itself. So there's a lot of documentation about how you can do this but concourse is always emitting data that you can report upon and similar to that we built a lot of our own reporting mechanisms to hey if jobs are failing with a stalled worker exception we get notified and then we can go take care of it. Anything else? All right well again approach Jason or I and thanks for your attention.