 Hey, how's everyone doing can give me thumbs up. It's a pretty chill talk. You don't have to yell or anything I'm here to talk about concourse. I'm Alex Srirachi. Sorry. I've been at I've been a VMware for about Well, it was at VMware in 2011 now. I'm here at Pivotal. So it's been a while And in the intermediate stages we kind of learned how to write a pretty good CI system I think just given the crazy demands that something as big as Cloud Foundry has on Pretty much every team like every team in the building has much different requirements. So it's a pretty good place to learn how to write a system so Concourse is a thing that does things So if you ever need things that need doing Concourse is for all your guy. It does them continuously. Some people would call this a CI system But that's sort of a specialization. It really just does whatever This is what it looks like. This is concourse's own CI pipeline Where you can see here is sort of Roughly the colored columns are the various stages as artifacts progress through the black boxes Which probably aren't very visible, but I'll show them off in the demo later These are called resources. So this would be like a get repo And whenever that changes the immediate next most things will trigger So we can see in this first column There's about five jobs that trigger and when they all go green they feed into this release candidate build Once we get past there, it's integration and then deploy and then whenever we're ready We can just come in and press ship it and then that's when we'll actually Trigger a bunch of little worker bees that publish to get our releases and stuff like that So obvious question is why would we do this? There's already like 10 systems out there at the very least that can pretty much do whatever Jenkins has been around for a Long time. It's a computer. You can do anything with the computer Mainly we just got tired of being pissed off When you're working on Cloud Foundry, there's a lot of CI that you're dealing with It's pretty often that you'll want to just see a failing build and get to it as quickly as possible So you can actually figure out what's going on And a lot of things don't really optimize for that You have to sort of click through and follow a chain of object models and stuff like that and then you finally get to your console output and then it looks like that and And you start to learn that these three question marks. That's actually a passing dot And these this bracket 32m is actually supposed to be green So people actually started literally copying and pasting the entire page into their terminal just so it would render the colors So that was the first thing we did and conquerors. It just looks like that. So that's one check mark for us So I'll do a quick little demo Get the full fancy effect here This is the status of all the teams in Cloud Foundry switching over. We have a few that are already done a lot are in flight Our Methodology has been sort of pair with them when it makes sense and sort of guide them through it But a lot of people are actually just sinking their teeth in and parallel which is kind of frightening but pretty cool So I can quickly go through some of the pipelines here. This is conquerors zone. I already should do that kind of This is Diego's It's kind of interesting they have unit and an ego here just chilling whenever that goes green we get a minor release candidate Bumped through and then there's this giant thing that does a bunch of crap that I don't understand but it's pretty cool Once it loads up It's also a demo of the Wi-Fi I guess there's a password. I'm gonna back out of that So that's Diego. Here's Bosch lights Whenever they want to they just come in and press a new box that'll trigger the rest of these pipelines or these rest of these jobs So this will build the virtual box image and then run bats against virtual box and AWS and deploy CF to it and Then publish to vagrant cloud and then push back to master and the repo So there's Bosch light Here is Bosch in its pipeline. They have a fairly simple just run unit integration Acceptance suite bump the version number and then ship it roughly Here's the main Bosch pipeline. They're still working on this But it's pretty nice example of just basic fan in fan out. You can see they're testing into review one nine and two one And then my sequel in Postgres. So there's a lot of stuff going on there. Here's gardens Not too much different from the other ones there. I guess it just looks different and here's CF LA's Which is this giant hulking behemoth of automated stuff. It's pretty neat so One thing you'll notice is that every single team's pipeline looks very different And I think this is actually a really nice piece of information that you can collect just from sort of onboarding on a team You can see how just fundamentally different everyone see I works as opposed to sort of joining and then you have this box That's either green or red and then you have to dig in and understand scripts One of the things conquerors tries to optimize for is you can actually see the propagation of artifacts through the pipeline And how many stages it takes to get from changing code and then publishing so I Can also show a little build here This is fly this is the conqueror see lie and you can see here as soon as I trigger that will be cloning the conquerors repo and Once that's all finished, which shouldn't take too long But I'm gonna add a little a bunch of filler words. There we go And then we fan out to Linux Darwin and Windows to run the same script roughly Well switch to like dot bat and stuff for Windows And then once this all goes green then the actual job is complete So that's There's not there's not much to it other than that you can see resources which are basically The location of these source repositories you can have like a get resource or an S3 resource And that basically identifies how to like pull down something from some abstract location or modify or push it up And you can see we just have a bunch of shaw's here and metadata and I think I've pretty much shown all the pages to conquerors right now. So let's go back to the slides so One thing we're also striving for is to have this be conceptually scalable meaning as your needs as a team grow You don't have like this constant need for a full understanding of your pipeline as you change and add things to it you can just sort of Piecemeal come in say I need a job that does this whenever this upstream logical dependency changes You can just add that in there and the pipeline forms as a result of that And as you're doing that there's much fewer things to think about there's really just three core concepts There's a resource which I've already touched on a little bit A resource is kind of the central focus point of any pipeline This is where you're publishing things to and pulling from and it's anything that can be versioned and what that means is like I can identify some Blob that will let me fetch that as it was whenever I first saw it so forget that would be like a get shot for S3 That might be a file name assuming you're not clobbering your files in your bucket A task is just how to run something in a container This either succeeds or fails and it has a set of logical dependencies And I say logical dependencies because it's really like you would think of this as an abstraction layer saying my task Can work with for example a Bosch stem cell any old Bosch stem cell But you might actually use it to run against an AWS stem cell or something like that So in the task you tend to use very generic names for the things you actually need And then you declare your other sort of concrete dependencies like which docker image you run in which script You're actually running which arguments to pass and things like that And a job is basically formed whenever you sort of meld these together and what's called a build plan and a build plan expresses Various actions you can do you can like get a resource run a task and then put a resource for example And a job is just What determines the inputs and outputs basically for each step in the pipeline? And it's what shows up as like green or red based on its status in the main view And from that you get this nice UI so each of these black boxes are resources these green ones are jobs and You can see from the indicators here that this Bosch init thing Starts through here, and it's a dark gray here because no new versions will appear there That's to convey that it's just threading the same version through the pipeline as it's past all these sort of checks and balances You can have many pipelines for deployment, this is actually kind of an interesting newer thing For example, you could have a pipeline that someone just wrote and figured out the nice semantics for doing a basic Bosch release and then you just take that and parameterize it and just shove it up on your concourse to play your concourse deployment so Yeah, as you're configuring these things we want you to be able to think about pretty much the next most thing above you and nothing More we found when we're dealing with other systems You often have to have sort of the full topology of your CI pipeline in your head because they're all sort of strictly related For example, you have a job that finishes and then triggers job B Whereas our intent is to have it so you configure this job and declare the things that actually needs and which checks and balances that Input should have made it through by this point and then the pipeline forms from that So now that you've had this all configured It's nice to not worry too much about see I just burning down I mean stuff happens a VM goes away all of your VMs go away depending on your infrastructure And it's nice to be able to just say I have this configuration make it so bring it back Which is a matter of making it show that you're never configuring your workers You're never like hand tweaking them going in there adding like apt dependencies or anything like that There's also no like GUI for can keep configuring pipelines instead You just have this one document that says here's my pipeline put it up there with the parameters That's to avoid like when you actually bring it back You don't have to click through for two hours just to bring your jobs back and hope they were exactly how they were It's very We at least want to be minimally Reproducible via Bosch will add easier ways to do it in the future. We have vagrant already which works for a local thing There's vagrant with AWS, but there's no credentials, so I wouldn't recommend that So yeah, our mode is very much like solve for Bosch get everything reproducible and fitting with our guidelines And then we'll sort of add UX on top of that to make it a lot nicer to bootstrap and There's also the fact that because builds are just run in a container there They should be much more reproducible than they would be if you had like state on some agent one of 20 agents You can take the same configuration you have for an individual build and just run it from the command line Which is great for seeing like something failed up there Let's reproduce the inputs or try changing the code locally and then run the exact same thing in the CI system But not as part of the pipeline because I'm just trying to debug something There's also we've been hiding away various CI state like build number from the tasks intentionally and this is because When so we've seen this pattern of like using the build number for versioning What can happen is your master rolls over and then your versions start over at zero unless you do something about it So like it's really awkward So we push that out into what's called December resource which Externalizes all that state and makes it so that truly if anything really bad happens You can just come back and still be on version 2.0 instead of 0.0 0.1 And I've touched on resource a little bit. I mentioned some of a resource just now get s3 The main goal of these guys is to make it so you can delete all your boring crap all your boilerplate that you have in your pipeline for like doing branch motion or Promoting an asset into a bucket And said there's this notion of a resource which is truly this very abstract notion of something The one of the more interesting things is it's a thing that you can linearize So like with git that would be a sequence of shaws and which Sean currently on and which I can check from For s3 this would be like objects in a bucket with versions in their file name so we can actually pull down the exact same one again and A lot of the first-class things that are in other systems like time to triggers or like cron things We actually try to find a way to just push these into resources so we can keep the course small and Drive out more Genericness I guess in the resource interface to sort of prove it out And these are really the only way you can extend concourse Which I actually think is a good thing I like to be able to trust that there's just this one interface that I can deal with and everything else is just sort of Pretty much how I need it. I don't have to worry about how two plugins interact with each other or anything like that So what is the resource? Today resources have three actions. There is get which is I have this version So like a get Shaw and I need this as an input to a build So that would basically look like get clone the repo and then check out that shot Put is how you create versions so you would have some output. This would be like arbitrary artifacts and then you would Modify some external thing like put something in a bucket and then return a version and that version would to get you back What was put up there and there's check which is how we discovered these versions it starts with either nothing If it's a brand new thing and we've never seen before or the current version and that gives us the sequence of versions So yeah, here's all the examples of the get thing It's just clone sort of push and then pull and log so forget it's very straightforward for other things You have to work with version numbers or version numbers and file names in s3 for example But we found this to be very accessible And this was actually like the first mind-blowing thing that we ran it to this is like after a late night chatting with Dmitry on how to pull Get core stuff out of concourse and generalize this and as soon as we realized that it was like this is gonna be pretty freaking cool And as following on that These are all of today's resources. There's a lot more plans for more coming in the future There's get repose, which is the very common one s3 is actually really common too You can use this to perform CI for Docker images so you can pull them whenever they change or Create them whenever their source changes Bosch deployment resource is a great way to automate a Bosch deployment There's a few subtle advantage advantages to this one is its code that you don't have to write to is it's just parameters that you don't have to carry around anything for and The other sort of nice thing is it guarantees that the inputs In terms of like their releases and the stem cells are the exact version that gets deployed So you don't have latest sort of sitting around in your manifests not doing what you think it's doing And there's sort of other various odds and ends And one of the nice things about running in containers well one of the not nice things sorry is They can be pretty far away and hard to debug in that case I imagine a common workflow today is if you have a build that's hanging or failing in some awkward way You have to SSH into the machine and figure out what's going on there But at least you can do that when it's in a container you have another another sort of step between there So what we've done is added a way to run builds locally so you can at least run them with your own inputs So say you're modifying code on disk and want to see if this actually breaks it the way you think it does and That this runs with the exact same semantics as the pipeline would which gives you the guarantee that it's doing What it probably will do when you push it up there And there's also fly hijack which lets you hop into the container running build So this is great if you have something that's stuck and you want to actually figure out what's going on You can just hot hijack in there Send SIG quit to the thing get a stack dump or whatever your actual system does It's fully interactive you can run whatever command Typically you just do bash and mess around in there Flight configure is how you actually shove a pipeline configure it configuration up there You just give it the name of the pipeline which handle file and Another file where you can use to provide secret credentials that way you can make your pipelines public And before you actually commit it, it'll tell you what's changed so you can confirm it So that's pretty much all there is to concourse I think itself so The rest is sort of nitty-gritty little low-level details about it Every container runs Via garden the main advantage of that is we can just use the same code and work with Windows and Darwin and Linux like I Munched them before The workers are entirely stateless, which means whenever you have to Scale up. It's just change a number of your Bosch deployment manifest and Bosch deploy You never have to worry about Do these new workers have the same state as the other ones are my ability going to randomly start failing if they run on those Instead everything's just homogenous. We randomly run build across all the workers So that's all you should really have to think about One other kind of interesting subtle detail is in a lot of systems to register Sort of esoteric workers like you say you have some Vester cluster in the office or you're running in some private network A lot of the time you have to make a VPN to actually reach those workers Which is a formula for a lot of pain from what I've seen the VPN can just sort of fall over and then your workers are orphaned It also means your CI needs access to your private network, which is a little dangerous So instead we reverse the flow of the tunnel and just have the workers SSH into the master that way The only thing that needs to be able to happen is your workers can reach your public CI server Which is much more likely and then that's just over a secure SSH session Every component except for Postgres if you count that excuse already, yes, I guess is highly available ATC's can be scaled up arbitrarily They do a pretty rough job of scaling the workload out, but it's pretty decent I guess it speeds up the web UI and it's mainly about like Being able to deploy itself without going down, which is an interesting venture Everything is written in go It's probably no big surprise there one of the nice side effects is it's theoretically very low footprint We haven't actually looked but I'm going to brag about it anyway The nice thing about the resources being pulled out is you can actually use whatever language makes sense for them You don't have to like write your plugins and go because the API is in go You don't have to write your thing in Java because Jenkins uses a Java They are just and set of binaries in a container so you can use Ruby or bash or go or whichever language you want to As long as you control your dependencies because they're just Docker images and that's fairly easy to reproduce There's nothing too shocking on the front end We use react.js for rendering the build which actually saves a lot of the performance and a lot of the complexity The main UI is D3 ish. It's really just we're using it to draw. We actually determine everything statically There's some fun problems though if you want to do pull requests or something because the main UI is a Despite looking like this clean thing. It's actually very Tricky to figure out how to lay these out. So there's like as few lines are crossing There's this one which is killing me, but we never found a good way to do that. I Promise you this is a very hard problem one of the constraints that Makes it hard is the lines have to come in on the same side that they come out Otherwise, it's fairly simple But given that like if even if we were to fix these it would like flip these over and weird stuff would happen over here it's crazy so Try to avoid that problem so That's Not it there we go Everything is open source. It's on github.com slash concourse. We have docs up at concourse.ci. They should be up to date They might not be good, but they're up to date The docs go over how to do a vagrant up and a Bosch deploy they assume some Bosch familiar familiar Yeah, some familiarness with Bosch But mainly because they're not good enough Yeah, we're on concourse on free node if you're free to come in we're usually there at least 24 hours a day Not not all of us at the same time, but someone's in there So have to get to the caveats. We're not one point. Oh yet occasionally we I Won't say break things because we give you a path forward at least but read those release notes because you'll probably have to Do something at some point to actually upgrade Sorry There's no caching so if you have a bunch of large artifacts It will clone them every time, but at least it's not polluting or sharing state We'll be doing caching pretty soon probably in the coming weeks We'll be making sure to also keep it so there's no weird pollution possible And yeah, I mentioned really read the release notes. It's worth saying again We're still thinking through some of the core principles We're mostly there but occasionally we have to sort of subtly change how Things like build plans work, but everything should at least have a transition path So that's pretty much it Any questions? tamer Sure If you want a giant example the concourse repo has our pipeline config public Obviously with all their credentials stripped out if I can find it Atride should be in there. Oh, thank you. Oh boy Cross your fingers Make the Wi-Fi signal. Come on If I open it in five tabs usually one of them loads or not No, you might have to trust me on this. Oh wait, I have the code Wait, wait, wait, wait, wait, wait, wait, wait, wait, wait for it Wait for it. See I pipelines concourse that should be big enough So let's look at fly, which is the one I showed you earlier This is the one that triggered that triggered both Linux or all of Linux Darwin and Windows The job config itself first off just says that it's public This is the thing you do if you actually want your pipeline to be open source You can also make it publicly viewable for the whole thing, but that just means You can view the pipeline, but you can't drill into the job until they say public true So that's just an extra safeguard because it's very easy to accidentally leak credentials, especially in like a Bosch deploy So every job just has a single plan this plan says pull down the concourse resource Which is defined further below with git stuff and then do this aggregate step of running Linux Darwin and Windows And each of these are saying run this task Using the config located in concourse slash ci concourse being the step that we pulled from And I can bring those up, but I'll quickly jump to resources as well so I can show you how concourse is defined One of these there he is So this just says it's a git resource it lives at this git URI the branches develop and use this private key Which we parameterize in So you would define resource for really all the logical dependencies, which is why we have one for Master one for develop, but we also have this concourse develop thing, which is pretty much the same thing, but You would basically duplicate config when you have a separate Logical resource and meaning like you use it differently That just happens to live in the same repo Which is one of the first things that we should probably update in the docs because a lot of people trip up on that Here's all the configuration for Things that we put in s3 so you can figure s3 with a bucket that the objects live in The reg X matching the file name you use this capture group to figure out the version number from the files in the bucket That way we can order them And there's a summer resource which currently lives in a s3 bucket and this is so we can manage our actual semantic bumping and things like that So to show the task config there's fly Linux This is the configuration for our Linux build and this says use this Docker image Conquer slash ATC CI. I have one logical dependency and that's concourse itself and I'm going to run this script And my platform is Linux as opposed to Darwin and windows and any of these Windows pretty much the same. There's a dot bat though. So it's windows and it says platform windows Conquer says pipeline is kind of big might be a bit intimidating I think there's a few more coming out Bosch is doing a big transition and they're making all their pipelines open source So that's a great information. That's a great source of information on like some of the simpler flows of like I just have this CLI and need a binary that goes to s3 or something Anymore no more questions. Oh Over there sure Everything is a Bosch deployment Sorry, how do you scale your CI cluster? Given that we lean on Bosch heavily for managing our each of our jobs Ideally, it's just change a number in your manifest Not even ideally pretty much practically. That's all we've ever had to do You can scale up workers pretty easily. We'll just randomly go between them. You can scale up ATCs That's mainly for high availability It's it would be pretty easy to deploy at least ATC which is the web UI all it needs is postgres There's no like message bus or anything other than that The tricky thing is the workers because a lot of the time you need actual full privileges to run those I did do one experiment at least where I deployed the workers to Lattice just using privileged containers and then just had the workers all register with some other thing that was running in there and pointing at RDS In principle, we could just have a big public Stateless worker pool and turn off privileged and then we have sort of a multi-tenant worker pool But you can just see if push the web UI which would be at least more convenient. So Anymore back there Mm-hmm Sure What you would usually do is pre-build a Docker image for your builds and put all your actual sort of static dependencies in there If you have something that's more than I but more dynamic you would probably pull it down at runtime As long as it's not too slow a lot of times you end up just optimizing and putting it into the image because once we fetch it It's pretty quick to boot up because it's all cached So Next one not seeing any I guess we're good cool. Thank you