 My name is Zach Dunn, that's who I am. I'm the senior director of platform operations at a company called Optoro. That's a very long way of saying I care about blinky lights and I care about them staying blinky or solid or maybe not going amber. There's a lot of different things I care about, mostly what I care about are blinky lights. So what I'm gonna talk to you today is about our transition from two separate code repos and three different ways of doing CI to a consolidated platform on GitLab using one consistent build pattern for our runners. So that being said, oops, I'm gonna close your little thing so I can see my timer, which isn't started, so I get unlimited time. Ha ha ha ha! So in 2012, what we did was go ahead and start putting our code in this place called GitHub. It was a thing, people did it for a while. So this was good though, right? Like people started checking in code, we started getting along. Someone then had the genius idea that maybe we should actually test our code and I guarantee you probably most people can relate. The first way this was done was on someone's laptop in a screen session, their goddamn laptop, okay? So people couldn't test their code unless someone pulled it down, ran it and then pushed back up the results. It was amazing. So they said, well, that's bullshit, we shouldn't do that. We'll start an artisanal EWS server, some random EC2, I think it was like a Boutoo 1204 image and started running on a screen shell there. So at least then we got a webhook that would get you that pretty little checkbox. This configuration was awful. It basically made it so that we were always, well, we were always broken. Ha ha ha ha! Right, and no one's trust in CI because it was slow, it didn't work that well. We ended up, eventually, we had to reprovision this and we made it a little better by building it with Chef but it didn't really work all that well. Mostly because at any given point, someone could go into the UI and change all the things about it. It was great. So then I show up about 2015 in this story. What happens is I say, hey man, we should do the same thing. Infrastructure is code, we should test our code, let's write some tests. So we started writing tests and again, it's a Chef pipeline, so it was a little better, things were controlled. We still managed to break it all the time because again, it's XML and a GUI and people can go and press buttons and if you give them a button, they're going to press it. And we actually used this for CI and CD. So CD, where Jenkins was the only actually authorized user against our Chef server. So if you uploaded a cookbook, it had to go through a process and we pushed that all the way to start. Like we were doing things, right? They were even right. It was pretty good. So this is kind of the state of the world. We probably, this way to about, I don't know, I think it was 2017, 18, I wanna say. I made up that number, I wasn't sure what it was. Mostly because we got rid of it. So we were super happy though with how we could do CD in this environment. So we started using Terraform a lot and when we were using Terraform, it was basically like, plan looks good, go ahead and deploy it Bob. And then everyone, literally one of my engineers had Terraform plan, I'm not sure, everyone used Terraform, have you used Terraform? So Terraform, infrastructure's code, all that. The bash alias for Terraform plan was fuck me and then the Terraform apply was fuck you. Because that's basically how we broke our infrastructure several times. When it would just be like, I'm gonna deploy some new invetuses, cool, yeah, plan looks good. Let me go ahead and hit apply. What do you mean you did something to the state while I was doing something? No, no, no, no, no, no, no, I'm just, that's a fun post mortem. So we're like, hey, these pipeline things, they sound cool, let's do that. We also had some interesting problems around GitHub would have some interesting pricing models and so we had ended up having multiple organizations so that we could have different pricing tiers based on users and either unlimited repos or unlimited users and so we were trying to be cheap here. So we ended up running this on-prem, started using the GitLab pipelines and it actually worked pretty damn well. So we started adding infrastructure here, our CI CD, all the infrastructure codes started going there. This stuff worked well enough. We actually spent the time and wrote a piece of software called Git Glue, which is great, it was a little app, it sat there, it took a webhook from GitHub and it would run CI in GitLab CI. So now what happens is we've got three different places your code could be tested, never any confusion on that. Also some of them had both Jenkins and GitLab CI and we had two different places where you could have your code as well and then we had this glue bit in here which was fun because it was just kind of like magic that no one actually quite understood. It was hilarious when GitLab released a product that did what this did because we were getting ready to release this. We thought we were fucking clever and then yeah. So what ended up happening though is now we can get some developers using this for CI because they were like, yeah, Jenkins sucks, I hate this. So all of our services, because we were like, hey, we're gonna write a bunch of microservices and we can have a whole long conversation about what the hell a microservice is because I guarantee you once you put a database into it, screw you. So now we have a bunch of services. I wanna say about 50 services or so. We got those sort of spread out everywhere. We even had a couple, I call them feature devs, putting code into GitLab CI. So things worked, right? Like you could check your code in, it would run tests and most of the time it would tell you whether or not it worked. I say most of the time because again, I think at this point we probably had about 18 or 24 Jenkins executors which were like eight core machines with 32 gigs of RAM that were occasionally crafted as well. So missing on that previous slide is all of the crap associated with the executors and runners. So you've gotta imagine also like a big smattering of other crap kind of around the edges. Turns out it's hard to find an emoji that would let me spread it around the sides. So things did work, right? And that's good, right? So we decided to go up. And I put an asterisk here because when I say we, I mean a customer told us we had to get a SOC2. And I'm not sure if anyone knows what a SOC2 is. It's like a process and security thing. So basically you're going to hire an auditor, they're gonna come and they're gonna audit all your processes and you have to show them a bunch of details and it sucks. And you have to pay them for the privilege. So going back again, all of that, those are all logical access groups. They have to be audited. And turns out if you wanna get SSO on GitHub, they want you to pay for the enterprise edition which basically just gets you SSO. So things worked. We decided to go up because the customer told us we had to which is a good way to do it. So how did we do that? So obviously long story short, we did it. So just gonna take the suspense out of the room right now. So first we started leveraging the GitLab mirror. We said we're gonna go in, we're gonna chuck a token in there and we synced all those organizations into GitLab. So now we have all of our repos mirrored over there. So any changes that are gonna be made here, we're getting over in GitLab as well. That's also let us start to work out some of the permissions models we wanted to use. We could start sending people there. We could start using CI off of the GitLab, hosted GitLab Cloud GitLab. What are we supposed to call it? Do we know? Because hosted sounds like I'm hosting it. GitLab.com. Okay, so we started leveraging GitLab mirrors. That worked. Second, we started consolidating all the CI. Like I said, we actually had a lot of CI, GitLab CI files in our GitHub repos, which confused the hell out of everyone. It was great. So we actually started running CI almost immediately in GitLab CI using our Kubernetes stacks. So we actually also migrated to our Kubernetes, the Kubernetes executors. And then the last step was honestly shot the damn repos in the head. We would archive them on GitHub until team by team, this is your list of repos. They're all mirrored. Do you have your permission set? Here's the checklist. Go through it. You done? Great, all of those are getting archived. So a thing I don't have on the slides here that's really fun is it turns out we broke a lot of shit doing that. There's so many places in your code base that you don't even realize that are right now referencing things by Git repo that are gonna absolutely screw you. It's amazing. And the best part is everyone would say the same thing. Someone books CI and you'd be like, what the hell, how did we break, no one's touched CI today. And then all of a sudden you'd be like, yeah, Brad can close down that repo today. Yes, and every time we had the same reaction, screw it, just get it over, change the goddamn thing. So the other thing is we did this in about three months. We set ourselves a date of June, it was June 1st. We started about in March. And we basically just went through this process. It was about 1,000 repos. I wanna say this took us about 1,000 repos. We've run about 45, in that three months because I've got the little charts now. In the three months we've done about 4,500 pipelines. Each one of our pipelines does something awful, like 16 different jobs which are pods and then the pods inside of them have multiple dockers. Again, I have a whole nother talk about testing Ruby code and how we need to grow the fuck up. So where we ended up? Where the hosts get back? We're using Kubernetes runners. We've got the cleaning builds, which is great because this is one of the big, big, big victories here is a template is set of consistent best practices. Because before in the Jenkins world, we were just like go and set up your CI job. Oh, you're just doing nothing. You're not doing security testing. You're not doing, you're building a docker. How, whatever, don't care. Now I can just say, did you include the repo? Did you include the template? Good, move on. You should have a nice little mustache. I call them CI mustaches. I think they're cuter that way. We have Kanako Docker builds. There's only one single source of truth now and I got the joy of turning off about a terabyte of crap. It was beautiful and we had a party. Time.