 All right, welcome everyone. Hopefully you're all here for the right talk. This is going to be Introducing Probo CI, which is an open source project that I've been spending a lot of my time and energy on lately, trying to scratch an itch that we have where I work. And hopefully that's going to be one that you guys are looking to get scratched as well. A little bit about me. I'm Howard Tyson, probably better known as Tizzo on Twitter, Drupal.org, Tumblr, anywhere else I can get it, GitHub. And I am the vice president of engineering at a company called ZivTech. We do open source consulting. We specialize in building Drupal websites, but more and more people have been coming to us for DevOps-y stuff, setting up systems, doing automation, helping them with their build processes, and helping their teams to get set up for working really efficiently. We have about 30 people, and we work on a large variety of projects. We do some hosting, and we do a lot of long-term support. So one of the things that that means for us is that we have a large number of sites that we might jump in and out of, which I think is part of what made some of the issues that I'm going to talk about today, particularly acute for us, and is part of why I started working on this tool. So the tool, ProboCI, is a continuous integration tool. There are a few other continuous integration tools. There's Travis, there's Circle, there's DroneIO. I was talking to my good friend Mark a little while ago, and he said, I think the quote was, I think we need another CI system like we need a hole in the head. It's like the one thing the world doesn't need more of. And I said, sir, you're wrong. I said, I see an overcrowded marketplace, and I say, me too. So you've probably got some CI system, if you guys are interested in this DevOps talk. You've probably at least looked at one. It probably connects to your version control repository. It probably sees when you push new code or when you create new feature branches or create pull requests. And it probably runs tests, and maybe a computer can look at some things and tell you, did this test pass or did this test not pass? If you've gotten that far, you are light years ahead of most Drupal folks. What your system probably doesn't help you do is quality assurance outside of what a computer can look at. And that's a big part of what we see as the missing link is a lot of Drupal is really difficult to test. It's really hard to get perfect testing. In fact, I'd go so far as to say that perfect automated testing is essentially impossible with Drupal. I said it. I'm not a Drupal con. Drupal's not perfect. There's too much code to test for any given site because we've got this web application for building web applications. And that's really cool because you can build a ridiculous amount of stuff while writing very little code just by assembling these pre-built components together and configuring them. The catch is that nothing can do the integration testing for you for how all those pieces are going to fit with your custom code and your particular configuration. The permutation of all the possible configuration options is essentially infinite. And in the Drupal community, the way we approach everything is to say let's not deduplicate effort, let's pull all of our efforts into a single small project, or not a small project, a small number of projects. The project gets huge. And we say, oh, you want to do something with dates? Well, rather than writing three lines of form, stuff, and including a couple of lines of JavaScript to pick a date and save it into a database table, we say, here's date module. Enjoy 10,000 lines of PHP code that you're now responsible for. You're probably not going to execute all of that code, but you're going to have it. It's going to be on your site, and it's going to be sitting there waiting for someone to press some button that turns it on and makes it start messing with stuff that you probably haven't tested. The other thing that makes it almost impossible to get perfect testing for Drupal, there's too much in the database. There's a huge amount of configuration in the database, and while it's really instructive and useful to do brand new blue sky builds where you say, let me take an install profile and build it from scratch, there's still a huge amount of stuff that can go into the database that can change on production that can break stuff. So I think the thing with Drupal is that it's really difficult to build some kind of automated test suite where a computer can really tell you 100% the stuff you care about isn't broken. Code review. Every time I give one of these talks, I ask, how many of you guys do code review as a part of your built in process all the time? Okay, this is a DevOps session, so that was a way better turnout than I've ever seen before, but it was still maybe two thirds of the audience. If you're not doing code review, start. This session isn't really about code review, but it's just such an important topic, I'm just starting to put these slides into all of my talks. When you review your own code, you have a particular attitude toward it where you wanna test it, but you kinda love it, you don't wanna push it too hard. So you give it a little poke and then you just hug it and say, oh, my beautiful creation. However, when you're reviewing someone else's code, especially someone else's code who has given you a critical review, that goes a little differently. You really wanna hit it from all of the angles so it doesn't even know what happened. Push it to failure points, it didn't even know it had. So code review is a critical part of any delivery process where you care about the end result. So I think the traditional best practice Drupal workflow is first you create and assign a ticket. You better be using some ticket system to keep track of what needs to be done and what success looks like. If your ticket doesn't describe what needs to be different about the world to call this thing finished, fix that. You go off and you do some code. You commit that code, you push it up. If you're using Acquire or Pantheon or you've got your own Jenkins setup, that probably auto-deploys into some dev environment and you go and see if it's broken. That's not bad, that's a lot better than some folks are doing, going in cowboy style with an FTP account and just editing code live. Much better dev staging test environment. This is kind of the traditional model. A dev does something local. The dev pushes something to the dev environment which is a shared integration development environment usually. Part of the reason that there's a separate dev environment from staging which I'm about to explain is because in dev, we usually reserve the right to break anything. And so we do our code, we commit it to our master branch, we push it, it deploys to the live. Acquire or Pantheon just do that for us, like magic, it's there, that's great. But if I push something and someone else pushes something and they collide, sometimes we see that dev branch break or maybe what I pushed wasn't fully baked. At some point, we say we think the snapshot is good, everybody stop pushing crazy stuff into master, we're trying to stabilize, we push those changes to staging and we tell the client, go into staging and let me know what you think, does that look like what you wanted us to build? And then if that looks good, we deploy it to production. So code moves up, while meanwhile, we're solving that problem I brought up before about the database being too big and holding too much by copying database snapshots down from production of those other environments. So hopefully you do most of your work on local, test it on dev, client tests it on staging because you really need to be able to tell the client this is where I think it's not broken. If you tell them look at this part, but not that part, they have a tendency to come in and say, oh, I knew everything wasn't done so I thought that obviously was gonna change. Sometimes it's not obvious to you what's right and wrong. So you want some environment where the client knows this is where if there's a problem, I should report it. All right, so that's a pretty good and pretty compelling process, but it's got some catches, one of which is people are pushing all this stuff into the dev branch and the dev branch can get broken a lot and you wanna push stuff into staging and it's not always safe to just say, whatever's in dev, let's move it. So a lot of us have started to take this separate approach following sort of a Git flow model where we say create and assign the ticket, developer does the code, but the developer commits it to a feature branch, a standalone kind of sandbox branch where just that part's being worked on and then we review the branch before merging it to dev like Dries was talking about in the keynote, that allows us to take this approach where we can say look master is always shippable. We can always crank out the main trunk because we review it first on the feature branch and that means master is always ready to go. If the client looks at the dev environment that should always be right. Client doesn't ever have to say in the dev environment is this done or not. Because all of the stuff that's not done is often a separate feature branch and you can track the status separately. Ideally it's in a pull request and you can see from the comments kind of where it's at, if this thing needs more work, et cetera. Now even with this model, most people are trying to review the branch before they merge to dev but they don't have a good way to look at the change before it's in dev because they don't have their own environment for that. So you look at the code, you click the merge button, that auto deploys and now we go see if it's broken or not and if it is broken, we have to clean it up, we've made a mess. So the best workflow should be something where you can actually review before merging on dev. So that was all kind of abstract. Let me kind of walk you through sort of our old workflow which is a feature branch workflow, probably not unlike what a lot of you guys are doing I would guess. So we've got a lead developer and she says, I've got a thing. Could you do that for me? And the devs are always happy to help. What can I do for you? So the dev goes off and he really thinks about what this thing should be and how to come up with an elegant and beautiful solution and he's still kind of working on it in his own little world but we have some project managers and the project manager comes over and says, are you doing your work? Are you getting that finished? Where is this at? Is it almost ready? The dev's very confident and he says, like I promise, that is done. It is done as in done. You don't even worry about it. PM goes, okay, now project lead, I need you to review this. The lead goes, all right, all right, I'll check out the branch. She starts working on it. She reverts her features. She starts updating her database. It dropped a table. Corrupted things, I can't even fix it. Now, the dev at this point, he works in the same office. He's here in some grumbling. He's getting a little worried. The grumbling is getting worse. The PM's starting to worry about the deadline. Doesn't really know where this thing's at. The dev is really regretting some major life decisions here. This whole software development thing kind of seems like maybe it was a bad idea. The PM's hoping that this time we'll have it and we won't miss the deadline. He can't really tell because it's in a feature branch and he doesn't know what that giant diff of a view's view means. So finally the lead goes. She goes to see if it works. She merges the fix, all right, not bad. But here's the thing. The client never actually looked at the change. So she thought it was what the client wanted but the client wants some fixes. The client is like that is not what I asked you to build. Drupal has a lot of wonderful Lego building blocks but that is not how I wanted you to put them together. So the thing is, by the time the client looked at it, all of that stuff was merged into one dev environment. So to get it up there on Acquia where somebody could see it in dev, they created, we merged all these pieces together and maybe they happened in kind of different order as different pieces got merged back in. So we've got commits for five, ticket five and ticket four and ticket one and ticket three. The client says, actually, ticket one or ticket one, that's perfect, that's ready, ship it. Hey, Jason, can you just deploy those few things? Oh, God, what's he gonna do? Is he gonna cherry pick, I guess, make a new branch and then cherry pick those individual ones in but now he's got two different versions of the same commits and two different branches and he's gonna have to merge them together later. Oh, God, what are we even gonna do? So even with this feature branch best practice workflow, we still end up with these places where we get in these awful conflicts or awful situations where we have to try to cherry pick things out into partial releases while the client says, yeah, okay, maybe that's even what I told you to build but now that I see it, it was as dumb as you thought it was gonna be. You were right, now can you please release the one that I did get what I wanted and the ones where we were obviously gonna need more help? Keep working on those. So this is what ProboCI is intended to just solve. ProboCI is first an open source continue integration tool, continuous integration tool. So that means that it does all the things that you'd expect, it can grab a copy of your site, it can run your tests, it can report back to you on whether they succeeded or failed. It's also a software as a service platform for running tests and right now we are in a rolling limited beta so I'm gonna have a link up on the screen in a few minutes that you can use to get on our list to make sure that we can scale everything and have a good experience. We're gonna be adding people slowly but there is a software as service, it is ready and if you get on that list we'll get you in line to get an invite. More on that in a minute. But first let me explain how this helps us work now. So our new workflow is the lead assigns a ticket. She's still very hopeful this time. Dev, still happy to help, of course. Still going and thinking about his code, opening a pull request, feeling really cocky about it. The project manager sees it and the project manager might see that it's wrong, say I talked to the client about what they wanted this thing to be but if that's the case, the project manager sends it back to the dev, dev has to think and code some more, open another pull request, let the project manager check it and this time the project manager says, hey that is just what the client wanted, that's perfect. Now you might think here's where it would go to the lead for her to do her final code review, make sure there's no security vulnerabilities but wait, why don't we send it to the client and ask the client, here's a link, can you just go? Is this what you wanted the site to be like? Is this what you wanted the new section to look like? Is that how it should work? Now the client might make that angry face from before, at which point, back to Jason to do some more work or the client says it's good, now the lead is responsible for merging and deploying that so that she can deal with any get conflicts, make sure that all of this stuff is going out in the same way, make sure there are no security vulnerabilities, it didn't have to bounce to her to go through that onerous process of restoring a database, reverting the features, running the database updates, seeing that it was screwed up, going to review the next ticket, importing another database, starting the whole process again. In my experience with a big active site, that can be a 20 minute process or more. So the idea of taking something, on some of our projects we have eight or 10 developers, you start multiplying that by a couple of pull requests per person per day, times 20 minutes per pull request review, all of a sudden it starts to look like someone's full time job to just be testing dev environments and because you have to do the whole droopal rigmarole of having a local environment and running database updates and reverting features, that's highly skilled labor, that's something that you actually need a relatively senior developer to do, or you need to spend a lot of time training somebody. So with this, because we can go through and we can test everything as it happens without a person changing their local one environment or something else, there's just this great sense of relief, there's just much rejoicing and we're living in a better world. So without Probo CI, Master always had a mix of stuff that was bad or not vetted by clients yet. Testing feature branches was always hard, devs always just wanted to work on master, so if they were in a hurry they'd work on master and you'd have to smack some sense into them and say always make a feature branch and they'd say but I just wanna push it to dev and see if it's broken or not, I don't have this copy of the site locally. And only developers could actually see where a feature branch was in the process, a client couldn't see, a project manager couldn't see, if you have dedicated QA people in-house or maybe you outsourced that, they couldn't see any of that stuff until it got merged into one of these dev branches. So with Probo CI you can always have, oh, I needed to update that one, you can always have master be ready to deploy, contrary to what that says there. And every team member and client can see where every single ticket is in the process. So here is what our new workflow looks like. If a developer completes a feature request, it goes to the Probo environment, anyone on the team can see it, see if they've got their deployment step right, that's a big problem for us that we've seen with not just us but with our clients and other consultants that we've worked with. Often people mess up the deployment step, they forget to export something in features, they write a bad database update, they go to see the actual site, you go to test their feature branch and you have to send it back saying, you forgot this strong arm rule, you forgot this, you forgot that. Here the developer can just push the feature branch, go see whether it looks right or not and if it doesn't, they test their own work in a new environment and test that deploy step. If we see that it all went well, it can go through, the lead can merge it and just click that merge button right into the master environment, that can do an auto deploy to that dev environment and you can make sure everything's working, roll a deployment to production anytime you want. So right now we've got GitHub and Stash integration. The GitHub integration is open source, the Stash integration is part of our platform as a service, we're figuring out how to share that better. We're working on Bitbucket integration and I'm sure you guys have another half dozen services that you'd want it to work with. Send us a message and we'll get on it. So at this point I wanna talk a little bit about since this is an open source project and this is the DevOps track, I wanna talk a little bit about how this is put together so that if you wanna download it, run it yourself, you kind of understand what all the pieces are and how they fit together. So it's written in Node.js. I know, you might have thought go because it's a container E-C-I-E thing but my official answer is Calibox is written in Node.js, a lot of Drupal developers know how to write JavaScript, it's gonna be more accessible for the people that might wanna use and extend this. My less official answer is I haven't learned go yet. So let me talk a little bit more about the architecture. This is gonna be a controversial one. We're running fat containers which is to say that we are running a Node.js process manager so there's a little Node process that starts up and that starts up your LAMP stack. Linux, Apache, MySQL, Redis, Selenium are what we start up by default. And it multiplexes the log output so that we can get the output from each of the services and stream it and put it somewhere for you to do something interesting with. And then that also allows us to easily treat any environment as a single unit. So that's really useful for us because it means we can archive your whole thing with one Docker command. It means that Docker's C groups that it sets up for you will apply to all of those processes which means that it's easier to do resource isolation without having to layer something on top of Docker's C group handling, find all those process IDs and assign them to a group every time you do something with a container. And so, and most of the normal arguments around running fat containers are around what happens when you need to scale it out, when you need to move one. And in our case, these are QA environments. They're only intended to be seen by one or two people at a time. We don't ever want to do that. So at the moment we run fat containers, we may change that eventually. It's just more work to do it right in a different way. And we have a microservice architecture. So I'm gonna walk through each of these units individually but here is sort of the overall layout. You've got web requests coming in. You've got GitHub notifications coming in. Your GitHub notifications get sent from Nginx, which is our endpoint to the GitHub handler. Back to what we call the container manager, which sends messages to Loom and coordinates Docker. That's the high level. Let me get into some of the details. So we use Nginx to terminate SSL efficiently and to route to the appropriate microservice. So Nginx starts up, listens on port 443. You set up your SSL configuration in it. It's gonna be better at that than Node.js from my limited experience with running Node.js SSL at scale. And it's gonna route to the appropriate microservice on the backend. The GitHub handler receives GitHub web calls responsible for sending the status and deployment updates back to GitHub. And it retrieves the probo.yaml configuration. So that's a configuration file very much like a travis.yaml if you've seen that before. It just tells us how to build your site so that you can build sort of whatever kind of lamp site you want on our architecture. The container manager is where most of the magic happens. It drives Docker, so it has to connect to the Docker's either POSX port or TCP socket. You can configure it to do other one. Starts and stops containers, gets information out of them. It runs the build steps for you. So whatever you need to do to run your tests or provision your environment, it runs them. It sends messages to Loom, which is our service that I'm gonna talk about next. And it reports on containers running on the system. So you can call the container manager and say what's there, what's running, stop something, start something, et cetera. Loom weaves multiple streams of output together. So it receives any kind of arbitrary stream. We've tested it with streams of binary data although probably other things are better at that. And it has an API that allows you to tail those streams. And then it stores some metadata about that as well. So what that means is that we can take your standard out and your standard error from whatever commands you're running and we can pipe them to this thing called Loom. And Loom will take that and store it in a configurable backend. The one that we ship right now is RethinkDB. So that can pipe that output into RethinkDB and we're working on simpler file store backends and some other things. Then the web proxy. How do you actually go and see this thing? Well, Docker's going to spin up this environment and it's gonna pick a port for you, right? But you probably don't wanna go to some IP address, colon, some port number that's dynamic and changes, right? So we wanna give you a link that you can hang your hat on. So what the web proxy does is, looks at the request that's coming in, figures out what container address and port it's supposed to use, and then proxies the incoming connection back to the appropriate container on the backend. We actually put containers to sleep, sort of like Pantheon does, doing kind of a socket activation thing where if you haven't looked at a container for a while, RAM's expensive, we don't wanna use it for QA environments that are only gonna be viewed two or three times over a few day period usually. So what we do is we just stop the container, when a request comes in it only takes about two seconds to start up all the services at absolute max. So then we can spin the services back up and proxy connect to you. So the first time you load a page for the first time, a site for the first time you've looked at in a while, might take about two seconds and then it'll be lightning fast because we're using packet.net hardware, which are some of the folks that used to work at Advomatic now have a hosting company where you can get bare metal servers on an hourly basis. So we're using RAID 0 to NVME drives, which makes this stuff super fast. Tests, reverting features, all of that stuff runs faster on packet.net than it runs on any of our local machines or the bare metal server in our office. It's not cheap to get that kind of hardware, but nothing makes Drupal perform better. It supports domain access pretty well. You can set it up to sort of prefix so that you can have domain access domains or suffix actually, we tack a suffix on, so that you can use domain access and you can switch things in the URL just to switch access controls and see maybe different language versions or a different theme, a different sub-site with different sections of the content. And it supports multi-site mostly. You have to get clever and it's not very well-documented, but it's possible to set up multi-site environments on Probo. I'd still recommend against using multi-site ever for any reason, but if you're going to ignore my advice, it is possible to make it work. So then we also have another component. Remember that thing I said about Drupal having all this stuff in the database and that you don't really know if you've broken something unless you're testing the most recent code with the most recent database. That's why we built this thing called the Asset Receiver. So what it lets you do is receive upload streams and put them in a bucket. So you can say, I'm gonna create a bucket for this project. And then when you stream a database snapshot to me, I will put it somewhere, I will use an access token so that you can kind of have API keys to upload. And then I will compress it and then encrypt it. That order is important if you want that compression to do anything on file so that you can then fetch it from your builds. So we allow them then to be downloaded. Now the Asset Receiver is currently protected basically just by a firewall. So we set up our engine X configuration so that you can post stuff to the Asset Receiver publicly but you can't download stuff. And so inside our infrastructure you can download stuff, outside it you can only upload stuff. And you have to have a token to upload something. So what that lets you do is you just keep sending us new copies of your database. The second we fully receive a complete copy, we replace your old one with that and now all your new builds will use that database. So in the Probo UI, which I'm gonna show in a second, you go and you can get a token. We generate one for you. So there's an example of a token and up here is an example of how you could use our Probo Uploader little CLI client. You just copy that token and then you can pass it like this to our Probo Uploader and you can upload either a stream of data or a file you had on disk. And so using that is pretty simple. You just sudo npm install dash g Probo Uploader and the second you do that it installs all the dependencies, installs it as a global thing and then you can, the other video I edited like a cooking show so it went real fast, there we go. Probo Uploader and then you can just give it the name of a file and it'll start at a token which you can copy off of the Probo site if you're using our hosted version and you can upload your database and get it up into our system for us to do a build. So in terms of what do you need to get started, all you need is a GitHub repo and to send us your database and now we can start doing real builds with your real production data. That's all you need. Now everything I've talked about so far except for the UI I just showed, I'll get into that in a second, is all open source, we've given it all the way, it's on github.com slash proboci. Now we also like I said have a software as a service version. So in addition to nginx, well that's not ours obviously but a service that you need to run to kind of set this all up. The proxy server, the github handler, those are your windows to the outside world. The container manager which runs your container and loom which collects all your logs, we add just a couple of other steps and you'll see that those same open source components, we don't have any plugins for them or anything, we use the same exact ones we've open sourced. Instead of pointing them at the container manager, we've built a little proxy called the, we call it the coordinator which is really just necessary to allow us to do multi-container manager deployments and allow us to do things like, so that is proprietary, it's part of our SaaS service and it enforces resource restrictions. So how much are you paying us for storage? Don't let one loud person run us out of our hosting bill and enables you to have deployments with more than one container manager because otherwise if you're pointing directly at the container manager, you only get one. So if you have a single project, install this yourself, don't pay us for anything, it'll be great, you'll be happy, you can have all the workflow stuff that I talked about. If you've got a ton of projects, it becomes nice to have that proprietary web UI that I showed and I'm gonna show more in a minute which basically provides a friendly UI for enabling repositories, creating upload tokens and viewing the log results. We're gonna be working on an open source log viewer but we don't have it yet. Also does the credit card collecting, integrating with resource limiting, et cetera so that we can sort of say how much you're able to use. So you're uploading your database to us. I said that we store it encrypted but then we're gonna be running your stuff. What about security? Essentially our answer is the same as the Docker answer because it's pretty much built on top of Docker. So we run your container as a non-root user inside of a container and the isolation, the protection you have against someone else in the same container manager is as good as Linux C groups and namespaces, mostly just namespaces support. So there have been a couple of issues with namespace privilege escalation which is why we're looking at rolling out kind of a more enterprise-y service where if you pay us a little more we'll set up a box and you'll have your own container manager only your code will run on it. Or conversely if you're setting this up yourself just keep in mind that the resource isolation is as good as the kernels. Some people think that's really good. A lot of startups are starting that are using that as their isolation model. Maybe it's good enough. If not I just wanna be really transparent about what the guarantees are. We're sitting on top of a stack that's pretty new and we're only as good as those kernel features that are responsible for isolating users from each other. So if you want us to run your stuff on your own box or you wanna use the open source version to run stuff on your own box, great, do it. If not, it's probably okay. But maybe sanitize your database. I don't know, we store them encrypted but can somebody get over to your container while it's running is probably the main privilege escalation. Not sure. Should be fine. There have been exploits. That's the model. So let me show you a quick demo. So here I'm inside a branch of our website. I'm creating a new feature branch called background green. That's a pretty descriptive title but if you're not sure I'm gonna turn the background green. You can see here I opened a CSS file. I just changed orange to green. I saved that file. I add it to git. You should always use add-p so you don't commit anything you didn't mean to. It's if tech not using add-p we consider a fireable offense. I'm only sort of kidding about that. Descriptive message. We're gonna push that live, or sorry push that up to GitHub as a feature branch called background green. And when we switch to GitHub this couldn't be easier on GitHub. GitHub will notice that we've pushed a new branch and will automatically pop up a thing in the interface saying hey you just pushed a new branch. Are you trying to maybe create a pull request? I see you've got something called background.green. Just click here and you can look at the difference. You can write a nice comment about what you were thinking and what you were doing and why someone should merge this. And then you can click create pull request and immediately a message goes out to Probo and we start building your environment. Once we build your environment we parse your .probo.yaml file and we figure out what you wanna run. And you can see then we create all the steps and here we're working on a high level step so that for most Drupal sites you could just say do you want me to revert the features? Yes, no, do you want me to update the database? Yes, no, is this an install profile or is this the full site? And you'll just have to give us four lines. At the moment you just create bash just like you would on Travis. Travis doesn't know about Drupal but you can test Drupal on it, same thing here. So you can see here the stuff is running. If I click the detail link with the open source version if I click the detail link I would go directly to the build. So I wouldn't see the logs it would take me directly to that thing built. Let me see my instance, let me see if the thing's messed up or not. If you're using the SAS version we do have this UI that does our software as a service stuff and you can see all the different results of all the different steps and here's the killer feature. You click view build. I edited out the two second delay. Oh maybe I didn't. And after that delay we should see the page load here. There you go. There is the ZivTech site and I just hit reload there. You saw how fast it reloaded and now our background's green. So we just push that in less than a minute we got a build of the site. Actually on this site it's probably probably almost a minute and a half with a fresh build of our relatively small database. Then in the UI you can also click back see your project. There again is where you get your upload tokens. You can see the branch names and if this was associated with a pull request you can see the pull requests and you can click on any of those to go view the build. So you saw this one was called background blue and it guesses as to what I did on this one. And as soon as that loads and we scroll down you're going to see that there we go the background is blue on this one. We can look at these different parallel universes of if I accept this pull request what's gonna happen to my site. If I accept this pull request what's gonna happen to my site. And if we go back there I think I click on that pull request idea just showing that that's a quick way to get to another pull request. You can see that all the checks have passed now. You can see that build. Now what happens with a pull request that has multiple things that keep getting pushed to it. Well you can see that you actually still retain that detail about each one of those pushes. So this was the one that we were using to demo this for people at the booth. So you can see that we've got all of these different builds and you can click on any one of them to go see the results or go click to view the repository at that point in time. And so here we'll see a version I think where it's orange maybe. Yellow, yeah. And so this really enables you to kind of move into that sort of next level workflow where you get to see everything about what's gonna change before it lands in that shared master branch where other people are going to be happy or unhappy with how all that stuff matches together. Here you can test all those things in isolation and then see them start to combine on master once they're fully baked. Just like Reese was saying in the keynote get those feature branches totally ready master can be shipped at any point in time. So if you wanna get an invite code to the software as a service edition go to probo.ci slash bcn and folks signing up from Barcelona since we are sort of auditioning it here get to the front of the list and we're gonna be inviting people on a rolling basis as we see what the real resource utilization is. Oh, I should also mention we do also do garbage collection generally. So after some period of time all those builds we were looking at were fresh we have a configurable garbage collector that we can say how many builds on an open pull request should we keep around and we usually leave it at one. So for this demo we turned it up you can turn it up on your own instances but we usually leave it at one so the most recent build will always be there from an open one and when you close it we destroy it to throw away that container and reclaim those resources and we're gonna be adding a button in our UI so you can say rebuild and we'll build it again for you. Okay, I think at this point I can start to take a few questions for the last 15 minutes here there's sort of a lot more I could go into but let me just say thanks a lot and we'll move on to questions. So what's involved with the open source setup if you wanna set this up on your local and test it out? Ah, I'm glad you asked that because I actually have a resource that I wanted to share with you guys. Actually here let me just turn on mirroring so that if I need to show anything else I can quickly not break my neck, turn my head around. So ProboCI is again the GitHub repository that GitHub organization. You can see here exactly the closed source pieces that I was telling you about, our stash, our puppet master, our coordinator, and our web service that's the whole thing that's not open. If you look at the main project Probo this is going to grow into a meta project that pieces together all of the different sub things so that if you wanna get a simple setup running you can do NPM install Probo on a machine with Docker and then just say like Probo monolith and we'll just start up all of the services for you and you can evaluate it. At the moment it's kind of a split. Some of the processes like the GitHub handler and the container manager are in this monolithic project and then some of them have been split out onto their own. We're in the process of splitting those out into their own individual repos now that things are kind of stable and then this is gonna group those together for an easy installer, sort of a packager. But in this repository you can see there's a quick start guide and the quick start guide will give you all the steps you need to do a complete setup. We just retested all of the steps in here yesterday and made sure that we got everything totally locked down and by yesterday I mean yesterday East Coast time so really this morning. So that should be totally up to date, it should totally work. You do need a recent version of Node. The Probo uploader should work with anything from the last couple years but this stuff we're using some of the cool new ECMA script features so you need to be running either Node 4, the new version now that IOJS has been merged in or IOJS version three or later I think. I think we have that at the top. So yeah, if you wanna get started just go, it's easy to remember, ProboCI slash Probo and look for the markdown file with a quick start guide. We also are working on updating the ProboCI site which does have a documentation section that probably needs to get merged into as well. That'll give some more information and we're actively working on getting that updated so that it's got the latest information on it too. Yeah thanks for the talk, was delightful. Two questions actually, do you plan or already haven't developed meant some kind of master merge? So if a feature branch is longer open and two other feature branches are already merged in so that before the environment is built, the master is merged into the feature branch so that you can test basically what it looks like. What it would be like if you clicked the merge button. Yeah, we're actually working on that right now. So initially we just did the, we're building your feature branch at that commit which would tell you what happens if you check that out but not what happens if you click the merge button. So TravisCI does what happens when you click the merge button. GitHub even has some API features that can do some of that merge for you and we're gonna make that a configurable flag so that you can say do I want the regular commit or do I want the merged version and the default will be the merged version because I think that's more useful. You wanna know what happens when you click merge, not what happens if you check out that individual commit. Great. And then I think the harder thing also is like do we wanna do any kind of automated attempt at rebasing or something which is a much scarier, bigger feature where you'd need a lot more UI outside of GitHub which is something we've been talking about but if GitHub's afraid of doing it we probably should be too. And second one, do you plan to actually release a enterprise paid version for your server for a known server, not as service? Well, I think. Because we have customers which would never allow us to use the software as a service basically. Yeah, I think reach out to me after this and we could, if those users are comfortable using beta software, we can work something out and get something set up with the beta. At the moment we're not accepting credit cards because we wanna make sure that this thing's gonna be robust and we don't wanna start taking somebody's money before we've seen it working for people that aren't us and our close friends and family. It seems to be working very reliably but we haven't had it open to just the broader world which I'm quite sure will reveal some edge cases about things people need to do to do builds and so once we've established that and we kinda get a sense of what the hard costs of running it are gonna be because we still don't have great data on that. We know what the profile looks like for our existing sites but we don't know what the profile will look like for the broader community in terms of how long will most people have these containers up versus how much will they be shut down, how big are databases on average, how long do these normally last before the container gets destroyed. All of that kind of stuff is stuff that we still wanna gather more data on before we start setting rates that we don't wanna have to renegotiate them basically. So we're gonna be letting people into a limited free beta period where you can try it out and use it for no cost at the moment and our payment is we learn how you use it and then at some point we'll sort of we're gonna try to have a free tier but for more resources, for more private stuff that will probably sort of be a tiered thing based somehow tied to resource utilization, maybe it's a number of builds, maybe it's disk space, maybe it's some combination, maybe it's a number of projects. We're still kinda trying to figure that out. We will have an enterprise version. There will be an enterprise version that you're able to install. I think one thing that that might look like that we're talking about is you run the container manager inside your own architecture but it could tie out to our UI. So basically we don't ever have any of your data but you can use our UI to see the log history and that kind of stuff or enterprise we can install the whole thing behind the firewall, depending. So I work at Yoast and we build a lot of WordPress plugins currently. We're also doing other stuff but and in WordPress you have this problem of multiple PHP versions and we're building plugins so we want to test that for multiple PHP versions and also multiple different subsets of plugins that are installed together with ours. So that means that we might even be interested in like 10 build environments to QA test for one feature range. Is that something that would be possible with your product? We don't currently support matrix builds which is what you'd be looking at, right? You'd want to have some set of criteria where you might say this version of PHP I want to test against PHP 5.3, 5.4 and 5.5 and 5.6 I guess. And then you might also say I want to test against the hip hop VM and then maybe I want to test against MySQL and MariaDB. Okay well now we don't have seven, right? We have 10. And so the number of builds and the number of resources that we're using for those things just can start to explode depending on what we let you do. That's in the backlog to support matrix builds at the moment we don't at all. So the only way you'd be able to do it would be to sort of hack around it because we load the probo.yaml file for the code that we're testing. So were you to create separate pull requests that tweaked that one thing it would work but that would be a pretty nasty way. So yeah I think I should probably also mention we waitlist what containers you can use. So right now we have one image it's built on Ubuntu 14.04 it has the default versions of everything and we intend to add we have a 12.04 that's just about stable a lot of people are still running that and we're working on a newer image as well. So in your probo.yaml you can specify what image but it's not just any image from the Docker registry we filter it and if it's not one of the images that we built we don't want to use it which is sort of for security reasons at the moment largely but one of the things people keep asking me is can I run my own containers. And right now the answer is like if you need something that we don't have tell me and I'll try to include it if you have a container you've built that you want me to include tell me and we might white label it to let you use it but at the moment no you can't just say like I'm picking what I'm gonna build and that's what it runs. Okay. Although there's quite a lot that you can do with the scripts to change configuration and stuff from your build steps just like on Travis. Thanks. Sure. Hi, in your asset management I forgot the name you showed us that you import a database dump is it possible to use it for normal files like in Drupal's files directory or some other assets? Absolutely. So there's two sort of components to that. So the Probo uploader was the thing I showed installing and using in that demo video. The asset receiver is the open source thing. You can see that it's totally documented with curl commands so it's not tied to Probo itself at all. Same thing with Loom I should have said. Both of these things don't directly integrate with the other services the other services know how to integrate with them. So they're general purpose things for storing uploaded files in a bucket that you can use for whatever sort of use that you'd have. The way you end up using those I can actually show you the Probo YAML file for ZivTech. I probably should have put this in the slides I'll have that for next time. Because we have this Probo.YAML file it'll look pretty familiar to anyone who's ever done a Travis.YAML file and you'll notice that in here I have an assets key with an array of files. So basically when you upload the file to us we keep track of the name and then you just tell us which named file you wanna use and we use the most recent one. So if you set up a Jenkins job the point of the CLI client instead of using the web UI the point of the CLI client is that you can set up a Jenkins job with one of these access tokens that takes a SQL dump, sanitizes the database would be a good idea and I would highly recommend it then sends it to us over SSL we store it encrypted. Then when you wanna use it you just put that in here and the container manager has a plugin that looks at any assets that you've got and then connects to the configured asset receiver to download those assets into the container and it puts them in slash assets and it also downloads the code to slash code. So by the time your stuff runs your script in here your steps wake up in the assets folder or sorry in the code folder slash SRC and then they can do whatever they want with those. So you can have multiple assets you could upload a files zip of your public files folder and your SQL database. Now what we do is we use stage file proxy so we don't upload our files we configure the stage file proxy Drush command you can see here we turn it on by running drush enable stage file proxy and what that does is when someone requests an image and it's missing it goes and fetches it from the configured upstream server in real time. So when you go to load one of the images on our homepage we actually connect to our real homepage and download it and then it's cached where you to upload a new version of the same named file that would get uploaded to the local site not up to the main site and it would sort of override what was there. Or in Drupal usually we don't overwrite a file we write a new file and update the reference but that's the general idea with stage file proxy it's a great module so that you can save space because if you have a big public files folder if we're charging you by the gigabyte you won't want to be pulling that whole set into every single build it's bad enough to do the whole database but the files usually you can test enough with stage file proxy. Another question is if you have a custom nginx with a very custom configuration how will it fit into the diagram you showed how will it be on the web proxy or will it be before the web proxy? It would be it depends on what you're trying to do if your Drupal site requires a configure like a special nginx configuration I think what you'd need to do is upload that as an asset and then you could download it into the appropriate configuration folder move it into the appropriate configuration folder on the container now right now our only image like I said is a lamp image we're working on a lamp one though In our case it is a custom build nginx with custom modules and custom configurations it is very huge with SSI and some law scripts so it will not fit into the container like this Yes so for that we'd need one of two things either we'd need to coordinate and I'd have to set up an image for you or whitelist your image so that you'd be allowed to run it or we'd have to open it up so that you could just specify any image on the registry I think because what we need is a custom custom build image Is it possible to provision it to be ansible? It is not currently possible to provision it with ansible that's an interesting feature request but yeah I'll have to give that a little bit more thought I think that's an edge case for most Drupal sites but and I wonder whether to test the Drupal configuration you really need the full nginx configuration but it's definitely worth talking about and worth looking at we want to let you run as much as possible we just want to do it securely and we're still sort of nailing exactly nailing down exactly what that means Have you found any sort of limitations in size with any projects? For example, if you have a database of 20 gigs or more would that slow the process down of launching? Sure, so I mean we did okay so there's two levels to that another feature that's on our roadmap that we're gonna be getting started on soon is to exploit Docker's layered file system where Docker allows you to create a layer of the file system and then it's a copy on write system so you can have the base machine install and then when you import the MySQL database that could create a new layer and if you use inodb's file per table you'd have a separate file for each one of those tables so then if we built that as an image for your site and only let you run sort of off that then when we built an individual build for you we wouldn't need a new copy of your whole database we'd have to clone each file that you wrote to so if you created a node we'd need to make a copy of the node table but it at least wouldn't be the entire database so if there's stuff that your tests don't touch it wouldn't blow up But right now you're currently importing the database? Currently we are importing the whole thing so that would make your build run really slowly we're currently not enforcing any timeouts that's one of the features that we needed actually from other CI systems a lot of them only let you run say 100 minutes with Drupal building a fresh page on a big site even a pretty optimized one it's really easy to burn half a second at least rebuilding a cache so when you start saying we're going to do integration testing that clicks through all the features that we care about it's very easy for that test suite to start to take an hour and a half, two hours and the only way to break that up without having tests step on each other's toes is to try to break that into multiple suites that can be run concurrently on separate services so to do that on Travis they say break it up and run a matrix build across that runs some parts of your suite in each one basically and yeah, if you're importing your database it takes however long that takes so we want to leave that constrained but if we're charging for disk space that's going to get expensive too very quickly if you have a lot of pull requests being created you basically be eating 20 gigs on disk for each one of the open builds Is it still usable up until if you credit a new upload is the old version still usable until that database uploads complete? Yep, so the way it works is we start streaming the file and when we hit the end of the file that's when we write the record saying we have a new version of the file and then garbage collection happens to only retain actually to be perfectly honest the garbage collection for that isn't written yet so right now we just store all of your files forever but as that becomes a problem we'll build in the feature for doing garbage collection saying only keep the most recent three right now we only allow you to use the most recent one per name so, but yeah that happens at the end of the file that's when we update the record and that's when we point you at the new version all right, thanks a lot guys, really appreciate it oh I should say if you liked it review the session if you don't you can still review the session I just wouldn't recommend it I think it was fundamentally a little bit different