 All right, I think we can get started. Thanks everyone for showing up for the last talk of the day. I expected it to be half empty in here and everyone to be tired. So thanks for still coming. So if your career in software development resembles mine in any way at all, you've probably found one thing to be constant over everything else. There's always more work to do. No matter how much you try, no matter how long you work, there's always something else you can do today, tomorrow, next week. For me, like I said, this has definitely been the case. I work on the Google Cloud Client Libraries team, and we maintain hundreds of different packages. To make this problem worse, they aren't even all node packages. We maintain eight different languages. Now luckily, no one on the team is expected to know all eight of these languages, but almost all of us know more than one, and we still have to maintain all of these things. So that only helps the problem a little bit. What we started to realize is that we needed help, and that's when the bots came to save us. I say the bots came to save us, but I'm obviously speaking poetically. They didn't come to save us, and our bots aren't all that smart, so I wouldn't expect them to be our saviors. I mean, they're helpful, but they tend to be good at just doing a single thing very well. Before we get further into this talk, I wanted to discuss what is a bot? What does that mean in the terms of software development? When I think about robots, I immediately think of an automotive assembly line, and these big arms that replace the job of humans from decades before. But we really don't mean this when we say bots in technology. We're not usually talking about a physical robot of any kind, and we're not even talking about a robots.txt. Robots.txts are crawled by web crawlers and are fairly similar to the bots we're referring to, but they're a little different as well. What I usually mean when I say the word bots is something that resembles this. I mean a script that runs as a result of an action, a trigger from something external, that might be a timer, that might be a pull request, and as a response to this event, some code runs. Bots aren't good at all things, of course. As I said, they're usually good at repetitive work, things that are scoped, one single thing, but they're very good at that one thing. Particularly, they're good at things that don't require intuition or any sort of debugging. So the sort of action that can be blindly followed with a strict process. And this isn't necessarily a bad thing. Because it turns out humans are really bad at being robots. Multiple studies show that when humans in a counter-repeated processes, we fatigue, we make mistakes, we miss steps. And so it's good if we can use something that's good at following rote rules instead of ourselves. I think it should be our goal to eliminate as many of those sorts of tasks as possible so we can focus on the work that requires human intuition and generally brings joy to developers. And while most of us use bots, I think it's true that the majority of software developers haven't written a bot. With any hope by the end of this talk, a lot of the mystery around that will be removed. So to frame this, I wanna talk a bit about levels of automation. The SAE, Society of Automotive Engineers, has a standard for self-driving cars where they separate the levels that they are automated, the amount the system does work. So they separate those out so we can have an understanding of the advancement and sort of risk involved. This will help us sort of understand things and it will allow us to start with simpler bots and get to more complex ones as I talk. So this is the chart they use. I'm not going to try to describe it exactly, but at level zero, it's your typical car, the ones that have been around forever. The human being does everything. Eventually you get to level five where the machine does everything. In theory, you don't need a human being. It is entirely unscoped, meaning it doesn't just have a list of tasks it can do, it can do infinite tasks, anything at all. This is the fully self-driving car. So if we take that and apply it to bots, we get this chart. We're at level one, we have things that are automated a little bit, sort of at the level of a script or a tool. And eventually again we get to level five where the machine does everything. You can see at each level of this chart one more thing is taken over by the system. That's the bolded bit. I'm not going to try to describe this much further here. I think it'll be easier as we go into examples. So at level one, put simply, our goal is to automate portions of our workflow. Not necessarily make a bot do all the work, but take away the parts where it's easy to make small little mistakes. As a human being, you're going to discover the work. You're going to kick off a task, hit a button, run a script. But the work itself will be automated. So let's put up an example. We have a package and our goal is to release this. But releasing takes multiple actions. It might involve tagging a branch, updating a release number, publishing to NPM, maybe deploying docs, et cetera. So how do we fix that? Well, we can write a script. Script can do all of those things and we can click a button. This might not sound entirely like a bot yet. Bots, we don't tend to think of as being a thing where we just execute a script locally, but this is the most basic bot. The deployment environment is your machine. It does a task for you. The only thing that's a bit odd is because at level one we're still triggering and assisting the system, it doesn't feel all that automated. And then we move on to level two. The best way to describe this is that we can automate the discovery now as well as the work. But we're still, as human beings, going to be responsible for supervising the bot. We're not going to trust it to operate unmonitored. Representative tasks for that would be that script we authored previously could be forgotten to be run. And it would be good to know if that happens. If we have some release ready to go, let's say we've updated the release number within package.json and we haven't published yet. It would be cool if we had some sort of monitoring to let us know about that, so we don't just let it sit stale forever. And when we start in the level three, this is where I think you start to really see them as very useful bots. At this point, we let the bot start doing work for us. We have to supervise it a little bit, probably check in on it. But for the most part, it's fully starting to self monitor. And we're not going to have to do a lot of intervention ourselves. An example of how that might manifest is we have issues in our repository that goes stale. So we all have repositories that we have to work in and issues get assigned to developers on the team. But occasionally people on the team become overloaded or that individual on the team maybe isn't the subject matter expert for that. So they're stalling out. So we can implement something that juggled these issues around and could assign them to a different team member to see if that would help us get traction. Another example you might see here is something like a CLA bot, maybe where you can notice that someone on your team doesn't have CLA's and you could sort of walk through that. It's going to require very limited monitoring. The monitoring at that point is mostly in the fact that nothing's going to get merged without a human. But the bot can still go through that entire interaction with a new contributor. And then we get to level four when things get a bit more advanced. We don't really have to supervise any more at all. The system handles its own fallback. When errors happen, it knows how to self recover. Maybe if it has an error it can't recover from, it opens a bug for you. So what does an example like this start to look like? Sometimes it turns out that in our repositories we have branches that get created for PRs and contributors forget to delete them. And this starts to make things get a bit bloaty and hard to see what's going on. So maybe we could write a bot that deletes them. And I feel like this is a point to mention that as you go through the levels, risk starts to increase also. This is a rather risky thing to do. What if those branches are needed and you made a mistake in this bot? This is where I mean to say it's not really supervised anymore. It starts to take actions that would be hard to recover from. So other bots in this category are things that look like merging on green to master where maybe you reviewed it and located it earlier but someone merged something else, then the CI passed and it gets merged in. And that could be rather risky. There are a lot of bots in this category and the example we'll use later falls into this level four. And finally, we get to level five. I think at level five, the easiest way to describe it is the robot starts to become your own boss. Because unlike the previous one, it's now unscoped. It no longer has a question and a solution. It just responds to all questions and all solutions. And I think you'll find the science fiction has taught us that unscoping bots can be a rather dangerous thing. I don't know that I would really want this happening in my repository. I would say we avoid doing this all together, if at all possible. And after all, human beings are good at these sorts of tasks anyway and we usually want intuition. So this probably isn't where we want to look for bots. To sort of tie back to automotive, they're finding the same problem. In automotive driverless cars, getting to level five is going to be very difficult because it means the automation needs to understand new scenarios it hasn't seen before. And that's a pretty giant jump from training on existing scenarios. So at this point though, I think we should talk about writing bots and not me just sort of giving you quick examples. And you probably thought maybe we could just write a bot to write the bots. Let's go back to the previous step. We probably don't want to do that, right? If we find a way to have bots write bots, we've probably reached century computing. And then you end up with this weird problem of who watches the watcher. Eventually realizes turtles all the way down and you're probably just going to have to accept that we're doing this ourselves. So while we can't use a bot to do this for us, we can leverage a series of frameworks. And our team found one that we liked a lot called ProBot. This is good because most of us don't want to spend our time authoring bots. The bots are a means to an end. They're not the solution itself. They're not our marketing. And so being able to leverage other open source products mean we can get back to our product and not just being bot authors. ProBot integrates really well with GitHub. It's authored by a GitHub engineer. And it allows us to trigger events in the form of small node apps based on a GitHub context, many different GitHub events. The nice thing too is they have a variety of samples we can use to sort of inspire ourselves and understand what to do. The documentation is fairly good. And so that's what we decided to go with. So the first thing we're going to want to do is think about the scenario we want to solve and try to scope that to a problem that is solvable by a bot and no longer requires a human. So a simple question that we might ask yourself is could we have renovate PRs, PRs from the renovate bot automatically run CI for us and not wait for an engineer on the team to go tell the CI system to run? The reason this is important is systems like Travis Circle and the internal CI we use, they restrict which contributors can kick off builds. And this is important. Most build systems have secrets. And if any random person on the internet can run a build, they can modify those files and they can expose secrets. So we don't want to allow that to just happen. It needs to be a trusted contributor of the repository of the project. But renovate isn't really a contributor. It's a thing we use, it's a thing we trust, but it's not part of the GitHub org. And so we could probably write a bot to do this. And that seems like a small enough size and something direct. When renovate creates a PR and we detect that it's the author, we run CI. The next thing we need to figure out is what sort of events do we need to trigger this on? We could try to trigger on all possible events, but that's probably going to mean it runs too much. So we might want to trigger on initial PR, maybe on updates to the PR, maybe on the creation of an issue. That's probably not relevant in this exact example, but it's a common one. There are dozens of different events you can trigger on, but for me these four are most often the ones you end up using. And the next decision you get to make is are bots going to alert or change your system? And this again goes back to risk. If a system only ever alerts you of a problem, it's generally not that risky. In the case of something like a CLA bot or maybe a Linting bot, it's likely to just leave a comment on the PR. It's not going to merge your code, not gonna run your build system, relatively safe. On the other hand, if we make changes that run the build system, merge, publish, they become more risky. These are the sort of things that causes incidents. And so you need to decide how much risk you're willing to take on in this instance. So for the case of this bot, we're going to likely add a label to our repository that says it's safe to run CI. And so this looks a little more like a change. And so that's a little bit more risk, but we can't get the value without that. So let's get around to building it. Probot comes with this quick start we can run via NPX and it's a pretty reasonable place to start if you've never written a bot before. It's going to populate you a node project that has most of the templating after asking you some simple questions. We didn't use this exactly because we found out that we wanted to do templating on top of Probot. And so we recently added our own bot generator that does basically the same thing. There's a few less questions because we can make a lot of assumptions. For instance, all the authors are Google and so that makes that a little simpler. But this also allowed us to do things like template our read-me's and have consistent style across all of our samples and have similar targets inside package at JSON. Like I said, at the end of the day, this is just another node package. All the bot system does is runs a method when an event happens. It's a pretty straightforward application and looks like a lot of things you've used before. And as a very bare minimum set of dependencies, we're going to use something from OctaKit to interact with Git. We're going to use Probot. In our case, we use a thing called GCFutils because we leverage Google Cloud Functions for this. And we have a package called GCFutils we wrote to let us do that. So diving a bit deeper into source, I wanted to look at the code that isn't just boilerplate. Most of this will be, you'll probably never edit it. But inside of one of the TS files, you're going to find a function that takes an application and on a list of events does some action. In this case, we have a few different triggers that we're going to go on. If a PR is opened, reopened or synchronized, that's a PR update, we want to run an event. These are all possible options for renovate when it comes through. But the bot we write is pretty simple. We check the pull requests user login and if that user login matches renovate, we can add a label to the repository. Bots aren't necessarily all that complicated. A lot of times they're simple actions but the difficulty of this part of the bot doesn't really tell you the value. This one bot has saved hundreds of hours for our development team from having to go into these issues and tell them to run CI. So how do you set up your environment to do these bots, to offer these bots locally? This is pretty GitHub centric in the way I describe it but it's likely worth noting there's nothing GitHub specific about what we're doing here. You could change those events to not be GitHub events. You could use Probot and send it web hooks from somewhere else, all possibilities. So the first thing we do to support local development is start a proxy. This is so that way we can use our local development system as the target of the web hook that GitHub provides. And there's a service called SME we can use. All you have to do to use this particular proxy is go to SME.io, click a button and it will give you a slug URL that you can then use to route your issues to, your events from GitHub too. So we start by running a proxy once we get that slug URL and this sets everything up for us. I should mention it's possible you don't even need to run this step but we've found environments that if you skip this it might not properly configure. And so you run it once. You only ever run this the first time you set your machine up. And the next thing you do is you run NPM start. Like I said, it's basically just a regular node package once you use Probot and it will direct you to go to port 3000 of your machine. And this is that way we can go ahead and set up the GitHub app. You'll be presented with a screen that looks like this where you have to register a GitHub app. You'll go through the GitHub apps creation process. We'll give it a name. We'll configure permissions. Once we get around to configuring permissions things start to get a little bit harder to do because we have to ask ourselves some real questions. What does this bot really have to change about a repository? And in theory we could give it all access but again back to risk. We likely don't wanna do that. So we're going to have to start sort of thinking. In the case of this bot we're going to need right access to the repository because we need to add labels. We're gonna give it permission to mess with pull requests only and that will restrict it from doing anything too crazy. And once we've done that we need to install the bot into the repository. There's a install app button on the screen we're on for permissions. And you'll be presented with a yellow box and where you'll have to review your permissions. So the permissions here are being reviewed are from the previous step. And I point this out because if you don't do this nothing interesting will happen. You'll sit here awhile, it's an easy mistake to make because you think well I've made a bot and I've set permissions but you have to do this step. You also might have to do this step again if you ever change the permissions. And that's a gotcha that's caught me. I haven't given it all the permissions I needed the first time around. And if you forget to do that again it won't trigger on those things until you come and do this. So we click on that link. We can say all right these permissions are safe I'm happy with that. We can install it on any repository or all repositories in an org. Something that I do is I create a repository purely for testing. It's not important. I can make PRs against the branches, whatever. It'll be fine. And I target this repository with new bots. That way I can sort of test drive them before putting them against anything live. So we set that permission up and we can move along to running the bot. To do this we need to set a few environment variables. Every GitHub app comes with an application ID. It comes with the private key. And it comes with a secret for web hooks. These are pretty straightforward to get. The app ID will be at the top of the app page and we can export it to app ID. The web hook secret is a string that you set. So for demonstration purposes in this case it's a ProBot demo. And then we need to configure private keys. At the bottom of the page, there's an area that you can generate new secrets. It'll automatically download a JSON for you. And as long as you put the path to that it'll be able to resolve this for you. So here we show ourselves running npm start. It's forwarding to SME now to localhost 3000. And we're starting to get these post requests coming through. And the post requests are all the result of me opening this PR. I set it as a test repo so you can see that I've updated the read me a bunch and opened and reopened pull request. And then we can go over to SME and start to look into what events we're getting. If we expand one of those pull requests we'll see the JSON payload and what it looks like on a live repository. You get a better idea of what information our bot's receiving and how to respond to it. So this is handy for live debugging and just ad hoc testing. But it's also useful for then taking these payloads and turning them into unit tests that are repeatable. And so that's where this tends to be most useful is you can ad hoc test how it works and you can capture that and turn that into a test you can run over and over again and avoid regressing your bot. So what is the deployment stack for that look like? Now that you can run it locally how do we get that into somewhere that's not running on our development machine? We use a variety of services. Like I said we put this on Google Cloud Functions. We ultimately use storage, a thing called KMS Key Management Service. So let's talk about a bit of those components. The most important bit is Google Cloud Functions. We started on this just as a call out right away. At the time we started this GitHub actions didn't exist and at this time if we started this again that might have been an approach we looked into. But we already started this for Google Cloud Functions and so it doesn't really make much sense at this point for us to go back. Google Cloud Functions take a web event any sort of HTTP trigger and they can start executing. So they're these little on-demand actions which is a really good fit for a bot. They tend to be a good fit for anything that doesn't have a lot of state management and that's not being called very frequently. And most bots aren't. They're intermittently called maybe just during business hours and so this is a good application for that. And there's an existing Google Cloud Function Handler that ProBot provides. If you'd like to use this you can install it from them. We also wanted to use the thing called KMS Key Management Service. And so the idea here was that we didn't wanna store any of the secrets in the Cloud Function itself. There's a potential security risk in using environment variables in the bot as well just like there would be for a CI system. And so instead of using environment variables we can inject these things through the KMS system which ultimately stores them on Google Cloud Storage and then they're fetched as they're needed and then immediately piped into the commands. So they're never sourced in environment variable. It would be more difficult for someone to capture those things. And so the utility that's released by ProBot doesn't support this so we ended up writing our own. It is a rather simple boot scrapper so that wasn't too much work. But if this sounds interesting to you that you would wanna use Cloud Functions and some of these more advanced features feel free to reach out to me or come visit any of the Google booth. I'd like to talk about it. We haven't yet released this to people. It's only in our repository but if there was value it's something we could consider open sourcing further. So this is the version we use. Like I said it's a simple package in our repository. It is under active development but if this interests you please talk to me further. We also use a system called Google Cloud Build and this is what deploys our bots. This allows us to use the secrets that we're storing as well as the deployment pipeline, no developer needs to manage the publishing it just self publishes as we need. So that's good for us. But let's step away from that a bit and get out of some of the Google specifics and just talk about if you were to publish a single bot to something like Cloud Functions what would that look like? It's going to look again like a lot of node apps you've written before. We're gonna have a compilation step. All of our bots are typescripts most of our code base in fact is. We're going to make a target directory and copy some things over to it. And that's the build step. Technically this target part isn't necessary but it is a bit of a safety. It means that when we go to deploy we don't deploy anything unnecessary. We're only going to deploy the things we most care about for the bot not random artifacts that happen to be in the repository. For publishing we provide a function name and Google Cloud comes with a tool called gcloud and we can use that to upload a function once we provide a directory. Through here we're going to provide it to use all the K and the secrets for us. It's gonna go through and upload it for us through gcloud. And that can be done without Cloud Build. The reason we use Cloud Build ultimately is we have more than one bot and so it's nice to have a centralized system for that and a pattern we can follow. So I hope this has helped you understand a bit of how we do bots at least for Google Cloud client libraries and has inspired you to embrace using bots to free your team from a lot of gardening and allow you to do more meaningful work. I would say that any task that's repeated often is a good candidate for bots and virtually all projects can benefit from using them. I'd also like to mention that all of our bots are open sourced. They're on GitHub but they can be looked at. This is the repository they exist at. There's a variety of instances of them and most all of the examples I talked about are bots that exist that we are using today. And I also wanted to take a moment to thank the others that contributed to this. I am certainly not the only one that has worked on this project. A lot of people have. I just wanted to take a moment to thank them all for their efforts. So thank you all for having me.