 Is that gonna speak about GovCMS? It's not just Drupal, it's an ecosystem of tools. Good morning. I'll start slowly so that people get the chance to come in. Yeah, I'm the guy that comes to a Drupal conference and doesn't talk about Drupal, because that's where I roll. So yeah, I'm Toby. I'm the GovCMS tech lead. What we're gonna do is talk a little bit about all the things that go into running a platform for Drupal sites. GovCMS has always been known as a Drupal distribution. That's its roots, that's where it comes from. But in the last year, we've gone through a bit of a journey and new hosting platforms, a whole wide range of new opportunities for us. So we're now looking after two, seem to be three distributions. There's scaffolding, repositories, there's Docker images, there's Drupal modules. There's all sorts of helpers and tools and other add-on stuff that we do. So I'll talk about some of those, and we'll do questions and stuff. So GovCMS is growing. It's something that we don't talk about a lot, but the actual numbers behind where we're growing. We're getting 15 million SaaS page views a month. This time last year, we were at 10 million. This time two years ago, we were at 5 million. So we're adding 5 million page views, monthly page views every year to GovCMS. 600 million hits a month across the whole platform, 50 terabytes of traffic. I remember in the conversations when we're talking about when we got to five terabytes of traffic, and it's just growing. 300 sites, 50 plus in the works. It's not structuring any signs of slowing down. So from a platform point of view, we've got to build a platform that can support this. And that's what we've been doing in the last year. We went to market two years ago, 80 months ago, to replace the existing GovCMS platform. We selected Lagoon, and provided by Maze.io, and Solstice Digital to be the platform we chose. And the reason we chose it is because it's flexible. We can sort of choose and scale and grow. And if a tool works, we can use it. If a tool does a job better than one we've got, we can replace it. And that's the view we take with a lot of this. So what I'm going to cover is where we've got to on a lot of these things. But the ability to shape some of these tools for our needs has been really, really useful. So stuff that we use every day, it's testing, it's analysis, it's building, it's auditing, it's reporting, it's all this other stuff. Lagoon is the main platform that underpins GovCMS, open source, to Docker, build and deploy system. Basically, it allows you to build a local stack on your local machine, build, test, check it's okay, and it will convert it, and magically put it up in the Internet. And it'll look exactly the same, and it'll work perfectly. And there won't be any slowdowns, problems, delays, or holdups on route. GovCMS, the use of Lagoon is sort of pushing the boundaries of what Lagoon can do. So we've been working really, really closely with Maze to help Lagoon meet our needs. And Maze has been fantastic in allowing us that partnership level. And so together with Solstice, we've put a lot of thought into stuff that Lagoon could do to make our lives easier and to make our agencies and developers lives easier. So we've collaborated on a few things. We've been quite active in contributing to Lagoon. There's an awful lot of Carl in my presentation. But Carl wrote a task for running Cron. Chris from the Department of Human Services renamed a file to fix NGINX rewrites. I added a catch all option for variables. Steve re-did MariaDB Client Configure A, Tom. These are all the contributions that GovCMS and Solstice have put back towards Lagoon to say, we're using this platform. We're doing it. These are things that we think could be done better or these are enhancements. And Maze has said, yep, that makes sense. That makes total sense. If it doesn't make sense, not afraid to tell us. That's a good thing. But they're scratching our itches. But they're itches that other people are experiencing. So that's the whole point of the open source thing, is that if we can fix something for ourselves, that's great. If we can fix it for everybody else who's using the platform, that's even better. A couple of really big things, when we took Lagoon on, we acknowledged that the authentication control, role based access control wasn't as fine grained as we needed it to be. We need sort of the concept of protecting prod environments and agencies be able to define permission levels for users. So we worked really closely with the team at Maze to build a role based access control system that was released in August. Lagoon episode one, The Permission Wars, because naming stuff is the most fun part of doing any job. So we basically funded this role based access control component for Lagoon. That's now a standard part of the Lagoon product. Everybody that's got Lagoon elsewhere around the world now uses that. And it's integrated tightly into all the way the API and the authentication subsystems work. The Lagoon CLI is a tool that was built by Matt Glammon for his own purposes for a Lagoon site that he was running in the US. And he got to the end of his project. Some of the Maze team had been using the CLI, saw potential in it. And GovCMS said, well, yeah, that sort of scratches are rich too. We're administering a platform, a CLI that means we can add and remove projects, makes our life easier. It'll make the Salsa digital team who do a lot of our frontline support. It'll make their lives easier. Yeah, let's spend some time making this CLI and fully fledging it. And Ben's done a great job of building this tool and really thinking about how someone's gonna interact with an application. So we're really proud to be part of this and to be involved in the evolution of a product. The way that you don't get when you buy something off the shelf. That's the whole crux of the open source thing for us. We run a big GitLab installation inside our cluster, because for us, security is paramount. If everything is inside the same VPC, it's an awful lot easier for us to secure. We use GitLab to provide the authentication base. So someone logs into their GitLab account, and that's the same account that the role-based access control uses to control access to their projects and their environments and all their tools. We pay for GitLab because we use some of the features in it that don't come in the free edition, but it's an open source project. It's all out there, the code's all there. One of the nice things it's got that we make use of is a container registry. So like Docker Hub for public images, we can put images into the container registry inside our cluster, and we can do a bit of stuff with that. GitLab's a CICD tool, we've done a lot of work over the last few months in really building out a high quality CICD pipeline. This is where some of the proof of concept work is at, but the idea is that a job will go from being built on your local machine and it's up and it works, and you commit it to Git. And Git then picks up a workflow and it'll go through and it'll run lint tests, unit tests, it'll run a vet step and that's for us the check that this is valid, that you've not changed something you shouldn't, that you've not broken something you shouldn't. It'll do preflight tests. If you wanna do more integration tests, you can switch on auditing. Carl will be talking a bit about auditing this afternoon. You can switch on B-hat tests, you can do functional tests. You get through all those steps and it'll trigger a deployment and it will go up to the guin. We've not rolled this out to production, but we're looking at how we can leverage some of those steps to help agencies build better sites, to deploy sites that work. A lot of flexibility in that. One of the other things that we've come across with GitLab is that there's a couple of things it didn't do that we really needed it to do. But same as the guin, it's an open source project. So because we're using GitLab heavily for roles, there was a webhook that wasn't firing if you change a role a certain way. And we raised this with GitLab as a bug and they said, look, it's a bug. Yep, I think we can get someone on it in the next sort of nine to 12 months, unless you've got someone that can write Ruby. I think Carl was busy that weekend. So Brandon from AMAZE wrote the code for GitLab to build a webhook for permission change. He learned Ruby, he wrote the code, pushed it up to GitLab. They approved the PR and so his code went live into GitLab and GitLab.com in version 12.3. They looked at the work we'd done and said, yeah, that's brilliant. Tom did work rewriting their charts for deployment into OpenShift or Kubernetes, supporting an issue that we'd come across with that container registry. Again, submit the PR, normal workflow, it's up. It's released in the next version of GitLab. We've got this direct access to the product that we're using. And because we've engaged people who are smart, people who are contribution savvy, we can get the changes we want into the products we're using really easily, I'll say really easily. But we've got unprecedented level of access to the tools we use. In terms of running on the cluster, we're very heavy users of the Elastic Stack. So Elastic Search, Kibana, LogStash, we use them again inside the cluster because it's nice and safe and secure, it's within our install. But we've got the option, we can use Elastic Search in the cluster. We're at the process of trying to work out whether Lagoon can handle a hosted Elastic Search or multiple Elastic Search and all these sort of possibilities there. Trying to work out what the best, most sensible uses for them are. We use Elastic Search to collect all of our logs, which it does a good job at, mostly. But Kibana is really valuable for visualizing some of this stuff, to try and put logs back in people's hands, to put data back in people's hands. So we do quite a lot of analysis and insight and visualization of this stuff. We've got, there's a chart of GovCMS sites by type. So we can see SAS versus PASS, Drupal 7 versus Drupal 8. The outcome of some of that audit tests we've got, you can see the red bars are things that everyone fails at. The green bars are generally good. It gives us really quick insight into what's working, what's not. This horrific chart on the right-hand side is the top 50 sites traffic over a 24-hour period. So you can see it's nice and quiet overnight and then just goes crazy. You can't tell much from this other than whatever site that is, did something wild about 10 a.m. We've got all that kind of information. How do we use it? How do we process it? To get it in, we use LogStash and FluentD. Log Collection is a really exciting subject. But what we've done here is we've built some custom pipelines that we use to go out and gather information. So as well as collecting logs from sites, we're collecting information from GitLab. We're collecting information from Lagoon. And we're trying to work out what the best way of using it is. So, token code slide. We do this to check whether sites are live and collect the routes. So if a site's got a route attached and we can check whether that route's actually on the internet or not, we can see whether it's actually on our platform or not. If there's a header present, we can go in and we can look at jobs. So anytime you submit a GitLab job to build in CI, we can track that and we can report that. And we can see who's taking the longest time to run CI jobs and have a small bit of list on the wall. It's not a contest, but there's obvious winners. We collect our logs from Simple Email Service out in Amazon. Again, that comes back into Elasticsearch so we can visualize and we can analyze it. We do stuff in here to make sure that sites are up. Putting it all in the same place, you've got a richness of information that helps to see this, the state of a platform at any one time. We use a tool called Grafana, quite a bit for doing analytics and monitoring, whereas LogStash, Elasticsearch and Kibana are very much sort of application level. Individual sites, individual projects, individual tools. Grafana is much more in the actual platform weeds, so we can see the state of all of the Elasticsearch indexes. We can see how much memory usage an individual site has used over the last three, five, six hours, that's a fairly standard one. These little burbles here are tiny little cron jobs that pop up every so often. There's a good example here of sort of traffic to sites over a time period. Here we can see the total volume of cron jobs running on the platform, so you can see it spikes at six o'clock and then goes back down. We've got uptime metrics for databases and CPU utilization. We've got all that kind of information. It's up to us to work out how we use it and what's good, what's bad, how do we measure, how do we report, how do we learn from this? So it's a real journey for us. Having that much data, you've got that much more to process, you've got that much more to consider. It means when we're trying to work out what's happening at any stage, we have all this stuff, trying to find out the most important things is a bit of a hit and miss game sometimes, but we do a lot of automation on the platform. There's a lot of manual tasks involved in running a platform this size. Stuart and Alex gave a talk at Dribblesaf last year about using AWX for migrating 108 sites to have CMS. And they wrote a really cool pipeline that would go out to Acquire, it would grab a site, it would transform, it would move it, and it would transport it across onto Lagoon and make it go live magically. We didn't really think at that time we could do more with AWX and Ansible. Over the last year, we've really built a lot of Ansible tools to automate the stuff we're doing. So whenever we build or deploy a new site, if an agency comes and says I wanna build a new site called coolgovsite.gov.au, we'll provision that site using Ansible to make sure that the projects are set up correctly, the permissions are set up correctly, the names are set correctly, the routes, all that stuff is all done in a repeatable manner. We've got a lot of processes that we run, a lot of the collectors that we run in Logstash and that we put into Elasticsearch come out of AWX processes because they're repeatable. We use it to do some of the bulk platform work. So if you've ever got an email from us saying we're doing scheduled maintenance or GovCMS Drupal 7 sites have been deployed on a Friday night, Tuesday night or whatever, we'll use AWX for that because its strength is in automation. It can trigger a deployment, it can watch for that deployment to complete, it can run five or eight or 10 or one or two in parallel really easily. So we've written some really cool tools. We can do nightly audit runs, we can do database backups, we can a bulk run command. So if we wanna find out who's running the latest version of admin views on a Drupal 7 site, we can run Drush admin views against our whole inventory of Drupal 7 sites. In five minutes later it'll all come back and it'll say here's your 160 sites, 157 of them are running 1.7, 10 of them are running 1.6. We now know which ones are out of sync. It'd be nice to make that automated and make that more collectible but it's much better than having to SSH into every single site individually and find out what it's doing. We can run a scheduled jobs. So we run edge and beta scheduled jobs overnight. We can push the next version of GovCMS out to sites automatically for testing purposes. It's something that we do quite a lot internally. We've not released it externally to too many customers yet but if someone's site builds on the next version of GovCMS repeatedly night after night we can be confident that the next version of GovCMS is good and it's not going to cause and play havoc with people. The bulk redeploy environments, that's the biggie, that's the one that we run. We will kick that off and that will, yeah, go through, we'll say we wanna do five sites at once and it will just stroll through the whole of GovCMS updating and redeploying sites. To make this happen, we've done some really cool work in the background on inventories. I've learned a lot about Ansible working with people that actually know about Ansible but the concept of an inventory is a list of sites. What we've done is we connect Ansible up to Elasticsearch because everything's reporting into Elasticsearch. So Ansible will go to Elasticsearch and say, hey, I wanna know project names, I wanna know project types, I wanna know all that stuff. And then when we're setting up a job, we can say, okay, I want SAS, I want Drupal 8, give me those sites, go into the inventory and just show me the Drupal 8 sites. So it will go through because Elasticsearch knows what's what and where's where. It will create an inventory that's only got Drupal 8 sites in there so that when we run that job, we know we're only running it against Drupal 8 sites. We can run it against Drupal 7 and Drupal 8, we can run it against production, non-production, we can run it versus SAS versus PASS. And it makes the deployment jobs really easily. We run against an inventory, we've got some pretty cool stuff going on with vaults to store secrets and passwords and stuff. So our deployment script is like 20 lines long because we're making use of Ansible's inbuilt features. The script hasn't always been 20 pages long. It's gone through many iterations because we're running in production. It needs to work. If it works, we're happy. If we can do it better, we'll look to do it better, but that's not our first priority. So there's been iterations that have done, the first iteration couldn't actually be paused once it started to roll out. It would just keep on going. Later versions brought in pausing. We can now sort of work out how much parallelism it's doing. So we sort of made quite a lot of changes to this over the time and the team at Salsa has been really good at leading a lot of this automation and process automation work because there's a real interest in reducing the manual input required. We've been experimenting with a tool called Key for anybody that's familiar with Docker. Images, Docker images go on Docker Hub. Key is basically the same as Docker Hub but it's got a few really nice features in there that from a security point of view, we really like. So container image scanning, the Zordit logs, there's some of that stuff. We've not moved all of our images across there, just a few that we're building, but when you build, so I've got an image here, we've got an image we use for our CI pipelines. It's got six vulnerabilities in it. Each of those vulnerabilities has a version fix. So what we will do is we'll go back and we'll look at the way we're building this tool and we'll work out where those vulnerabilities have come from, fix the vulnerabilities, push a new version up and those will all magically go away. We know that because Key tells us that we have those vulnerabilities because we know about them, we're better prepared. It gives us a nice audit log so we can see who's pushing to it. We can see who's pulling it. Unfortunately, because most of the pulling is done from our OpenShift cluster, it all comes as an anonymous call from an IP address but if someone was doing an authenticated login, we could track that, we could see that happening and you've got build histories and you can roll back inside Key, you can roll back an image, whereas Docker Hub, you have to create the image and push it back yourself. So it's something that we've been sort of experimenting with and playing with. One of the things I'm most excited about is a tool called Tugboat. Now we've got a really nice swanky platform, Lagoon, it does all the building, it does all the deploying, but we've got a lot of public distribution repos that we wanna be able to do public stuff and encourage public contribution and we don't wanna over complicate by adding all the Docker stuff in and all the Lagoon stuff in. We wanna try and keep them as simple and plain as possible. I came across Tugboat, tool built by Lodibot and it's basically for deploying and testing PR environments. So on our GovCMS-8 and GovCMS distributions, our testers now can use Tugboats to build an environment from a pull request. So if we're doing an update to the entity revision module, we can build that pull request in Tugboat and it builds an environment hosted somewhere else. So it's not on our infrastructure. So if anything nefarious happens or if someone logs bad PRs or whatever, it's all separated in air gaps from us. From a testers point of view, the tester can deploy it themselves. They can click on preview and it fires up. So they can log in and they can say, okay, today I've gotta do this, this, this and this. They can build those tests, they can pass them, they can shut them back down again when they're finished. So our developers don't need to provision environments for them and hand hold them. It's got a couple of really nice features. The logging all comes out to the UI. It's got terminal access. So if they do something really weird, you can SSH it and you can see exactly what's happened. It's got a mail catcher. So if they're testing email, you can see the mail appear in there. Pretty easy to set up. But it's got also a really nice tool, visual regression testing. So the GovCMS distribution comes with a default theme. We install that default theme and it will go through and every time it builds one of these environments, it will run a full set of tests on whatever URLs you tell it. So in this case, it's picked up that this page has changed in this PR. Shock horror, what's happened, it turns out the date's different. So that's not the biggest issue in the world. If that page was completely blank or missing something, yeah, we're gonna have to dig in and find out exactly why. So it's a really good solution for being able to build and throw away those environments. Yes, the first iteration we had did do it inside Lagoon, but we had to do a lot of converting and we had to build a lot of tooling to allow that to happen. Whereas this way we can keep those distributions pure Drupal so that they're more reusable outside of our platform and for the cost of a monthly subscription, the amount of time that we don't have to spend supporting the building of environments for testers and because the test load is shared between GovCMS and Salsa, anybody can step in and test anything at any time. It just makes it that much easier. Since day one, we've used a tool called CircleCI. Now we use GitLabCI in the cluster. We use CircleCI outside the cluster. Again, we could do a lot of it inside the cluster but having that separation between public projects and private repos is really important for us. It helps to sort of clearly define what's production GovCMS and what's our workflow outside of GovCMS. So again, nicely integrated cloud hosted. It'll go through every time you do a pull request or build a new branch, it will run a set of tests. So we've got, in Drupal 7, we've got a whole range of B-hat tests. In Drupal 8, we've got less tests but it will go through. It'll do some pretty, it'll run whatever you need it to run. So it will build the environment, it'll run your set of tests, it'll collect whatever outputs you need it to. One of the, some of the things we do is we do outputs of like double checking PHP versions and we'll do a drush module, drush PM list so that we can see at a glance we were expecting this to update search API attachments module. We should be able to look in the output of the CI and see, yep, search attachments module has been updated, we're on fire. One of the things that we built in the last few months is a set of D9 tooling using the static Drupal 9 analysis tools so that when someone submits a PR, it'll go through and it'll do a Drupal 9 readiness check on that PR so it'll analyze the GovCMS core, the GovCMS modules, but it'll also go through and scan all of the contrib modules for the Drupal 9 flag so we know whether Drupal 9, like where we are in relation to being able to do Drupal 9. It's really handy getting that in the CI. It's run, it can be run every time. We've used Drizany for a long time. Drizany is a fantastic tool. It's open source, it's developed by the Tamas at Aqueer. We've got custom policies, profiles, remediation in place. Go and see Carl's talk. Someone's calling me. He'll be talking a little bit about what we do with it and how we use some of the outcomes. One of the things we do at a platform level, and we've done proof of concepts with a tool called Satis and Satis is a composer resolver for those that were in Blaz and Tom's talk yesterday morning. They went about this a different way. There's many ways to skin a cat but what we like about this is it allows for fine control of PHP package versions. From a GovCMS point of view, GovCMS the distribution has sort of one state. It's to use GovCMS is tested on these versions of these modules and we know it works for those. It may well work on other versions but we know it works for this. So we've traditionally had a very tight control over that. The way we build our images currently has the image versions baked into them. The proof of concept we've done on Satis allows the site to exist like a normal Drupal site and it goes off and it tries to resolve the list of modules that make up GovCMS. Satis basically sits in front of the traditional composer and packages workflow and says you can only have this version. So instead of us having to bake hard code all of the modules in we let composer go and do it stuff and it can say well give me all the stuff that makes up GovCMS and because Satis only knows one way to build GovCMS that's what it gives you. It allows really fine control. We can add other stuff in. So you can build a project and rather than pointing to packages, so Satis itself points to packages, packages Drupal, couple of other GovCMS ones and all it's doing is building GovCMS, beta 11. It builds scaffold tooling and it builds required dev. It will go through do that and it will generate the actual buildable Satis and that will only give you version 8.79 of Drupal because that matches beta 11. It'll only give you alpha two of the UI Kickstarter theme because that matches that. So the resolution is really fast because it doesn't need to work out which version of which works with which and how does it all come together? For our need, there's a couple of things that Satis didn't do. If we absolutely don't want to use a version of a module because we've done a very thorough review on it and we've said, look, I really, really don't like the 1.6 version of entity reference revisions. We need more information. We need a patch. We can't let that on the platform. We wanted to be able to say under no circumstances should sites be installing that module so we can use Satis to say blacklist that. So when it goes through and builds, it will come to entity reference revisions and say, well, no, I can't use 1.6, I'm gonna have to go 1.5. We can do that for old versions. If there's a security vulnerability on an old version of Ctools, we can blacklist that version of Ctools and nobody will be able to install it because packages and Satis won't let you. Satis didn't do that out of the box. So Cy wrote the blacklist component for Satis. That's now part of Satis. Cy also wrote a match best candidates. One of the ways that Composer works is it can come up with multiple candidates and Satis occasionally would get polluted and have two or three versions. It also allowed us to restrict the dependencies to the bare minimum. So rather than a file being 27 meg, the resulting resolution file 700K. Again, Cy contributed it. It's seen the Satis now. It's just another example of the stuff that we've built. It's real world experience that's now being benefited from by everyone else. We've got so many more tools that we've built. We have a really cool mail relay that we use to get our mails out of Drupal sites and into SES that Carl made a couple of really good contributions to. Carl did a rewrite of amazing IOs, Pigme tool that it's really, really good. And it's much a diverse improvement on the original. We talked about that kind of CMS CI image. So we've got a Docker and Docker image that we use in our CI pipelines. We're trying to do all this stuff. It's made our lives easier. We're trying to release it so that other people can benefit. And our ethos with a lot of this is rather than fork it and use it internally, let's try and contribute to the upstream. And our experience has generally been really good dealing with upstream maintainers and saying, look, we've identified this, the SES mail relay, we fixed a problem where it couldn't handle multiple recipients. We enabled it to support AWS config sets. The maintainer even moved closer to Australia to be more used to us. That's what he told us anyway. Yeah, configuration set name, the Docker image, the Pigme rewrite, it's all part of what we do. We're trying to use our experience to help and guide other people forwards. A bit more extension to sort of what Blaze and Tom talked about yesterday. We've been through a bit of an automated dependency management journey. I once had all three automated dependency managers running on my fork of Gov CMS and I'd get somewhere in the region of 200 emails a week, just of updates to things. We're not running any of these. We are technically running one, but we're trying to evaluate which is the best. I really like Violinist. I think Violinist is a really good dependency manager. It will go, it'll scan your GitHub repo and it will come back with updates. It integrates really well with Drupal so you get the proper Drupal changelog in there. It only does PHP. Dependabot is now part of GitHub, but they do more than just PHP. They do PHP, they do Node, they do Docker. We're running Dependabot in some of our GitLab CI because Dependabot publish an open source version of their dependency resolver that we can then run inside the cluster to check for updates to Docker images for some of the LogStash pipelines we do. So if someone pushes a new tag of the LogStash image, we can get a PR with that inside our cluster. For us, that's really important because we want as little talking to the outside world from inside as we can. So all it's doing is basically talking back and it's pulling down a file and matching that file and doing the resolution inside. Dependencies.io, again, does PHP, does Node, does other things. They've all got their pros and cons. I think eventually we'll settle on one or three of them, but it's really good information. It's really good takes against some of that automate, automating some of that manual step. If we get a pull request into our repo on a Thursday morning for the latest version of whatever module, we can then build, it'll go through CircleCI, so it will build, it will test, it'll get a green light. A tester can open it in tugboat, test it. And by the time we've sat down on our desks at 11 a.m., I mean, 9.30 a.m., this can be PR, it can be tested, it can be manually tested. It'll just streamline that workflow so much more. It's a really big photo of me. So if you wanna talk about any of this stuff, I'll take questions now, but I'll be wobbling around for the rest of the day. Happy to show, demo stuff, happy to answer questions and talk about our experience doing this. But I think it's really important that there's so much more to running Drupal than just Drupal. All this stuff, all of our experiences, platform owners, we're really keen to sort of show you what we're doing and where we're at. So anybody got any questions? Questions? That makes my life easier.