 Before we get started, everyone could just put your hands together and thank the organizers of WordCamp Brisbane for putting on such a great show. We were a Fairfax family in that way that back in the 80s, people had to actually go to a milk bar to buy a paper so they could read the news over breakfast. Each morning, Mild Man would go down to Mrs. Dancy's because you see, in the 80s, you knew the name of the local milk bar owner. Anyway, Dad would go to Mrs. Dancy's and he would bring back the copy of the paper, hid by the age. The effect of this was predictably enough that when I started reading the news, I started reading the age too. This is the earliest capture of the age website on the way back machine. By this time, I was already a regular user of the net, so fairly soon after it went online, I started reading the paper online for free. It was great. The quality of the website slowly got better over the years until the invention of auto playing videos and flash ads making use of dark patterns meant that I switched to ABC News as my primary source. Jump forward about 20 years after Fairfax started putting their news online. And Fairfax is in trouble. Years of cash from print have long dried up. All the money that we made from those auto playing videos and dark patterns and flash ads, it's definitely not paying the bills. Business has already gone through years of brutal cost cutting and is barely staying afloat. We basically engaged an agency to work out if it's even possible to run a sustainable digital newsroom and their findings were grim. It basically came back and said you can't cut any more costs out of the existing business. If you want to succeed, you need to start again from scratch. Basically green fields all die. So the secret project to reimagine our publishing tech stack began to build it from the ground up. And that's when I started at Fairfax. So very much welcome. Everything's on fire. But before I get into how I met Pete and human maids, let's spend a minute talking about architecture. Who loves architecture? Woo! Yes. Thank you. My people. I was a bit scared that no one would say yes. Starting out, we knew what we wanted to avoid. We wanted to avoid building one of these, the thing where you author stuff and store stuff and display stuff for both your web and application, all from a single app. And when your app does all the things, it gets pretty heavy. We didn't want to do that. We wanted to keep things loosely coupled and we did that with the magical things called microservices. This is a super high altitude diagram of our environment today. We have journalists down here authoring content and readers up there reading, consuming the content and in between magic. When I first saw this as a full-time WordPress developer, I was terrified. Excited because I'd heard about headless WordPress and it was exciting and something I wanted to try but also incredibly terrifying. You may be sharing some of the same thoughts that were going around in my head at the time, like, this looks really complicated and why would you go to all the effort? Well, look, web and app delivery were our main concern when we started, but we also had to consider known unknowns. What if voice assistants really take off? What if the next Oculus goes mainstream? We wanted to try and make sure that we could, whatever we built, would be able to deal with those uncertainties and those disruptions without having to throw it all out and start all over again. Didn't make sense to look at the display capability of our site to that of our CMS, so we knew that whatever we settled on, the display would be separate from our authoring system. When we look at the tech that we're using today, it shouldn't be much of a surprise that we landed on WordPress for authoring. Our service APIs are built with Golang and we use whatever makes sense for the render layer, so React for web, for example. We use GraphQL as a proxy for all of our service APIs. We need to start building straight away from the start of this project. So, Fairfax put together the teams for the APIs and the render layers and we engaged human maids to help get authoring off the ground. And everything was peachy from day one, right, Pete? As I stand here next to a former client, I want to ask you all a question. Who here has had a conversation today about difficult clients? They're painful, they don't know your processes, they won't listen to what works. And who here has had a conversation about difficult supplies today? Yes, thank you. They're painful, they don't know your process and they don't listen to what works. So, that was us early on. We both arrived with ideas of how we wanted to start the project based on how we'd run projects in the past. The large projects I'd worked on, they still had web teams of sort of under five or 10 people in... And we're using a pretty vanilla WordPress setup, which at work allows us to get away with using Chassis for our VMs and GitHub for our issue tracking, a system that works for us and continues to work to this day. However, the Fairfax team all came from an agile background and had plans for a much bigger team than five or 10 people, which for the first time, for me at least, meant using Jira and switching from GitHub to Bitbucket. And as always, when you're using tools that you're unfamiliar with, it can lead to a level of frustration. And that's before I mention the meetings. Monday, stand-up, backlog grooming, Tuesday, stand-up, estimation, Wednesday, stand-up, sprint planning, Thursday, stand-up, Town Hall, Friday, stand-up, showcase and sprint retro. But it's agile, so they're not called meetings, they're called ceremonies. And the most important meeting, sorry, Ben, ceremony, was estimating each ticket and assigning a t-shirt size or a number of points based on the Fibonacci sequence. Not once do you dare mention how long the thing might take. And for the first few months, we sucked at it. We'd pile in the points, not really knowing how many t-shirts we could fit in our suitcase, only to reach the end of the sprint goal and to reach the end of the two weeks and miss our sprint goal. I have to be completely honest, and we should be fair, though. I did not make this easy on human-made at all. We made them switch from chassis to docker and from GitHub to Bitbucket. And also, I was dropping PRs like this at 2.40 a.m. of the day of the showcase to make things look nice, which I merged without the required approvals. This didn't come across too well for some reason, so I stopped doing that. But moving away from chassis was a really big deal. It's human-made in-house dev platform. It was built for purpose, had everything they needed. And our dev tools were nowhere near as complete, but they were our dev tools. And if I'm going to be super passionate about something and pushy on something, it's going to be about removing drift from dev environments. So I made this CLI tool called BlueStruck and it was designed as a day one install. It would make sure everyone had the same dev tools, config across the board. It would download all the repos, start everything with a single command. The entire stack was going to be WordPress and the APIs and the render layers and everything. It was going to be awesome, except it wasn't. In the early days, BlueStruck was super opinionated and it was incredibly rude. It had a list of stuff to install and it would just do it. It did not care for your existing applications and configuration. It didn't support XD bug and Docker for Mac was just, well, volatile. And as we built more stuff, Docker ate more and more RAM and it took us a while to get things right. On the outside though, once BlueStruck was stable, it was great. We could test everything from the CMS right through to the website without actually leaving our computer. Does the short code populate the website correctly? Boom, we can open it up in the new Brisbane Times website so that's a new Brisbane Times website that's running on our own computer. So it took us a while to find our rhythm, but it didn't take too long until we were building some pretty cool stuff. And today we're going to go through some of the bigger items. Starting with the Media Library. It was one of the earlier features we started on and it was basically to enhance it. Now, this is the default version from my site. The Fairfax site has far fewer of my holiday snaps from a trip to New York to see musicals. And in the default Media Library, to upload an image, I drag it into the window and I drop it and the image is uploaded. The image is uploaded. It's added to the library. It's not an uncommon interaction by any means, but it's a really nice one, all the same. And it's one Fairfax wanted to keep. What they didn't want to keep were some of the limitations of the WordPress library. Each image is limited to four crops and these crops are applied globally and file names are not systematic. Switching over to the Fairfax library, it looks much the same and this is quite deliberate and behaves the same. Now, this is the local version from when I had a development server because the real thing has many thousands more photos and they're not all from the Tony Award winning musical, Dear Evan Hansen. And while it looks generally the same, there's a lot more happening in the background. Let's start early on in the journey. The Geno wants to upload an image to the library. The first thing we do is rename that image to a hash of its contents. Among other things, it allows us to discard duplicates immediately. It also provides the nice systematic file names that we were missing earlier. It's then that some interesting stuff begins to happen. Remember this diagram, instead of storing the images in WordPress, we hit the Fairfax built media API and their media team deals with serving the images to the site. The moment the journalist releases their mouse on this screen, we take over from WordPress. The browser uploads the image to the server and our code converts the data to the format required by the Fairfax API and sends it across to their server. As we no longer need the image in WordPress, we delete it immediately. So that's how we've handled uploads. Let's take a look at what happens when Geno wants to use an image in their article. In particular, what happens when they first open the media library? WordPress returns the data in a nice enough format, but it's nothing like the data that the Fairfax API returns. Because one of the nice things about systematic URLs is that they're, well, they're systematic. So the Fairfax API only returns the most basic of information. We needed to convert it to something WordPress understands, but we wanted to write as little code as possible. WordPress expects these responses, though, to include various sizes, but we've got dynamically generated URLs. The thumbnail is named slightly differently, and you can specify width and height to get a best-guess crop, or you can specify entirely custom crops. For the default crops, we define the sizes in the usual WordPress fashion by registering them in code. This allows us to refer to the default crop sizes throughout the CMS. Unfortunately, we immediately hit a problem or two. What we discovered that getting it is that when you get an attachment via WordPress, in a roundabout fashion, the first thing it does is validate the ID as an integer, which, in our case, it's very much not. And in no way is this filterable. So what we ended up doing was writing a whole bunch of functions with comments along the lines of, this matches the signature of the core function that modified for the API. We wrote them as often as possible to match the core signatures to be kind to the developers that came after us, which is often our future selves. I and a lot of other WordPress types know that core attachment metadata is an array of the original image dimensions and the resized image dimensions. Our modified function returns it in pretty much the same format. By getting WordPress to return data from our core equivalent functions in a correct format, we could convert the data from the Fairfax Media API to a format expected by the Media Library in WordPress. If we look back at the default Media Library, the one full of HotMy holiday snaps, the client-side code is actually made up of dozens and dozens of components. By doing the work to convert the data on the server, it allowed us to make the changes to the client-side code a relatively light touch. Or more to the point, it allowed us to focus our client-side changes on more difficult elements, such as when it came time to replace the default insert selection with something much more useful for inserting an image into an article, which did require that we started to build some completely custom components to manage them. These custom components allow us to change the URL of an image and for a journal to customize the crop. Fortunately, we didn't need to worry about saving these changes to the Fairfax Media API as they're modified on a per-article basis. That was the specific goal for the crop to become part of the content and be transmitted across as part of the article's content rather than the images. So Pete's touched on some of the reasons why the default media library wasn't really suitable for the newsroom. I'm gonna come back to one of those for a moment, the global crop issue. One of the reasons we landed on Cloudinary as our imagery size is because I wanted to experiment to see if we could build something where the newsroom didn't actually have to crop anything at all. Products came up with four different ratios that we needed to support, so stories would display correctly across different devices. So that's four separate crops for every article that's produced in the newsroom. And it doesn't sound a lot, but when you're churning out a lot of articles every day, that's quite a bit of strain. That's quite a bit of time for the newsroom to spend. If we use WordPress global crops in the example here, we'd probably end up with a center crop just like this one here. These are possible here, but just. Have you tried a center crop on red carpet images? That's a no dice. You can't get away with that, forget about it. Cloudinary ships with content aware cropping, and it's awesome. By adding a single parameter to the image URL, we get a much more interesting set of crops. And for a time poor newsroom, this is a huge time saver. Even if the crops aren't perfect, it's a great starting point. Here's a practical example. First, I'm gonna switch to another lead image, and you'll notice the right-hand side rail is updated. Now I'm gonna override the landscape image and also the portrait image. Now, the portrait auto crop didn't quite nail the auto crop, it's not perfect, but it's pretty easy to then jump in and edit it, take it from where it left off, drag it around a little bit, just make it just, just get it just right, that's looking good. Yeah, so there you go. So we basically just made the defaults a little bit more sensible, and made it so that if they do need to do crops, hopefully it's only one of that four. Modifying the default publishing and workflow features was one of the biggest changes we made to bump the CMS into something that you could consider enterprise. Among other things, we took the default WordPress publish box and replaced it with what we modestly called the publish box of the future. And while lacking in hubris, it actually started out as a convenient name while we were building out the UI because it allowed us to distinguish between the one that worked and the one that would work in the future. If you ever find yourself in a position where you need to make anything more than just minor changes to the WordPress publish graphics, may I suggest this is your approach, kill it, kill it dad, like it will get in the way, honestly. We replaced it with our own meta box and built it out in JavaScript. One of the goals was to avoid full-page refreshers, which means saving and publishing via the WordPress REST API. This involved customizing the post endpoint and adding a whole lot of our custom data to it. We needed to add a custom property for each item of metadata that we'd added to the edit screen. In our case, this was quite a lot. In fact, it's such a lot that we needed to split the screen into two tabs, one primarily for editors and one primarily for the reporter writing the story. We ended up adding just over 60 custom properties because it can take a little bit of work to just keep it just another day in the newsroom. As part of adding these custom properties to the edit screen, we needed revisions to include and a bunch of additional content stored as metadata. For this, we used Adam Silverstein's post-meta revisions plug-in, which was working while we were saving with the default publish box. But once we switched to the publish box of the future, we started experiencing an off-by-one error when saving revisions. Updates to the custom properties would be saved against the following revision. The full-page refresh via the default publish box uses a single function to save the post content, the taxonomies, the metadata, and after that, a revision is created. Saving via the WordPress REST API does things slightly differently. Content is saved in one function. Taxonomy and metadata updates are both saved in separate functions. This changes the order in which things happen. By default revisions are created before the post-metadata has finished updating. If revisions don't include post-meta, that's fine. But once revisions start to include over 60 custom properties, the client kind of started to notice. To manage this, on REST API requests, we had to move saving revisions, so it ran much later when a user saved via the WordPress REST API. But regardless of how the data is saved, once a journalist clicks save, the data needs to be sent off to the content API. Unlike media and attachments, though, we did maintain a copy of the content in the WordPress database. One of the features of the published meta box of the future was the ability to set an on time for articles. What makes this so different to the core future publisher WordPress? Well, we don't use WordPress post status. Well, we know it's there and it's in the database, but we don't use it. When we publish something in our authoring system, and it gets sent to the API, it has a status of being published. It just has a future on time. That makes things pretty complicated. On time seems like a tiny feature, but in every newsroom that I've worked in so far, it's super critical. If it doesn't work, you're gonna have a bad day. So let's fast forward to a few months ago. We've got a team of WordPress developers now at Fairfax, and HumanMade are engaged with other clients. And the Fairfax team has a pretty good handle on the code base, but then we start getting issues like this. The article first published time displayed on live site predates the time at which the article was first published. Cool. Steps to reproduce. Save an article with a future on time, but don't publish. Publish an article after the future on time has elapsed. Now notice incorrect first published time. Yeah, my head's hurting a little bit. Like most of you, I enjoy sleeping. I'm pretty keen on not having to stress about timestamps on articles. I also wanted my team to feel more confident about making changes to this part of the code base. So we put together some simple functional tests to cover the most important stuff. This is a small portion of one of the Cypress tests. I'm not gonna deep dive. I just wanted to share that to demonstrate that functional tests I found are both one, very handy, and two, not as scary as I thought that they might be. So starting off, we can see that the test is around scheduling. We fill in the mandatory fields and run through a few click events to edit the on time. A little further down the code, we see that the schedule button is pressed and we're gonna chill for a bit until we get a response from the API. And finally, let's check. Are we sending the right info with our request and did we get a valid response? That's it. That's just a snippet of one of the many, many tests that we wrote. And although it hurt a little, spending the time to put these tests together really increased our confidence around releases and reduced the need for super, super in-depth regression testing every time we made changes to anything related with time. Now, before I came down today, I checked how many tags work users on their blog. About 90. My personal site is predictably enough less disciplined with 118. One of the last things I did when I had access to a Fairfax dev server was run this simple query for Fairfax. And the number that came back is still a little startling. Oh, it seems like a lot, a little over 23,000, but there are a lot of places in the world and a lot of people have no. As the CMS and the site were separate, our tags were managed on a central system and published to a Fairfax tags API endpoint and the CMS would import these periodically. Fortunately, WordPress has some functions built in for scheduling cron jobs for this purpose. For each of these jobs, we'd go off, get 50 tags from the content API and add them to the WordPress database and repeat several times until we have 23,000 tags. But I'm getting ahead of myself. We got a little bruise when it came time to import the tags into the database because managing tags makes a lot of database queries. And when you're dealing with a large scale, you need to consider the database queries. It simplified that this is a code that we needed to run each time we inserted or updated the tag. We insert the tag itself and then set some associated metadata. Now, let's run through what happens to the database when you're on this code. First WordPress will check if the tag exists by the slug. It doesn't. So the slug, the tag is then queried by name. Strap in, okay? Checking the tag's name against the pair and again. Checking the slug is unique. Actually inserting the tag. We can celebrate this briefly. Check the relationship data exists. Parents, taxonomy, that kind of thing. When inserting tags, it doesn't. So let's keep going. Actually inserting the relationship data. And finally a logic check to ensure that the duplicate hasn't been created. And that's just to insert the tags. We have two types of metadata to insert. So we need to run three more queries twice. And for hierarchical taxonomies, the hierarchy is calculated. WordPress updates what grows to become a very large option every time a tag is inserted. But we knew how heavy on the database inserting tags was and we were prepared for it. We were going to control for all these slow queries by slowing down the import significantly. One page every 10 minutes. We plan to test the imports locally with just a few tags, but one morning we woke to discover the world. Or more to the point, the 14,919 points on the globe that Fairfax had added to the tags API endpoint and that WordPress was importing in turn, which essentially turned all of our environments into this, but slower. What happened was that we needed to immediately rewrite our code and then tidy it up for a second time. Bob's standard WordPress doesn't run a real cron job. It fakes it with HTTP requests. But once we hit a lot of pages of tags, which I made 320, the background HTTP requests would time out after 30 seconds. And that's what we'd forgotten that HTTP requests time out. But during the 30 seconds that it was running for, we'd already started storing cron jobs in the database. When the job failed, the same job would run again and add the same data to the database. We reverted the original commit and decided to add a plugin to allow for the scaling of cron. And for this, we used an existing human-made product called Cavalcade, because Cavalcade is a runner, because of course it's a pun. In production, our code look, if anything, the code look less considered than the original. Instead of throttling the imports, we scheduled them to happen immediately, all 320 pages. Well, by the time we deployed this code, all 460 pages. In the first iteration, we were slowing down how often these jobs could run. In the second iteration, they couldn't run often enough. In the second version, the import jobs run much as before, but because we're using a process runner instead of an HTTP request, we didn't have to deal with 30 second time outs. The process runs as long as it needs to. We could run four jobs in parallel and using the original code, it was going to take us 53 hours to do the first import of tags. The new code allowed us to do it in around 30 minutes. And I'm here to say that in the last few months, we threw all of that work out. Which is very harsh, but it's what happened. This is the truth. Human-made built what we asked them to, and it was absolutely the best outcome at the time. Tags API was built to be paginated through, that's just how the internal API worked. Didn't have a search endpoint. But we also started noticing some issues. First, interactions with the WordPress database itself were just slow. When an editor started typing into a tag field, it was taking just under a second for the results to be returned. Again, time for newsroom, 700 milliseconds, not really acceptable. So not only was the search perceived as slow, but due to the size of our data set, the best results we could offer were based on a lossy phrase search. So in this example, Donald returned a bunch of results. But Donald Trogg is no good, because we're missing the John. So unless the editors are paying attention while they're typing, or they know some every person's exact full name, you can see how this would be a frustrating user experience. In the meantime, our internal APIs were evolving, and a bunch of other teams had started using GraphQL, and they were having a great time. So we made the switch. Instead of asking WordPress for our tags, we asked our GraphQL layer instead. GraphQL then uses a server-to-server method to search the tags API, which is basically a simple Postgres query. Once we had the tag, we passed it back to GraphQL, comes back to the user in WordPress. Now, having done similar things in the past using REST, I was very skeptical that this would be faster than interfacing with our own internal MySQL database. But I was wrong, very, very wrong. Search is now consistently insanely fast, like seven times faster. The other huge benefit is that we're interfacing in the API itself. So no more imports, no more duplication of data, and if a new tag's added, we don't have to wait for the next batch import, it's available to the newsroom within milliseconds. So not only was it seven times faster, but Old Mate Trump could be found by the newsroom with ease. Finally, as a quick demo, just to show you that I'm not making it all up, the left example is interfacing with the WordPress database, and the right is interfacing with GraphQL. The WordPress search for North Korea fails as it's actually the Democratic People's Republic of Korea. I'll play it again, and you can just watch how fast those results are coming back on the right-hand screen with GraphQL. It is very fast. So Ben and I have just spent a good chunk of time talking about the code we wrote on this project. On the human-made side, this is who we are. When Ben and I talk about us and we, that encompasses a lot of people. It encompasses XWP, who helps build out our live article interface, and encompasses Ben May, who helped at the very beginning of the project. And of course, it encompasses the teams at Fairfax who are working on the product today. In total, there are 37 people who have non-merge commits on the project, about two-thirds of which, at one point or the other, were doing full-time work, dedicated work just on the CMS. As human-made wrapped up on the project, between the companies that had worked on it, we'd made around about 11,500 non-merge commits. Since February, this number has grown to just shy of 16,000, and we also have a few more tags. For many of those commits, most of them, in fact, my job was code review. He got to the point that introducing myself on more than one occasion, I made the joke that my job is code review and fooling myself that this sprint, I really will get to pick up a ticker. As the team grew, I ended up spending 40 hours a week reviewing code. I didn't write this code, the team did. And when you take merge commits into account, about 3% of total commits is just me clicking the merge button on a PR after it's been approved. It's very easy to think of code review as unproductive time. One of the biggest lessons I learned working on the Fairfax project was that code review is productivity. And that's my guilty secret. I enjoy code review. My guilty secret is that I liked Pete doing code review because he's excellent at it. But I also think that code review is a big part of what separates the work that I've done elsewhere at other newsrooms with the work that we've done at Fairfax. I think it's what makes the project feel enterprise. When you've got a big team and you're releasing to production daily, you've got a few hundred people in the newsroom relying on your platform. Spending time on code review is no longer a nice thing to have. It becomes a crucial part of your process. August 27 was an exciting day for me last year. It was about 10 months after we started work on the CMS and it was the night that we launched Brisbane Times. I was in the office sweating profusely. I was at home drinking a really nice glass of wine. Yep, which he let me know on Slack. Thanks, Pete. Our product and render teams had put countless hours into delivering a better experience for our readers. But that's not what I was really excited about. I was excited because this relaunched signal that the Brisbane Times newsroom was using WordPress as their primary CMS. The immediate feedback that we got in the days following was that it was just another day in the newsroom. And it was our job to give the newsroom the tools they needed to do what they do best. Write award-winning quality content and we've done our job well. Thank you. Thank you.