 Hi folks, my name is Alex Durgachev and this is my colleague Dave Vasilevsky. We work at Evolving Web, Drupal Shop in Montreal. And today we're here to talk to you about test-driven Drupal upgrades, so hopefully you're in the right place. Before we go further, I want to say thank you very much for coming into a five to six o'clock session, the last session of the day. We thought because it's the last one, it's probably because it's our first Drupal Con presentation, so they knew that, so they put us at the end, but we're made up for it by preparing a lot more than usual, so hopefully you'll enjoy the talk. Yeah, and just in case we have some mountain dew, if you need it, just feel free to grab a bit. Okay, so the outline of our talk today, we'll spend a minute talking about us, an intro to the subject at hand, and then we'll cover the basics of Drupal minor updates, major upgrades, and how does testing fit into all of this. The tools we'll want to particularly showcase today are B-Hat for integration testing, Circle CI for continuous integration, Docker for spinning up consistent development environments, and then after that we'll do a demo called Drupal Docker Marriage that combines the two and really shows a really nice workflow that we've evolved ourselves for ourselves. And then we're going to go into a little bit of a case study of a project we did a year ago upgrading a website from McGill University that was quite hairy and really pushed our teams' capabilities and forced us to come up with some better practices than just naively doing it. And that was a Drupal 6-7 upgrade. And then after that we'll demo a tool that we produced as a result of this and similar projects called SiteDiff, which I actually hope will really be something you guys are interested in trying. It may save you a lot of time and effort. All right, so about evolving, so our company, like I mentioned, is based in Montreal, Quebec. We've been doing Drupal since 2007. We're super involved in the Drupal community. I mean, I've been to like half a dozen or a dozen of these conferences we've sponsored usually. And yeah, so, and our devs have committed code to Drupal core and control modules. The kind of projects we take on, you know, we're not shy about DevOps kind of stuff, with lots of users, lots of content, multi-lingual, it's a big deal in Montreal. The kind of organizations we work with, they really seem to take an interest in search, so we've done a couple of interesting search UI projects. And also, if you have lots of data, you have to import that data. So we're not shy about doing migrations and content, import, and sync. Of course, custom team development and design and responsive design is right up our alley. And we've built apps on top of Drupal, like an admissions app for McGill University. So that's really like treating Drupal as a development platform. And beyond all this, we also do quite a bit of training. Hi folks, we're just coming in. So we do quite a bit of training as well. And the kind of clients that we see, so both across Canada and the United States, so the top left is All Scene Alliance for the Linux Foundation, we've done a couple of projects and we're doing a few more bigger ones just coming up. Parks Canada is a Canadian federal government agency, and as well as buy and sell, achat, vent, JC.ca, it's for Public Works Canada, it's their RFP portal that was a few years ago. That was another Drupal 6 to 7 upgrade, actually. All Hotels was a big portal for Travelocity that was competing with Hotels.com. My lifetime, I think every Drupal shop in New York has worked with this project at one point or another. It's one of the earliest big Drupal deployments for any entertainment. And other local things like McGill University and others. So beyond that, we've done public training in Montreal, Ottawa, and Toronto very regularly, as well as DrupalCon Munich, New Jersey, New York, Boston, and we've done a bunch of private training as well. That's my partner, Suzanne, who leads our training program. We've done it for Health Canada, Parks Canada, Tourism Quebec, Trent, and other teams. And we actually do for both Drupal development teams, both for a dev shop or an organization that has an internal team, and we've even done some remote. So if you guys are interested in good quality Drupal training, please don't hesitate to contact us. Aside from being involved in involving WebSense since I founded it, or I co-founded it seven years ago, I have other programming interests like I've written a Chrome extension recently for translating and learning French, and I've written some red mine and vagrant plugins that people use. Dave was actually one of our first devs when we started the company seven years ago on a part-time basis, and yeah, he was around to show us how to use version control and Linux and the like, and beyond that. Yeah, I basically delve into any open source program that I find, so I've contributed batches to a wide range of programs, from KDE to Firefox to the JSP to Fire. All right, yeah, so there's a good, there's a chance that in the future you might, you see credit somewhere. Great, so that brings us to the talk at hand, Drupal upgrades. So rhetorically I felt obliged to include this, why are upgrades and updates important for security? Well, is everyone familiar with what's being called Drupal Geddin? Yeah, has anyone here not patched to Drupal 732 or above for one of your sites or one of your client sites? If so, we will pause the presentation and we will wait for you to update your Drupal core, because, or just tell us and then we'll play a prank on you if you don't want to do that. Yeah, so it's a huge security vulnerability that affected every Drupal site and within days of its announcement, like, script KDE's or whatever had an automated break-in, or like, hey, had an automated break-in that allowed them to take full control of your server or an execute arbitrary code, which means any server that was running on patched Drupal sites for more than a day is considered compromised and should be wiped. So that's a pretty important lesson to be able to quickly and regularly perform core and contribute module updates. But also, they can bring in new features. The Drupal 7, for example, has many performance improvement features, so if you operate from D6 to D7, you can run varnish without maintaining the press low fork, and obviously the admin UI. If you operate a control module like Webform from 3 to 4, you're going to get token integration, so your little Webform emails can leverage everything that token can do, the core token module. And of course, after Drupal 8 is released, as of now I think it's three months after, they will no longer be, well, they're not promising anymore to backport security releases to Drupal 6. So whenever Drupal 8 is released, whenever it's ready, if you have any Drupal 6 sites you're sitting on, you're going to have to upgrade them to 7 or preferably 8. Great. So many Drupal devs, ourselves included, don't like updating and upgrading. We do it when we have to. We do it when we see an important security announcement. But what's so hard about it? Why isn't it done all the time and regularly? Well, first, it's not entirely clear in every situation what to do. I mean, you have to sit down and figure it out, and there goes a couple of hours that you weren't planning to spend that day on that. It takes time. And we're afraid of regressions, and we hate doing the manual testing, right, because it's not actually fundamentally hard to update the code. What's hardest to be sure that the updated code works and works well, like your site doesn't get broken. It's very possible, I wouldn't say likely, but possible that even during a minor update, you'll introduce a bug that wasn't there before. So you think you're getting an important security release, but no one was really going to go after your site, or maybe it wasn't a factor. But now you've just introduced your bug, and your site is broken. And so people are afraid to just try it without dedicated a large amount of time and having a good reason to do it. So that's what this talk is about. It's to try to talk a little bit about steps so it's at the top of your head after you leave this room. You probably know many of them anyway, but we're going to introduce a few extra tools and processes that make it easier and faster. And a lot of that is going to be focused on the testing side of things, how to verify that your upgrade is actually successful. Because, you know, that's the manual painful part, I suspect, for most of the people in this room and ourselves. Just a quick point of terminology, minor updates and major upgrades. I guess everybody in this room should know, minor is 735, 736. Typically you're only for security updates and important bug fixes, both core and contrib, major upgrades were a major version bump. Both for core and contrib means backwards incompatible API changes. Means if you have custom code that extends it, it might not work. If you have other modules that are supposed to play together, they might not work. So hence the minor updates versus major upgrade. So then we have a couple of slides now about minor update basics, just. So first, when to run a minor update? I guess at the very least, if there's a security announcement that you see, you're also subscribed to this mailing list, Drupal.org slash security. I hope they will send you an email saying these contrib modules have a known vulnerability or Drupal core has just come out with an important thing. So if you see that, that's at the very least do it then. Where to, or more often, I mean, try to stay on top even more regularly than that, where to upgrade? Well, don't ever do it, I'm proud. I know for smaller projects, that maybe you're still kind of maintaining from back when you just started Drupal development as a favor to the organization or person, there is no Dev site. You don't have a Dev site anymore. You don't have a staging environment. But hopefully after today, you'll be able to set your Dev environment in such a way that even in the future, you'll be able to come back and spin one up really fast. So don't do it, I'm proud. And the other thing to keep in mind for both minor and major upgrades is if you cheated and you used a version of a module that you needed to patch or hack whatever terminology you want to use, that's going to complicate your maintenance and your updates, because you'll have to maintain the patch. If you were your colleague or somebody else was taken over after you doesn't know that that contrib module or core module was patched or hack, then they will overwrite your patch and lose whatever benefits it provided. Okay, so how to actually do the update in place? Well, there's two Drush commands that are helpful, Drush UPS or Drush PM update status, and it lists all the available updates, and then Drush up or Drush PM update that actually performs in place all the updates. So this is really great, one Drush command and you're done, right? Almost, in a real word situation, you're going to run this Drush command on Dev manually. It's going to do what it needs, and then you're going to test, and then you're going to see if anything broke, however you test, whether manually or using the automated testing tools that we'll discuss or something else. I see if it broke, then you're going to commit because updates means code changes, so you commit it to your Git or whatever repository. And then the next thing you should do is you should realize that maybe there was a mistake in how you committed things. So don't apply the update on prod directly. If possible, apply it on a staging or test environment, and then make sure that works. And only after that process has been applied, staging successfully and tested successfully, then you can be, with confidence, apply and prod. I mean, and this is why in Drupal's update.php, you've seen the language saying, make a backup of your site now. It might be that something goes wrong and you'll have a corrupt database and you'll need to hire Dries and WebChick to fix it for you because it's going to be in an unknown state. Yeah, so basically better safe than sorry in this regard. The beyond the code changes that Drush will download, another thing it runs is it runs a specific Drush command called DrushUpDB, or alternatively you can trigger the same code by running update.php. And what that does, I'm sure most of you know, is it brings the database structure of your MySQL tables and, like, up to the same format as expected by the new code. As code changes over time, you might have a new column in the database where somebody might rename a variable from one name to another. And so if your database didn't get updated to the assumptions of the new code, the new code will actually be buggy and not work well. It might give you a white screen of death. So the first thing you need to do is after you update your code is you need to run DrushUpDB or visit update.php. Great. So that was minor update basics. I ran through that pretty fast because we'll show all of this as an example. So now this brings us to major update basics, major upgrade basics. Well, the first thing to say about major upgrades is that there's no such thing as a basic major upgrade. I guess you probably can guess what the second thing, the second rule of major upgrades. Yeah, so major upgrades can be harder than a site rebuild. Very often, if you do a major Drupal upgrade, you'll have incompatible API changes, and your custom code won't work anymore. Or contrived modules, which assumed that API won't work anymore. Or the contrived modules in Drupal from 6 to 7 or from 7 to 8 might not exist in Drupal 6 to 7. So then you'll have to find another one or upgrade it yourself, or build a custom module that does something similar. So you should be aware of this, that this is a major thing. And even when you're given that Drupal 7 has been around for four years, you think, I have a simple site. Everyone's by now upgraded every one of these modules. No, we found out multiple times in the last couple of years, three, four years after Drupal 7 has been around, that it's been far from trivial at every step. And there's always something different that breaks. Great, so then we're also talking about updates D6 to D7. We put that in the description for this talk, so hopefully you're not disappointed. But I actually don't know how to upgrade to Drupal 8. Does anyone in this room know how to upgrade from Drupal 6 to 8 or Drupal 7 to 8? All right, somebody, you should stick around for questions. People will have questions for you. I'll have questions for you. But from what I understand, the core developers are right now assuming you're doing a site rebuild in Drupal 8. And for now, you're going to just use the migrate module to go and import your content into the new site. Maybe that's not correct. And I know that they're planning to do better than that in the future. There's supposed to be a way. In the future. Yeah, so it doesn't exist yet, as far as I've heard. But lots of people are going to be scratching their heads saying, what the heck do I do? So there probably a way will emerge. But if you're following that way in the next year, you can be sure you'll have to get your hands dirty. All right, so major upgrade basics some more. To convince you that this is not trivial, here's what you have to do before you even start going to D7, still in D6. First, from D6 to D7 that is. First, you're going to have to upgrade every single core, both core version, every single core module, and every single contrib module to their latest D6 version. For contrib modules, that means dev versions. Because the upgrade path from D6 to D7 is actually going to assume that you've run all the updates to get to the latest point in D6. So you've got to do that first. And that may take you a day or longer if you have complications. Then you'll have to de-feature as I'll talk about that in a bit. But if you're a good boy, a good girl, and you've been putting every single bit of configuration and features in D6, and you're saying, oh, great, all my configuration is going to go to D7, right? No, features are PHP code. And there is no update path for features. You'll have to figure out a way to de-feature as your site and put everything back into the DB to have an upgrade path. Then you're going to have to clean up and fix bugs because upgrading to D7 will create lots of bugs. And you want to know, is this a bug that I just introduced, or is this a bug that was there to begin with? Then you're going to have to disable all contribute optional modules because, safer that way. I had a good reason to skip it now. And then you're going to try to redo the upgrade. But you should know that some things will fail in the upgrade path. And then you're going to have to go and uninstall those modules. That means delete them from your site, delete all their data from their site. Hopefully they don't have much. If they do have tables or things in your database that you need that you don't want to delete, you're going to need to write a migration, like a ddd migration later to capture that back again. And then you're going to disable your theme because you can't disable your core theme, your main theme, but you have to switch it to a core theme like garland. OK, so you've done this. Then you can perform the actual upgrade, which involves upgrading the code of core and the remaining modules to the highest D7 version. Then you're going to run this Drush of DB, just like we did for the minor thing, but for the major. That may take a while. That may have bugs, like I mentioned. Then you're going to run content migrate, because what you're going to figure out is this core upgrade path and contrib module upgrade path that just ran that doesn't pick up CCK to fields upgrades that the developers have left that as a contrib module called content migrate, and it has its own separate Drush command that you're going to need to run. And that works fairly well. OK, so then you're going to enable one at a time, or two at a time, or three at a time, all your contrib modules that you've disabled and upgraded, and then hope for the best. And you're just going to iterate on seeing if they work, and then maybe patching them if they need to be patched, replacing them if they need to be replaced. Like in Drupal 6, we use pop-ups API for a couple projects for a better admin UI experience. And in Drupal 7, nobody bothered porting that, because it was so hacky and tied to Drupal 6. But in Drupal 7, there's references dialog, which tries to do something similar, and lots of people use it. Great. So then you have to re-enable and test your custom modules. So that's going to be a lot harder, because you're actually going to have to rewrite a good chunk of the code that you initially wrote, retest all that code. Hopefully you've got good tests. I'll talk about that later. And you should know that there's a code or upgrade module that does a code analysis of your custom modules and tells you how to upgrade them sort of mechanically, but it gives you lots of comments to guide you along the way. So you don't have to know all the API changes. It knows all the common changes you need to make. Then once you wrap up your custom modules, you're going to restore and refactor your project specific theme. And then you're going to iterate, iterate, iterate on this, testing the site, finding omissions, finding bugs, and generally getting it to a better state than when it started. Great. So that's major updates in a nutshell. So I'll hand it over to my colleague Dave to see how testing fits all of this. Hi. So we're going to start with some testing basics. And there's a bunch of different kinds of testing that we talk about when we talk about testing in Drupal. The most common one is unit testing, where a lot of modules in Drupal will just have a test. The test uses a fixture for input, uses some basic data structure as input, and some basic data structure as known output. And just test that, test one function or one class. And that's basic unit testing. Then there's integration testing, which tests how different modules fit together, or how the site as a whole fits together. Finally, there's UI testing, which tests how a user or a real browser would experience your site. And one of the good tools that we found for using that is called BeHat, and we'll talk more about it later. And finally, we'll talk about continuous integration, which runs your tests on an automated manner all the time to make sure that your tests are always being run. So first, let's talk about unit testing. Unit tests are great because they're fast. They don't have to touch the Drupal database. They don't have to load every single module in the site. They can just test one single function or one method. And as I mentioned, they use fixtures. You can't use most other Drupal methods in a unit test. So you have to put together a structure as your input and a structure as your output. So let's say you'd have a function that added two numbers. You'd have the two numbers as input and the sum as the output, and that would be all your function would test. It wouldn't be able to test how it integrates with anything else. In Drupal 7, you usually use simple tests for this by making your test classes inherit from the Drupal unit test case. And in Drupal 8, they're using PHP unit, which is a more standardized PHP way of doing unit testing. So unit tests are great, but they don't help you that much for updates. At least the contrib modules unit tests. Because each contrib module has already written tests for itself, and they run the tests every time they do a release, and the test bot runs those tests. And so you can be pretty satisfied that when you update views, views has already been tested in isolation. What you're worried about when you do an update is how will it integrate with the rest of your site? If they've added a method in views or changed the parameters to a method in views, is that going to work with your site or is something going to break? However, they are really good for custom modules when it comes to updates. When you're doing a major update of Drupal, for example, you're going to have to change a lot of code in your own custom modules. And when you do that, you're going to have to make sure they still have the same behavior as they did before. Unit tests are a great way to do that. Integration testing is more important for updates. We still have simple tests in Drupal for integration tests using the Drupal web test case as a base class. And simple test is great, because it is a lot of Drupal integration. You can do things like enable modules and create content and add users. And indeed, you have to do that in order to use simple tests correctly. The problem with it is that by default, simple tests will test your module in isolation. It only tests your module and the modules that your module depends on. It won't test how it integrates with other seemingly unrelated modules. But if you add, let's say, a new field type, it won't necessarily test how that works with views, for example. The other option is you could have your simple tests pull in all kinds of other modules and really do a whole site test, even with simple tests. But that creates kind of tight coupling. Because your site and your tests now depend on the structure of your entire site, and it's hard to maintain. Simple tests are also integration tests with simple tests are also really slow. Because every time you run a test, Drupal basically does a site install. It copies the entire database and installs all the modules that are necessary. Finally, integration tests are troublesome, because you can test things that PHP doesn't see, things like JavaScript, CSS, certain forms. UI testing is what we found to be the most useful for this. It basically tests your site by running a real browser and controlling that browser to visit your site and perform operations. It's very powerful and thorough, because it can run all kinds of JavaScript and other complicated web technologies. There's a lot of examples of them, like Selenium, Casper, JS, but Behat's the one that we've chosen because it has great Drupal integration. We find that UI testing is especially useful when you've got a legacy site where all you were doing before is manual testing. If you tried to add unit tests to that site, it would take a long time before you're able to test every bit of custom code. But by using UI testing, you can test at least a little bit of each bit of custom code in a short amount of time. So let's talk about Behat. Behat is a technology for behavior-driven development that has a Drupal extension that has a lot of Drupal integration. Behavior-driven development is basically the idea that you should write your tests first using a human, readable language. And then afterwards, when you write them at first, they'll all fail, because you haven't implemented the functionality yet. But after you write them, then you can implement the functionality and make sure they pass. It's a great way of handing off work to other people, because if the tests pass, then they've done an OK job. And if you use behavior-driven development, if you write your tests in advance, then you know that your site will be ready for upgrades because you have your tests in place. Moving on. So here's how you write tests in Behat. Here's an example where we have a scenario. And what we're doing here is just we have a site, and the site has articles on it. It has nodes. And when you hover over a certain part of a page on that site, you should see the author pop up in a hover. So we express this using an English-like language. We describe the scenario. We say, when you hover, it should show the author. Then we have a given line that sets up your example. It says, given that I have certain content, and here I'm going to describe the content that I'm given. It has a certain title, a certain author, and a certain body. Then it describes what you do. It says, when I hover over the author region, and the browser will be controlled to do that. And then finally, it has an assertion that says, I should see something. And if I don't, there's a problem. In this case, I should see the text bob. So in this case, the given line, this bit, comes from the Drupal integration of Behat, specifically the Drupal API extension, driver rather. The third part, that then I should see the text bob, is a built-in part of Behat. It just knows how to do that. And the second part is part that we built ourselves. And just to show you how easy it is to add functionality to Behat using PHP. In this case, we just created a method in what Behat calls a context. And the method has a line at the top that tells it what kind of text should trigger this method. In this case, when you say, when I hover over a name of a region, and then it provides a function that just says, what do I do when I see this text? In this case, it looks for a region and then tells the browser to mouse over that region. Behat has a pretty complicated stack, and it can be useful to understand how it all works. There's the Gherkin language, which is this interesting human-readable text that we saw before that's implemented in Behat. And then Behat relies on two separate parts. It has one part that integrates with Drupal, the Drupal extension drivers. This lets you do things through a side channel without controlling the web interface. So you can tell Drupal, create a node, or create a user, or log in without having to click all over the user interface to tell it to do all these things. And then the second part, which I listed first here, is the part that controls the browser. It uses a thing called Mink, which knows how to control many different types of browsers and browser emulators. And through Mink, it talks to Selenium, which knows how to control real browsers, like Chrome, or Safari, or Firefox. And then finally, it talks to the real browser and tells it what to do. Next, we're going to talk about continuous integration. Tests are great, and lots of people write them. But tests can be really slow, and developers get really tired of that and just stop running tests. Also, you just don't always remember. Sometimes you make a really quick change, and you think it looks good, and so you just commit, and you push, and you think you're happy, and you might not be. There might be breakage there. So we like to use continuous integration. That's basically a system that every time you commit, every time you push to GitHub, let's say, or whatever Git service you use, it will automatically run your tests. Often use a build server for this. That way, even if a developer forgets to run continuous integration on their machine, the build server will run your tests, and it will report on the results to the entire team, and nobody can have their individual mistake just mess everything up. We find that for upgrades, this works really well if you have test-driven development, if you have your tests written ahead of time, then this will find basically any important bug. So for our continuous integration work, we use something called CircleCI. We really like it because it integrates with GitHub, which we use for a lot of projects. It's a hosted service. It integrates with GitHub, and so every time we're creating a branch or a pull request, it just picks it up and runs all our tests. It notifies us when something breaks, and it catches a lot of really unusual bugs that we didn't expect at all. We use it to run our entire, not just the tests on our set itself, but the entire deployment scripts as well. So if something weird happens, like a server disappeared that's not really important, but that the deployment scripts are relying on, we find out. If we hadn't run continuous integration for that project, we would have had some time when the site went down in the future, and we would not have been able to redeploy it, and we wouldn't know why. And finally, we like CircleCI because it lets us use Docker, which we use for a lot of our projects. And that lets us be sure that when we run a test in CircleCI, it's exactly the same environment as when we run it in development or when we run it in prod, that we never have CI find problems that don't really exist in our site, and we never have problems that really exist in our site that CI just doesn't find. So let's talk about Docker a little. Docker is a tool that basically lets you run lightweight virtual machines, which we call containers. There, it makes it really, really easy to just run a copy of your site, an exact copy that works exactly the same. You never have to worry about having a different version of something installed, or somebody having a slightly different file system layout. Everything's exactly the same. That means that if something breaks, you can just spin up the site again. If you run an update and things don't look good, you can just go back to where you were before. It also really helps because you have a consistent environment along, amongst all your sites, because it's exactly the same. If one developer finds a bug, they know that every other developer is going to be able to find that too, going to be able to reproduce it. You never have situations where somebody says, hey, this isn't working. And somebody else says, it's working for me. Everybody can find it. So here's how we integrate CircleCI and Docker. CircleCI is configured with a file that they call Circle.yaml, just written in YAML, which is a very simple human readable JSON alternative. So we specify the top that we want Docker to be running on CircleCI's virtual environment. And then we specify what commands we want it to run to set up our site. In this case, we just build our project with Docker. We'll talk more about that later. And we run the project, and we tell it to map certain ports so we can shush into the site. Then finally, we have the test stage, which tells it what to run on our site. We shush into the site, and then we run, in this case, a simple test, a Drush test run. And once we have this set up, then every time we make a change to the site, all these tests will be run. Now I'm going to show you a little demo of using BeHat, CircleCI, and Docker together. So if you were asleep, wake up for this part. So we have the site that we put together, and we're just going to run it. We have a little command that runs it automatically. And all it does is it runs the container that we've already built and has a certain name with Docker. So then we can go to our site. So now you can see this is the site that we have. It's from a barrage of some friends of ours from a long time back. You can see the happy couple here. As you can see, the site has a navigation bar, and clicking on the bar will move between different sections of the page. And every time you do that, the little active section of this bar changes. There's also a form at the bottom where you can register for the wedding. It's been over for a while, but I can still register. And we can see that when you register, you get a little thank you box. So now we can look at the BeHat.yaml that we put together, which defines how BeHat will work and some informational metadata about our tests. So here's how the BeHat.yaml looks. We have a bunch of lines at the top, which define how, which tell BeHat basically to pull in the Drupal extensions. We have to give it a couple of URLs. One tells BeHat how to talk to Selenium, because we're going to use Selenium to drive Chrome when we run our tests. Another one tells BeHat and Selenium how to get to our site in the first place. We have some magic in our Docker files and in our make files that set up these IPs with reasonable values. We tell BeHat to use a certain API driver, which tells it how to talk to Drupal. There's a bunch of different drivers. There's one that uses Drush. There's one that basically doesn't use anything. And there's one, this one, called the Drupal API driver, which we prefer, because it lets you do lots of interesting things like run Chrome, which you couldn't do otherwise. We tell BeHat where our Drupal site's root is. And we define a certain CSS selector and give it a name. In this case, we call it menu active, and it defines the active version of this menu. So right now it points to the R story one. It's important to define that because we're going to use it in our tests. So now I'm going to go into the container that I just ran, the Docker container. Now I'm inside, and you can see that this looks like a Drupal site. I can run BeHat, which is a little wrapper that I set up, and it'll run our BeHat tests. You can see that they're successful. They're all green. And you can see that they're being run in Chrome. It's automatically filling things out and doing operations. So let's show you what the tests look like. OK, so here's our test. We have one that submits the form. It says I should be an anonymous user. I can visit the home page. I can put some data in the form, press Submit, and I should see that little text thank you that we saw before. And we have another test that tests the header navigation. When I go to the home page, then the active section of the menu should be the little home button, this thing. And lodging should not be active. And then if I click lodgings, then lodging should be active afterwards. So now let's start doing some upgrades. You can run Drush Ups, and it'll show us that some upgrades aren't available. Actually, a lot of upgrades. So one of them that we want to look at is Webform. Because Webform is a security update, and we don't want to be at a date. So let's update it. I'll do Drush Ups dash y. So you can see that it updated the code, and it performed some up DB steps that's here, and now the update's finished. And so we can test if it still works by running our command. And everything looks OK. Then let's try another upgrade. One of the other ones that we saw was Bootstrap, which is our theme. And we'd really like our theme to be up to date. So we can do that. Update Bootstrap. Now when we run our tests again, you can see that the first test, using the Webform, is successful. But the second one fails. And that'll make us think what's going on. And we can visit the site and see. Now if we reload, you can see that this isn't clearly not working right. It's not in the right place. And the slow scrolling isn't working, and there's no active indicator. So VHAT has detected a broken update on our site. On a larger site, you wouldn't just test this live. Actually, let me show you something first. One nice thing about Docker is we can just reboot the site immediately from the stored image. And it's working again. Everything's fine. It's just really nice when you're doing tests, because you never have to worry that you're just going to miss everything up, and you'll never be able to recover. So on a larger site, you would probably not do these manually, right at the terminal. But instead, you would turn them into pull requests, which is basically, if you're not familiar with pull requests, it's just basically an issue or a bug that has some commits attached to them and says, I think these are ready to go for our site. So in this case, we've submitted a bunch of pull requests which perform these updates we talked about. One of them does Bootstrap, and one of them does Webform. And CircleCI, which we've set up for the site, has annotated them with little icons indicating whether our tests were successful or failing. In this case, we can see that the Bootstrap one failed. And if we click on it, there's a little Details button. You can see first that this was actually the Bootstrap update that we expected it to be. So you can get how it cooperates. So all that was updated was Bootstrap. It's really a Bootstrap upgrade. And then we can see what CircleCI has to say about this. Why did it say it didn't work? And it has all the different steps of the site, the same way we did make run just a moment ago. It did make run. And then it runs the test at the end. And you can see that the exact same test failed. Because we were using Docker, and on CircleCI, we had absolute confidence that the test that succeeded locally should fail remotely, and it should succeed remotely, rather. And the test that failed locally should fail remotely. They were exactly the same. Yeah, any questions about the demo? Talk to us about a question. Sorry. Yeah, so I noticed in your test cases, you had a wait for Ajax. And more in Cucumber when I worked in Rails, there was never really, there's a lot of talk about how to actually wait for Ajax. I was curious if you guys had any magic or if it was just a sleep or wait. No, I mean, what VHAT does for Drupal by default now is kind of sad and not likely to work in all cases. I think it's the kind of thing where maybe you just have to customize it for your tasks. All right, is that anything else? OK. All right. So that was a fairly simple website that we did as a favor to a friend a couple of years back. And we thought it was a nice combination of Drupal and Docker and VHAT to show you guys. And I actually want to add that in the slides, there's a link to the GitHub repo for that. It's all open source, so you guys can poke around at the VHAT YAML file, the CircleCI YAML file. So it's all there. And even for the Docker site. Now let's talk about something a little bit more involved. So we have an upgrade case study for McGill University. The site is their course calendar, which is the Canadian parlance for an academic catalog, basically. It lists all the programs, all the courses, and all the regulations that used to be printed up every year in a book about this thick. And now it's online. These are legal documents that students follow as part of their academic program. They can sue if they are not graduating, but they should be according to that year's version of the program. There's a bunch of metadata, lots of cross references in Drupal 6.0 and old references. Because there's so much data in there, it's a search driven UI, and there's lots and lots and lots of data. So the search driven UI was back when we built the site originally five, six years ago. It was one of our first Drupal projects. And Drupal was a very new thing. And we actually kind of pushed the envelope a little bit with having search tabs. And within, there's different facets for each search tab and different content types and different templates. And we also had embedded search list sections. You navigate to Arts Undergraduate through the menu, and then now you have a search just that section link. Which was actually not available in the Drupal Solar Integration Module, Apache Solar in Drupal 6. It is in Drupal 7.0. I think partially because of some of the work we contributed back, we did the project. So we were happy that Drupal 7.0 search API did all of our custom stuff. We were able to throw away a bunch of custom code. Another interesting part of the site was there's a huge hierarchical organization of every single page on the site. And so from a user's perspective, there's like 10,000 items in a menu driven hierarchy. And this consistency for the menus, the breadcrumbs, the URLs, probably other things that I'm forgetting now. But actually, there's only a few hundred menu items in the primary links, which is the Drupal 7.0 main menu. And the rest are actually in separate book menus because all these regulations are being written by different academic units and teams with a different import workflow. But each one of those top level books sits in the primary links. And then we had a lot of custom code that sort of glues each book menu to its right spot in its primary links menu. And that code used menu tree page data, menu tree all data, menu block, API, and so on. What else is interesting is the site has about 70,000 node revisions per year. There's five years. And each one is a separate Drupal database. Most of the content from the site is being imported from McGill's academic ERP called Banner and Sun Guard Banner. And for the regulations, that's being imported from Documentim, which is an EMC document management system, content management system. There's 15 content types, 170 field instances of fields per content type, and lots of node references. And we also have lots of custom input filters, for example. If anywhere, there's a course code mentioned that's going to get linked to the right page. What else made this project challenging is that we wrote it four years ago as one of our first big Drupal projects. And since then, there's been four years of cruft, like Debs who are doing their best to understand what's going on, who didn't write the original code, who copied and pasted a block of code somewhere because they had a related use case and modified a little bit. And I don't think anyone's guilty. I mean, we maybe didn't document it well enough. And maybe they didn't figure out exactly what the hell we were trying to do well enough. But after four or five years, there's a lot of code that nobody in this world could understand exactly why it was written that way. All the news, it kind of worked. So when we're going to upgrade to Drupal 7, and when we're going to break everything, we said, OK, well, we can't try to fix all this code that we don't really understand why it works. Let's refactor it in Drupal 6 first. We also had a legal requirement that everything must be correct. And another major challenge is we didn't deliver an updated code base and a database dump. We delivered an upgrade process that would work on their production database that's on as we were developing. And that would work on the other four years, two or at least, with adaptation. So what made the project easier than other Drupal projects actually is it had very few contrib modules, 20 to 30 standard stuff well supported. McGill's very disciplined in that way. Almost all of it is anonymously accessible. The content is imported, but doesn't change beyond that. And we knew the original code, at least, in some form or some years ago. And the client was a fairly disciplined client. They had a solid development team. They knew this was a hard project. Anyway, they weren't trying to fix everything that they didn't like about the site, come up with a new organization, and so on. This is just an upgrade project. But there were still surprises that we didn't necessarily expect. I already mentioned that the de-featurization that we had to do at the start, I guess we didn't figure that out until we started, we were surprised that we couldn't even get a deployment of a dev site that's exactly the same, because there were some custom modules on McGill that wouldn't run outside of McGill's servers, or they didn't have the legal ability to share with us. So we had to work around all that. And so our deployment site was an approximation to begin with of their deployment. And we also figured out along the way that when you run that DrushFDB and then run the content migrate, that takes two days because there's just so many fields and so many records for each field. And that's kind of a pain to have a two-day step in your build process that you're iterating on. So we worked around that. And then we discovered MISC upgrade path bugs, modules that had a broken update path, despite the fact that Drupal Cell hasn't been out for years. So what helped, what did we resort to? So first, we did the Drupal 6 refactoring in a test room kind of way using unit tests. Second, we took the content migrate, which was the main culprit for the two-day sting. And we made a completely re-factor version, I guess, for lack of a better word, that ran all the transformations as a single query. And that was much faster. And we tested that. We used Docker that they just talked about for the build process automation. And we developed an integration testing tool called SiteDip that we'll talk about that really made us confident that at every step along the way we actually had the exact same content being moved forward. So for the refactoring side of things, when we were developing this stuff in D6, we said let's clean up all this code base, let's get rid of all this duplicate code. So because it was custom code, we wrote a bunch of PHP unit stuff. We could have used simple tests, but we were sort of forward-looking to D8. And it was OK. It wasn't too complicated to run PHP unit with D7. You just had to write some wrapper scripts and require some files manually, or you can even try auto loading with Composer. We used fixtures that were really super for testing our menu gluing functions that we had, because we had input primary menu simplified looks like this, PHP array data structure. Input book menu data structure looks like this. When they glued together, they should look like this. So that was input fixtures and output fixtures. And we actually used this to refactor the code and make sure that every single change we did still worked to give proper menu output the whole time. And same for nodes. And because we were refactoring the code, we were able to make sure that we were able to feed it fixtures and mock implementations of Drupal functions instead of the real thing to really decouple it. So that's called dependency injection. If you're trying to test code that uses Drupal functions otherwise, it's actually quite tricky. So it's something to be aware of. So our build process, I don't have too much time to go into it, but we had to deploy it in Drupal 6 and make sure it mirrors what they have in production. We had to refactor it to make sure we didn't break anything. We had to de-featureize things and prepare for upgrades, as I talked about. Then we had to run the actual Drush UPDB and the code update. Then we had to finish that off with content migrate. And we had to do all kinds of crazy adjustments, which I don't have time to go to, but these are three or four scripts that have 30 Drush commands each, fixing various bugs and settings. So that took a really long time. Like I said, initially, content migrate along took two days initially. We pared down the production database to have only one faculty with about 10% or 5% of the content. So we were actually doing the upgrade on a subset of the content, which really allowed us to iterate a lot faster. Still, ours, but better than days. Then in parallel, we rewrote content migrate into content migrate tweaks that made certain assumptions that were shown to be correct that allow us to do the whole update for the whole field tables one table at a time with an insert select query, old table structure, new table structure, instead of one record at a time, one delta of every field at a time. So this took 30 minutes before all the fields instead of days. And we were able to actually do a checksum in MySQL of the resulting thing of the full content migrate with all the hooks that it allows and one record at a time. And the simplified form. And we saw that the output checksums are the same. So we were sure that our optimizations were valid for this project, maybe for your projects with crazy custom modules they wouldn't be. Great. And then we have some other performance improvements. Then we needed to come up with a build automation tool, a build environment that ran these builds. It had to be easy to deploy dev and testing environments that are consistent with each other. So we saw a bug in one, we saw another. We needed to have checkpoints. So first for the initial deploy of D6, then the refactor D6, then D6 prepared like all the modules disabled, then D7 up here. And then this was for testing. We needed to make sure that every step along the way we didn't introduce bugs with everything we saw. So we needed something that allows us to have side by side deployments of these steps for iteration. And then we needed something that's simple, because whatever our crazy deploy tool was going to be, McGill Devs weren't necessarily going to learn Chef or Puppet or this or that. So we needed something that looked a lot like bash. Like I guess where you see where I was going to go with this, Docker fits the bill for this quite well. We used a bunch of different Docker files, one for each step of this process that I described. Each one of those Docker files, when you build this image, it starts with a clean Ubuntu image, and then it starts installing all the packages, configures like V host settings and like, runs our deploy scripts, which includes checking out the code base at the right branch, pulling in the database, and then doing all the refactoring and the Drushoff DB. And they worked well, and these Docker files actually look almost line for line, like bash scripts, except the catch is every single line it runs, every command at every step, it actually produces an internal cached image, like hard drive snapshot almost. And it can detect if nothing changed in your built script since the last time you ran it, between lines one and 33, only in lines 34 there's a change. It'll use lines one to 33 snapshots as a base and continue, so that speeds it up greatly. So this worked super well, but we still had to experiment a lot with setting it all up, with various make files, you have to like, leverage some pretty good Linux system and stuff, and there's more than one way to do it, the same problem. So it wasn't obvious how to do it, but we got it working for us pretty well. And on the testing side of things, so that was the deployment script, and that really saved our button. On the testing side of things, we needed to make sure that each step was correct, right? So we needed to make sure that our dev D6 reflected prod in a meaningful way. We needed to make sure that there were no D6 refactoring problems that were introduced. We needed to make sure that once we did upgrade to D7, it was just like the D6 refactored one. And we were thinking, what kind of tool can we use to automatically compare it? Should we be writing B-hat tasks for, you know, 50,000 to 70,000 pages or something? How are we gonna capture all that? I guess we can have representative pages, but exactly what are we gonna compare each element, or what? And the main legal requirement was like, this static content that existed in this version needs to be at the same URL in more or less the same place than the other version. So the HTML output shouldn't change very much from this upgrade, right? So we said, well, why don't we just see if we can do an HTML diff on a page-by-page basis? So that's where I'm going with this, that's what we did. And we had the tool to track our update progress. We could see which pages had diffs how far along they were. Yeah, so the tool that we ended up refining in the course of this and previous projects called SiteDiff, it's open source, it's on our GitHub page. And the way it works is it downloads a subset of pages from the before site and the after site. It also could be the same site before it's cached and then you change it and re-download it and compare. And then it computes the diff at every page. It'll have a way for cleaning up various changes that you don't care about, such as, for example, if you have absolute URLs, you know, and you have an port 8080 old version and port 8081 new version, and there's some absolute URLs in Drupal, you know, that's gonna show up in your HTML diff, right? But they're very easy to take out with a simple regex. So this tool lets you provide these regexes to quickly clean out the differences you don't care about. And then when you run this diff, it gives you a command line UI just like we saw with BeHot. But because diffs are way more complicated than BeHot, we also have a web report that's fairly useful, that bullshit. And what's amazing is actually it took this giant upgrade with all the complexity of the site and broke it up into like almost gamified the task, make these diffs go away. Is it a spurious diff? Write a regex, easy, done. Is it a real diff? File a bug and work on this and fix the bug. So it really allowed us to track the upgrade process. And then when we're done, prove that, well, not prove, but be quite confident that we didn't miss anything. So the URL, the URL before, yeah. So the configuration here, this is the before URL. Here's the after URL, like I mentioned. Here are the paths. Obviously this is a simplified file. This is the paths that it takes. And here's a sanitization rule of what it could look like. There's different kinds of sanitization rules. Like you can have a CSS style selector that says remove this block. It's got random garbage and I don't care about it from diffs. It's a block I don't need to diff. So, but this is just a simple find and replace like for absolute URLs. And here's an old screenshot of what the report UI looks like. It's a little bit different now. Dave will show that in a quick demo after this. But what you see is here's the two URLs. Here's the, for each page that I tested, the before link, the after link, if you actually want to visit them and whether or not there was a diff. And if you do click on this little diff link, you see that we show you the whole page of the HTML with diff and mind you, you're using a nice diff library that says inline, you know, inline diff. So you actually see which parts of the line changed. And you also normalize the HTML so that it's nice diffs. So there's, yeah, the advantages I've already talked about, but particularly we have caching and we have parallelization. So it's actually quite fast and quite thorough. But as we talked about earlier, obviously it doesn't pick up JavaScript errors or CSS errors because it just analyzes the HTML. If you have like workflow and web form kind of stuff, you're gonna need something like BeHot to check that. I mean, this is really comparing HTML. But every Drupal site has, this is useful for every upgrade, but it might not be enough if you have something that's quite a lot more complex than a static site. And the UI, for example, can't do much to help you there. And the use cases for it are for refactorings. Comparing deployments is my Dev and Prod the same. They should be. For content migration, even non-Drupal to Drupal and for upgrades. And we have a quick demo in the next few minutes that they will show. All right, hi again. So I have a different site we're gonna show you this time. Should already be running. Okay, so it's a pretty simple Drupal site. There's not that much here, but there's a few content types and we can see them here. There's the basic article and basic page ones and there's just a test type. So what I'm gonna show now is a basic demo of how to use SiteDiff, getting started with it. How you can use it on your own site to look for differences over time and how you can use it to detect bugs on upgrades. Specifically gonna show it in a slightly different mode than the one Alex was talking about. Alex was talking about how you can compare like one site to another site, but you can also use SiteDiff to compare a site to itself as it was at an earlier point in time, which is really great for things like updates. So I have a directory here and I'm just gonna initialize SiteDiff with this site. So I have my URL and I'm gonna do SiteDiff in it with my URL and this will create a configuration file for me. It'll visit all the pages of the site that it could find. It crawls the site automatically, finds a bunch of pages and for each of these pages it stores the content in a cache file. You can see it here. There's a cache file right there and it also creates a configuration file, SiteDiff.yaml that we'll be looking at later. So I haven't made any changes to the site since I just did the diff. So an initial SiteDiffDiff to do a comparison should you would think show nothing, but instead it shows a whole lot of things. What's going on here? There's a whole bunch of things in a Drupal site that change every time you visit a page. Some of them are things like the form build ID which is basically a random string which Drupal creates every time you visit the page. We're not really interested in that kind of change. So we're gonna take a look at the SiteDiff configuration file which lets us exclude them. We have a whole bunch of things that we call sanitization rules here and one of them is here, strip form build IDs. It was automatically added to the configuration file because SiteDiff detected that it might be useful but it left it disabled by default just in case you didn't really want it. So we can remove the disabled line. Whoops. You can remove that. And now if we do SiteDiffDiff again, we can see that it's not there anymore. There's different things that are differences but that one is gone. So now I can reinitialize SiteDiff where I turn all the rules on so you can see how it really should work. Now when I run SiteDiffDiff to do a comparison it excludes all the spurious changes and it says it's all successful and nothing has changed at all. Now I'm gonna show you how you can add a single page and SiteDiff will detect that change. Just as an example of the simplest possible change that SiteDiff could detect. So here I've added a node, it's on the front page and so there's a change to the front page and we would hope SiteDiff would detect that. So let's find out. We do SiteDiffDiff and it does indeed find some changes. You see that the notice that Drupal gives you the first time you visit a site that says there's been no front page content, that one's gone. And a lot of other stuff has been added including the node that we just added. So this was an expected change. There's nothing unusual about us adding something and that adding to the HTML output. So we can just tell SiteDiff we're happy with that. We don't think that's a change that we are interested in anymore and we're gonna keep the site as it is now with this added content as the new baseline for future comparisons. So there, we did a store and that tells it to use that as the baseline. Now we can go to our site and we can try to do an upgrade. So this site right now is using Drupal 7.35. So let's try updating it and get to 7.36. So it's gonna take a minute and Josh is gonna download all the code for Drupal 7.36 and run up DB and hopefully if the update's good then nothing should change. But if y'all have been paying attention there was a bug in Drupal 7.36 where some content types would just disappear depending on the circumstance. So now if we run the update now if we go to the front page everything looks okay but if we run SiteDiffDiff we find that there's a lot of stuff that's changed. This is really hard to read of course because it's a giant command line diff. So we have a web UI that we can display. I'm gonna launch that now by typing SiteDiffServe and we get this little web report. It tells us that several pages have changed like note add, note add article and we can see what changed. So for example, note add article we can click on a diff and it displays us the whole page the whole HTML of the page pretty printed so it's easy to read including a diff. In this case we can see the page is still there but Lorem the content type that we had before has disappeared it's just not there anymore. We can it can be a little hard to read these diffs so we also have a side by side display. If you click this link that says both sites it shows the old site as served from the cache this no longer exists on a real web server but we still have it cached and in SiteDiff and the new site as it currently exists and you can see that Lorem existed inside bar here but it didn't exist here. So even though your homepage looks exactly the same as it did before, SiteDiff detected a change that would have broken your site and yeah that's basically our demo. We hope you tried out SiteDiff and we hope your upgrades go well. All right, thanks very much guys and we have one minute for questions if anyone is brave and hi. So let's say there's a site that's really not updating from Drupal 6 and we saw that it's gonna go unsupported so what should be our strategy at that point? Should we just keep an eye on the forum and see if there's a security update that we could patch and just survive that site for its remaining life? That's a great question, I'm not sure. I'm actually not sure what's best. I think the Drupal community expects you to upgrade. That's why they pushed back. The end of support for security fixes for Drupal 6 to three months after Drupal 8 comes out. Originally it's supposed to be the day Drupal 8 comes out so you do have three months to upgrade or figure it out. Or panic or yeah. So that's a great question that the community doesn't in my opinion have a satisfying answer for. Hi, I always find myself manually doing minor upgrades for plenty of sites. So what I thought is maybe fork a branch of each website and then apply the update automatically and then merge it to fit success also. Do you recommend that you see any problems with that approach? Sorry, I actually couldn't hear you. Maybe you can speak louder? Yeah. Yeah, I always find myself manually doing minor upgrades to a lot of sites. That's very, that's a drudgery actually. So is there any, you foresee any problems with automating this whole thing? I think with tools like this, it becomes easier to imagine how continuous integration can give you some peace of mind. So if you have an excellent set of VHAT tests or if a tool like SiteDiff can give you peace of mind. So if your tests are really good, perhaps you can do it. There's something called continuous deployment. The like hosted tools, I know one I've tried recently called Cloud 66 that integrates into GitHub and with your CI testing and Travis or Circle. By the way, I don't have automated tests for all my sites. You don't? Yeah, so you do? No, I don't have it. You don't? So you can't automate, in my opinion, until you have good test coverage, you can't really automate that. But once you do, that's called continuous deployment and that's an emerging trend that's really exciting. Thank you. Is there any more questions? No. All right, well, thanks very much everybody and please find us tonight or just after this and I'll be very happy to chat about this with you guys. Thanks a lot for coming to our late session.