 Okay, so I'm gonna ramp back up. So yeah, so if I'm the developer on a project and I start talking to the client and it looks like it's gonna be an interesting project and everything sounds great and then all of a sudden I hear this from the client. We think we have a way to save some money. You build the site, we're gonna take care of the content. Just don't worry about that. So the schedule says that the site's gonna be finished but just empty maybe two weeks before the site goes live and what could go wrong? So you build this beautiful site and everything's in place, you have all your blocks, all your views, you have mock-ups and IA, everything's there, all that's missing is content and then this happens. The content that the client has in mind just doesn't always fit what was planned for and how they described the site to you at first. Often clients don't understand that when they want images they don't have to put them in a body field, they can put them as actual image fields. Sometimes they think that you should put links to things instead of using entity references. Sometimes you just don't know the size of text that you expect, you build your view expecting that you'll have text 100 characters long and it ends up being practically a novel and everything just looks really bad and unresponsive. There's so many different things that can go wrong with content and that you have to plan for when you, early on in the project, if you wait until two weeks before it's done it's just, it's not gonna get finished on time. So we think we have a solution to that and I'm gonna show you an example of how we use this solution. We had this project we were doing for Trent University and when we started out doing this project we told them instead of waiting until we have a site built here's a spreadsheet with a bunch of different columns for the sort of thing that you're gonna be putting on the site for one content type, in this case it's for programs. So we just throw this up in Google Docs and we tell them, fill out the spreadsheet. Obviously we have to talk to them first and figure out what we think their information architecture is but once we're done, instead of waiting until the site's built we go and do this and we tell them, we give them a deadline when they have to do this by that's way, way before the site is done. They have to have this filled out maybe five weeks into the project. But then to make sure they're not just wasting their effort, we don't wanna collect their data and then have them have to type it in all over again once we're finished the site. So later we use migrate and we turn these things into nodes that look like beautiful Drupal nodes and aren't just spreadsheets. And so this basically saves everybody's time. The client gets to work on their content at the start of the project. They get to make sure that they know what their content really is and if there's something that they were wrong about or something that we were wrong about when we were doing the analysis we find out really early because they start filling in the spreadsheet and they say this isn't what we expected to be here. The client gets to use the tools that they know. Often you have to do Drupal training when you have a client who's new to Drupal but pretty much anybody can use a spreadsheet. So at least for getting the site up and running this is way faster than waiting until the site is done and then training the client. As I mentioned you can check if the IA is correct and also important is that you don't have one site that's sitting up on a server somewhere that has all the content and if there's a bug there your developers have trouble figuring out what's going on. Each developer can have their own copy of the site they can just migrate the content in from the same spreadsheet and have the exact same content as every other copy of the site. They'll be exactly the same but they can work on it on their own workstation. And you don't have to use Google Docs to do this you can use CSVs, there are tools like Gather Content which some other folks talked about yesterday I think. And this is the kind of thing that this talk is about. It's not about necessarily what you think of when you're thinking about migrate but it's about how migrate lets you have content workflows that are better. So before I go on further I have to talk a little bit about migrate so people know what it's about. So it's about migrate, who here has used migrate? Okay and who's heard of migrate? This is important. Okay that's good, it's a lot of people. So let's talk about what migrate is. Migrate's a system, it's built into Drupal Core now. It's an experimental module in Drupal Core but as experimental modules go it's pretty stable. I would not warn you away from it. And it lets you import data into Drupal and the data can come from pretty much anywhere. It can come from databases, from CSV files that are like spreadsheets. It can come from JSON files from basically any data source that you can generate. You can migrate stuff into Drupal. And in Drupal it turns into basically anything you want. It can turn into nodes, it can turn into users, it can even turn into config if you wanna set up a content type based on something from an external site that you can do that. It's part of Drupal Core so it uses all the same plugin systems that Drupal uses so it's super flexible. And then one thing that's really important is to realize what migrate isn't because people hear the word migrate. It's one of those words that Drupal uses that can mean a lot of different things to people but Drupal uses it in a specific way. When people say the word views that can mean lots of stuff but in Drupal a view means one thing. And so migrate is not certain things. Migrate is not about taking a module that works on Drupal 7 and making it work on Drupal 8. We usually call that porting, not migrating the module. And secondly, migrate isn't just about upgrading. A lot of people think like, oh, migrate, that's when you take a site that's Drupal 7 and you get the data into Drupal 8. And migrate does that but it does a lot more too. So I'm not gonna just tell you what migrate is here. There's lots of talks that give you a tutorial on migrate. Evolving Web has a blog series that we've written about it so if you wanna learn the basics of migrate and sort of move forwards from that into getting more at an intermediate migrate level, that's great but that's not the stock. This talk wants talk not just about how you do a migration or how you start doing migration but why you care about doing migrations. When do you wanna do a migration? We only talk about upgrades from Drupal 6 and Drupal 7 or taking some data from a legacy site like that kind of legacy site and moving it to Drupal. But migrate does tons of other things and it does things that you don't think of as being at all related to migrate. Enables you to have these new workflows like the one that I just talked about where we're collecting data in a spreadsheet and migrating it in. This is a brand new site, right? There's no migration coming from a legacy site. There's no migration coming from Drupal 7 in this example. Instead, it's a migration that you're setting up to make your content workflow better, to make your project manager's life easier so they don't have to go and yell at the client to get their content in shape two weeks before the site's ready. And so that's why I think that you don't just migrate things that, yeah, that sound like legacy. You wanna migrate all the things or at least a lot of the things. There are also some cases where you really don't wanna use migrate and I'm gonna talk about that too because I don't want anybody coming out of here being really disappointed about how something didn't work. So we just talked about this workflow before on that Trent University site, which I call the content first workflow. That you want your content to be really early, you want it to be complete or as complete as you can get it before you've built the site. But now we have to talk about how to actually implement this. And the good news is that this is super simple. But I am gonna have to explain a little bit more about how migrate works. Just a little bit. So a migration in Drupal has three parts. There's a source. And the source is some code in Drupal that is responsible for taking data from something, from anywhere. It could be a CSV file in this example, but in other examples, it could be a database or some external site. And it turns it into a bunch of rows, like basically like a spreadsheet table. And then there's a destination on the other side. And the destination is responsible for taking a bunch of rows and turning them into something in Drupal. So it says the destination of this data is gonna be users, let's say. And then you'll notice that if you look at these two little mini spreadsheets, these two sets of rows, they're kind of organized differently, right? In the first one, there are two columns for names, for first and last name. And in the second one, there's just one column where they're combined. And so something has to make those match, something has to make sure that the stuff that comes from your source can fit into your destination. And that's what the process does. The process defines how things are combined, how things are changed around so that everything fits together. And when you're doing migrations, this is just one migration, but you'll often have many migrations, maybe one for each content type, and then one for users, and then a few for taxonomy terms. So on a complete example, you might have five or six different kinds of migrations, different migrations that you're running around. When you use migrate, you also use a couple of contrib modules that are really important. The core of migrate is in core, but there are some other, you can't really make the best use out of it without some contrib modules, which were both written by Mike Ryan, who also gave me the t-shirt. So really, I think I'm very much. Migrate Plus is a contrib module that lets you define migrations with YAML files. You don't have to write any code to do that. You can just go and put some YAML files together, and I'll show you a very brief example later for how that's done. And then there's migrate tools, and migrate tools just lets you run a migration using drush instead of having to write some custom code to do it. So pretty useful. So I'll go show you the world's shortest migration, and this is the kind of thing that we would use for a simple example with a bunch of pages, let's say. If we were gonna migrate some pages, and each page has a title and a body, how would we do that? So a migration has basically four parts here. There's just the basic metadata that says, this is a YAML file. This is how you actually define a migration. This is a real migration that really works and is useful. This is all it takes. So it has an ID and a label that just lets you identify it. That's pretty simple. It defines a source. So remember, a source says where the data comes from. This says the type of the source, which we call a plugin in this case, is a CSV. The data's coming from CSV. The CSV file lives at a certain path. It has a header row, and then we have to define how to uniquely identify each row to make sure that we don't actually accidentally migrate something twice. So we say that the title is unique in this case. That's not always the case. It's pretty normal on a Drupal site to have multiple pages with the same title, in which case you'll have to choose something else to be your unique data, your unique identifier. You have a destination, and that just says where the data is gonna go in Drupal. So these are gonna become nodes, and they're gonna become nodes of type page. And then you have the process, which just defines how you turn the data from the source into data that goes to the destination. So the node title comes from the title column, and the node body comes from the body column. It's pretty simple, and this works, and this is the kind of migration that my colleague Suzanne wrote for the Trent University project. And she does mostly site building and theming. She's not a backend developer, and she was still perfectly capable of doing this and making it work on that project. So let's talk again about what we gain. We have all these things where the client can write content early. The client can use the tools they know. We get a chance to check if our architecture is correct, and we get a better environment for developers, and we all do this without much work, right? With just a little 15 line file in the scales. But there's still some things that can go wrong. First of all, you only wanna do this when it's worth it. If you have some really complex data model where you have hundreds of fields, I mean, I don't know, you have a problem with your site if you have hundreds of fields in one content type anyway. But if you have that for some reason, you might not wanna write a giant migration that has to deal with that. If you, especially if you only have, let's say you have a special landing page type that you only use for the front page, maybe don't migrate one page. That's not worth putting your developer's time into or your time if you're a project manager or site builder. And second, we're used to Drupal. Drupal gives you the ability to do lots of really cool things. It can have images. It can have rich text. It can have related content. So things like entity references. Those are really hard to put in spreadsheets. We've seen people who try it. It can get a little awkward. You start having like a spreadsheet which, and then a folder full of images or something. It doesn't work very well. And so this specific model of using a spreadsheet, the specific workflow is too simple for some sites. And for some sites, it will limit you and you don't wanna use this model for those sites. So let's talk about a potential solution to this because in this example, we're getting data from some source and our source is a spreadsheet and we're putting it to Drupal. And spreadsheets aren't powerful enough. They don't have certain things we want like rich text and images. But what does that have that? Drupal has that. Drupal already has a great UI for content management, for rich text, for images. So maybe let's not take our data out of a spreadsheet. Let's take our data out of Drupal. We can have two separate sites. Our real site, which is gonna be the final site and looks beautiful and is where the client sees all their data. And then another site that we call content staging that we only use to collect content in advance and it can collect complicated content. You can collect content that, that again is rich text or content that has images and then instead of migrating from the spreadsheet, we can migrate from a Drupal site. So this has advantages and disadvantages compared to the other workflow. If you have a client that doesn't know how to use Drupal, you're gonna need training to put stuff in Drupal. But as I said, it also lets you have more advanced types of content. So you have to consider those when you're choosing this. But there's a few things that have to change here that are more complicated in the very simple 12 line migration I showed you before. So one of these is the data has to come from somewhere. A spreadsheet, it's pretty clear how that's structured. It's pretty clear how you can get data out of it. But how do you get data out of Drupal to put in another Drupal? And another thing that's really complicated is how do you deal with relationships? If you have entity references, you can't just have, let's say you have terms. Terms are entity references now in Drupal 8. You're gonna import some terms and tags and you're gonna import some nodes that reference those tags. But you don't just wanna import them separately. You wanna connect them after you import them, right? You wanna make sure that your nodes that have tags point to the tags you just imported, not to some tags from somewhere else. So let's talk about those problems. First of all, you've got where do you get the data from? So the answer is this is built in now in Drupal 8. We have a REST module that comes with the 8 and you can just make a view of type REST export and it'll go and generate some JSON for you, which is just a format for the less computers read data essentially, restructure data. And then this JSON is perfectly readable by migrate and you can just import it. So this is not a problem anymore. It used to be, but it's solved now. Relationships are a bit more complicated. So here's an example of related data. We have a list of companies, or maybe nodes, and then a list of users, each of which has a company. And we wanna not just import those, but we wanna connect them. And this gets hard because we connect things in Drupal by IDs, right? In this case we'd say that I am a user and I point to a certain ID, a company with a certain ID. And we don't know what these IDs are until we're done with the migration, right? Until we're done migrating the company. We don't know which ID it's gonna get because maybe you created a bunch of other things first. So you can't just add a really simple line to do this. You need a more complex solution, but migrate comes with something to do this built in. I showed you before that there's lots of, that the process plugins let you do mapping. So I showed you that if you have a first name and a last name, but you wanna stick them together, a process plugin can do that. And so this is what we call the concatenation plugin that exists that lets you take two things and stick them together into one thing. And there's a whole bunch of different process, this is a very simple one, but there's a whole bunch of other process plugins. And there's a great list at the URL over there, which is really the only place to find a good information about this. So I would encourage you to check it out. And there's a special process plugin that is about relationships. It is a very confusing name of migration. I think we're working on changing that. But what it basically says in this case is when you're looking at the employer field for each user, instead of just mapping it directly as a string, as like evolving web as a text, look at what it says, go look in another migration, in this case in the company's migration, and then see what the result of that company's migration was and use that as your value. So if the company migration said evolving web had ID nine, then use the ID nine as the real value. It's a complicated process, but it really just works. And when you do this, it lets you map relationships. So how does it work when we put this all together? So we have a Drupal editing experience that is all the nice things we like about Drupal. It has images and upload fields. It has fancy bodies. And this is what we did on the evolving website actually, I forgot to mention this. On our own site, when we built it, we used a process like this where you collected all the data in Drupal in a different Drupal instance. So the different Drupal instance had a normal Drupal editing UI, but it wasn't a nice site. It just had, it was just a plain Bartek site with nothing special to just let you collect the data. All it had was content types and a few views. From those, we used the REST export module to export a whole bunch of JSON. And then the JSON all turned into a beautiful node in the end. So you were able to collect our data, in this case, our own data, and do it right at the start of our project and end up with a beautiful site in the end without having to wait until the site was ready before we got our content there. This was super important for this particular project because we built our site, I think, two months after Drupal 8.0 was released. Things were, there was a lot of learning involved, a lot of things that we didn't know how exactly they would work out. And by doing it this way, when we wanted to change something, we just could. We had some fields, for example, that let you select which level a training course was in. So it could be beginner, intermediate, or advanced. And then at a certain point later, we realized maybe we wanna have a list of all the intermediate courses. Maybe we wanna have some text that goes with a description of what it means for a course to be intermediate. So this shouldn't be a select field. This should really be a taxonomy term. And we were able to just have our migrations take care of that difference. Instead of migrating select fields, just migrated taxonomy terms. And we had enough time to change this instead of getting all our content in and then having to change it all on the live site and being left with some crafty field that we didn't want anymore. So yeah, that's the content first in Drupal workflow, the second one. It lets you gather content early and lets you use the Drupal UI. There are some things that are complicated. First of all, if you're collecting content on this plain old Drupal site that lives on some server somewhere, you probably want it to be protected. You don't want random people showing up and finding out what all your content is before you've actually published it. But then migrate still has to be able to get to that content even though your site is protected. Fortunately, this is supported now. You can use the shield module to protect your site with a username and password for all content on that site. And then migrate plus contains an authentication parameter now that lets you just tell your site when you're going to get this JSON data, here's a username and password. Some things are more complicated. We are, as I said, in Montreal, so we have to deal with translations. And when you deal with translations, you have to kind of connect them all together and it gets a little more complicated. You never really need two migrations to do it. I'm not gonna go into what's involved with that, but we have a series of blog posts that my colleague, Jigar, at Evolving Web has published and so I encourage you to check those out on our blog. And then there's some things in Drupal that are hard to export with this REST export system. You can't export menus easily and you can't export custom blocks easily or rather you can export them but you can't do anything with them really. So if you're gonna do these and try to put them into this content workflow, you're gonna have to do some custom work or maybe that's something that we should try and get into a contrib module someday and make it easier for people. But then there's a couple of reasons you might not wanna do this workflow at all even aside from the complications. First of all, your customer, your client is gonna be using this site that is a plain old Arctic site probably. It's not gonna look pretty and clients like pretty things. You might not wanna show them that unless they're prepared for it, unless you can make them aware that what they're seeing now is different from the final site and don't worry and here's what the final site looks like. Thankfully migrations can run pretty fast. You can show them at least the beautiful theme that you're in the middle of working on but you have to prepare them. And secondly, you don't wanna keep migrating all your content every time once your site goes live, right? Once your site is live or once it's even close to being live, once you're prepping it and you're getting final acceptance testing, you wanna stop doing the migrations and you wanna just edit content on the site then because otherwise it just gets too hard to fix little tiny things. It can be, I showed you how easy migrations can be but there are some gotchas. Sometimes there's a migration that's hard to make 100% correct and that's okay if you're stopping before you go live because you can just fix it then. You just make a list of things you need to fix and you fix it and it's much easier than trying to write code that's 100% bug free. So that's the second workflow, content first and Drupal. Now let's talk about something that's completely different. Sometimes Drupal is not the right tool for managing some content. Drupal, again, we love Drupal, it has so many great tools but we've seen people have to manage a membership management system which integrated with all kinds of other tools in their organization and we don't wanna write integrations between Drupal and all these other tools. We wanna let them keep using their system. Some people have image management systems that are designed so they're a photographer can just plug a camera into something and all the images go somewhere and we don't wanna write something like that. Sometimes there's specific tools that are useful for a certain vertical so if you work at a university you probably have a better system for building your course catalog and putting everything directly into Drupal. And right now when we build Drupal sites we do a kinda bad job of dealing with this. We tell people we'll just copy your content into Drupal somehow and then keep maintaining it in two places. That's not really friendly. Or sometimes we try to write API, we try to use external APIs and we tell people we'll just use the external system and every time somebody makes a request to your Drupal site we'll just make a bunch of web requests and everything will be really slow but that's cool, right? We think there's a better solution and we think that's to keep using the external system, yeah, but get the data in Drupal too and then keep them in sync. And so this is just like with mirrors of software. You have Drupal has its software live on Drupal.org but there are other sites that hold a copy of that all over the world. And so yeah, we call it the mirror system because it's like that. And it might sound a little crazy because Drupal's a content management system and we're telling you for some things don't manage your content in Drupal. But it's not so crazy. We already talk about Drupal having a lot of different layers. You can do your rendering, your permissions, your custom code and your content. And we sometimes analogize that to a human body and we say, you know, we don't wanna do the rendering in Drupal, we wanna do the rendering in some kind of JavaScript. And we call it headless Drupal and this has become an accepted thing even though Drupal's great at rendering cool content and making it look good. So for our mirror workflow, we're just kinda doing the opposite of that. We're cutting off the content part. And that's okay, Drupal's still great at permissions and Drupal's still great at rendering so I guess it's footless Drupal. And so we did this sort of thing on a project for a site called the Council for Responsible Nutrition. And so for these guys, they had some external system that just gave them a giant CSV dump of content. We built them a little form where they could upload their file. It would go, they'd submit it, it would give them a little report. Hey, we've created a bunch of users, we blocked a bunch of users that no longer subscribe to your member management system. And then when somebody would visit the member directory, let's say, it would just list all the users and this is just a normal view. You can't do a view over external content very easily but because we synced it into Drupal, we can do that and our site builders are happy. Again, there are some complications. These migrations now are not the same as the migrations we were doing before that we're starting at zero and migrating all the content. Now these are recurring migrations. You migrate once and then again on top of the site that already has users and then again and again. And so this has to do some things differently. If you've got 20,000 users on your site, you don't wanna recreate or edit or save even every user because it'll take forever. You wanna change only the ones that have changed. If there are things that disappeared in your source, if there are users that no longer exist in this case, you wanna get rid of them somehow, either block them or delete them. You wanna be able to trigger this with migrations. Normally you trigger them with Drush if you're a developer but in this case, the client has the site, they're not gonna run Drush, so we have to make a UI for them. And finally, we have to be really careful about mistakes because if you accidentally delete every user on your site, it's gonna be bad. Yeah. You've got a backup. Yeah, it's true, we do have backups but still, the client's not happy when we have to do that. So the first of those problems is really easy to solve. If you only wanna update things that have changed, you just add one line to your migration YAML file that says track changes and it just takes care of it for you. Thanks, Migrate Core. For triggering migrations without Drush and doing it from the UI, it's a little more complicated but this is where it's important to realize that migrations aren't magic. Sometimes when you learn migrations, when you look at a tutorial, let's say, it just tells you, hey, you create this YAML file and a migration just happens and your content gets into Drupal. Wow, and it's great. It is really cool that it happens as if by magic but the different parts of Migrate, they're just Drupal plugins and Drupal interfaces and all the other normal things that are part of Drupal. If you have a developer who can write Drupal modules, they can also customize your Drupal migrations and they can write code that uses the migrate system. So here, this is all the code that was necessary to put in our form submission hook to make the migration happen. It's not a lot, it's not impossible but people sometimes act like it is so I encourage you to look at the interfaces that are part of Migrate and just try using them. So another problem is if you're uploading a file, now the migration's not coming from a static place, right? When you upload a file into Drupal, it gets put somewhere in the files directory and it has a different name now. So we have to tell Migrate, don't just do this migration from a static YAML file but be able to change some parts of it. In this case, be able to change the path where the data comes from. So again, that's not too hard to do with the tools that Migrate gives you. In this case, we just look at our form, we get the URI of the file, we get our migration and we stuff a path into it. It's not terribly difficult. It's something that we need more documentation for because it's not obvious to people always how to do this. If you are having trouble figuring out something like this, there's a lot of people in the Drupal Migrate channel on IRC who are always happy to help. At least I know I am, I don't know about the rest of them. More importantly is telling the users what's gonna happen when they do this. Because when you build a UI, it's not about just letting somebody upload a file. You also have to make sure that they realize what they're doing. If the file somehow is slightly corrupted and it's gonna remove all their users, you wanna tell them that. If they somehow have a thousand new users and they're not expecting to have a thousand new users, that's a problem. Maybe they had a bug in their export system and every single user's job titles changed if that's in their site. They should know about that. So, again, you can talk to Migrate. In this case, we can talk to the source and ask it. So what do you have that's different for us? What do you have that's changed? What do you have that's deleted? It doesn't contain this already. It would be a good idea if we could get this in decor or contrib. But it's about, the script that I linked to here is about 60 lines of code, I think. It's not terribly complicated. And in this case, I wrote a little dry run script that just tells you, hey, if you do this migration, you're gonna have this node is gonna be updated, this other one's updated, this one's new, this one's deleted. And so that's not too hard to do. And then what's cool is that once you know what's gonna be deleted, well, you can just delete them, right? You can just take care of these users or nodes or whatever it is that's going away based on the external system. So it's not hard to add a hook that does that, but it's really, really important that you're careful when you do this. As I mentioned, you have to validate your data and make sure it's the right data. It can be really useful to have a hook where it runs before your migration runs that checks, does this, if it's a CSV file, does it look like a CSV file? If it's something else, don't just treat it as an empty file, figure out, give the user an error that says, this is not the right thing, you should go and talk to your development team. So if you're validating your data, it's really important that you're able to debug problems with it. So a couple of useful ways to do that. There's a module called MigrateDevel, and it just prints every row. If you're running a migration from trash, it'll print every row before it's migrated. So that lets you see what's going on after all the processes have run after your source has given you data. There's also tables in the database that help you figure things out. Migrate doesn't always do a great job of telling you when something was skipped, why it was skipped or what happened, but the information will be in these two tables. And finally, by far the best way to do this really, is to just, you get a developer who can use XDbug and have them step through the different steps of the migration. The code's not too complicated and I think people should be able to follow it. Okay, we talked about all these problems. What do we get from all this? First of all, the client gets to use the tool that they like, right? If they're used to managing their courses in a course management system, they don't have to find some kind of not perfect way of doing it in Drupal. They get to keep using their system. And then we also don't have to do any of these weird API integrations. Like, there are modules that do that, like external entities, they're pretty cool. But, I mean, every time you access stuff, you're gonna be doing really slow API calls. You can't do things like views most of the time. You really wanna have the data live in Drupal if you're gonna have your other Drupal developers working with it. And then, again, there's a bunch of reasons why you don't wanna do this. I mentioned that you have to validate your data. And of course, when you're doing this data it will be overwritten. So if you have, let's say you're migrating your users and you let users edit their job titles on the site, when you do the migration, maybe they edited their job titles on the remote external system as well. And they didn't do it the same way. And they differ now. So you can't allow that to happen. You can't allow there to be two states that have both been edited and differ. So, generally when you do this, you will turn off the edit forms, essentially disable the edit forms for those content types in Drupal. Or at least the fields that you're migrating. I also really wanna caution everybody not to do bidirectional sync. Sometimes clients really want this. They tell you, can I just change the data in Drupal and change the data in an external system and get them to talk to each other and figure out what's better. We tried to do this on a project many, many years ago and it was the worst project Evolving Lab ever had. You really don't wanna try it. So that's, this is the sort of generic concept of how you do one of these mirrors with a pretty simple source I talked about this time which is users in a CSV. And that doesn't sound, I mean, so it's useful. It's still really good for this particular client. But it's not necessarily what gets a client really excited about using their content workflow, about using a different and better workflow, in this case in Drupal. But we do have an example of a client that had something that they really wanted. In this case, we were working on a site for the Allseen Alliance and they had a bunch of developer docs that they were writing, documentation to tell that their developers were writing to tell other developers how to integrate with their systems. And of course, these are developers so they didn't wanna go write their data in rich text. They wanted to write it in markdown in a Git repository. It sounds really weird to a lot of people. Why would you use that for content management? But for a developer, this is great. This is exactly the way you wanna manage your content. It lets them use all the power of markdown in Git. If they have external markdown editors, they can use that to edit stuff. They don't have to rely on Drupal's editor. Drupal's markdown editor is, I mean, I hope the person who developed it isn't here, it's okay. But there are custom markdown apps that are really, really good. And Git lets you do all these things that you can't do on Drupal content. So here's an example of what their data looks like. They have a big repo that has a whole bunch of different files, all markdown files in this giant structure. And they can do all these Git things on it. They can ask, tell me the last few changes to the entire site. They could even ask, tell me what, like they could change multiple articles at the same time and have the change happen at the very same time. You can't do that in Drupal. You have to change them one after the other, right? So they get the capability to do a lot of cool things that they wouldn't be able to do in Drupal. But after our migrations touch it, it looks like a nice Drupal site and developers don't have to go and clone some weird repository to read the docs. They get to read it in a beautiful Drupal site with the menu, with formatting, and just all the things we like about Drupal. So what were the complications in getting this to work? What did we have to build to let the client make this work? So first, in this case, we're not migrating from one file, from one CSV file, we're migrating from a whole giant directory of files. And we have to have each of those files, it becomes one node, right? Each of those is one row from the source. So how do we get those all? Well, again, in Drupal, everything's a plugin and that's the same thing in Migrate. So you can just write your own Migrate source. Again, it sounds intimidating when you start, but it's really not that hard. So here is a complete Migrate source that basically does that. I don't expect you to all read through this and understand what it all does, but the point is it fits on one slide. It's not something that you have to learn a ton of stuff before you do. It's something you can just look at one of the existing sources that's in the Migrate module or in one of the Migrate-related modules, like Migrate CSV, look at what it does and build something that does something slightly differently. So in this case, we use File Scan Directory to get all the files and then return them all. And it's not so hard in this case, but some things are harder. Sometimes the client gives you source data that's really strange. They gave us a giant hierarchical file in YAML, but it could have been XML or something else, that they wanted us to turn into a menu. And that's really hard, because Migrate wants a list of rows, right? It doesn't want a tree of rows where things have children and parents, but there's a tool that's part of PHP called the recursive iterator-iterator. It's a really weird name, but it basically means give me a tree thing and I'll turn it into rows for you. And I wrote a blog post about it, which you can read there. And all these slides, by the way, are already online, so if you can't scribble down one of these links, you can go find them on the Drupalcon site. Finally, again, I mentioned validation is really important, but what's an example of how you do that? In this case, we wrote a whole bunch of hooks that ran after the migrations, the check for things that indicated a potential problem. So we would look for, is there one node that links to something that doesn't exist? If that happens and data that you're migrating, that probably means that somebody edited one thing and forgot to edit the other thing and has to go and change those. Maybe there are nodes that no menu item points to that in this particular case that indicated an error. We wanted a menu item for every one of the nodes we were importing, but we can just check for that and then tell the client after they run the migrations, hey, we noticed this problem. Maybe you wanna try and fix that and run the migrations again. And then in this case, our client actually had a little mini site that they were creating with other tools that let people sort of demo the markdown content before importing it to Drupal. So we built this little tool that we use for other things as well called SiteDepth, which lets you look at two sites, in this case, the little demo site and the real Drupal site, and look for differences in them. So we would detect if our migrations got anything weird. So at one point, the markdown processor we were using didn't deal correctly with accented characters. And we wouldn't have noticed that very easily because we would have had to look at every single node very closely, but our automated tool did it for us and found the mistake, and we were able to fix it pretty quickly. So again, it's sort of the theme of this talk is that you wanna use the best tool for the job. And in this case, again, it's not what I would use for most Drupal sites, but for our client, it was the best tool for the job and it's the client who has to stay happy. So they got to use all these tools that they like. They got to use GetBlame to see who made a change that was problematic, GetMerges, so two people could change a node at the same time. They would just merge together. They can use whatever markdown editor they like. And what's really cool about this is because they're using Git, they can just revert to a different branch of Git and run the migrations, and now they've reverted their site to an earlier version, which is pretty hard to do in Drupal, generally. So yeah, I guess that's our fourth workflow. And finally, I'm gonna talk about one of the more normal workflows, I guess. People always talk about using Migrate for D6 and D7 upgrades to Drupal 8. But there are problems with using Migrate as it's advertised, I guess, that way. I was just in a session talking about Drupal 6 End of Life and running through the default migration that you do from Drupal 6 to Drupal 8, let's say, with the fancy migration UI, it works on a very small site, but it migrates everything about the site. It migrates every field that the old Drupal 6 or 7 site has, every content type, every setting, every setting that it knows about anyway. And that's not really appropriate most of the time. In the eight years, since you built your site in Drupal 6, things have changed a lot. People want responsive data now. People want different fields. People want different field types. They want images with old fields. They want all these different things. And you probably don't wanna just keep everything exactly the same. Some things you wanna keep, the important content you wanna keep, for sure, because you don't wanna go and have to type up 10,000 new blog posts or whatever you have, but you don't wanna keep the whole site configuration most of the time. So people usually have to choose between do I do a full update and keep all this stuff, this cruft, or do I write the site from scratch and then have to rewrite all my content? But you don't really have to do that. You can build your new site from scratch with all the modern DA techniques that you like, but then get the content and only the content from the old site. And so to do this, you have to write some custom migrations, and that sounds hard, but there's a shortcut in the migrate upgrade module. I believe there's a drush command you can run that doesn't just, and this one doesn't run the migration. It's the same tool that you would use to run a D6 to the eight migration, drush migrate upgrade. But in this case, instead of running it, it just builds the configuration files that it would use to run it if it was gonna run it. And then you can just look at all these config files and you see that a lot of these are things that we don't wanna migrate. We don't wanna migrate the filter format settings. We don't wanna migrate the node settings most of the time. We just really wanna migrate the content, the nodes. So you can just export all this config and then delete most of it. Just get rid of almost all of it. And you only keep the things you wanna save. So if you wanna save your pages, save the one that's called upgrade D7 node page. If you wanna save your terms, then your tags, and you can save that one, but get rid of most of them. And what's great about this is that writing a custom migration that does simple data that doesn't need to have that much to it is pretty easy. But if there's more complex data, like there might be for your old content, it can be harder. And this will generate exactly the configuration that works for your old data, even if it's complicated. And then once you've done this, you can modify these migrations too. So if you're not migrating users, you can take out the UID from the migrations because it's not gonna be useful. So some things that can be complicated with this is often you figure out you wanna migrate one content type and not another one because one of them is, let's say, all your blog posts and the example of our site. And another one are your pages, which are gonna change because you have a new organization to your site now. So when this happens, you wanna keep all the images from your blog posts, but you don't wanna keep the ones from your pages. So there are ways to do this. And you can kind of filter each migration before you do it. And I've written a blog post about how to do that. And there's other similar changes that you can do where you just take things out as you don't need them. A really important point to note, though, if you do this approach is broken links are terrible. Your site will, Google will hate you if you have broken links. It's really not good. So if there's a URL alias migration on your site, which Drupal should generate, you wanna keep that one for sure, even if it doesn't seem like the kind of thing you're interested in because it'll point to all your new nodes. But if you're throwing up content, even that won't help because there'll be links that are dead, right? So you wanna have a plan for that. You wanna either identify the things that you're filtering out and see what's there anyway and add aliases for them manually, maybe. You can use our site diff tool, which lets you compare the new and old site and figure out what's broken. Or you can wait until you've deployed your site and then just figure out where users are going that's broken and then fix it after. Which isn't as great an experience, but it still works and it's easier. The Google Webmaster tools will give you something like a top broken links page where you can just figure that out. So here's an example of where we did it, which again, was on our site. We had an old D7 site, which was actually a direct upgrade from D6. And so it was really crafty at a whole bunch of old fields we didn't want, but it had these blog posts and we wanted to keep those. So we generated some migrations and here you can see I've highlighted some of the things that we kept. The users, for example, you have to write a plug, you have to use a plug-in here in your YAML, which is an obvious which one to use if you haven't done this before. So that was just generated by Drupal and so we kept it. It's not here which fields used for revision timestamps. I hadn't even considered that at all, but Drupal just tells us which one to use so we keep that. But then other things we can change. We use the custom plug-in here to munch links because we were adding HTTPS and so these things were gonna die. And so we wrote a custom plug-in to deal with things that we wanted to change. We combined what was generated with what we wanted to change and then we stuck them together and ran the migrations. We got a new site that's organized in completely different ways that keeps all the same blog posts with all the same content. So yeah, what we got out of that is we keep what's worth keeping, we throw out the junk and what's kinda cool is you can merge these together. In this case, you might notice this example is from the evolving website and so was one of the earlier ones where we were gathering content in Drupal. We had one bunch of migrations that migrated new content, one bunch for old content and that just works fine and there's no problem writing different kinds of migrations and just sticking them together. So yeah, that's basically my talk. You saw a bunch of different, you got an introduction to migrate if you haven't seen it before. You saw a bunch of different workflows for better ways that you can manage content in Drupal by using the migrate module. Not using Drupal the way you're used to but in a way that's often better for the client. And so yeah, that's the point of this. If you wanna find out more, you can find me on Twitter and I will answer your questions there as well as here. I definitely encourage you to learn more about migrate because this wasn't a complete introduction to migrate and there are people who have better versions of that and I encourage you to check those out. And yeah, try out these workflows or tell me about your own workflows that you've tried. Evolvingwebs, as I mentioned, also has a ton of trainings. Some of them involve migrate, some of them don't. But they're still good ones, even the ones that don't. We've been all over the place and if you have new people starting up with Drupal who have to learn how to use it, you should contact us. There's contribution sprints on Friday as I'm sure you're all aware. And yeah, it's open for questions now. Tell me anything I got wrong, tell me any ideas that you think I should look into and tell me how you think migrate core should change because there are definitely ways it can get better. So thank you, everyone. What if you have a contrary module like Biblio or something like that? How can you migrate if they haven't fixed Biblio or whatever module for Drupal 8? Okay, so there's two parts to this. One is getting the functionality in Drupal 8 and migrate can help you with that. You have to do that somehow. So maybe, I don't know what Biblio does really but there are lots of things that people did in Drupal 6 let's say that you would now replace with paragraphs, for example, you would find a new and better solution in Drupal 8. And then what migrate can do is it can manage the matching between those. So it can take your data out of let's say you had like, you know, field data one, field data two, field data three or something and turn those into paragraphs and it can do that fine. So I don't know what Biblio does but presumably migrate can help you do something like that with Biblio. All right, thank you very much. Thank you. I don't know if there's anybody in the room from Think Shout, but I just wanted to give a thanks to them, your first strategy on migrating from Sheets. They have a Migrate Google Sheets plug-in module that was very helpful for us on our project and just part of the community, so. I better go and give them one of those flowers then. Thank you. Hey there. I was just curious, if you're doing a Drupal 6. Can you get a little closer to the mic? For a Drupal 6 and Drupal 7 upgrade, I'm migrating the Content Drupal 8. It is another option, actually creating a view in Drupal 6 or 7 that would give you a CSV export. Absolutely. It's a combination of version one and version four. It's slower, I'm not sure what the advantage would be to doing that, do you have something in mind? I was just thinking about a site that maybe has like maybe like 50 nodes as opposed to thousands. Yeah, so the built-in Drupal migrations, upgrade migrations, they work directly from the database which is usually easier to get than to build a view that exports something, then build a thing that uses that, so. Gotcha, okay, thank you. But if you have a giant database, for example, if somehow you have like a three gig database that has like 50 things you wanna export, maybe that's a good idea. Okay, thank you. So asking about those partial upgrades, using the export that you can get by the config only option. So I was trying that out a few weeks ago. And you mentioned, for example, filter formats being something that you don't want to include. Usually. And I totally get that, because I was upgrading a former six site that was upgraded to seven a couple years ago, and then to eight. GSP filter? Yeah, well, I eliminated that when we went to seven. But okay, but regardless, but the thing is it was all IDs in six, and then it went to named IDs in seven, and I didn't catch that. So when it migrated to eight, it did horrible things, horrible things. But what I noticed in the config files that it generated is that the filter formats looked like we're a dependency for some of the fields in like the content type and fields. So how do you get around that? She's saying that filter format, oh, I see. So I mean, it depends on your model. In ours, we were able to identify that the nodes that we were interested in only used one of two filter formats. So we could recreate those kind of on the new site, but better because now we have new tools that we can use for that. And then take out the dependency and take out the part that uses the filter format migration, just replace it with our own little section that says if it's using plain text, keep it plain text, you know, something like that. Do you have an example of that potentially that you could share? Not right now, but if you stick around, I'll show you how you do something like that, okay? Yeah, do you want to say that at the mic? Because you're right, that's the answer, so. There's a process plugin, a static map process plugin. So if you know the ID, because we have the same thing with filters, you just, I mean, I have to look at the, but in one of those YAML files, where you say migration, say static and then one to, Yeah, so you could do something like, and I'm not gonna type it in here, it's impossible. Hi. Hi, I was wondering if this would be, if migrate would be a good solution for moving data that was unwisely, a node type that was unwisely created using field collections and changing it to paragraphs in the same site? So I have not tried it, but I think it would be a great way to do that, yes. Can I imagine a way it wouldn't work? It might not work out of the box for some reason, but I wouldn't be surprised if it did just work, and if it didn't, then come find me in the sprint room and I'll see what I can do. Awesome, thank you. Speaking of the sprint room. Yeah. I got handed one of the D6 related issues, and for me to do it, it would help to have like a diagram of the internals, especially for things related to efficiency, because I think there's, Do you have something in mind? The what? Do you have something in mind in particular? Well, the issue, the specific issue is moving to the vocabularies and. Yeah, I know what you're talking about. Yeah, so I mean like whatever, so now that you know the issue, what do I need to bone up on so that I can actually be productive on that issue? Oh man. I was hoping there was a patch on the issue, there's not. That one is a particularly hard one, I admit, and again, maybe come talk to me in the sprint room after and I'll help you out. Okay, cool, okay. Because that one's actually a quite complicated thing, and I wanna go into it right now. Yeah, okay. Thanks, though. The feasibility of doing HTML scraping with Migrate and DA? Yeah, we ended up doing that in our markdown thing. We didn't actually convert it to markdown in Drupal. We used a tool to do that first. All right, so you scraped the content out of this page. And then we scraped the content from, I mean, we didn't really scrape it. We had a directory full of HTML files, essentially, which we ended up using. You could scrape the content out of the migration itself, it would be interesting. I would maybe do it in two steps, though. In first, write a program that does the scraping, unless you make sure the data looks okay, and then from there migrate it, unless you have some particular reason why you have to scrape it. Did you have something in mind for that, or? The CMS in question is highly denormalized, and the HTML output is more reliable indication of what the fields would correspond to. So yeah, in that case, but if it's relatively static, and it's a one-time migration, then just yeah, do dump of the whole site somewhere, and then migrate from that, because it'll be much faster than downloading, like you're basically running a denial of service attack on the site, right? Well, yeah, it's internal, so. Okay, let's do it, it's not the... Yeah, we would throttle it, so. Okay, cool. Thank you, to unplug my stuff.