 A little bit about migrations. So I'm pretty sure most of you will have come across migrations at some point, maybe not in Drupal, but as a concept, migrations exist pretty much everywhere. So it might be something super simple, like reading some data from a CSV and importing it into your system. Or it might be something a little more complex, like reading data from a JSON API, modifying that data, and then migrating that into your system. So today I'm going to be talking about some of the techniques for using process plugins that I've learned from working on a couple of big migrations with some interesting challenges. So in particular, I'll be focusing on process plugins and how we can transform data to match our data model in cases when the source data doesn't match our destination. So before we get too far into that, however, let's go over some of the Drupal migration basics. It's a little bit of a refresher. So the basic migration flow looks something like this. You've got your source, your process, and your destination steps. Your source, that's a CSV or an API. Basically, a read-only source, sometimes an existing website as well. Your process step, which we're going to talk a lot about today, and your destination, which is basically where your data is going to end up in your website. So that might be a field, or a custom entity, or perhaps a media entity. So what exactly is the process step doing? Basically, this is a step to modify or transform your data or restructure it if needed. So if we take a look at a simple example, let's imagine we had a customers.csv file with three columns, burst name, last name, and email. And we had a customer entity with only two fields, full name and email, sorry. Basically, in this case, we'd need to combine the first name and last name column with the space in between into a string so that it could map nicely to our full name field. So basically, the process step would just need to combine those two columns. If you had to do something like this in PHP, you'd pretty much just use the upload function. It'd be pretty simple, right? And if you wanted to, you could write a custom process plugin to do that. But thankfully, Core actually provides a bunch of great process plugins for us. So we actually don't need to write any PHP code in this case. So this is an example of something like what the migration yaml file for that migration might look like. After the ID and label, which is mostly just for developer use, you've got your source, your process, and your destination steps. So if we look at the full name field here, this is straight from our customer entity. It's the name of the field. So these fields here are generally field names in your destination. And we're specifying a plugin here. And we're saying to Drupal, use these source fields, first name and last name, which we specified up here. Call the concat plugin, pass in these arguments, pass a delimiter of space. And that'll pass that onto the concat process plugin, which we'll basically call the PHP upload function behind the scenes, which will return a string, which we can map directly to our full name field. So if the source and the destination are incompatible types already, you actually don't need to call any particular process plugin. So in the case of email down the bottom there, you can just map your source value straight to your destination. And that actually behind the scenes is calling what's called the get plugin. But it doesn't have to be explicitly named. So it's a little bit special. So Drupal Core comes with a great set of process plugins. You can see a full list here at this URL. But basically, if you're migrating your data from Drupal 6 or Drupal 7, most of the hardware is already done for you. So you're not going to have many problems. If you're migrating something small, like a blog or a simple website, you probably won't need to write any PHP code at all, which is amazing. And a lot of contrib modules also provide great process plugins as well. And they're almost like an upgrade path. So a good example of that is address field. So migrating from Drupal 7 to Drupal 8 using a address field is very simple. Rather than having to map each part of the field individually, you just need to call the process plugin and all that's handled for you. So I want to take a quick look at two super simple process plugins that I use quite a lot in almost every migration. So the first one is skip on empty. And as the name suggests, it just skips over empty values. And it's not exactly a process plugin. It's more for flow control, in this case. But it is quite useful if you have things like pipelines. We'll talk a little bit more about pipelines shortly. But basically, if you're passing the value of your first process step into something else that relies on a value being passed in, then you need to have some sort of flow control here to prevent errors. The second one I want to talk about is static map, which is basically just mapping a value in your source to a new value in your destination. So this is particularly good for things like select list. So we can imagine here in the legacy system, they've had a dropdown saying yes or no for it being promoted. And we're mapping that to a boolean field here, so a nice checkbox. So before you jump to writing your own process plugins, it's always a good idea to check what's available in Contrib. Migrate Plus provides a bunch of great more advanced options. I won't read through all of them. And you can see a full list here. But two that I will mention in particular are entity lookup and entity generate. And these are amazing for when you're importing content, that references other content. And even simple pages do that. Things like article pages might reference tags, that sort of thing. So entity lookup is exactly like the name suggests, looking up entities on the fly. And entity generate builds on top of entity lookup. So it will look up an entity first. If it exists, map it straight to your destination. And if not, it can create it. So a good example of that is maybe a tags field in an article where the data that you're moving across can, say you're referencing tags taxonomy. Those tags taxonomy can be created on the fly. So sometimes your data needs to pass through a few different process plugins to get the desired output. And this is where the pipeline that I mentioned before comes into play. And this basically allows you to pass the output of one process plugin directly into the source of the next process plugin. So let's take a look at an example here. So for this example, let's assume that we're migrating a reference field pointing to a taxonomy vocabulary article type here. As I mentioned before, we've got skip one empty for flow control, which prevents us passing empty values down further down the line, preventing errors. After our first plugin, you'll notice that the second plugin doesn't have source. And that's because, as I mentioned before, the source value is inherently the output of the first process plugin. So it's super easy to define a pipeline like this. And they can pretty much get as long as you need. After skip one empty, we're then calling the callback plugin for a string to lower, which is basically just calling a PHP function. And then right down the bottom, we're calling entity generate. And you'll notice that some of the keys for entity generate actually come from entity lookup. So value key, bundle key, bundle, and I believe Ignore case are all used for entity lookup. And they'll basically check if the entity exists already, map that, and if not create it. So before we take a look at writing a custom migrate process plugin, I wanted to cover one more technique that I use in some more advanced cases. Normally, as I said before, keys in your process step will map to a destination field or an entity, something like that. But it's actually possible to create, I guess, pseudo properties that you can use by referencing them within the migration itself. So it makes a little bit more sense with an example here. So this is an example of copying over some image files from a particular folder into a new directory. So define your constants at the top. We've got a source, space, path, and all files path, similar to how you would in PHP. In our process step, we've got three fields here. File name, source, full path, and URI. But in this case, only file name and URI actually exist on our destination entity. A source full path is just something that we've created here so that we can run it through a process plugin but not map it anywhere. So in this particular case, we're using the concat plugin again to map the old files path and add the file name after a slash so that we can get the full path to the old file. Now, when we're referencing this that we've created, we use an at symbol and we wrap the plugin name in quotes, sorry, the field name quotes. And this basically tells Drupal that we're referencing something that we've defined in the migration itself. So you can see that file name here exists in both the source and the destination. And when we reference file name here, we're not adding an at and that at symbol. Without that at symbol, Drupal knows that we're referencing the source value and not the value that we've created. In this case, it actually would be fine to add an at value here and reference that. But I think it's better practice to reference the source value where possible. Something else that's good to mention here is the file copy plugin down the bottom, which is basically just exactly as the name suggests, copying a file into a new destination. This one's great for when you're migrating things like media or new files. And finally, if you find that your process pipeline is getting too long, too complex, hard to debug and manage, it might be time to write your own process plugin. So writing a process plugin is pretty simple. You basically just need to create a class, which extends the process plugin base class. And the only required part of the annotation is the ID key. So that one's pretty easy. And you'll reference that ID as your process plugin when you're writing your migrate YAML files. So basically, once you've done this, the only required method is the transform method here. And you'll get a value, which is your source value coming in, your migrate executable, which you can do to check things like the status of the migration, things like that, your row, which is your raw source value, and your destination property, which is just a string with the field name that you have to find in your YAML file. In this method, you're basically in PHP land, so you can do whatever you like there. If you wanted to use something like container factory plugin interface, you could do things like injecting access to the database, sorry. At this point in the migration Drupal is bootstrapped, so you can do anything that you like. It is important, though, to consider rollback support. So if you are doing things like creating entities here or writing to the database, you might want to consider what happens when you roll back your migration. Should those entities remain? Should they be deleted? Things like that. But otherwise, you can do anything that you like here and just return an output, which will map to your destination field. So you can pass configuration into the process plugin simply by adding new keys in your migrate YAML file. So we have an example of stream replace here. We're passing in search and replace these two arguments there. Now, you don't need to explicitly define these anywhere. So these, you can pretty much just make up anything you like and write it there. They don't need to be defined in an annotation or PHP, which brings me down to this. It's good practice to check if your configuration exists. So when these are defined, they'll be made available as part of the configuration array here, this configuration, which you can access by key name. But if you don't have a very clear error message like this, my great exception, when these aren't passed in, you might find yourself in a little bit of trouble debugging typos or perhaps a wrong property somewhere like that. Depending on your context and depending on what you're doing, you might also add type checks to make sure that the source data is coming in correctly. Where possible, it's great to write generic plugins that you can reuse. But sometimes, depending on your case, that's not always possible. Any questions? So it depends what you're mapping to. So if you're just mapping to a text field, you can just pass a string. But in other cases, you can pass different structures. Yeah, sure. For rollback support, do you just have to implement rollback or how do you provide that? That one's a little bit of a different situation. We could probably chat about that after if you like. I think part of it is just making sure that it's mapped and you're tracking which entities that you're creating. So if you're using things like entity generate, I believe it actually does that for you. Could be wrong there, so I don't quite know that. Hey, thank you very much. It's a lovely explanation. I would just like to stress one point about process plugins and doing things like writing to databases and accessing databases. You can get yourself into trouble doing that. The process plugin is really designed just to process your data, to massage it from one format to another. And if you're writing to your database and trying to do permanent things, it can make it more difficult to debug and define problems. All that stuff really should be in the destination plugin writing out and all your inputs should be coming from your source plugins. Yeah, definitely. Just want to make that clarification when people start getting deep into this to be aware that that concept is there. Thank you. Is there any like Drupal console generator or something we can generate these things? I'm not actually sure about that. Probably is, but I haven't used Drupal console extensively. Or any like Drush even, you know, like how they generate the modules and stuff. I generally write them by hand. So if you're just defining things like these YAML files, they're not too complex to write. So the process plugins, again, this is all you need. There probably is, yeah. Thanks. Thank you very much, Eric. Thank you.