 All right. Welcome, everybody. Hello. Hope you have a good time, be it having fun or taking a good nap. So this is our topic for today. And a quick word about where we come from, where we work at the evolving web. It's a digital agency that does everything that is listed in there, pretty much. So all the steps of a big web project that you can imagine. And here are some of our interesting clients to talk. You can see a lot of higher ed in here. But we have government and health as well as some of our greatest clients. We are also a very diverse team of about 80 people from different places and the whole world. And just out of fun, can anyone recognize all the flags in here? It might be a bit tricky to get all of them. Maybe. No, I'm not going to tell you. No cheating, no cheating. The game here is that we don't propose this. You get to pick which pair represents our nationalities of origin. Let's do a quick raising hands here just to warm up, everybody. Who thinks it's A? Yeah, I just repeat the question. So which pair represents our nationalities? Yeah. OK, so who thinks it's A? Some? B? No, I don't make it easy. No, everyone will make it easy. So who thinks it's B then? Your hands. C, D, E. A few more hands. F, four hands. G, H. OK, that was a bit of a spread, but there was some concentration there. You want to introduce yourself? Yes. So hello, everybody. My name is Robert Ngo. I'm a solution architect at Evolving Web. And I have been doing Drupal since Drupal 7, like 40 years ago, since I worked back in Vietnam. And I'm France. I also have 14 years of Drupal experience and also a solutions architect. And I started working right after I left my school in Brazil as well. So that was my country of origin. So if you guessed correctly, that was C. All right. So warm up, Dan. Yeah. So for the agenda today, we would like to present about the problem, a use case that we have in the recent project. We will present the requirement of the project, the system architecture, the approach that we choose. And then, and of course, the solution that we chose is the feed module. And in this project, we need to create some custom workflow for our feed importation. And that's why I think it's quite interesting to make a presentation about how we go through each and every step there. In reality, you will not need to go through all those steps because feed ecosystem is already really strong. You can basically create an importation without custom code. And then we also mentioned some common problem that we have when working with feed, which is about how to manipulate data before you import. How do you chaining it? And how do you deal with the mid-lingual content? Because we have come from Canada. We need to work with English and French. And at the end, we will talk a bit about the lesson learned in this particular project and what we found about this solution. So yeah, it's really a solutions architect talk here. Does anyone know Beneva? You have people? Yeah. So this is a big insurance company in Canada. It's situated in Quebec, but it serves the whole Canada territory. It is, particularly to this project, is interesting to mention it was actually a merger between two already big insurance companies in there. So they came up with a new name and a new structure. And then that's part of the problem that we had because there are also multiple sources of data that have to be combined together. So this tool here is our focus here. It's in their website. By the way, it's a very fast and beautiful website if you want to check yourselves. It is recommended. And this is a tool that allows you to search for an advisor based on your institution. So an institution, you work in an institution. It is an institution served by Beneva. And they will provide you a recommendation of an advisor here. And we'll talk a little bit about that. So the initial requirements was that we had to pull data from two different sources. And I'll explain a bit about what are these sources. And these are our first thoughts. The thoughts you have when you first read the problem. OK, so yeah, there will be in the standard format the data that we're going to have. So it will be just fetch the data. We can fetch it all from Drupal. All of them will be very available and reliable. And you just need a simple request for the data. And it will be fetched in a timely manner, ready to process. So these are the first assumptions you make when you start working on a problem. And then the two different sources here is because, if you go back to here, what they wanted to do, wanted to provide their clients, is a way in which they can very dynamically swap the advisors that are recommended for each institution based on some business rules to balance. So an advisor doesn't get too many. Other advisors also get the same amount of requests. So they have a list of all the advisors. But they also have a list of these associations of advisors per institution, which will be daily renovated. So that's why we have to pull it. That's where the BI comes from, because the business intelligence tool is what provides that data to us. So yeah, easy, easy problem, right? So we come in and do some primary discovery on the system. And we discover a few things. Like the first, the two company is big, and it has been there for decades. So the legacy architecture is hard to change, not to mention that there are so many microservice running around the company. So it's mean. At first, we say, oh, the data is not standardized. How about we just standardize the data before importing into Drupal? But that is not impossible. That is not possible, actually. And so the data is consumed by all the microservice. We cannot do anything about that. Just take it and import it. And if we fetch it directly into Drupal, which means Drupal, we need to make a request into that server and then get the response. That is not the case, because the site is housed on Acura. And then the BI system is housed inside a VPN. There's no way for Acura to get in there and get the data out. And we also think, OK, the data source can be reliable. If we fetch it, we should get it. We should get something. But what will happen if there are some migration upgrade process going on during that synchronization? That is a risk we need to count into. And so we take a step back and we look through the system. And we discover that, OK, this is the very basic look of the system. We can see that Drupal site is what we are building. We have an entity repository where it provides a list of advisors. And the second one is the BI system set inside a VPN. And we need to build a solution to fetch those two data, transform them, merge them, and then import it into Drupal. The entity repository is in Java, CMS. The BI system is also in the VPN, as I said. So we need to have a solution so that we need to calculate the situation where what if a new advisor is at during the day? What if they remove an advisor during the day? And that need to be mapped correctly with the data from the institution coming from the BI system. Not to mention that all the content need to be bilingual. They are mapping and they share the same UUID. And the synchronization needs to happen every night. So it's a recurring, which means like if today I have 100 advisors, tomorrow I have 150, we will add 150. The order 100 will stay there. Or if they change a name, they have an S in the name, it need to be up there as well. So here is an example of the JSON file that we get from their list of advisors, which is the advisor repository. And because the system is on, the structure is not optimal. And sometimes when we want to get an attribute in there, we need to dig down quite deep. This is just a very simple one, but there are some other properties that are more complex to get. On the other hand, we also have the sample of the JSON file received from the institution. And as you can see here, we have the number of concierge, which is like the ID on their system represent their advisor. But in this system, it's just a part of the string. On the entity repository, it will be a bit more than that. It can clear with some prefix and some suffix. So during the transformation of data, we need to account for that information. OK, so not as easy to summarize. We have this make sure it is a daily re-synchronization of all these sources. So it's always consistent. We have multiple data sources. We have VPN restrictions. Can't simply fetch it. And we have, as Robert was explaining, some sophisticated field mapping to be done, along with some transformations as well. And finally, we have to combine the imported data into existing entities. So we talk a little bit about what are the approaches that we decided to take to attack these problems. So we needed a fail-safe solution, first of all, because of what we mentioned before, being under a VPN, being not always reliable. We needed a solution that could be able to handle these errors and faults and still work. We needed something that would be easy to manage on the processing side of things. Easy to modify the data. It's a transform where we use the term tamper here because feeds use that. We'll talk a bit more later. And a recurring importation, which means we'll be running all the time and be able to update items as we go. This is a bit of our solution actually looks like in the end after, of course, a lot of back and forth. It's a bit of a complicated chart. But as you can see, we basically had the BI system had to post the data for us so we can then be able to get it because it's under the VPN. And then there is a long process of receiving this data, triggering another feed, trigger another import from the other entity repository to first then update the list of advisors if there is some that you can update, new ones, or removed ones. And in parallel to that, we are saving the data received from the BI in a JSON format so we can go back to it later. And once you're done with importing those advisors, we trigger that final importation. So it's like a back and forth process between two importers. And once you're done, we'll have the data finally synchronized. So we chose feeds for a few reasons. But it's not obviously the only way to do this. It could be done in other ways. I just briefly mentioned it could be done with migrate as well. But we chose feeds because of the way it's structured, especially in terms of the UI. It would be interesting to have that flexibility. And I'll give you a brief overview of feeds for those who don't know. So it's a contrived module in Drupal. It's a very old module. It's been there for a long time. It's been refactored. It's been redone for Drupal 8. And it's still being well maintained, has been redesigned, improved all the time. And it's very pluggable. It has always been very pluggable. It's always been a big feature of it. And it supports a lot of these import formats. It has a very good UI where you can actually build your whole imported from fetching the data all the way to saving it into entities. And we'll talk a bit about how it does the mapping and how we can actually modify the data and transform it in the multiple ways of doing. And it also is very easy to set up for a periodic importation. Yeah, just like I mentioned, you can actually build the whole import without custom code. There is a lot of things available out of the box. And there are additional modules that you can use. Briefly mentioned them. That extend this. And you can use these plugins in every step of the process and build feeds imported without any custom code. So this is a very interesting feature for content editors, for site builders. Here are some of the modules that allow you to extend feeds a bit more. I'm not going to talk about all of them. But the tamper is for sure the most commonly used one, which allows you another level of plugins to transform the data of each field individually before importing it. In our case, however, we could not simply use it as out of the box. We did have to implement some customizations. And then, Robert will talk about it. Yeah. Yeah, so in our case, we need to create a custom feed workflow. And what does the custom feed workflow? OK, so the reason that we need to build a custom feed workflow is we want to combine multiple data like I explained in the introduction. We also need to have a custom mapping logic. And we also need to create a custom processor because for some other technical decision, we choose the custom entity. And out of the box, it doesn't support it. So here's what it looks like for a custom feed workflow. So there are three steps. First, fetch it, parks it, and then process it. So the fetcher is something that get the API and get data from the API, and the output will be a raw fetch result object, which consists of an array of the chosen object. The parser is where you can transform it the way that into the object, the output of the parser stage, it's an array of the item of the structure that you want. In normal case, we will not touch this part, but in some case, you can create an item class that extends the base item class, where you can customize the structure the way that we want. That will help for the next step, which is for the processor. Processor is like basically import those data into either a node entity or either a media entity, or in our case, it's just the custom entity. So here, we are talking about the first step, which is about the fetcher. Let me go a bit faster and go to the structure of, why do we need to have a custom fetcher? So this is our case. We have 123456 endpoint that provide advisor of different type. And the funny thing is that in order to get advisor in English and in French, we need to make two calls to two endpoint. No question else, because that is the structure. So we need to build a custom fetcher to combine all of those. And the output of it will be an array of all the advisor. And it will be mapping English and French together. Because remember, we have the UUID, which is something that we can trust. So in order to do that, I would like to go back here. I think it's here. Yeah. So here, this is a sample code. This is a simplified version of what we did. To create a custom fetcher like this, basically, we will extend the HTTP fetcher. And we will add the annotation in here, which is the feed fetcher, put an ID in there. And basically, when we override HTTP fetcher from feed, we will need to provide what we want inside the fetch method. So in our case, we looked through the list of language, which is English and French. We also looked through the list of the query list, which is the type of the advisor. And then in there, you will suffer manipulating data and create the array that we want. And then we just threw it back as a fetcher result, which rabbit in a fetcher result and threw it back. That will go to the next step, go to the processor. And in our case, why do we want it? Not only because we want to combine data from different source, but we want to provide additional header into the request. Because our request, we need to send some authentication header in there. And we need to set some query, some special query in there to get the right data we want. So this is the reason why we need to have a custom fetcher in our solution. The next thing that we talk about is the parser and the item. So the parser by default, if we install in FIT, it will provide default parser. If we install in FIT extensible parser, we will have parser for JSON, which is very valuable, easy to use. So by default, we have on this kind of parser, and basically we don't need to custom it. And honestly, I don't see the real reason why we need to override it. So just use it as it is. And then here, in our case, OK, let's go to the next step, which is the processor. So the processor, we have the custom entity. So in order to create a custom processor for FIT, again, we extend the class entity processor bay. And it's also come with the annotation here. And the destination, we will just specify the entity ID here. So on the content, we'll be mapping. We'll be processed and put it into that type of entity. And here, there's the mapping function. This is the place where we can do additional tasks, like if you want to add additional information into that entity. In normal case, basically, we just leave it at just. Then next part, we will talk about entering data before importing into Trouple. All right, so a few problems solved on the source of data. Now we need to make sure we can alter them in the right way we want. So there are a few ways you can alter the data in feeds, using feeds. And we're listing them here. We would separate in two categories here, which is using the Tempor module, FeedsTemper, like I mentioned before, and using Events Subscriber. It's the Events API from Trouple itself. So in each of them, you have two different ways to do it. So in the first one, you can use the Tempor plugins that are already available, out of the box. No need to custom code. And if you really need something very special, then a Custom FeedsTemper plugin, you just write it and use it. For the Events, you have two Events there for two different use cases. And we'll talk a bit about that. You would, I'm not sure if it is it later here. OK, just make sure I don't have to go back. So for FeedsTemper, here are some of the common FeedsTemper plugins that are available. So these are normal transformations, usually ones you do in strings, string transformations. There might be even more than this, but these are the most common. And here's an example on the right of a Custom Tempor plugin. So I mentioned at the beginning this was a merger of two big companies. And so we had to treat that as well. There was this one field that we needed to detect if it come from one or the other and then be able to determine that and then assign the right value. So it's a very simple plugin. There is more metadata into this code than actual logic. But it is as simple to create. You just use the annotation for a Tempor plugin described by the suing. And then all you have to do is implement a Tempor function that will basically receive the data from a specific field and then return the transformed data. If we're using events, why would you need to use events? Like I said, the Tempor is able to transform the data for a single field. So for instance, but sometimes you want to transform the whole item, like Robert was describing, the item is a collection of fields. One advisor, for example. And then you want to make a transformation that, for example, can read through multiple properties of this item and then make a transformation to one or multiple properties accordingly. So you would use the first event that is triggered by feeds, which is the after parse base. This is one? Yes. OK. Yeah, so we have another example here. So if you're not familiar with Events API in Drupal, it is you basically it's an event based system. It's a subscribe based system. You have to declare a service in Drupal that will subscribe to a specific event. And then you have to implement a method that would be responding to that event. I would like to add in one thing in here. So the example about the event and their custom after parse event here, I just get it directly from the feed documentation. So because that is the simplified version and just very clear to understand. So for example, the after parse version here in order to create it, basically we just extend the after parse, and then we can apply it to the type of feed that we want. And then we can enter data the way that we want in the item method. Sorry. And the second one is about the event subscriber. So by implementing the event subscriber interface, we can define when do we want to enter data in there. And at this moment, if we find the right moment, which is like feed event parse after the parse, we will be able to get all the data in here and do some additional transformation at this moment. So this example has also come from their feed documentation. Next, we would like to talk about the training feed, like France showed in the diagram before. Like when we finish the first importation, it will need to trigger the second one. And in order to do that, there are several ways. The first way is to use a motor name feed dependency. It provides a UI where you can define, OK, after running this, running that, which works well. But in our case, it's a bit more tricky because we need to define the right moment to run it. So we decide to go with the way the event subscriber again. So by default, feed provides several events, which we can find more in this particular link here. And basically, the implementation is fairly simple. We declare an event that extends the event subscriber interface. And in this case, we have the feed event named import finish. OK, when this is finished, let's run the additional task that I'm telling here. So the additional task is here. We basically have a very simple logic. I would get the type of the current feed. If it is of type feed 1, then I will trigger this feed number 2. And if in the order context, we need to feed number 3 to run after the feed number 2, we can set it up this way. So the setup is simple and easy to understand. And I would strongly recommend to have a look into the feed event file here. It come up with some additional feed event where it will help us to manipulate the feed in various ways at the right moment. The next part that we are talking about is the importation of the minty lingo content. So here, there's a funny thing. So I was looking for a solution on how to import it using feed. And I stumbled on an issue on Troopol. There's somebody say, I need a documentation on feed to instruct how to import minty lingo content. And there's a command. There's somebody command, say, we should do this like this. And that issue is still open. And the documentation is still not up to date. So I want to give it in here. The importation of minty lingo content is actually simple. By in the mapping step, we basically we can specify the content from the soaks, can go into a specific field. For example, this case, it's a field name. And then you can specify it in that language. So in our case here, we can map the English and the French into two different version of the node. So that is on the manipulation step that we have done on the feed importation in our system. The reality solution can be a bit more it consists of a bit more step, more custom logic based on our business logic. But the base structure is like this. And we did go through each and every step to customize our workflow. And in reality, in all the use case, we do think that we don't need to do those kind of thing. Because feed already provide enough tools for to do importation without going through on the Haslop creating custom plugin like this. Well, lastly, we have a few more lessons learned during the process that we didn't mention so far. But it's so important to say we learned that being able to log in, do more logging during the process was very helpful to understand where in this long process, it was failing, it was not working as expected. We also were able to set notification channels. We can use Slack or email to notify when there is a problem. For example, the importation failed or this importation was incomplete. So we can jump in and then investigate further. Feeds when running in the background, we learned that sometimes you have to be patient depending on what your setup is. If it's a long feed, it might be imported halfway and then waiting for the next run run to finish. So you have to keep that in mind. And finally, we had also to develop some additional tools just to visualize the data we were importing to spot anomalies. So we saw that the data imported was quite complex. So sometimes you have to write a script just to go into it, make sure everything is imported correctly, multilingual, or else if items are correctly combined and all of that. And yeah, I think that's it for a presentation part of it. We have a few minutes for questions if you have. Yeah, for the problem with the VPN, so we say that if the data cannot come to us, we will come to it. And so we write a script in the VPN, get the data and post it to our endpoint. So it will come to us. Yeah, and then that is the starting point of the whole process because when our endpoint receives something, it will start trigger, after storing those information temporarily somewhere, we will start trigger the first feed to import the advisor entity. And when it is already done, we go circle back to the data that we received before and run the second feed. Yeah. Thank you. Thank you. Yes. Yeah, actually I found that the custom plugin, the custom template plugin is super simple to do. And there are many documentation that we can file on Drupal.org. Basically, we just extend the class and then we have full control on how we want to manipulate data in there. Yeah. Yeah, by default in feed there is a configuration where you can set that that information is not available in the incoming data. It will be deleted on the other. Yeah. There is a configuration whether you want to keep them or you want to delete them. Yeah. No, actually that doesn't need a tamper because that is built in feed. Right, right, right. There's a configuration for that. Yes. Oh, nice. Yeah. Yes. There's a configuration for that. I don't think so. Yeah. Thank you. Yeah. Thank you. Yes. So, kind of playing devil's advocate, many would say that this looks like it's the solution for some custom code. What was the driving use case that said, we're not going to do a custom code, we're going to do it in feeds. What was it that feeds offered that was kind of like already out of the box there and made this the best solution versus just do it? Yeah. By default, like feed already provide a synchronization mechanism, which is quite robust. You can simply set it to import data from a source, let's say a JSON source, an XML source, or RSS source into Drupal, into a single entity in Drupal. And you can configure it to go into one content type and you can easily map each few of the data into a few in your node and your content type. And it's very easy to, let's say we need to do some basic tamper. We need to trim on the space. We need to make on the text uppercase or something like that. You need to encode or decode URL string. Stuff like that are built in, already built in and we don't need to do anything. And not to mention that the periodic importation is already built in and it run through Chrome through the QAPI, so by default it provide enough tunes for us to do the importation. And one thing? Yes. I just want to add that in general our design choice is to do as little custom code as possible because it's always harder to maintain in the long run. So a module as strong as feeds, we would like to keep it as close as possible to what it provides. So that's why we choose feeds, which has plugins for everything. So every custom code we wrote is either plugins or event subscribing. Yes. Yes. Yeah, actually we can answer that question because Fran is the former maintainer of feed. And he is also the maintainer of feed tamper. Yeah. Thank you. Yes. Yeah, so the question is whether we can skip an item, you mean? No, no rollback. Or you want to rollback. Yes, I believe that that is super easy because that you can do using the admin interface. In the setting of the feed, I think on the second tab, there is an option for you to delete on the existing item. One click and everything got rolled back. And then you can rerun the importation again. Yeah. I'm not as actively involved right now in maintaining feeds, but I believe the current solution is quite stable even though it's not marked as stable. It's just a criteria there is pretty high on book fixing and all that. But I would encourage you to try it out and use it. I mean, we have been using this in production. It's been pretty solid right now. Thank you for coming today. Yeah, thank you so much.