 Good morning. Thanks, everyone, for being here at 8.30 on a rainy day. My name is Tony Savorelli. I work as a senior web architect at Pega Systems, a software company based in Cambridge, Massachusetts. The focus of our web team at Pega is largely on our internal sites. I've been with the company for about three years. And for the past two, my personal focus has been on Drupal localization, which has largely meant supporting our localization specialists and providing ways for them to, and here's a technical term, declunkify their jobs. My presentation is a very distilled description, really, version of the work that I've been doing. I will talk about translation a lot, but I think the main takeaways from my talk can really be applied more broadly to any development that requires integrating with external services and lucky for you all, there's going to be, and most of all, lucky for me, there's not going to be any live demos and no live coding, so that's great for me. So I have too many things to look at here. Here, so two years ago, when I started in my role as a localization developer, this is where we were. Of all our sites, only our main marketing site, pegadot.com, was being translated into six languages. We have, the site has a very complex data structure. We have a very active marketing team. We are working with two translator vendors, sorry, translation vendors, to translate the site. One of them is Lingutech, which is very well known for its seamless Drupal integration. And the other one is Aclaro, which provides the actual humans who are going to translate our content. The reason why we have two is probably historical, first of all, but Aclaro is the provider of translation services for the whole company, for product, web, non-digital marketing, and all that. And so we need to, let's say, comply with this requirement. Lingutech, on the other hand, really makes life easier when not only localization specialists are not only localization specialists need to perform translations, but also developers need to be able to access the translation memory to simplify their work. So because Lingutech provides a web-based platform to handle the translation process, and this platform integrates with Drupal very well, that was a decision made years ago. And I think, I still think, things work pretty well. A little background on Drupal localization for those who haven't dabbled in it. If you have done any translation in Drupal, I must guess that you might have done that in a more manual way. So create a node, save the node, create the translation, save the translation, and so on. Whether or not you have internal translators available at your company, agency, client, this process, this manual process works well if you don't have a lot of content. If you're not translating it into too many languages, and by too many languages I mean more than one, and perhaps more importantly, if this content doesn't change all that often. And having an active marketing team means that some of the content, the most prominent content, will change, I want to say, daily sometimes. And so translations will need to be kept up to speed with that. In any other cases, in cases like ours, you will need to automate at least part of the process. And two years ago, as I mentioned further down, our process was only, I want to say, 20% automated to be optimistic. When I say time-consumer, our manual process was very time-consuming, which is a kind word for mind-numbingly painful. The process was the following. Actually, I can show you a little diagram to here. This is where the diagram is. The process was as follows. Our Drupal content was sent to Lingotech through the Lingotech module, which works. And then if you can see those other arrows that are squiggly and sort of handwritten, because that's where the manual part comes in, each of the translatable segments would be manually, there was one person, our main localization specialist at the time, who would copy and paste each of them, each of the segments. Not these, these are just examples. Into a spreadsheet, then upload the spreadsheet to a Clara, to their platform, where they would provide estimates, they would actually do the translation work, and then complete the order to say, hey, it's done. And then our localization specialist would come back, download this spreadsheet or spreadsheets, because one is never enough, copy and paste all the translated segments back into Lingotech, and then download all the translations into Drupal again. This process was bad enough when we had only one site, only the marketing site, to localize. As I mentioned briefly, I'm going to have to go back and forth, we had a huge upcoming project on the horizon, which was our new training website, which luckily enough, I had, whose development I'd actually led up until the moment I started being the localization developer, so I knew that project very well. And I'll show you in a second why I call that huge. So while this is an example, two examples of our marketing pages, so our marketing pages are fairly simple, fairly effective, but simple. They're mostly comprised of nested paragraphs. It's a very complex architecture, but simple. On the front end, each of the paragraph types will match a front end component in our design system. And so ultimately, the complexity of these pages lies pretty much entirely within the page itself. So translating a marketing page, if you have to translate only one at a time, is a fairly painless job. The Lingotek module will handle paragraphs even when they're infinitely nested in a fairly straightforward way. So no issues with that, except for the manual part of the job, which was still problematic. However, this is what, in PEG Academy, what we call a mission. A mission is our main training container, really. Each mission is a node type, which in turn references a few other content types, such as modules and challenges. And in turn, modules will reference what we call topics. Which are another content type, which has nested paragraphs and so on. So as you can see, compared to the marketing pages, the training pages, the training content, the complexity of the training content lies outside the bounds of a single content type. So translating an entire mission, especially one of the more complex ones, which in turn contains references to other missions in a sort of like Russian doll situation, is potentially a pretty big job, because you need to take into account all the different entities that are being referenced in all this content. And again, doing so through Lingotek is pretty easy. But I wanted ways to make it even easier and almost like a single click situation. So my main goals, some of my main goals were these. First of all, allow the data, or translatable data, and translated data to flow between our two vendors APIs, flow, not be copied and pasted. Copy and pasted. Second, use Drupal as the main control center. I didn't want our localization specialists, particularly once there stopped being only one and there started being more. Now we have about four or five, if I'm not wrong. I wanted them to have a consistent interface through which to do their job in Drupal. Sure, they still had to then visit the Lingotek dashboard and the Claro dashboard for a complete part of the work. But within Drupal, I didn't want them to have to jump around too much. And third, especially for the sake of Peg Academy, I wanted to allow them to translate hundreds of related entities at once without having to wonder where they were. And I had a few driving criteria. First off, avoid reinventing the wheel. So Drupal already has an integration with Lingotek through the Lingotek module. OK, I wanted to continue doing that. As I'll show in a couple of slides, there is also the possibility for Drupal to integrate with the Eclaro API, which fortunately exists. And so I wanted to take advantage of that too. I didn't want to have to create something that was just for us. Second, I wanted, as I mentioned, to maintain a consistent interface. Again, because if you're a localization specialist, your job is to getting translations done. That's a unified concept. It doesn't need to depend on how many different APIs we are using on the back end. Third, I wanted to avoid patching modules because, sure, originally we had only one site localized. But then we have, how many sites do we have total? Eight or so? Yeah, potentially, I mean, we're not translating all of them, but potentially we might be. So patching modules is one hell of a job if you want to scale up what you're doing for something this complex. And it is complex. Fourth, keep everything generic. As much as possible, I was thinking of anything I was doing as something that I could contribute back to the community. So anything that was, on the other hand, Pega-specific, and there were a couple of spots there were, should just piggyback on the core technology that I was developing and should be confined to small internal modules. So here are my technical building blocks starting with some contrib modules. One, of course, is the Lingotek module plus Lingotek overrides and the asterisk, not a cartoon character. Asterisk here means that I'm cheating because this is actually a module that I developed and I contributed. So it's fairly easy to contribute. It's my own module. The reason I created it was mainly to solve a couple of workflow issues that we were experiencing just by using the Lingotek module. And I'm not going to spend too much time here on this because I will talk more about the inner workings of what I did during the talk that I'll share with Hector Lopez from Stryker Lingotek on Thursday morning. Same time, not same place, though. The other contributor modules are TMGMT or Translation Management Tool, which is an API suite that allows translating content from different sources and sending it to different translation providers, including Clara, which is very lucky. However, TMGMT provides its own interface, which doesn't naturally interact with the Lingotek interface. So that was one of the issues that I wanted to solve. Then I have a few custom modules. And I call these custom modules because even after almost two years since they were done, I still haven't contributed them. But eventually, I will want to clean them up and document them as well. So like I said, Lingotek, despite some of the similarities between the Lingotek module and the architecture of the TMGMT module, they don't talk to each other. So I had to create something that bridged the gap between the two. So the TMGMT Lingotek module tries to do that by adding a few plugins for the Lingotek interface to add new filters to the admin, new columns because it's a giant table, and more importantly, entity operations so that we could perform actions that were not related strictly to Lingotek but related to our other vendor. And also, it provides a source plugin that allows TMGMT to use XLIF data. I'll get to XLIF in a sec. I know we have to define our acronyms. So data grabbed from Lingotek as the translatable data to be added to any order on Clara. So what is XLIF? It's the XML localization interchange file format, which is the common format for data exchange and computer data translation software, or CAT. It has been around for a couple of decades. And it's, as often happens with XML, different tools can read it and write it. So Lingotek can import and export data in this format. Clara, on the other hand, can import and export data in this format. However, there's one problem. Drupal natively can't. So what happens in this whole process is that when you send data from Drupal to Lingotek, that data is sent as a JSON data structure. And then when you send data, say Lingotek was not in the picture, if you want to send data to Clara, or to any of the, well, not any. I don't want to go that far. If you want to send data to Clara, Drupal is able to send XLIF data. But originally, the data that Drupal gets from its own database is not XLIF at all. So the XLIF data gets created within Drupal. However, we have these two integrations, both of which can read and write XLIF. So my goal was to grab XLIF data directly from Lingotek, which is super important. And send that to Clara and back. The reason it's super important is because Lingotek will save within the XLIF data structure all sorts of unique identifiers for the main document, for each of the translatable segments and all that. And those UIDs will need to be maintained once we get the content back from the translated content and put it back into Lingotek. If those UIDs go, then our process is broken. So that needed to be maintained. I'll go back a sec. TMGMT file override, so why overrides? One of the submodules for the TMGMT suite is called TMGMT file, which allows to exchange something like an XLIF file and to push it up to a translation vendor. And TMGMT, Clara XLIF, is another piece that allows me to modify very carefully the data structure so that we mostly can save information on where the process is at any given point. Small stuff, important but very small. So I'm going to show a couple of, well, not a couple, just one. So after all of this, I don't know if any of you is familiar with the original Lingotek interface here, but the most important part, oh, by the way, I'm using Jin here and it's not really playing well with some of the UI elements, but that's what you get for using experimental stuff. So this screenshot reveals a few of the features that I've been working on for the past couple of years, both as an enhancement to the Lingotek module itself. New columns, you can see especially columns that reflect the moderation state for the current revision of a node and the latest revision of a node. And I could have 10 talks about content moderation and translation maybe next year. And particularly when it comes to the integration with a Claro, and again, this was one of my goals, not to have to force people to go somewhere else. There are a few operations that are specific to a Claro. So through the Lingotek admin interface within Drupal, a localization specialist can just sit there, send content to Lingotek. Then once the content is ready to be sent to a Claro, they can send it to a Claro and do their thing. Once the order is ready, they can just bring it back always through this interface. So the create order and fetch translation are the two complementary operations that they will need to perform, plus another couple that are mostly to just the Claro-Claro jobs is mostly for development purposes. So most of the work can be done directly through this interface. So what I want to say is, of course, this is, I guess, a joke that would kill in the PEG Academy team, but mission accomplished. Because now, of course, I've hugely simplified this diagram. But now what we have is data being exchanged as it were in the past between Drupal and Lingotek, and data as XLIF files being exchanged between Lingotek and a Claro. Drupal is merely used as a conduit, as a control center, if you will, as I like to call it. It's mostly seamless, as with anything. There's always a couple of snags here and there, but it largely works. We were able, last year, to publish a number of Academy missions in French, German, and Japanese, and Italian, Spanish, and Portuguese are coming this year. So I consider that a personal success. And people seem to like it. They thank me sometimes. So that's pretty much it. Like I said, it was a distilled version of my work, and I hope it provided some insight into something as daunting as this kind of stuff can be. If I want, I should probably walk around. I was just wondering what some of the gotchas you ran into were. Ooh, I need to go back in time. So mainly it was, I think one of the main things was about manipulating the XML data structure, frankly, because it's never pleasant. If you've done any of that, it's just not. But for the rest, occasionally one of the APIs will not work as I expected to. On both sides, the support is great. Maybe there were just good clients, and so they respond to our emails very quickly, which it's inevitable. And so sometimes maybe the modules don't reflect the latest changes to the API. But because we're a Drupal console, we know how those kinds of things work, and it's always easy to go in. Again, possibly without patching things. But yeah, mostly because it's an external integration, there's always that point where you're faced with something you can't directly control, and that's the part that I was going to say scares me. It doesn't scare me anymore. But initially it was, I mean, this whole task was something that had been talked about for years. Kelly here can confirm. And finally, we got the resources and the time to do it. But mostly that. What is Lingotech providing that Aclaro can't? Could you just do a direct connection between Drupal and Aclaro without Lingotech in the middle? Potentially, yes. But on the Lingotech side, we get a web-based platform that allows even non-translator people, I guess, to manipulate the translations, especially when it comes to, during the development process, for example, you might want to translate things with a machine translation engine, and then look at what all the translatable segments are before bringing them back into Drupal, do all the reviews. So basically, we can go through the whole translation process even as developers, as non-translator, I don't know if translator person is a word, but that otherwise we couldn't do. Because all the translation process that happens on Aclaro is completely offline. And then we wouldn't get the tools that it manages all the translations. It maintains a translation memory, which is precious if you want to have any form of reasonably accurate translation of content that's already been translated once. And so I think it does add value to the process, even though it makes these two integrations make everything. The people who are doing the translating aren't doing that within Drupal at all. Drupal is just showing the status of everything. Yes, exactly. So the translators aren't even within the company. When I say localization specialists, they are just, quote unquote, they do a huge job. There are people within the company that are handling all the content that needs to be translated, they review translated content. So we have them in different areas of the world with different specializations. But the actual translators are all external and they never touch Drupal. They know their tools and we know ours. So anyone else? I guess not. Well, thank you, everyone.