 This session, hello. Thanks everyone for coming, especially for your last session of DrupalCon this year. This session is the couple of Drupal with Sylex. If that's not what you're here for, then one of us is in the wrong room, and I don't think both of us are wrong. So my name is Larry Garfields. You may know me online as Crell. If you want to make fun of me during this session on Twitter, that's where you do so. I'm a senior architect with Palantir.net, where I work at a web development shop based in Chicago in the United States. We do mostly Drupal work, some symphony work as well. For Drupal 8, it was the Web Services Initiative Lead, Drupal representative to the PHP Framework Interoperability Group. It's kind of the United Nations of PHP with all the positive and negative connotations that has. Advice of the Drupal Association and general purpose lovable pedant. So that's me. My name is Hagen Nass. I'm a senior solutions architect at Uyala. Not as many titles as somebody else has. But let me tell you a little bit about Uyala. We are one of the leading online video platform providers. Company got founded in 2007. So we've been around for a while. We have by now over 300 employees worldwide. Global footprint of about 200 million unique users in 130 countries. So we do know video. And we're working with some of the biggest and most successful broadcasts and media companies in the world. Short list of those here. I'm sure that some of you have watched video on at least one of those sites. For this specific project, we wanted to build a kind of management system for an OTT solution. OTT, good question. Over the top, video solution. So I think Netflix type video service. Some of the features that we gave Palantir to develop were that we want to support a really structured, well structured data model for the metadata for the video. We of course wanted the UI to manage the actual metadata. We wanted to implement a workflow for the publishing of videos. It's very important because the quality of data that you typically get from the studios is not all that great. So there's at least two steps in that specific workflow. First a QA workflow, a QA step, and then the actual publishing to the service. We wanted to curate content as well. Some of it like lists and collections that could be displayed on the applications and websites, banners for the websites, and then some application CMS kind of aspect came in it as well, editing the home page, and then list of lists for other parts of the website. And not just of, yeah, in addition, sync the data with UYALA's existing DAM, Digital Asset Management System, because we need some data in there as well. And on top of that, APIs that work at scale, because we didn't just want to drive websites, but also applications on different devices. The service had to support thousands of movies and TV shows, tens of thousands of users, and as I mentioned, you have websites and applications. So we'd worked with, Palantir had worked with UYALA before, and they came back to us and described this project that they wanted to do for one of their customers. And we said, you want us to build Netflix in a box in Drupal? Cool, sounds fun. So we put together our team for this project. Our project manager was Amy Dosimo, our senior engineers were myself and Robin Berry, and then Beth Binkabits and Beck White were on the engineering team as well. One of the complexities for this project was that there was no front end. We're not actually building the user-facing part of the system in Drupal. We were not even building the user-facing part of the system period. We all wanted us to build a content management system that served a REST API, and that's it. And then web application, mobile application, whatever could consume that data. And we looked at this case, all right, so we've got data that we need to bring in and do content management, type stuff on it, and then serve it out as a REST API. There's a system that is really, really good at that. You know what that is? Drupal 8. Except we were saying this in early 2013, so no, there wasn't Drupal 8. All right, so what do we do? This is a bit more than our usual, oh good, let's just throw up a Drupal site kind of approach. So we looked at, all right, we could use Drupal 7 with a services module. Problem, services is not actually a REST system. It's an RPC type approach, and architecturally it's fighting against Drupal 7, because Drupal 7 architecturally wants everything to be a page, and so it's not gonna be that performant. It's gonna be kind of clunky, lots of moving cars that don't need to be there. There's the REST WS module. Who's used REST WS? A few people. In early 2013, it still was very new and kind of experimental. We didn't want to do that much with it. But it has since become kind of the model for Drupal 8's REST module. But at this point, when we built this project, it wasn't a serious option. Another option we considered was, let's just do it ourselves and go all symphony with it. At that point, we knew that Drupal 8 was adopting a lot of symphony components. Symphony is a good framework for doing custom bespoke applications. Why not use that? And we spend a lot of time wrestling with that question of, is Drupal even the right platform for it? And what we came down to on this one was, do we want to spend the extra time ramping up on symphony? Do we want this to be our first symphony project and learn all of the piece of symphony and have to build the UI from scratch? Drupal does give you an awful lot out of the box in terms of content management. That when you start using some other system, you realize just how much you're missing that you just take for granted in Drupal. And we kind of wanted to take that stuff for granted in Drupal. And then we looked at another option, Sylex, who's worked with Sylex before? Good number of you, okay. So you probably, those of you can probably see why we're looking at it. It's kind of symphony junior. It's a lightweight system built on the same the same components as symphony and as Drupal 8. So a lot of commonalities there. It's actually very fast. I did it some quick benchmarks for serving a rest response, just a little hello world level rest response and Sylex was three times faster than Drupal 7. And that's before we did any optimization of any kind. Downside there though, it's a micro framework. So it would do even less for us out of the box than symphony would. And so that's even more we would have to do ourselves. None of these looked like the right platform to build a system on. So all right, let's combine. Drupal 7, its strong advantages are as a content management system and displaying pages and displaying administrative screens. And it's really good at that, but Drupal 7 is kind of lame for rest services, truth. Sylex on the other hand, that's the Sylex logo. One of the nice things is they both have kooky eyes. So they work well together as projects. But Sylex is really, really good at HTTP handling. It's raw symphony, which is a very strong API for dealing with web services. The downside of course is that any kind of UI you want is going to be hand rolled artisanal from scratch. You're responsible for every tag in the markup. So let's just combine them because leverage the right tool for the job. There's a lot of jobs here, so let's use multiple tools. So how do we get these talking to each other? If you're talking about Drupal 8, instead, they're all using symphonies so you can stack them, but we need these to talk to each other somehow. So how do we get data from Drupal to Elasticsearch to serve the API? Well, Drupal's data lives in Drupal and it has its own very specific structure. But the structure that you need for editing is not the same as what we wanted to serve on the API. We had, you want to have editors edit content in these three chunks, but serve them as one chunk to end users or vice versa. So instead we decided let's put Elasticsearch in the middle. Let Drupal control Drupal content and let Sylex do APIs and the only place they have to talk to each other is in Elasticsearch. So who's worked with Elasticsearch before? Good number of people. Who saw the Elasticsearch session here at the conference earlier? Completely different set of people. So now everyone understands Elasticsearch now, right? Okay, this was actually the first time we used Elasticsearch, but from what we understood it's the same underlying engine as Apache Solar which Drupal uses all the time, it's a Lucene engine, but its API is a lot easier to work with than Solars. So if you're doing custom developments, we have found, we have done custom development on solar projects as well, and we found Elasticsearch just way easier to develop against as people fighting against it. The JSON API is a lot easier to work with the fact that you can mess around with a schema for it without restarting a server and re-indexing all data was very nice. If you're doing custom developments I recommend Elasticsearch as your search server. It's just a lot easier to work with. So we ended up with this pipeline then for the system where we'd have some kind of incoming XML from the customer that says the definition of some movie or TV show or whatever, the metadata that we need to be managing. And we import that into Drupal. And content editors in Drupal could then edit it, update it, make sure that nothing is misspelled, make sure the image is the right image for this, the right poster image for this video. You really don't wanna put an R rated movies poster image on a kid's movie by accident. That's just a bad idea, trust me. Drupal then would dump that it's data into Elasticsearch where Silux would read it and serve it out to whatever the API consumer is going to be, phone, website, whatever. All right, there's a lot of moving parts here, but we've really just broken it down into three sections. So our data model itself, I'm not gonna go into too much detail on, but this is Drupal node modeling. And you've all done this a dozen times. In our case, the major content to be at programs, go to the record for a movie or a TV show or whatever, the asset, which is the wrapper around the actual video file. We were not storing the actual video files, that's what Yala's back lot service was for, and we're really just doing the browsing system for it. Offers, which are the thing you can actually buy that give you access to some movie for some period of time. And then stuff like collections and lists like Hagen mentioned for other curatorial stuff. And again, I'm not gonna go into too much detail, it's just we had bunches of nodes. So where's the data coming from? We're not entering it manually, because that would be slow. Instead, the customer's sending us contents in XML form. And that XML looks something like this. Please don't be scared. You don't, I don't want you to understand all of this, but the thing I wanna call out is this one XML file includes several of those content types wrapped up together. So like most of this is part of the program. Something down here in the video, this is actually the assets. And we also have the exhibition window here is what helps to find the offer. And we also had device filtering. So certain devices can access certain content. So we had to work that in as well. And all of that was coming in as one big XML file. So how do we break that up? So importing process, what are the options in Drupal? Well, there's the migrate module and problem. Migrate's designed to run user triggered like from in Drupal, you push a button, or you wanna draw scripts or something like that, which is great for the use case it's designed for, but we wanted to accept pushed data from the customer. We wanted them to be able to push ingested data to us so that they could just keep throwing data at us and we'll consume it. So that wasn't an option. We thought about feeds. Problem is feeds really wants to map things to one object and we needed to map to multiple objects. It also has an awful lot of moving parts, some of which are better built than others. We have worked with feeds in the past for Drupal to Drupal communication, but we didn't want that overhead on this project. Services again, kind of clunky, not really restful and we did wanna add more and more layers to the project that we needed to. So went custom with this and it ended up working out really well and by custom I just mean a custom module. So we just parsed the XML with a library called QueryPath. Who's used QueryPath as a Drupal module for it? Wow, I'm impressed with a lot of people. Okay, so for those not familiar with it, I think JQuery for PHP. It's actually a really nice little library. And with that we had a little import engine we built using just standard PHP 5.3 tools. So if you wanna get into the developer part of it, who's seen my functional programming talk before? I've given it a couple of Drupal cons. The example I use out of that for the import engine is from this project. So for those of you who recognize that, that's what it is. And that gives us when XML file comes in, bunch of nodes come out, we call node save. And nice and simple, even testable, yay object oriented programming. So we've got that import engine, wired up to Drupal. And yeah, we said Drupal 7 is terrible at rest, but that's okay, all we need is posts. So we're not actually doing rest for this part. So we just had a couple of custom page callbacks, straight up Drupal behavior, Drupal API, to just shuffle data from the incoming request to that importer. And we did add SSL authentication for it because we didn't want anyone in the world to be able to shove data into the system, would it thunk it? So it's just simple HTTP auth module, I think that's the correct name. That lets us restrict access with HTTP auth and then we throw SSL on it, actually HTTP basic and then throw SSL on it. Just for selected paths. Great. So we've got all of our data, we've got all the data we're gonna need. Almost. Problem is that that XML file I showed, is often missing stuff because the customer in this case had an awful lot of data that wasn't very good. So we've got to fill in this data somehow. How are we gonna get replacement data? Do we make someone just sit there and edit it all the time? And they have to come up with a new synopsis for these movies? That's a waste of a human being's time. So instead, we need to talk to a third party service and bring in some kind of third party data. And in this case, the customer contracted with Rotten Tomatoes, the website. If you know IMDB, they're kind of like that. But they have actually a really nice API to get at the content in their system for movie information, reviews on movies, other information like that. And they do offer, it's a free API for a certain level of usage and paid API over that. So the customer paid for a paid API, great, swap the credentials in and start talking to that. So to actually talk to it, we use the Guzzle library. Guzzle, for those not familiar, is an HTTP client for Drupal. So who's used Drupal HTTP request in Drupal 7? Okay, it no longer exists in Drupal 8 because it is the second most impossible to debug and maintain piece of code in the universe. And there's actually been measurements done of that and its systematic complexity is obscene. So we didn't even want to touch it. But Guzzle is the library that Drupal 8 is using. It's a third-party PHP library in Drupal 8 just pulling it in, great. And so we said, if good enough for Drupal 8, it's good enough for us. So brought that in. And then wrote a standalone Guzzle library, a Guzzle extension, to talk to the Rotten Tomatoes API. And this is a link, it's actually a standalone Guzzle extension that we've released on GitHub. It's just an open-source library that lets you talk to Rotten Tomatoes and just load movie objects out of it and review objects out of it, navigate between them. Just nice little wrapper around their API. And then a Drupal module that we've also released that leverages that and provides a mapping from their data to Drupal. Solve one problem in each of these little pieces along the way. What this means is we can then say, all right, we've updated this movie, linked it up by its IMDb ID because Rotten Tomatoes uses the IMDb unique identifier as their unique identifier because monopoly, I don't know. And so when we save a node in Drupal, it has, it's a movie, it has an IMDb ID on it. We go out to Rotten Tomatoes, fetch its data and just map that into a parallel node in Drupal. So now we have the Dark Knight, the movie from our customer, and the Dark Knight, the movie from Rotten Tomatoes sitting in our database. And then we can periodically refetch it if we need to to keep data fresh on Save or on Cron and various business logic around that. But the system was set up in such a way we can vary that whenever we want to. And then we also brought in reviews. So someone using the API wants to look at a movie and okay, what are the reviews on it? How good is it? That data is coming from Rotten Tomatoes. Okay, so now we've got all this data in Drupal. What are we gonna do with it? We edited it and Drupal's good at that, right? Almost. Drupal's node edit page, especially in Drupal 7 can get very large and unwieldy if you have as many, as complex a data set as we had here. So we had to do a bit of custom work on this one. So the main admin interface, this is straight up views. You know, if you're doing any kind of custom work in Drupal, you wanna build a custom admin views as your go-to tool. You rarely even need to theme it. So in this case, we're showing all the data that we imported. So there's the one of the offers. Here's this corresponding program. Yeah, so this is the movie Brave from Disney and the other information for it. Somewhere down here is the actual assets. And we've got QA state, published review state. These are just fields on the node as well. There's a lot of data in here. It's gonna be thousands of movies. So fortunately, views offers really nice filtering. There's really nothing custom going on here. This is nice and simple. Separate page to show the in Drupal content. So collections that people create, our custom lists and so forth. Banner images and stuff like that that are gonna get shown. The fun part is the actual node edit experience, which this is not what your normal node edit page looks like. What we wanted to do was offer not, here's a dump of the entire huge node, but here are the pieces of it and how they relate to each other. So in this case, we're showing the overview for the dark, wrong button, dark night program. And we've got some information on it here, but also here's the curatorial status of this data. And here's the offer it relates to and here's the assets it relates to. So an administrator can add a glance, see, all right, this whole movie in total is ready to be published. It's ready to be approved. Then we have an overview page. I made the screenshots recently, so there normally would be a preview of the video here, but my current dev copy doesn't have a connection anymore. So imagine there's a video screen here. But you can see when this was imported, you get the current QA state. So certain users with certain roles can approve it. That's just normal Drupal access control, like we're all used to. Put in a log message for why it's being rejected. We had some custom formalters behind the scene to say the log message is only required if this field is failed for some reason. Then the actual information itself shows below that, including very sized thumbnails of the actual poster images. Those are all just being created by Drupal image styles. And that's actually the images we then serve out to the public. And it goes down further. This is the overview page. And this is all built with panels. So we did all of this custom UI work with panels and a couple of formalters, essentially. It's a bit more than that, but when I say a couple of formalters, I'm sure a few people are chuckling at that, yeah. For editing then, we broke it up into several different screens. Rather than one gigantic edit page, you can edit different parts of the record, different parts of the node. So in this case, we're editing just the credit information. So that's the, who's the director, who's the writer, who's the actress on it, and so on. And so we've got a QA state for the record. We've got the switch between the different credit sources. So on this piece of content, I trust the incoming data from the customer's data source. And on this piece of content, that data source is crap. So I'm going to use the information from Rotten Tomatoes instead. And you can make that decision on a per node basis. And it's really just saving a Boolean or a toggle in the database for now and showing both the incoming data and the Rotten Tomatoes data. Then you can edit the generic content. There's a couple of other fields. Again, built with panels. This is a very long page. I'm not going to bore you with the whole thing, but you can see all this metadata and information that's in the incoming data that they want to be able to expose in the API and then build whatever filtering around in client applications. Synopsis again, same idea. Use the incoming data or use Rotten Tomatoes. And you can make that distinction differently from the credit data or the other metadata. And again, for now, we're just storing a flag as part of the node. It's just a field. I said this is built with panels. Who's tried to override node edit forms with panels? It's painful, isn't it? That's why I say there were a couple of form alters. I'm papering over an awful lot there that I don't want to go into too much detail on. It would bore most of the room. Most of it came down to doing black magic around the required fields. So they were required only on certain pages because Drupal 7 really, really, really doesn't like you doing this. So these are all panels using node edit as a context. This is actually right around the time while we were working on this project that the concept of form modes went into Drupal 8, which is basically this thing, this concept built into core out of the box as a fully supported feature. So if you want to do this kind of interface in Drupal 8, should be a lot easier now. We're putting this thing to Drupal 8, Hargan. It's unfortunate. This is what editors actually see. And again, we just threw the seven theme on everything. There was no front end. We had no themers on this project. Who was in the developer versus themer session earlier? Few people, yeah. So we bypassed that fight by not having any themers on the project. Okay, so we've got the data and we've got it curated. We've got it edited. We know what we're gonna do with it. Now what? Now we get to the decoupled part. This is headless Drupal in a sense because when a node is ready, we dump it out to elastic search. Problem, this actually takes a while because as we said, the data structure we want in Drupal is not the same as the data structure we want in the API. Do we want to reformat that data every time it's requested by a user? Absolutely not, that would be horribly slow. So instead, we want it to do, back up a moment. We also, the contributes module support for elastic search in Drupal in 2013 was rather poor. It has since gotten a lot better. There's modules that have been released since then that actually use the supported API libraries from elastic search, the company. And we looked at rules to do this preprocessing and again, too many moving parts, not enough places we can hook in and do, I just want to write some code. A lot of this project was, this is easier if I just write code. Fortunately, Drupal lets you just write code. And we also have this interesting problem where sometimes when you export one node, you have to export some other node, but only conditionally. So you export a program, you have to also publish its corresponding asset, but some pieces of the asset, we won't move on to the program and some pieces of the program we just delete. So we had to have somewhere to do this kind of logic, including merging in that Rotten Tomatoes content. So if you're exporting a program and it's toggle is set to use Rotten Tomatoes for the source for the synopsis, we need to pull in the data at that time, which means loading up the Rotten Tomatoes node, pulling the data off of that, putting that into the information we're exporting to elastic search instead of what's in the node we're about to export. Complicated, right? Great. Another little custom system. It's actually not that hard to do in Drupal, if you go object-oriented with it, because then you're not coupled to Drupal and, you know, object per node type. This ended up being a very simple, very robust way to handle it. So, alright, now when are we going to export content? What are the publication rules? Do we want to export content before it's been approved by an editor? That would be a terrible idea. So let's not do that. So we made the decision that if a node is published, that means it's going to be public. It's ready for public consumption. And if it's ready for public consumption, we're gonna push it out to elastic search. So, if it's published, we push it to elastic search. If it's unpublished, we don't push it to elastic search. Nice and simple. So when is a node ready to be public? When it's been approved? Well, it's a bit more than that. A program is publishable when it's been approved by an editor with some caveats. An offer is publishable when it's been approved and it's within a publication window because we may only have a license to sell this movie during the month of May or only in certain regions and so on. And so we only want to publish it within its publication window and maybe a few weeks before that so people know it's coming or maybe a few days after. It's more complicated than I'm making it out to be. For an asset, the actual information about the video itself, when it's been approved and it has an offer that's been approved because if there's no offer for it, no sense publishing that information. Or the offer is within its publication window because then we can talk about it but you can't actually buy it yet. It's complicated. Let's just go with that. And we don't wanna do complicated when people save nodes. When you save a node, its publication status may change and that's a terrible, terrible time to do all this complicated logic because then we slow down the UI, the editors are using the site and have to sit and wait for two seconds every time they hit save as we try and figure out if the nodes should be published and if so which other nodes that we publish with it and that would be a terrible, terrible thing for the user. So let's not do that. Instead, we used Drupal's queue system and Cron. So they spent all of that hard work of computing, do we need to publish this thing? Do we not need to publish it? What else do we publish? How do we reformat the data? All of that only ever happened in a queue. That's the only time it happened which means it's completely out of ban from the user. The user responsiveness is like that because all we're doing when you save a node is tossing an item into a queue to say by the way, check this node when you get a chance. So on Cron, we would say find nodes that we think are ready to be published based on their time windows and so forth, SQL queries. And find nodes that because it's not something else it's about to be published. I gotta get that one. Throw it all into a queue. Do the same thing with nodes that are about to go out of their window. Just find them, toss them into a queue, come back to them later. This is the exact same model that Core uses for things like the aggregator module. And I think Drupal 8 does it even more where if you have hard work you wanna do but it's the same thing a lot, use a Cron hook to say, all right, it's been an hour, it's been a half hour, whatever. Grab all of the things I need to update, need to check, throw them into a queue, and then just let the queue figure it out. And the queue will just churn through them at whatever rates it can. And update each feed, publish each node, whatever. And then, those will just toggle to publish and unpublish states if appropriate. That triggers a node save. Node save triggers another queue that says, all right, we've just saved the node which so go do the pushing to the elastic search you need to do. And then if it's published, we do all that fancy processing to reformat the data and merge content and so on and so forth and send that to elastic search and if not, we delete it. That way, again, the hard work is all happening here and here, never in node save. Node save becomes cheap. We have these sets of run very frequently so I think it's like a five minute time. If you want, you can even use a running queue module, we didn't hear, so queues are just all these processing. Another aspect of this is it's quite possible if you save a node and then edit something and save it again a minute later that a node will end up in the queue multiple times. That's not actually a problem in this case because what's the worst thing that happens? We delete a node from elastic search twice. Elastic search doesn't care or we push the same node to elastic search twice. Elastic search doesn't care. So overwrites and overwriting to elastic search was a non-issue, so we just ignored that. We don't care. If we process something multiple times, we don't care. So now we're content in elastic search. Now what? Now we get to Sylax. Now we get to the non-drupal bits. This is just serving an API, just serving a REST API. So there is no user interface. We didn't actually get to play with Twig on that project, I have to say that. Sylax is a very, very simple lightweight micro framework. It gives you a routing system, some glue, and the ability to register callbacks and that's about it. And that's all we needed. Everything else you build yourself and so we built it ourselves. It's using the same core pipeline as I said before as symphony and it's Drupal 8. So we already had some familiarity with that based on watching Drupal 8 happen. And unlike symphony full stack, which gives you a whole ton of stuff, it gives you Twig out of the box, it gives you doctrine, it gives you whatever, or Drupal, which gives you all this functionality. Sylax, you add what you want. It's a very minimalist system out of the box. So we added Elastica, which is an open source PHP library for talking to Elasticsearch, which is the same one we used in Drupal in fact, Guzzle to talk back to Uyala's Backlot Dam on occasion when we needed to. And this HAL library. HAL stands for Hypertext Application Language. It's a JSON spec, it's an ITF draft currently. It's a very simple extension to JSON that provides hypermedia links. So you can actually build a RESTful API with it. It's actually really, really nice. And it's what Drupal 8 uses out of the box. And it's what like Zen Dapagility uses out of the box. It's really becoming quite popular. And there's an XML favor of it, but I don't know anyone that uses it. Fun little story actually. February of last year of 2013, I was at another conference in Miami. And I was going to the onsite with Uyala as soon as I got back. While I was there, Lynn Clark, who was from the Drupal 8 REST team emailed me to say, Larry, this thing we've been trying to do in Drupal 8 using JSON-LD is not gonna work. I suggest we switch to HAL. And at Sunshine PHP, the conference, there was a presenter talking about building their entire company on HAL and showing some of the benefits of HAL as a format and the tool train around it. And I came away from those two inputs saying, so, Hargan, this API building for you, let's use HAL for it. So we used HAL on this at the exact same time that we were switching Drupal 8 over to HAL as well. So they nicely played off each other. So what are we doing in Sylex? Again, I'm not gonna get into code, but it's a very, very simple pipeline because Sylex is simple and all we needed to do was suck data out of Elasticsearch and serve it. And that's it. Pull data out, format it, and serve it. So we've got the routing system that happens in Symphony or in Sylex or in Drupal. Load data out of a little repository wrapper, rebuilt around Elasticsearch, get that data back, convert it to a HAL object and return it from the controller. And then in a view listener, which is the kind of post step in a Symphony application, we take that HAL object and convert it to an actual JSON string. We actually support XML2 on it because it took an extra three lines of code to do that. And did all of that hard work there. And also HTTP caching. Fun fact, we're actually moving to this exact same model in Drupal 8 of doing, I'll return an actual object here and then do the hard work in the view listener. That's what Drupal 8 is gonna be doing internally. Based in large part on our experience in this project, convincing us that this was a good way to do it. So this played into Drupal 8's architecture in fact. And then all the HTTP caching is easy to do in Symphony, they've got the tools there for it. We don't do any caching at all in the Sylex application. There's not one bit of caching in there. HTTP takes care of everything. One of the nice things about HAL, I'd say the best thing about HAL, is that there's this nice HAL browser available for it. This is a simple downloadable single page app that lets you browse any arbitrary HAL API. As long as it's well-structured, you can browse through the API entirely. So this is the index API, so a client application only ever hard codes this URL, domain name slash or domain name slash API or whatever it's going to be. And then there's various links that you can traverse just like a webpage to get to some other resource. This is how you navigate around a REST API. This is the proper way of building REST APIs. Some of these you need to do a device API on, like we said before, a lot of other complexity I'm just gonna wave over for the moment. But if we follow the link to a program, this is what a program looks like. And the full response that comes back is it's over here. This is the actual structure of the content itself. It's just a blob of JSON, cleaned up, formatted the way we want. Some bits of it look like Drupal, some don't. That's fine. This is something we negotiate with the team that's doing the front end. You can have a look at all of the headers come back for debugging, like in this case we're in dev mode, so there's no caching turned on. In production, we turn on caching. And from that, you can actually just follow the link to a person who was on this program, an actor who was in this movie. So in this case, we're following it off to Heath Ledger. And we have links back here that you can traverse back to movies that person has acted in. Which means in a client application, it becomes really, really easy to say, all right, I wanna find movies that this actor was in. So I look up that actor object and I see, oh, here's the relationships to movies that this actor has been in. I will go get those, it's nice and simple. Navigating an API as if it were a webpage by just traversing links around, that's how you do a proper REST API. That's actual hypermedia. And of course, we wanted this to be highly available. We wanted this to not crash. Wanted it to be fast, wanted it to be stable. Fortunately, the Sylex app is pretty much stateless. It has no state of its own other than the Elasticsearch server. So you spin up three or four of them, tell them where the Elasticsearch server is, and you're good to go. You've got three, four, five web heads that you can load balance very, very easily. And they are themselves very lightweight too. Elasticsearch as well is almost disturbingly easy to set up in a replication mode. You just spin up multiple Elasticsearch servers, give them the same ID, and if they just work themselves out. So in production, we've got at least two Sylex heads, at least two Elasticsearch servers, and I believe we do have two Drupal servers. The Drupal servers are completely behind a firewall. It is not actually possible to get to them from the public web. The only thing you can get to is the Sylex servers. Yeah, we actually had, for Elasticsearch, one of the admin systems that the client kept asking us, so how do we set up replication in Elasticsearch? And we said, you turn on a second Elasticsearch server and give it the same ID and it figures itself out. So how do you set up clustering with Elasticsearch? You spin up multiple servers and it figures itself out. So can you provide this documentation for spinning up clusters in Elasticsearch? You spin up multiple servers and it figures itself out. It was too easy for him to be able to grasp. Systems that are too easy to use for a person to comprehend is a good sign. And then we stick Varnish in front of it. We're using HTTP caching. Varnish is an HTTP cache. Great. Everything we're doing is read-only, so everything we're doing is read-only, so it'll cache just fine. If it's a little bit stale because of updates from Drupal, we don't care. Five minutes, 10 minutes, time window is fine, which means almost everything's getting served out of Varnish. Varnish is blazingly fast. To the point that I think we just have the one Varnish server and that's not even load balanced because we don't need it to be. Or is it load balanced for redundancy? I'm not actually sure. No, performance was never actually an issue for us. So that's what we did. That's how we put this thing together. So actually, how are we gonna, a year on now since we built this thing, how's it working out? Really well, actually. The service, the service launched on web for desktop only shortly thereafter, they launched applications on Android and iOS as well. All running against the same API. Just recently, they launched Chromecast also. Again, without any changes to the actual APIs or the backend of the service. Yeah, unfortunately, we couldn't involve Palantir in building the website, maybe for the next project. And the service is growing. Highly reliable, Larry told you all that's wanting it. I don't know of any outage that actually happened, so far not a perfect record. Library of thousands of assets, they're about to launch TV box sets so they're adding high volume of TV shows. Thousands of subscribers, maybe not quite the numbers that they wanted yet, but steadily growing and again, no issues in terms of the service running. We're also still adding new features like I just mentioned TV box sets that require a minor change in the data model. But the architecture in general held up. Modular, it's easy to add and change things. And that's it. All right, so thank you. We have some time for questions. We can go deeper into the architecture or not as deep if you want. Please use the microphone here in the middle of the room so it gets recorded. And what do you wanna know about this project that we haven't already covered? Here's somebody. I was just wondering what is the latency from pressing save to the content actually being available in the front end? Worst, I think, it depends on what you set your cron time to be and what you set your HTTP cache to be. I think, as I was last on the project, they were both sets of five minutes. So 10 minutes is your worst case scenario. Set that down to a minute in a minute, then two minutes is your worst case scenario. So you have no problem processing all of the... No, no, okay, cool. And one of the advantages of Drupal's queue system is that it can be run synchronously. So if the cron got way too big, you spin up, just run drush queue run or queue process, whatever the command is, three or four times in separate windows, and it'll just burn through the whole thing and shove it all out. So when they get to processing that much data at once, spin up more queue workers and you're good to go. You mentioned that you made the imports from XML files and I'm wondering if you had some tests made for those and if you had, how did you make the tests for testing if the import is well done from the XML? The way you would test that is just have sample files on hand and run them through it and make sure it nodes save at the end. Drupal 7 really doesn't make unit testing easy. So I think we just did integration tests, level testing on that, but we did have a bunch of sample files lying around that we could just run import on, make sure we got the nodes out the other end we want. If a new XML file is provided from the customer, which happens several times with slightly different format, then we've got the tests there already, tweak those for what we expect for the new content, tweak the mappings, keep running it and you're good to go. So Drupal 7 means you do integration tests of the simple tests and I wish we could have done better, but Drupal 7. I had the same, that's why I asked. Yeah, thank you. In Drupal 7, that's pretty much all you can do. In Drupal 8, you can wrap most of that up in PHP unit tests and then just do a simple wiring test at the top with a simple test. I was wondering what the added value of elastic search was in your case. It's for search, kind of search requests because as you were caching everything anyway, you could maybe have used the primary database as well. So we were actually doing keyword searches as well and for that elastic search was the obvious choice and once we had that in there, it made sense to also use that to hold the denormalized data. We technically could have just used HTTP cache for everything. Again, problem there is anytime we pass through the cache, it's gonna be very expensive for that call and whoever that user is, well, it sucks to be them. We didn't want that. It also kept the systems decoupled. So the Drupal site can go down. We can take that offline to change the data model there and as long as what gets pushed out to elastic search doesn't change, we can just leave the asciatic servers untouched and technically we can even completely replace one or the other and we just have elastic search in the middle. Yeah, it's layers of separation of redundancy as well as we actually were doing searching. Thanks. Did you care about invalidating the varnish cache or? We talked about that? We actually... I'm not sure if they actually implemented it. They were asking for it and we told them how to, but... Yeah, in practice, the lifetime was short enough that the customer didn't care. You certainly could do, push an invalidation command to varnish if you wanted to. I think we had a user story for it. They just never wanted it because their timeline was short enough, but you certainly could do that. Yeah, so that was the other question. I had the agile approach you mentioned in the description of this. Yeah, this was actually the first pure agile project Palinter did and it worked out really, really well. Probably the most important thing for an agile project is having a product owner that is active and aware and able to ejaculate and then adjust on the fly. And we had a really, really good product owner on this project. What you're trying to say is I couldn't make up my mind and change things all the time. Yes, but you were okay with that. And so we had the Rotten Tomatoes integration that we had talked about a little bit, but didn't actually figure out until halfway through the project. And that was okay because our contract and our product owner were structured in such a way that we could do that. We also had handling around genres for different assets, which are far more complicated than you would ever expect them to be. But we had the same problem of we've got genres that come in from the source data, we've got genres that come in from Rotten Tomatoes and sometimes one is good and not the other and you wanna change it. So the incoming genre is called action adventure, but you wanna just use the word action so you have to have mapping logic. We figured that out halfway through the project, we just paused, all right, let's just stop and take two days and figure this part out rather than all that upfront design. So it actually worked out really, really well. This is a project that convinced me that Agile can actually work. Was there any consideration for using Node.js instead of Silux? We talked about it briefly, but we're a PHP shop, not a Node shop. And so we would have had to learn a completely new stack in a language that we spend less time in. And that wasn't worth it at that point. Silux was a shorter jump for us in terms of the technology stacked to learn. You certainly could do this kind of work with Node.js or something with Rails or Sinatra or something in Go or whatever you want on the other side of Elasticsearch. And that's kind of the point that that's independent of the Drupal part. So we could totally junk that entire Silux piece and rebuild it from scratch with cool technology to your and nothing else needs to change. It's just a different thing, pulling content out of Elasticsearch and serving it. We'd be fine with that. It was the time to support that kind of evolution. But yeah, the initial decision for Silux was we're a PHP shop, so let's use PHP. Guess that's it. So thank you everyone for coming. Hope you enjoyed DrupalCon. Do review this session online. You can follow Palantir at Palantir. Follow me at crow. Hagen doesn't really use Twitter, so don't bother following him. Yeah, let's enjoy the rest of your time in Amsterdam.