 And now, it's time for our second presentation from Morgan Strong, who will be presenting a case study on Tome and the Algolia search engine that we saw last month as well. So thanks Morgan, over to you. So yeah, I'm here to talk, I should actually put a caveat right out there, I'm not speaking on behalf of my employer, which is the Queensland Art Gallery and Gallery of Modern Art. I work there and I'm just sharing some stuff that I've recently done there. So yeah, this isn't an official presentation from my employer, this is Morgan Strong presenting about some stuff that he's done while working there. So effectively, we had a problem, it was called COVID-19, I don't know if you guys know about it, but it means that a lot of things changed and one of the really big ones for an institution like an art gallery, like a museum, like a library is that it means that we can't open our doors. And although there's a lot of digital engagement and a side note, something like a library is more about use, whereas something like an art gallery or a museum is more about consuming. So they're much more physically based. So if you close down, you've kind of closed one of the major avenues for being. And I know that probably most people who don't have a lot to do with museums and art galleries will know that they put on exhibitions, but they also have these things called collections and collections built up over often hundreds of years, usually decades. Australian ones, I think the first one in the Queensland Art Gallery is from 1896 or so. And a lot of these works are on display, but the vast majority are not. So we had a problem in that we had a lot of works on display, we had some exhibitions, but COVID happened and we had to close. Now, our gallery didn't have a very good collection search. This, which is still what's up there, it's, you know, it's old. You have to exactly know what you're looking for. The search is very much like you fill out a form. I should full screen, shouldn't I? Look at that. Wow, wait, it's coming. Yeah, so you can see it's slightly better now. But if you go to our side, you can kind of see it. So what are we going to do about this? Well, we decided that we try a few iterative releases and try to improve our offering of our online collections. So we're closed, but we wanted to make ourselves more available. And it was a different way of approaching it. I know that a lot of people in software are well aware of agile approaches, minimum viable product, early releases, but it's not as common in some institutions, particularly ones where you're more traditionally only get one go. And in terms of exhibitions and physical publishing, usually you put something on. And if you got it wrong, you got it wrong. So this was very different in that we decided from the moment that we closed, I gave myself three weeks to redevelop our collection online from the ground up, which I'd later probably should have given myself a little bit longer. But this is where we ended up. And this is this beta collection site we've got. So it is our goal to search over the top of a Drupal site. The Drupal site is actually static, though. I've used a static site generator within Drupal called Tome. And effectively, I import all the content in and then push out all of these pages and we end up with a static site. So after three weeks, we pushed out 1,000 works. There's approximately 18,000 items in the collection, about half of which are fully digitized. It's going to be over the next three years trying to get the other half digitized. And then three weeks later, we put out 2,000, that was last Friday. So we're about six weeks into this thing. And I'm now in the process of making version three, which will hopefully have timelines and maps and some other fun things that come with richer interactions with the data. So this is what our current search looks like. You look at a work, you get a small image. If you do click on that little view button, it does bring it up and slightly larger, you can only see part of the work at a time. And this is just standard copyright. You only really get this kind of little hint of information. And now, and I will go to the site in a little while, but, you know, I'm now searching for the artists. I can find their works and then see in that work that if I look for those similar words, I can search across the catalog. And then if I actually get on that result, I can bring back some actual interpretation. Yeah, this is an image from George Lambert from around 9.22 relating to the First World War. But that bit of context, we didn't really put as much of it there. But this information, this is great information that's been researched and it's just now more accessible. It's everything's kind of more linked. And Algolia and Drupal are really good at making that happen. So what is it? It's Drupal 8, although I did check. I think I'm going to be able to go to Drupal 9 pretty quickly. I'm using feeds. I didn't know feeds were still a thing. I was going to use a migrate or entity import. But when I started a few weeks ago, I looked at feeds that really picked up again, it was a big thing in Drupal 7. And it's now Drupal 9 ready, even though it's in Alpha. And it now imports JSON rather than just XML and CSV. Tone, this generates the static website. I'm using views to output JSON. That's what Algolia likes to drink. So the workflow I've got, I think I put this in the next slide. So the workflow I've got is I've got this database, which is pretty old. Like it's really well-structured, but it's in a thing called textpress, which I don't think this is Tina. I think it's T-E-Xpress, but it's very 90s. It's very, well, I don't actually know. But I think it's a bit more recent than that, but it's from the 90s. And as a result, when you export data out, it just magically changes it to UTF-16 for no reason, even though it's a UTF-8 database. And it inserts all these illegal characters, like in the regular world, you used to double quotes being escaped with a slash. In textpress land, you use three double quotes, unless, of course, it's the one that's been inserted by Word and that's slanted and then the quote doesn't count. So it's really good, rejects doesn't work. So I actually spent about 80% of my time in this column here. Where I get the data out, I put it into a holding my SQL database and I cleanse it and I take out all the weird things. Then gets converted to CSV. I map that to Drupal feeds, use feed tamper to explode out the relationships in the concatenated ID relationships. And then I create a whole bunch of nodes and taxonomies. Then at that point, I run the view to output to alcohol here. I cleanse the data and I misspell the word imported. But I import the clean data back into the textpress database here so I don't have to do all this stuff again. And then I run time to execute a static site. So this is a bit of a quick run through and I'll quickly show the site off after I do this slide, I swear. So why static? Why did I not just put a Drupal site up there? Why did I do it this way? Well, museum catalogs are really good candidates for static sites. I mean, there's a lot of content. I mean, each, when you look at the taxonomies because every kind of term, like an artist's name or a medium or art movement, they're all clickable terms that are taxonomy. So 1000 works makes about 4,000 pages. So museum catalogs, though, they're excellent candidates for static websites. There's a lot of content, but it's not that frequently updated. Why do they use Drupal? Well, it's good at data modeling. It's very good at creating content types and taxonomies and adding fields to it and stuff. Drupal has views. I think Scott mentioned that. It's the reason I loved Drupal back in Drupal 6.0 and then it's still something that I love. There's some good ways of getting data in. I'm not using feeds, which I really didn't expect. I didn't easily have a server either. I have somewhere where I could serve static files, but I didn't really have easy access to a server. So dishing out static pages was fun. Tome. And Algolio, we saw that talk last month about it. So it has a React-based app called Instant Search, which can be quite easily inserted as a Drupal block. Now, I use the block and views to give the JSON over rather than using Search API partially because it's static and also because it works. But by far the most important technique was I had three weeks to do all this by myself mostly, like in terms of technical or great content people at the gallery. You did all of the prep for getting the materials ready, but I have my other job to do, which is not actually, it's a strategy and project management position. It's not a technical role. So I had to do this and if you've got to do something quickly and it's a beta and you're moving ahead, using the tools that you're familiar with, right? I'd like to think so. So let's take a quick look, shall we? Now, I will confess my computer gets quite slow when I screen share, but so here we go. We might have seen, let's go for, and there's the Picasso, it's like, you can kind of see, I've got seven results, which turns back in one millisecond. Let's have a look at, this is one of our iconic pieces. Because it's static, it serves really fast, which is quite nicely, unless I should say, and each of the terminologies we've got, like collection area, become taxonomies. And when you run tone, it does actually scrape views and output, because this is a view of the taxonomy page. Does scrape it and does the pagination and does all of that kind of stuff, which is a bit nicer than running some, I mean, taxonomy is not the most complicated queries, it can get pretty big. But also the way that we imported the data, when I say we, I mean, the way you have a structure, is we actually import a work like this, then we import the artist, which is this, this is actually a lookup. And so as a result, anything that references that idea of the artist, if I click on that name, let's see if they've got more than one work. No, they've only got one in this particular release. That means they get everything linked to it, and then we import what are called narratives. And these are these interpretations that are written. And there's two different types. There's one that, what's called a primary narrative, and that's like the kind of curated list, and then the secondary narratives. And these are ones where there could be multiple works, or it's like a secondary thing written about it. So this one is, let's have a look. This is obviously written about a movement, and it's contains a lot, it references a lot of different works here. That's kind of cool. So, yeah, so we've now got, oh, yes, sure. What's, this is probably, going back to security, this is probably my favorite part, is that I only ever run localhosts to build this site. So when I see this message, I don't, well, I should get around to it, but I don't have to immediately worry about it, because I'm just outputting static pages. So I will, let's go to time. So if anyone is interested, feeds is kind of good again. Why, another reason I did choose feeds for data is that when I get around to, when I get around to doing continuous kind of deployments of content, it does work in the same way that you can kind of set a sync that it looks at a directory for a CSV or XML JSON and just kind of reprocesses it every 15, 31 hour 12 hours. So that could be kind of cool. And the Tampa module works, which allows me to explode out entity references and so on, so that's kind of cool. So while, as I said, apologies, my machine just hates screen sharing, so it runs slowly. But the way Tome works is you can generate this module with tome, tome.fyi, static site generator for triple A. You can do it the way I'm doing it, which is the pretty easy way, which is to enable a module. And then if I wanna generate a new static site, I quite simply just click generate static site. And that's it. So I then put in the URL that it's going to be linked off to. And it does stuff a little bit more clever. So that's the URL. Then just scraping the pages because it does know the Drupal parlance and it does listen to a lot of rules. One thing I don't think it does well as path redirects. So I've got to manually do those. But it just generates out quite a nice, let's see if I've got one up. No, that's our goal, yeah. Just generates like, based on whatever your path auto rules are, it just like say I've got slash object, it'll just then go slash object slash, and I'm using ID, so slash 16. And then inside that folder 16, it'll make an index.html. And then you can download it and it's pretty good. It takes a little while. I've got to say it's not like Jekyll or something like that. It takes me about 40 minutes to generate the site, but I got things that just run it off in the background. You can do Tomesync, which I'm not using, which allows you to commit your content as JSON. So you make a change, you can commit it, spin down Drupal, keep it all local. So you got to, and it does actually have this module called Luna, which is for client-side static searching. So you don't have to use our goal here. Luna's like a play on solar. And it's just like a lightweight search in JS. So that is, I guess, a little bit about Tom and a little bit about this project. So I made it in three weeks. So what's next and how much tech debt? Well, look, there's a lot. Not going to lie to you. I can't see you, but I'm not lying to you. A lot of it was because I had to learn this kind of quite legacy application and how all of the data structures worked. And some of the calls are made with the wrong calls, but look, it is what it is. The idea is we're going to transition out of the beta by March 2021. So I've got a bit of time to do more releases. We're getting great feedback about what our public want to see. I've had, I think, about 17,000 searches run through it. So I've got a bit of an idea of what people are looking for. Some of the results have surprised me a little bit from the survey about what they want to see next. I am interested when we reopen, which is 22nd of June, quag, and in August for Gallery of Modern Art, if you're in the Brisbane Meetup and you want to come back, we are doing social distancing and there's lots of hand sanitizer. We, I am interested to see once we're open whether the results change because there's definitely a lot of people saying they want to see things that are on display inside this beta. And one of the things that I think has been a real win out of this is you might remember from that graph that all of the data improvements are being fed back. Well, that has resulted in some real significant achievements in some of the staff have been aligning place names which were just text strings against vocabularies like spaghetti, which is a library of place names and can identify things a bit better that if you were to say, you know, the USSR, which doesn't exist obviously, but it's work from that period it would be able to locate it within that space which has now got 17 countries in it. So we're now able to try and do some slightly more clever things with maps and time dates because we are not, we're not just dealing with blobs of data, we're taking it out and improving it and a lot of that's manual processing of course, but one of the things that Groupal does do is it strips out a lot of redundant HTML which Microsoft Word adds. So we're through this process getting better data and there should be a new release in two weeks which will be lots of data. So look, I went 15 minutes instead of 30 because I spoke fast, I started slow and then I sped up. So that's really all the things I wanted to say. More than happy to play around a little bit in the side if anyone wants to see anything. We've got Creator Search, if you wanna look for others on any particular page, we've got like this nice little hidden a little catalog search, there's some gorgeous works on this and it's nice that we can kind of bring them to life a little bit better and I really look forward to the next phase where we start bringing maps and video and stuff like that that relates to these works and now they're not just pretty pictures, we've got this awesome research and interpretation that really talks about the significance of these works. So if you are interested in learning a bit about Clint's lens artistic heritage, have a look. All right, thanks a lot Morgan for that. I was really impressive, cool. Now you may have spoken a little bit quickly but we've surely have got heaps of questions for you so don't worry about that. I'll just kick it off with one. You've mentioned Algolia there and you were being able to track the searches and see what people are searching. Do you wanna just say a few words there on what Algolia has brought to you on the back end? I presume they've got some way of seeing the statistics on the searches there which we may not get in Drupal unless we track them explicitly. So is that an advantage you've seen there with that? Oops, you're on mute there Morgan. I did the smart thing of muting when I started talking rather than the other way around. Look the major advantage for Algolia was that initially like this got rushed along pretty fast but it was trying to visualize our data internally. So it started as an Algolia first project in that. I was getting some stuff out of their system and it was easier to visualize if you had a search like Algolia and what attracted me to it was more that I could put a lot of logic into emphasizing the search itself. So I could bring out the facets and really emphasize stuff like artist name would be a bit higher than just kind of a flat dumb search and start tokenizing certain components. So it started with an internal project that I wanted to make our collections more searchable. Yeah, I could have used solar, could have done whatever but Algolia is really nice to work with. I think one of the things, and we're on the paid tier, we're paying, forking out 29 bucks a month which gives us a quarter of a million operations which sounds like a lot but I imagine that once we get this out of beta it probably would be a monthly spend but if you're looking at running a static site and then just putting all of the dynamic elements through Algolia, which is what a museum catalog could really benefit with, let's see it is a really good way. In terms of analytics, they're actually pretty crap because unless you pay the really premium price they disappear after seven days. So you've got to like, and it doesn't seem to be I could just be looking in the wrong spots but it doesn't seem to be an easy way to pull out the analytics without manually going in every seven days and pulling everything out. Yeah, it's more the actual power of the search itself and the fact that I can customize it that attracted me. Well, thanks for other questions there for Morgan. I have a question about static site generator. So I actually find quite useful for one of the clients who didn't know how to run Drupal locally to quickly generate it, connected to SAS and CSS and let them actually play around with styles. That was quite useful. Did you find any other useful examples of actually how to use it or did you just use it for deploying to production? And the second question, where are you hosting if it's not a secret? It's not and that example you gave was a really useful one. So like, I did some basic stuff in Bootstrap 4 but I handed it over to somebody in the team to do the kind of final design on it. And she just was, I just handed the whole site over and she just tweaked with the one CSS file that was generated and just put some tweaks in that one file and I was able to put that back into the theme. So that was a really quite useful way of getting the theming done without having to get a non-Drupal person. I mean, I didn't have to document how you do the build and actually create that site locally. So that was really good. And it's actually hosted on the gallery servers. So it's, their current corporate site is a square matrix site, which is what it is. And, but in terms of any of the subsites which is the domain.quigoma is run locally. And that's why when I said it was difficult for me to get a web server, it was easy for me to get somewhere which we could serve files but it would have been hard to get a full server. Cool, thanks. Hi Morgan. Hey David, how you doing mate? Yeah, good to yourself mate. I'll do that. Just a couple of questions. Did you encounter any gotchas along the way that you would kind of do? Oh yeah, I've heard about that one. Every sixth time you generate a site in time, it only generates a quarter of the pages. Serious, on like clockwork, every sixth time you run it, it doesn't generate all the pages and then it's fine. Ah, the old six by four rule. Incredible. It's a thing. Look, the other one was redirects. I didn't like, because between releases, it's okay with the ones that have just IDs in the URLs but some of the ones are actually made by like using aliases and like the stories themselves and between releases they might tweak the names of the essays that accompany the works. And as a result, what I had found out is A, there weren't redirects when the page had been changed and two, our deployment pattern wasn't wiping the directory before uploading. So we had these kind of old stubs hanging out there. That probably shouldn't have actually said that because I only found that out a few days ago, so it's still there. But the thing where not all the pages get generated that's not near the here nor there really. If that happens, just do it again. But the other thing was definitely a gotcha. Then just another question. Did you, what was the reason why you chose this particular site generator? Did you look at anything else aside from it? Convenience is the main one. I just looked at, when we go to production, I'll probably reassess the whole build. Right now, it was about getting something built in from the ground up. And I've had three weeks, but I spent two weeks on data. So I had to do the, pretty much the whole thing in a week. So that was the main reason. Oh, I just remembered another one too for Tom. If you're using the kind of Drupal Composer project, it's fine. There's a Tom Composer project that's fine. If you try and apply it, and I don't know why I did this, but I was just interested playing around. I tried one that hadn't followed a standard Composer build and yet it went work. Hey, I'm just interested. Have you got the entire, like, diploma process automated? Because it seems like you're not actually doing any content management inside a Drupal. Drupal is kind of just a aggregator and processor of this other old system. So could you potentially just, in a CI pipeline, boot up the Drupal site, run the feeds import, run the Tom, generate, and then push the, all the HTML files to your server? That's the idea, one day, yeah. Yeah, that's awesome. I mean, Tom actually does, if you go to the Tom homepage, that's one of the workflows they set, is that you can just spin it down and just spin it up just for the generation or regenerating the content. Yeah, so that's the goal. Well, people are coming in and out. I am interested, Morgan, like, you're in beta now and I get that. When you go to production, how are you actually gonna publicize the asset? What do you mean, sir? In terms of advertise the URL or, you know, advertise that it's an asset out there to consume for your users, how are you gonna market it? Oh, well, we have that dodgy site, or sorry, I shouldn't say that. We have our existing catalog already. So, right now, collection.quagoma is the existing site and this one is collection-online-beta.quagoma.qld.gov.au. So when we go to prod with this, we will just go over to the new collection. Okay, cool. And the good thing about the current collection is none, all of the URLs are offsets off a temporary table when you run a search. So there's no persistent URLs, which is what you want with a persistent resource. That's completely facetiously sarcastic. It's not good. So, as anyone who's presented, it's kind of weird when like, has everyone's muted, but you can kind of see people, like Scott, you would have got this, but you can see people kind of talking, but it's dead silent. Yeah, humor in the land of Zoom is a, everyone's a tough crowd. Yeah, I'm used to when I present, I like to like say things that, you know, and it's pretty hard to judge whether or not how well we're going down. Maybe after virtual background, the next thing will be a room generation. Do you want to be in the conference room or in a big conference, small conference? Cool, well, I think that's me. But one closing thought is that if you do have like a client that's not really doing much with their site and, but they're on Drupal, I reckon, time it and spin down the site. And it's particularly if there's no interactions, which is the site is static. So he's new static. Well, yep, thank you so much, Morgan. That was really, really entertaining and it's always great to see new techniques there. So yeah, thanks for sharing that today.