 Let's get started. Hello and welcome everyone to the session. My name is Thomas Seidel. I'm trying to monkey on triple.org. And you're all hopefully here to hear a quick overview of search API module and see me do a live demo of how to set it up on a new installation. And so a quick warning, 14 minutes is pretty short for this talk. So I'll try to speed it up and maybe talk a little bit faster and drop all the good jokes. And yeah, if you have questions afterwards, either just watch the video online again or come find me and I'll be sure to answer any questions you have. So a quick overview of the module. So the search API is a contract module I developed in 2010 for triple seven, triple eight. And it's made to be a replacement for the core search module. So it gives you complete search functionality, just much more powerful and flexible than the core search module. But powerful and flexible as usual means, although more complex, which is why I'm standing here and telling you how to set it up because it's a bit hard to figure out for newcomers. But yeah, with the search API, you can use different dedicated search backends like solar or elastic search, as well as the normal database for searching and comes with a great use integration. So you can use the normal use tools, most of you will be familiar with, I guess, to set up your search. Internal structure is, it consists of two config entity types. You'll usually, for simple sites, just have one of each. A search index defines the kind of items you want to index and search like content or media or user profiles, which fields you want to index, how this should be processed. And all of that is completely back and independent. And then the server gives you the definition of how this will be, really where this data will be stored and retrieved during searches. So this is the back and dependent part, which connects to the database or solar or elastic search. And the idea is that all other modules, all other features that are built on top of search API then just reference the index, to relate the switch, the back end, then things will just keep working the same with the new back end. So with that out of the way, let's begin. Can everyone see that okay? I hope it's large enough. So we begin, of course, by installing the search API module. Yeah, and along with the search API module, we'll also need to install one module that provides a search back end for a server. So for this we have the option of using the database search back end, which comes with a search event project, or the solar module, which I'll show you later, or some other backend modules in ConTrip. And there's also the database search defaults module that gives you a ready-made installation based on the standard profile of search API. So then you can, if you have the standard profile, then you can get up and running right away and then just customize that. But of course, it would be a bit boring for this talk, so we'll not use that here. Also with the Mami, of course, it won't work. As I said, you need the standard profile. And once we have the module enabled, we can go to its configuration page. I don't know if who of you was already at Vinov's talk on Monday. Okay, not a lot. Okay, that's great because he actually talked about some of the same topics and it would be kind of boring. But yeah, we go to the configuration page, book market for related reverence. And first we add a server. So then we can later already add the index right to it. As I said, it will be a database server, so we just call it the database server. Everything else can be the default. Minimum bird length is usually three is a good default value for that. And that already gave us the search server, so that's very, very simple. The index is a bit more complicated and has more options. So first we need to decide what we actually want to index. And for this demo, we'll just index normal content entities. But as you can see here, all other content entity types are also available for indexing. And content modules can add even more of those. So yeah, these options can all stay as they are, we just want to select the database server as the search server. And in the index options, I need to quickly talk about the index items immediately option. Because what that does, normally items are indexed during run runs. So when a node gets changed, it just gets marked as search in an internal table. And then during run runs, all those dirty items will be indexed on the search server. And only then will they appear in search results. And as this option already describes in its label, this causes the items to instead be indexed right away after they get changed. So when a new node gets created, or an edited node gets saved, it will be added to the search server immediately, which of course has lots of benefits. Users which maybe you're using on your site, the search API also for normal listings of content. And when a user, a visitor creates a node, if they are able to do that, and then don't see it in a list, they'll wonder what happened. So this can surely prevent some head scratching. Also if you're using the index data for search access checks, this saving this option could be a security risk. Because then if you unpublish a node and it should not be in the search results anymore, the changed published field won't get indexed up until the next run run. So the node might still appear in search results until then. Therefore it could be important to have this option on. The downside is that especially for dedicated search backends, like solar or elastic search, they are very happy to just receive 500 items with a indexed dose, but are very unhappy if you ping them with a new item every 10 seconds. So the indexing performance and searching performance will go down for larger sites with a lot of frequent changes if you're using this option. But my recommendation is to always keep this enabled for small sites and only for larger sites, investigate if changing this makes sense for your site and make sure you don't have a security problem that way. For our small site, of course, keeping this is fine. So now that we have added the basic index, we can configure which fields the index should use. One great field to have here is the HTML output, which will basically just include everything a user sees on the page in the search index, which is in a lot of cases, of course, exactly what you want. But we just have a dedicated view mode for that. I could pre-configure the search index for that because you don't, of course, want to have field labels or edit links or share links or something else, something like that in the search index because that would be pointless and just muddy the search results. Then we just add other fields that we want to search or filter on later, maybe. Body we don't need because this is, of course, already part of the rendered HTML output. But content type could be nice for filtering. Just add fields that make sense. Basically, of course, normally you would, first create the index, maybe add some full fields, and later then just think about which other fields you need for filtering or sorting and then going back and adding those. So yeah, okay, I seem to have added text twice. I hope I didn't forget anything in turn. Not the only thing we did index title because that's normally not part of the rendered HTML output. And B, we want to be able to boost this field which basically says that the title or words in the title are eight times more important than a text in normal body, word in normal body, so that items containing a search keyword in their title will be ranked further up. So we save this fields page and then we get the warning or message that we need to re-index, or now it is indexed for the first time, so that the settings can take effect. But first we want to look at... Why is the autocomplete tab there? Okay, did I enable autocomplete by accident before? Did anyone pay attention? Yes, okay, good. Well, that makes sense, I didn't mean to, but anyways, processors. So what are processors? They are a type of plugin configured on the index and they can really do lots of different things which is why they have such a generic name as processors. They can change the index data, add new fields, transform search queries or search results, and there are countless of them, some provided by search by itself and some in contract modules. And yeah, for example, ones that are contained in the search by itself are the one that makes searches case incentive, which is of course something you almost always want, provide highlighting for search results or add access checks as I mentioned earlier. So automated content access on the search results. Normally access checks can't be provided generically by the search API, so if you define an index on something other than nodes or comments, then adding or making sure not nobody can see stuff that I'm supposed to see is, well, is your concern? Because it's just not possible to do this generically unfortunately. But for content and comments, there is the ready-made processor that does it for you. What you in any case should keep in mind is that some of these processors can cause problem with dedicated backends like solar or elastic search because they already have their own internal way of making searches ignore case, and provide standing and tokenizing and whatever. And so they just get confused if you already do that on the search API side and then things won't work properly and I get unnecessary prior reports. So I just make sure to keep that in mind when using a dedicated search engine. But for a database backend, most of these make sense. We want automated content access. We want highlighting. We want the HTML filter to not index HTML tags. We want to ignore case. Standing and tokenizing are also standard search features. If you're not sure what to do, check Wikipedia. The normal process settings are largely fine. Just for the HTML filter, we need to disable it or should very much disable it for those fields that actually don't contain any HTML because it doesn't make any sense for them. So we just saved these settings now and now it's finally time to reindex actually. For that we could wait for a cron run or just go to the view tab, click index now and great. We now have a working search setup just no way for the visitor or actually us to search. So we'll use the fuse integration which I mentioned earlier. Just create who here never has used fuse before. Okay, just making sure. I mean, I wouldn't have explained it really, but yeah, just making sure. So we call it search. The type here to use is index content index. So index and then the name we probably used. Then we realize something is not working with caching. Then we create the page. So I just cleared cache if you're wondering. We want to use a pager, create a menu link and main navigation maybe and then save and edit the view. And then you get the normal normal views UI. We can here change to have a rendered entity with search result view mode I set up earlier. We could also add the search experts excerpt as a field, but yeah, I don't have time for that right now. So now we look at the results and see, yeah, we already have search results there unfiltered at the moment. And as you see, we have also the same result twice, once in English and once in Spanish. So the first thing we should do is of course add a filter for language. So when you add a filter, you have all the fields on the index available as filters. But for now, we just want one on the item language, which is always present. You don't have to index that specifically. And one full text search, which gives you a normal search keywords field like you'd expect in any full text search. The item language to, of course, just correspond to the language selected for the page. We could always expose the filter if you wanted to, but let's not do that now. And full text search, of course, largely only makes sense when exposed. We couldn't make it required if we don't want people to be able to get unfiltered listing without any keywords. But sometimes allowing that is what you want to do. And yeah, we want that in this case. We'll use a placeholder because it looks nicer. And for the final setting we'll change is the minimum keyword length, which should correspond to the one you have picked on the search server. And also on the tokenizer, plug-in which also has a standard keyword length of 3.3. So yeah, this just gives the user an UI warning if they enter two short keywords. Then we could also add sorting by default. Sorting is done by relevance, so the most relevant items get shown first. We want that too, but if the user doesn't add any keywords, then there is no relevance. The relevance for all items is the same one usually. And so we want to have a backup sort in this case and we sort by the creation date for this case. Both of those, of course, descending. US items first and most relevant first. And then put the relevance above. So it's the one that gets used if there is any relevance. So what more to do? We could add headers, no results behavior. Yeah, maybe let's add a result summary header at least. And since we of course want, or of course, but we do want search box on every page of the site. We expose the form in a block. And yeah, and we do activate caching. First of all, these two don't actually work with the search API. Only these two do work in some sense, but using none is, you should evaluate for yourself using one of these caching strategies does make sense for your site. We save the view. And now we just need to enable the exposed block. We expose form block. We add this to a pre-header region. We don't want to display the title. Yeah, that's like you would do for any normal view too. Save blocks. And now we go back to the site and we should have a search block right here. Yeah, and there it is. So when we now search for something, we'll get the results page as we expected, hopefully, where they are just cooking results. And as you can see, if we switch to espanyol, sorry if I don't pronounce it. No, of course, no cooking. Cooking won't appear in any Spanish content. Okay, but if we just use the unfiltered search, we'll get 18 results. So exactly all of the Spanish content. So let's see. If we now want to find special information about cooking vegetables, let's say, back to English, sorry. Okay, we get two results. So we use a free search. That should mean that exactly those two words appear in this order, right, one after the other in the content. But what a Crest soup. Let's check that out. And yeah, this is not actually found in this item. If you go down to the other one, let's see there. Yeah, cooking vegetables is a part of the second item, but not of the first. So what's the problem here? Anyone knows? Well, a database backends, retoric question, sorry, is currently not capable of doing phrase queries. It will accept them. So the quotation marks are ignored, but it will just find any item that has both the words cooking and vegetables or cook and vegetable or yeah. So if we want free searches or faster searches or better matching behavior or just have lots more items, then we should switch to Solar. So Apache Solar. If you don't know it, Apache Solar is an open source search engine. So it's dedicated software that implements a search server for you. And this really has all the search features that you could ever imagine or want, more or less. And it can deal with almost any amount of documents, so it's happy to take 50 million as well. So once the database grows to small, switch to Solar, and it shouldn't have any problems, it probably set up. Now some of you might wonder what about elastic search, what about sapien or other search backends, which are also great. I'm not saying Apache Solar is better than any of these, but what's true in any case is that the throughput integration is far more mature and superior to the others as far as I have seen. And historically Apache Solar has always been a go-to solution for our throughput sites since I don't know throughput for. So it's just a lot more supported and better integrated. So I usually just suggest using this, even though, for example, elastic search, I like very much personally, but it's just not properly integrated with the search API at this moment. How to set up Solar? First of all, there is an option of just paying someone to do it for you. There are web posters which provide Solar service for you. You can just use one of them, or you can set it up locally, which I'll show you in a moment, easily to evaluate it, or even on your own server if you have the knowledge to do so. Either way, but especially if you set it up yourself, do not forget security by default, the Solar service is just accessible to everyone. Everyone who guesses the right URL can just go to the server, read all the items, index their own, delete items, do whatever they like. So definitely do not neglect a security server. I've seen production servers where just you needed to put in the standard Solar port and slash Solar and you were in. And this is really, really, really bad. So yeah, but I'll just do a basic local setup without security. You'll have to research yourself some options to add security to a Solar server. So yeah, doing that, first of all, let's start our Solar server. I've already downloaded the package and of course, verified it with the PGP signature. I'm a nice guy. Extract it here. Open the terminal and then just execute this command. Bim directory, Solar command start. And after a few seconds, you already have Solar running on your server. A few more seconds maybe. Yeah, and that's it. Solar is running. Now we need a core, a Solar core, which is a dedicated Solar index for our site. For that to get go to a server Solar and create a new directory for our site. You can call it anything. I'll call it demo. And then we need a const directory which needs to be named like this inside of there. And so now we need a configuration files for Solar and if someone has done this previously, a few years ago maybe, then they will say, oh, oh, oh, I know this one. You go to the search by Solar module. There's a folder there and you just copy over the configuration files. And that used to be the way to do it. But now actually Solar, the Solar module comes with a configuration builder which will build configuration based on your own configuration. So you can set up your own field types and it will most importantly recognize which languages are enabled on your site and include all the language specific configuration. So that's really great to have. So let's see that in action. First of all, we again have to enable the module, of course. There's also defaults module. Once again, if you want to quickly test it out with a standard installation. Yeah, now we go to the search by administration overview add a new server. We call this unimaginatively a Solar server. Choose Solar as the backend, of course. And then there's a standard Solar connector is fine for us. Yeah, if you don't know what the others are, just don't use them. Standard is fine for now. Basic auth might be a way to get proper security restrictions. And for the other options, they can all stay the same except we need to input our core name we chose earlier here into the Solar core field. And save the server. Now it says the server could be reached, but the core could not be accessed because, well, we haven't made it yet. We need the config files. And we can get those just by clicking here. Open with and you'll see there are files for both English and Spanish here. We extract all of those into the folder just needed, server, Solar, demo, Conf. And now when we go back to the fuller view, we see the files are extracted properly. So what we just now need to do is to tell Solar about the new core. So we can just click this link and it takes us to the Solar. Solar admin panel, at least if you haven't secured it anyway, that would prevent that. Then we go to core admin. If there aren't any cores, it already gives us the new core form. Otherwise we need to press add core. But yeah, not that hard to figure out. And with this, we already have our core up and running. All right. Now if you refresh, we see the core could be accessed. There aren't any items on it yet, but it has the right schema. It already works. We just need to add the content index to this server. Solar server, save, then reindex. And the content is all on the Solar server now. And when we reexecute the search, hmm, okay, the search content is definitely there. But cooking vegetables isn't found. Who knows why? Who paid attention? No one? Okay, ah, yeah. You have the filters already set up so it gets confused. Exactly. Yeah, the processor set up still for the database index and it gets confused. Very well done. Okay, we got the processors. And once we're there, we already see a big bold warning. It is recommended not to use this processor with the selected server. So let's just follow these hints. Disable the three processors that have this warning. And for the HTML filter, something that doesn't, isn't explicitly saved by tagboost, don't make any sense for that as far as I know. So better just disable those. But they also shouldn't do any harm. So re-nexting now, we have all day. No hurry. This is of course the great thing about live demos, everything goes always wrong. So if I can do it, you can do it definitely. If it works well enough in a live demo, nothing can go wrong for you at home. Okay. I think you maybe forgot some checkboxes to take because I think it's much more just to, there are some checkboxes I think you forgot to take. No, they don't have any warnings. Content access, highlight, HTML filter. They are all fine and the others, they don't have any warnings. So no, it's fine this way and indexing finally worked. Don't know what the holdup was. And now, yeah, back to the search. You see, we have this one article which does have cooking vegetables and we don't have the other anymore. So this, yeah, this now works. So solar upgrade complete. And actually I was much faster than during practice. So we might have the chance to look at one more module if you want or maybe go deeper into an existing one. So our extension modules are available as I said. Two of the most popular ones are facets which gives you a facelift search and the autocomplete module which gives you an autocomplete and, yeah, several other modules and, yeah. There is now, some of you might not be aware of projects that extend this link on project pages or on some projects pages at least. So to get a search by a project page, this link will take you to a complete list of all the modules that extend the search API in some way or integrate with it and didn't forget to put in this information. So I now want to talk about the facelift module unless we have a clear majority for autocomplete. Autocomplete? Oops. Okay. Facets, facets. Yeah, I thought so. Overruled. Everyone who wants to see location out of the room. I'll go to talk to Marcia as well. Where is she? Sorry. Good. Ah, okay. I did enable autocomplete already but sorry, I still know. Let's enable the facets module. So we installed that. Our range, which might also have been a nice idea but probably no time for that. Go to the search API, then go to the search and commit the data section. And there we can add facets again for all the fields we have indexed. We have to pick the searching question so that's this view search we set up in this case. The field we want to filter on, content type is of course, well, a very classic example. Just being able to filter by whether there's an article page, a basic page or a recipe that matches. I want to see a list of links. And there's now an endless list of options and processes which you could enable. You can read through all of them if you want sometime but not right now. We just want to transform the entity ID to label because the content type field contains references to content type entities. So we need that to have the label properly displayed. It's list item label, I think. No, it may be both work but I practiced. Chill, chill. It's fine. Okay, let's do the end operator for this one so we can only pick one of the content types. And yeah, the rest should be fine. So let's just save this and maybe another facet on, let's see what we have. Maybe tags, yeah, tags is also of course a very classic example of facet. For this we might want to use checkboxes instead so that we can filter by more than one facet or at least that it's also represented in the UI that way. Maybe set a soft limit for that so it doesn't show us all the countless tags right away. For this we also want to transform the entity ID to a label. Tags in this case are not a hierarchy taxonomy so the hierarchical features don't make sense. But yeah, everything else should be fine like this. Hard limit might be a good idea if there really are thousands of facets of tags but in our case it's not that many so no hard limit is also fine. We saved this and lastly I want to add a facet on the difficulty field which is only present on recipes. And so for this it makes sense to show this facet only if the user first clicked the recipe content type facet. So the way to do this is very simple. You just go to dependent facet enable condition for content type check whether the facet is set to specific values makes sense. And now we enter the machine name of the content type okay maybe not that simple but it's simple enough let's say like that. And for this we use list item label because the difficulty field is not an axonic term reference is just an option list with three possible options. And the rest is fine again so let's save this. Now we just need to add this to the side. We add the facet to the sidebar. I write we can't filter by category unfortunately that would have been easier but yeah we want that like that. The difficulty two and text was the last one. I hope I'm not forgetting some setting here but but of course you could if you want restrict the restricted blocks to only be shown on the search page but if no search query is present then they won't appear anyway so it's not necessary to do that. Good um yeah that's it do I need to save this let's be sure. Going back to the site okay cooking vegetables is a boring example but if we now go to the young filter search with no keywords we now see there's the content type facet there's text facet with checkboxes the checkboxes aren't zoomed so they are not as as well visible as the rest and when we click more we get the whole list of tags we might have wanted to order them by an athletic area instead of by account but yeah whatever works and if we click on recipe then we just see the nine recipe results and the difficulty facet now also appears and yeah that's a basic facet setup and we have three minutes left so no autocomplete sorry but um yeah are there any questions eight minutes I thought 40 minutes session yeah all right autocomplete yeah 45 minutes okay great uh yeah yeah I'm I'm fine with improvising so this is really much quicker than during practice it's amazing every time okay so uh right I should talk by doing this so first you enable the search by autocomplete module this is clear you go to the indexes autocomplete tab also pretty obvious once you see the tab is there then you have to enable autocomplete for any search where you want it we currently only have one search view uh named search so um we enable it for that and then we can edit it we can display live results so when the user starts typing they get uh the result um in real time once the once they type though that only really works well with partial matching which we don't have set up at the moment uh but once the user completes a word then they'll get the results right away and then there's uh several uh types that for the rest of the four suggestors really also retrieved uh I only retrieved the suggestions from the solar server and um one uses spell checking so if you miss type something it tries to find the correct uh word and the others um yeah let's just use solar terms for now I think um that should that should give us the best results in our case um and we just want to display three live results and then up to three up to three solar terms we can now also this configure the suggested displays we could just search in the title for example and we can choose which new mode should be used for um displaying the um the results in this autocomplete pop-up but if you don't have one configured of course now this makes sense because you don't want to have a whole teaser or anything in the um in the autocomplete pop-up and solar terms also just let's just choose the um fields that should be used and uh if you properly set this up then you might want to have an additional field that just contains terms for autocompletion so um the results will look nicer than um with uh stand terms but for the illustration purposes it should be fine like this we want this for all of the um use displays um want to complete right away and yeah let's display result count estimates too so um the user can already see how many um how many items um approximately will be returned by a certain search so going back to the site when we now go to the empty keywords you see there's already this autocomplete ring and if you start typing then we get suggested terms and if we complete a word then we already get a live result so these two are ones that would contain pasta these three contain past or some variation of that that etc and if we pick one of those we go directly to the loading question and if we pick one of the suggestions then we go to a search filled with that suggestion okay um yeah then we can already uh set up the search index a bit more properly for the live results which is also pretty simple with solar if you know how to do that so um first of all we want to add a new field um which is again just to render the html output um with search results basically a clone of the existing one but this time uh we are going to use a different um field type namely one that is uh better suited for um for uh um displaying live results because it will already um deliver results if you have incomplete keywords so um should pick some label for that so they don't they don't confuse that those two um say the changes um we can then re-index what we also want to do is um go to the search view and make sure that the search view doesn't doesn't actually use this new field for its own search um unless of course we want that but um let's let's see for now that we just want complete words in the search in the normal search um on the normal search page but to be able to deliver live results um so we go to the full search filter and down here we have the option to restrict the searched um search fields so we just pick um the two original ones and leave the autocomplete field out of it um save that and then of course we need to set up autocomplete to actually use this field might have done it first um yeah for this for the display live results um we just want to use this uh granite item one a field and for the solar terms it will already use what the search is using so it should not use this um granite item one field um if you want to go sure if you want to make sure you can also just um check these two fields here save again and hopefully now we get a better display live results autocomplete so let's try it out yeah and now just when you type two even just two letters you already get three live results with um yeah results starting with those letters and now I think we are really down with the session um so yeah one thing I should of course also point out that tomorrow will be at the contribution sprint we already have a table reserved for the whole week where we try to make the search a bit better so if you will have questions suggestions or want to help out trial patches write patches review whatever then please come by and help us get there or just come come there in general and help other modules or core improve and also please remember that you can rate this session and the whole trooper can give feedback and hopefully help us all improve so um yeah thanks everyone for attending and have a nice day and please meet me outside if you have any questions I am happy to answer there