 Hello and welcome everyone, my name is Thomas Seidel and as you are hopefully aware, I'm going to talk about the search API in Drupal 8 today. This will be a live demo of the current state of the search API in Drupal 8. I'm just a quick overview before. Who has already used the search API in Drupal 7? And who hasn't used it in Drupal 7? A lot of undecided people here, but good to know. So if you've used it in Drupal 7 already, a lot of this will seem familiar. There have only been slight changes user-facing-wise, but a lot of new stuff, underhood, and a lot more still in development. But before I begin with the demo, a quick overview of the search API in general. It was created in 2010 for Drupal 7 based on some of the suggestions floating around regarding core search for Drupal 8. So there are a lot of ideas about being more flexible in the kind of data you can search, in the kind of backend you can use for searching and all of that. And really I try to come up with a ground, with a completely new implementation of search. Not based on core search, that would incorporate as many of these features and functionalities as possible. So with the search API it's possible to index different kinds of data, with different search engines, and present this to the user with different types of user interface. Basic architecture, but yeah, if most of you have used it in Drupal 7 already, this is familiar. There's a search index, which holds the generic information about what is being indexed, the type of content, the type of data really, the fields, et cetera. And then there's the search server which incorporates all the backend specific information. So whether it uses the database or the patchy-soul or something for indexing. And yeah, it just really takes care of the actual process of indexing and searching. And this looks like this in the visually. So just one server, every index has a single server, but there can be multiple indexes per server and all the other modules then just use the indexes and don't generally care about the backend. So almost everything you do is backend independent and can be reused no matter what backend you're ending up using. Which can also be different, for example, on test or live servers to make things easier. And yeah, so with this short introduction out of the way, let's start right away with the demo. Is this large enough for everyone to see? Okay, I guess so. So we'll just as usually go to the extent page and there enable the search API and database search modules. As you see, there's also now a database search default module which was implemented a few months ago which allows you to easily have a full setup with index server and the view available without having to really configure all the stuff by yourself. And also important while this is loading is that when you're using search API, you almost always should disable the search module from core because that's not needed and will just take away unnecessary resources which keeps enabled. So search API, configuration page looks like this and here you can, as said, add servers and indexes. And we first add a server, and our example, just a database server. The description is optional and just displayed in the admin UI and you can again select the minimum word length to index but nothing really, but really nothing else for database backend which is very simple. And then once you have this backend, the server, you can add an index to the server and here is one of the larger changes in triple eight because now you don't have just the node index as previously, you can actually index any number of different item types in the same index. So you can index content but then also index comments and taxonomy terms and all of this will be found in a single search but you can also of course then also create searches for just one or several of these. And also as in the latest versions of the triple seven module, you can change the bundles that will be indexed for each of these data sources. So for nodes, you can just index articles or basic pages or for taxonomy terms, the same with the vocabularies. And then you just select the server you want to use and then there are just advanced options and one of these index items immediately I'll shortly discuss because that's a rather advanced option really by the very important one. So what this does is if it's enabled then as soon as a node or taxonomy term or comment gets edited or added, it will right away be indexed. So searches will right away return it. The advantages of this is of course that there is no stale data in your searches. So once something changes, then the search will reflect this and which is especially important if you use the index data for any security checks. So if you have a search view and are just showing published content for example, which is of course a very common thing to do, then if you don't have this option enabled and you unpublish a node, it will still show up in the search results which can of course lead to information leaks. So this is one large concern which you should keep in mind if you plan to disable this option. Then of course there's user experience. If a user is allowed to add new content or comments and they add a comment and then do a search or maybe you use the search by doing a simple list, then they will wonder why isn't my content a comment showing up if you would disable this option. On the other hand, there are performance issues in some cases, especially when you're using factory solar, it does much, much more performance to index large bunches of nodes at one time. So doing crown runs for example. And if you indexing one at a time, this can really be a drain on performance for larger sites, for smaller sites, this won't really make any difference usually. And then the indexing of course takes some time, so it might lead to longer page load times when the user edits or creates a node. Because then in the background, it will have to index before the next page can really be loaded. So to sum this up, it is usually a good idea to do this on small sites, especially if you're the only one editing content anyways. You can just enable this and unless you're running to real problems with it, with the performance, then just keep it enabled. For larger sites, you'll really have to think about it, what the effects will be on your site and make a decision based on that. So to continue, this is just a small demo, so we keep this enabled. We save and edit. Now the next step, we are selecting the fields we want to index. As said, this is also very similar to triple seven, although we have changes planned for this. Just not implemented at the moment. I'll come to this later. But here you have more or less the same interface as in triple seven, just with several data sources. So one for comments, content and taxonomy term with their respective fields. So one improvement here is that you don't have to add the text, the body text as a related field before indexing it. I think that's a UX plus because there's a lot of complaints we got for triple seven and it's actually very much easier internally in triple eight to do and likewise we can index the subject, of course, we want to find that tool. The type I'm selecting here is actually pretty straightforward in most of the cases. You have, yeah, integers, doesn't matter, states, the only real difference here is full text or string. Or string means really one keyword, one name, like the content type. This can just be a few different strings. You want to just index them as is as a single token. Whereas full text will be split into individual words which can then be found. So in full text fields you can really search for words that contain them. For string fields you can just filter is this, does the field have this value? And the boost just is a configuration for how important the field is. So the subject is eight times more important than the body field and hits in the subject field will count more than in the body field. Here you have the new related fields. Form. Now you can just enable here all the related fields you want to have for this data source and they will be added above. So for example, if we put once more information about the user, we can update it here. And then we have the user, okay, yeah, that's a bug or not really a bug, it's just a complication in a core. It leads to this double step. We'll try to eliminate that of course still, but then you can index the user's fields as well. If you for example want the user roles indexed with the comment. Then the same for content. Just indexing body and title for the moment and again for the four taxonomy terms. So now we're saving this and the last step as in triple seven is configuring the processors. And this again works to a large extent as it did in triple seven. You have different processors here. So aggregated fields can be used to add additional fields to the index. Content access is a very important one. I should also talk shortly about what this does is automatically add content access for notes and comments to the index. Because in general it's not really possible to do entity access on a generic level in Drupal. So most of the time when you create a search it's your responsibility to make sure that only content that the user can actually view is being displayed. But since notes and comments are of course the main use case here, we implemented Drupal's own note access mechanism in the search API. So if your site is using that you can just enable this processor and everything will be taken care of automatically. And as mentioned before, index items immediately is important here to enable because otherwise the note access checks will use stale data and you'll run into problems. There are other modules that don't use Drupal's core note access system. So those will have to be handled separately. And also all other data sources as far as they have some access control you would have to take care of that yourself too. Back to the demo. So other filters are there as before. Note status is also a method if you just never want to show unpublished notes then you can just use that too. Renate item is if you want to include just everything that would be displayed on the note page more or less just note content, HTML. If you just want to index that you can just enable this and search through that field because that will of course also contain everything that should be found. Just take care that no field labels are included there or something and also you cannot boost the title or other fields that way. So it's also a trade off. Stop words are also practical to exclude common words from the index. Something that's new now in Triple 8 is that the processors, the stages which the processors operate are explicitly shown now and you can rearrange the processor order specific to the different stages. So pre-processing index is before items are being indexed. You first want to add the aggregated fields then the content access information and then HTML filter, tokenizer, and like no case can also go up there. But this default sorting will of course be still refined to reflect what would be the most sensible default. This isn't working completely at the moment. Pre-process query is for when you do a search query at what point will the different processors pre-process this query. So the whole order should be more or less the same as here as far as the processors are the same and then you can have the processor settings. Aggregated fields, with aggregated fields you can add new fields and one practical use for this is to have a sortable title because now we have the indexed title as a full text field and especially actually in Triple 8 you can sort on full text fields but you have the title field in three different fields because it's the subject for comments, it's the title for nodes and it's the name for taxonomy terms. So there wouldn't be a practical way to sort on this field because it's really three different fields but with aggregated fields we can just combine this information into a single search API field and that way it becomes much easier to sort on it then. The other processors are more self-explanatory. HTML filters should of course only work on the fields that actually contain HTML. Ignore case can work on all of those. Stop words two, this will only work on a full text field and the tokenize settings are also okay. So we just save this. Then we can quickly take a look at the fields list where the fields, these processes add will now be displayed as well especially now there's a system in the search API to allow fields to be always enabled so to be forced enabled more or less so that you cannot not index them really because of course if you have an aggregated field you always want to index it so you cannot even change that here. Same for example, the content access processor uses the common status and this will also be always indexed as long as the content access process is enabled. But now we just need to index the content with these settings which will just take a few moments and then we're already ready to create a search with these server and index. The way we create the search is of course with views. With views in core for Drupal 8 is of course clear that views will be the primary way to create searches and yeah it's the only way included in the search API itself. So here we select the index we just created. We of course want to create a page with that. Random search API item just lets you use a view mode of the nodes and comments to display the search results. So first off we define these view modes that should be used for that. We want of course to have just a teaser for the nodes. The rest can stay the same. You might also want to just create new view modes for the search results to be able to better customize that. We have a full text search filter to be able to do full text search. We of course also expose it so visitors can actually use it. And the settings are also pretty self-explanatory I think. The one that gives a lot of people trouble in triple seven at least is this one which should for 99% of cases just remain at search keys. So if you're unsure just don't touch this. You can also restrict the fields that are being searched. At the minimum keyword length should be set to the same as you configured for the database backend so that this will play well with each other. And the user will also see what's wrong if he doesn't enter words that are long enough. Well also practical of course results summary so that we'll see what's going on. And you might also of course want to add a no results behavior. It just says sorry no results could be found for this search. But that's normal views tweaking of course. So now that it's already saved we can already view this page. And I will see just a list of all the content we have with comments and if we now insert one of the words that are contained this is of course just all low on Ipsum debugging content. Then you get results that only contain this. There are a few results as we see here. And items that have Makdo in the title which has a higher boost will come first here or generally be ranked higher. And of course if we then use more terms the results will get fewer and fewer. Okay then I also showcase how to set up search with the solar module which isn't as done at the moment as the database backend. So then there are more things to flash out yet but basically it basically works. First of course you need to enable the solar search module which provides the backend implementation for a search API to use with Apache Solar. Here we then just create a new server. But before we do that we of course have to we have to do two things actually. First off once we have the module installed we need to install all since dependency solarium which we do with Composer. So just go to the module directory, execute the statement at this command and then it will automatically install a solarium in the solar module for you and which the module requires to be working. And the other thing, okay this seems to have worked fine. The other thing we have to do is actually start a solar server on our machine which is also pretty easy. I have already downloaded the solar package from the website, 5.3.0 is I think the most recent one or at least one of the most recent ones and there are no known problems with even the latest versions so you can really use any of them. Then we need a bit of configuration. We need to first make a Drupal directory for our server and a Conf directory inside that Drupal directory for the configuration files. These configuration files just come packaged with the search by solar module. In the solar Conf directory and based on the solar major version. So we just copy all of these in here. Then, oh, of course yeah. Then we have to start the server too which now with a solar five regular single command it just to pin solar start and it will automatically start the solar server for you so there's no need to use Java explicitly or install Tomcat on your server. This is now pre bundled in a single application. Now this works and just go to the core admin and create a new core named Drupal in our case. As you saw earlier I used Drupal for the directory name. So this solar server is now running with our custom configuration. So now we just need to tell the search API about it. Again, just specify any name. Then use the solar backend in this case. Most of the settings will already be the right ones for the normal installation. The only thing we have to adapt here is to add our instance directory name to the path. The rest is old one stuff which we don't have to care about for a simple, especially for a simple test installation. One important thing to keep in mind here though is that when you install a solar server this way everyone can access it. So you have to use some way of restricting access to the solar server. Otherwise anyone can add or delete indexed items. So but there's just go to the search API solar handbook which explains several ways of implementing access checks here. Anyways we see the server was successfully saved and the solar server could be reached so we set it up correctly. No items have been indexed yet and yeah everything else seems to be fine too. So now we just need to move the index to the server and we can use the search as before. Oh no we first have to index of course. Now cross your fingers please. This never happened nothing to see here. Okay but in general once this has been fixed it should work, it should just work as before with the database back and not a very good presentation I admit but believe me it will work again and it worked before. Okay but that's already it for the demonstration. Oh yeah I forgot that but I mentioned most of this so you can also instead of self installing solar you can of course also use a solar host just have to take care that the right config files are used. Security concerns I mentioned and yeah Drupal.org handbook all this access checks are explained how to set them up with solar four or five or three I guess and yeah just to go quickly over some of the planned failure changes I already shortly mentioned the simplified UI. So this is something we really want to tackle still just after all the basic framework stuff has been finished because the user interface of course caused a lot of problems especially for newer users because most of it is unclear for new users and not very well explained and just maybe even in the wrong mindset for most of the users. So what we want to do for one thing is just have more views like using the face for the fields tab or at least we're planning to do that that's the current idea so that you just add new fields the way you would infuse and not have this huge table which can really get out of hand if you're for larger sites as I'm sure has happened to some of you in Drupal seven and we also plan to have a wizard for easy search creation where you just click after you install a module just click okay I want to set up a search then you just enter a few basic things and the wizard takes care of all the rest basically applying sensible defaults saying okay you're probably if you're indexing nodes you probably want to have these fields available index this way and the search with this and this this is also just a plan or idea at the moment but we definitely want to do something like that because set up for new users or inexperienced users is really a problem in Drupal seven and maybe we'd also backport some of it but as you saw the search by database defaults module already does something like this having enabling to just enabling the module you'll have a basic set up it's just not very flexible because it's all pre-configured so if you're using different fields on the nodes or something then this will all fall apart and so this is just the first step but we definitely want to prove there and then there are really not many other large changes planned some of the smallest changes are for indexing performance and especially initial set up performance because for huge sites with something like a million nodes in Drupal seven it's sometimes already happened that it made problems when creating the index to get all these items tracked and the problem will be even worse for Drupal eight because now you have all of the nodes in different languages or potentially they can be and you have to check for that and you have different data sources of course so also for all of the million nodes maybe even more comments and so that's something we'll have to deal with still one thing that if we can implement it correctly as in Drupal core which has really been focused on caching lately and providing even to authenticated user a very smooth experience through caching if we can implement proper caching for a search API this would also be a huge benefit of course at least for some sites I guess or for the most common pages and one other thing is search operators so for example, isn't the between operate in the search API in Drupal seven? So you cannot say I want to have this value between be between one and three or do maybe do something like begins with search for strings so some of these operators will be added and also be made available in views and speaking of views, that's as you might have seen far from finished at the moment so we have basic support for viewing the rendered items and having a full text filter but really not much else so this will be once the basic framework is done the next step of course to enable users to create the views actually using all of this functionality that the old one and the new then there is of course facets there was an effort started in at the deaf days, Montpellier in March I think and so the facets API module port is on the way it's still a long way from being done but there will surely be progress here on that as well and the auto-complete and saved searches modules for the search API will also definitely be ported but also at least if no one else gets to it I plan to do this after the search by itself is stable because A, that's the priority and B, otherwise these modules would might still implement stuff based on a shaky basis so it's better to first get a search by itself stable for the other modules, I expect this will also be ported attachments already has a port and just two modules that at least I don't plan to port are multi-index searches module because with the new functionality of creating an index for different types of entities at once this will be pretty useless I hope and it didn't work that well anyways so that will almost certainly not be ported at the pages modules since views is now in core I guess the using search views will just be the primary way to create searches and it won't be worth it to port the pages module which works around that but if anyone wants to port it that's of course also fine by me okay so this was my presentation thank you very much are there any questions? Yes, okay the question was about the page solar module I thought I didn't mention this but I thought it was already pretty known the page solar module won't be ported through plate it was already merged more or less with the search API solar module and we are working to make sure that the new search API solar module for triple eight will support all the use cases that the page solar module supported in triple seven so that everyone has a good update path and yeah that's been decided already other questions? Yes, the question was from the processors tab why the list for pre-process index pre-process query and post-process query were different the processor lists this is because processors work at different stages and this list this form really doesn't only allow you to configure the order but also shows you which processor works on which stage so the aggregated fields processor only adds a field and it only does that at index time because there's nothing to do with search time so it is only listed in the first of the three columns pre-process index things like the tokenizer which splits the content to individual words this has to work at indexing but also when doing a search because you have to split the search keys the same way on the other hand when post-processing the query only a little has to be done you can do highlighting so the highlight processor is active at this stage and you can the stop words processor just adds the stop words it filled it out to the results so that they can be displayed but other than that most of the other processors don't have to do anything at this stage so this is really a kind of introspection into what the process are doing something that was actually also requested for triple seven just to be just to have it more clear what the processors are actually doing other questions, yes? You mean the config system in triple eight so the question is whether search by place nicely with the new config system in triple eight and the answer is yes triple eight comes with a very great system for creating entities based on configuration and the search bar uses this for both indexes and servers so every index or server you create will actually be a single configuration object and can easily be exported and imported on a different side or export this part of the whole configuration and yeah it's of course in triple seven there were a lot of pains with using features to search it by especially with the database backend and of course now that this functionality is in triple core we also want to ensure that this doesn't happen again and this really works smoothly to export an import configuration and servers or indexes Any other questions, yes? No, this is actually only just a text field where you have to, where you can enter them it's, so here you just have a list of all the stop words it's of course not in triple seven there was actually the alternative to either use a file or this text field so maybe that will be added later but basically the idea is to have this configuration in the index itself also for better configuration support Yes, the question was whether things like stop words would be, if there are plans to make this multilingual that's a good question, yeah? Basically we are trying to do a much better job at supporting multilingual sites in triple seven also because again core has made great improvements in that area and what core does the search it by should of course support too so for the translated entity is actually are now indexed separately so these will work very smoothly in most cases already and regarding language specific configuration we are not that far unfortunately so there are not no multilingual stop words and I don't even think that this has been noticed but so if you want please create an issue for that and we'll see it's a really good idea of course because language independent stop words are basically useless for multilingual sites so yeah we should definitely have that and we also try to bake better multilingual support into the solar module itself so this will also work better out of the box or at least work smoothly with an additional module Thomas? Yes, may I answer this question as well? Oh, of course, yeah, hi. Yeah, just want to inform you that we started porting the patchy solar multilingual module to search API and triple eight and there's all this already in working prototype and maybe that's the answer for multilingual stop words handling all the languages and stuff like that yeah it's still a lot of work to do but we already started with that yeah so there's an additional module for this, yeah? Excellent, yeah. So yeah, for solar that's the module but of course we should also support it in the stop words processor if you're using the database back and so still valid but of course very good that this already exists, yes. So the question is about which processes basically solar can replace and the answer is almost all of them so if you're using solar you should definitely disable ignore case, stop words and tokenizer because they will just mark things up and keep solar from doing what it does and it does of course much better than we can do it so especially tokenizer can lead to really really bad results. HTML filter on the other hand is useful because solar at the moment cannot know if a field contains HTML or not I'm actually thinking about maybe having a different type for that so full text and full text HTML so that the server is aware of that and can take care of that as well but otherwise just have the HTML filter enabled and of course things like aggregated fields and content access and but if you enable the HTML filter you should also remove all the tech boosts because this is also something that won't work with solar but other than that, yeah most of the processes really won't work well with solar so really all of those that change what is being indexed and not add something new like rendered item or aggregated fields but all of these that changed the content on the way. Any more questions? Yes, last question was about the last exerge in triple eight as far as I'm aware, I fear I'm not using the last exerge so I'm not very familiar with how well all the modules are working but one of the modules has already been ported to triple eight last year in the Google Sum of Code but I don't know what happened further there if it's currently working or to what part it's working but I'm pretty sure that at least the less exerge client module will be ported I think that's the most active at the moment but as said I'm really not an expert I'm not working with the people maintaining these modules so I'm the wrong person to ask really. Other questions? Okay then, thank you very much.