 Okay, let's jump into the world of hibernate research. Take it away. So, hello everyone. I'm Yoann Rodier. I work at Red Hat as part of the hibernate team, and I work in particular on hibernate search. What I'm going to talk to you about today is not really search itself, it's more how to integrate search into an application. And more specifically, it's about integrating search into an application using Java 8 and above using the hibernate ORM to store data in a relational database. It could be any framework, Spring Boot, Jakarta EE, whatever really. So it's really about adding search to a business application that you might find in a lot of shops today. An application that uses the ORM to create, update and retrieve data from the relational database. What we want to add is some way to query this data in a full text way, not just scholar comparisons, but really more fuzzy search and more precise, more textual search. So one way to do that when your data is in your database would be to use Elasticsearch just next to your database. But then you have a problem, you would need to store your data both in the database and Elasticsearch. Every time you write to the database, you would need to write to Elasticsearch too. So ideally, you would like some kind of automatic synchronization, some tool that would allow you to not really care about synchronizing the two sources of truth, but really have it done behind the scenes. The question then becomes how to trigger the synchronization and how to map the relational world to the document world. In the relational world, you may have data which is spread out over several tables, whereas in the document world, you would want to have as much as possible in a single document to minimize the need to the joins and ideally to not do joins at all. The solution then, the one I am proposing, is Hymenetsearch, which is a library which integrates into Hymenet ORM. So it's two dependencies. If you want to map ORM entities to Elasticsearch, you have to add a mapper which turns Hymenet ORM entities into documents and the backend which will index those documents and allow you to query the Elasticsearch cluster. The configuration first. The configuration option, the properties, will have to go wherever you would configure Hymenet ORM. It could be a persistent.xml, Hymenet.properties, or if you're familiar with Spring, it could be in the application.tml or application.properties. Wherever you push settings to Hymenet ORM, you can push settings to Hymenetsearch. So in search first, you'll define a backend. You'll give it a name. Here it's backend one. You say your backend will be the default backend for all your entities. Then you say that your backend type is Elasticsearch. We support Elasticsearch but also embedded losing mode. So you have to choose something. Then, of course, you have to tell us where to find your Elasticsearch cluster and optionally, if you have authentication enabled, how to authenticate. Once this is done, you still have to tell us, to tell Hymenetsearch how to map the Java entity to an Elasticsearch document. Now here on the left, you have the model for Java entity, a JPA annotated entity. On the right, you have the Elasticsearch mapping. In this state, there's nothing to map the two documents. In Hymenetsearch, if you had this annotation, then automatically, Hymenetsearch will know that you map the book entity to a book index on the Elasticsearch site. If you only add that, though, well, the document will be empty because we don't know what to put in the document yet. So you add all the annotations. Here we have a title field. We say that this title field is a full text field analyzed with the clean text analyzer. And automatically, Hymenetsearch is able to translate that into an Elasticsearch mapping so that you have this. And of course, you add new fields as necessary, potentially on different properties, potentially on the same property if you need to do different things with the same data, like searching on the title, but also sorting on the title. If you're sorting on the title, you don't want to use an analyzer that will break down your title in multiple words because sorting on multiple tokens is really something you want to do. So you create a different field for the same data. And Hymenetsearch will just push the same data to both fields. I talked briefly about an analyzer. So the last part of the configuration would be to define these analyzers, which are after all the core of what you do when you do full text search. So you could do that directly in the Elasticsearch server, but since it's such a common need, there are APIs in Hymenetsearch to do that too. You would implement an interface. You would create an analysis configure and you would reference it in your configuration. Here I'm using Spring, so I annotated it with ad component and I referenced the component name, but you could do that using reflection without any framework and just put the fully qualified class name. Now, this configure is not very useful. It doesn't do anything. What you will do is just use a DSL to create the analyzer and tell us which tokenizer you want to use, which chart filter, token filters, and so on. And we're all set. Once we've done this, Hymenetsearch can work. So this code, which is usually used to persist entities with Hymenet or M, can be adapted to also work with Elasticsearch to also, when you persist the entity, you will also index it. And to adapt it, you will do this, nothing, because it's all automatic. It's all behind the scenes. So as soon as you do changes to an entity, you commit a transaction. Hymenetsearch will, after the commit of the transaction, send everything to Elasticsearch. And of course, it also works for dates and deletes, not just for ads, like I'm showing here. How it works is that when your application asks the ORM to perform entity changes, the ORM will send in certain updates to the database, but it will also publish change events, which are captured by Hymenetsearch. And then when the ORM will commit the transaction, there's a commit event, which Hymenetsearch will also capture. And when the commit event happens, then Hymenetsearch sends everything to Elasticsearch. There are a few features that make it a bit less naive. Like automatic bulking of search requests. We don't want to send one request at a time for each document you index. We actually want to put it all in a single request to minimize latency and to optimize the flow of information between our application and Elasticsearch. So that was nice. You don't have to do anything once it's configured. But the example I gave is really simple, too simple actually. In the real world, you don't want to map one entity to one document. You really want to map a tree of entities to a document. You want to denormalize your schema and to include, for example, if you have a book entity with a few chapters, when you search on the book, you would want to search also on the contents of the chapters. That's what really is interesting when you map your database data to an Elasticsearch model. So in order to do this, Hybrid Search also offers a feature. Here we have a book entity, a chapter entity, and here is the Elasticsearch schema for the book. So you have a list of chapters here. You want to embed these chapters into your book document. You would just add an annotation to that. And Hybrid Search will be able to add a chapter's object, a chapter's list of objects in your schema. Now, as before, the chapters are initially empty because you didn't tell Hybrid Search what to put, to index from your chapters. So you also need to add annotations on your field of chapter to tell us what you want to index, really, and how to index it. Once you've done that, then everything is indexed as part of the book. And there are several options. Actually, you can do more complicated things. Imagine your chapters are not only referenced from your book, but also from another table of contents entity. From another table of contents index, you need the page count, but not in your book. You will be able to tell us, when I embed the chapters in the book, I only want the title and the text, not the page count. I don't care about that. There are several options. You can also use nested storage to not just put the data as is in the document, but really ask Elasticsearch to store the chapters nested documents so that the structure is preserved and you keep the information about which title is for which text, which can be useful when you combine queries. You do a query on both the title and the content of a chapter, for example. Like I want a title chapter with John in the title and Smith in the contents. You need to preserve the structure to be able to do that kind of queries. Otherwise, you will get a book which has, in chapter one, John in the title and in chapter two, Smith in the content. That's not exactly what you want. There are many options here, and you can really customize how you map your book to a document. Once again, re-indexing is automatic. Now, if you have a piece of code that changes the chapter where the book is never involved, Hemiansearch will know that when a chapter changes, you need to re-index the corresponding book, and it will do that. You don't need to tell us to do that. We know it. That was nice. We are able to index everything in Elasticsearch, but then you index to search at some point. You need search APIs, and there's also that in Hemiansearch. You can, of course, just use the native Elasticsearch APIs. You can craft your JSON queries, send them to Elasticsearch, receive JSON responses, pass that, and do whatever you want in your application. But it's also nice to have a Java TypeSafe API. So that's what we offer here. By taking the user input and the entity manager, which is the entry point to the Hemiansearch API, you can extract the Hemiansearch API. So you'll create a full-text entity manager. From there, you can create a query. You will say, I want to search on the book class, and I want to create a query, a search query. Then you'll say, I want the results as entities, and that's where it gets interesting. You'll give a few predicates or multiple predicates saying you want the title of the book to match the user input, and you build your query. And then, when you execute your query, you will receive books which are managed entities, entities that are bound to the database. But you made the search in Elasticsearch. So you exploited Elasticsearch for the search capabilities, but as to data and to retrieve in data, it all comes from the database. And as such, you can exploit some, or you can benefit from some features from high-minute ORM, like lazy loading. In your book, maybe not everything was loaded right away, but if at some point later in your code you access a getter of your book, the data can be loaded lazily as needed. And you could also use the book to go make some changes and persist it, but that's not usually what you need here. So how it works is whenever high-minute search sends the query to Elasticsearch, it retrieves the hits, it retrieves the IDs of each document, and then it will ask ORM to retrieve the managed entities corresponding to these hits and just return that to the user. Now, you may not want that, you may just want the data from Elasticsearch, and that's fine too. You can do projections, what is called projections, which is basically retrieving the data from Elasticsearch instead of the database. And you can do, there are lots of other features, you can do sorts. Each time you need to tell us in your mapping what you want to do with each field, and then you'll be able, when you perform a query, to create a projection. Using DSL, you describe your projection, I want to project on the title field, and I expect it as a string. You use predicates, you can define a sort too, on the category field, and then by score, when the category is the same for two documents, you build your query and you retrieve the results, as strings, since you mentioned you wanted the title as a string. There are many other features. You have, of course, all sorts of predicates. You can have Boolean junctions that allow you to combine the predicates. You can do spatial predicates. You can do more complex projections where you will project on multiple fields and combine them in a single beam that you will retrieve from your query. For example, if you want both the title and the score of the document, you can do that. You put them in a pair and you just retrieve that from your query. So there are a few features like that which are made to make your life simple. Do we have, yeah, we do have a bit of time. So, a few details about the Elasticsearch integration. The schema I mentioned, how in search can push it automatically to Elasticsearch for you. There are several strategies to manage what is called the index life cycle. You could do nothing. You tell how in search I want to manage the schema myself. I know what I'm doing. I'm putting lots of particular settings in the schema and you don't want how in search to mess with that, that's possible. You can tell how in search to create it and if it already exists, just do nothing. You can tell how in search to expect it to exist and to validate it if it matches the configuration you have in your application. You can tell it to update automatically the schema which is a bit more dangerous because updates can fail. Of course, if you're mapping change in a way that data needs to be re-indexed, this kind of strategy will fail but still it's useful in development environments. These ones are also useful for tests mainly. You can tell how in search to just drop the current index, create a new one and drop it when it shuts down. Of course, I talked about on the fly indexing whenever you persist your entities but you might have an application with data already in the database and you want to re-index all of it so that everything is available in elastic search. You can do that with what is called the mass indexer where you just tell us what entity to re-index and you tell it to do its work. Obviously, it's a heavy process. It will take some time so you will need to tune it if you want it to perform well. You have several options to tune that process which will spawn several threads, load the data from the database and push it to elastic search. And if you want to give it a try, IvanNet Search 6 is still in alpha so there are some serious limitations still. You don't have all field types. There are some missing features. The APIs are still unstable so we're getting closer but you can try it. Reports and contributions are welcome. There's a demo at this address and if you want to use IvanNet Search in production you should rather have a look at search file from now which has different APIs and is focused on embedded loosing mode so it also has experimental support for elastic search but since the APIs are optimized for loosing it's not the best fit. That's why we're doing IvanNet Search 6. And that's all. Thank you. Because elastic search is not really a fit to do a lot of updating you usually would drop the index and re-index it and how would you keep your search alive then in the meantime? So that's a problem we don't tackle. The question was how do you deal with an update intensive application where usually you would drop the index and re-index everything. Hot updates are not something we support right now but we know we have to update it. We're thinking about solutions but in order to do that we first need to implement some sort of synchronization between our nodes or how many search nodes. Currently we don't synchronize between the nodes and we need to add some sort of communication. We plan to do that but that probably will be for 6.1 or something like that. So the question is, is the projection an elastic search document or a Java object? It's a Java object. It's type safe. Actually, IvanNet Search does all the work to convert the data. Like if you have a local date field it sends to elastic search as JSON and it's also received as JSON and it will be a string in the document. We will convert it back to a local date and give that to you. Having the entity go for the JSON document to retrieve the entity later. Sorry. You want to? You is who? If the application asks for an elastic search document to get it from the API and can you use these documents to go after an entity later to ask for... The question... Is there a reference to have the... Yes, so the question is if the application retrieves the elastic search documents is IvanNet Search able to use these documents later to retrieve the entities? You can... There's the information inside the document, yes. There's enough information to retrieve the entity. We actually have a specific projection for that which is called the reference projection. You can just retrieve an object that represents your entity. It's basically just the class and the ID and then you can just use the IvanNet or multi-load operation to retrieve the entities. I don't have much time but I'm sure you can ask in person afterwards. Thanks again. Thank you everyone.