 So my talk is about how we at Gojek built a query semantics engine for GoFood to help our users find food faster. So let me start with a little bit about myself. I have a background in physics after which I did a masters in data science, and I've been working as a data scientist for about three years now. I also recently wrote a book called Applied Supervised Learning with Python, and it was published by PACT in April this year. And I'm currently working as a data scientist at Gojek. Just to show off hands, how many of you have heard of Gojek? Okay, that's quite a lot of people. For those of you who haven't, Gojek is Indonesia's first and fastest growing unicorn. And it has grown just 1100% just in the last three years. And now with over 20 plus services within the app and with products across five countries, we are what we call a super app. So one of the services that we have is called GoFood, which is essentially the food delivery platform within the app. And it has over 400,000 restaurants and 16 million dishes on the platform itself. So that's the kind of scale that we operate at in Indonesia and other countries. So let me start with the focus areas of the stock. I'll be talking about how we took advantage of word embeddings to make our search engine more intelligent about some of the data challenges that we faced, including how to find the right data and how we build our training and test data sets, etc. And about how we choose the right metrics for evaluating our models. So the agenda that I have for today is, so first I'll introduce why we need a Quay semantics engine and how it adds value. What was the existing workflow and what was proposed? Then I'll take a little aside into how we represent semantics using embeddings. And then about how we built the Quay semantics engine in, like what are the components that we built using those embeddings? And then finally, I'll show you some of the results and talk about the impact that we had. So let's start with why we need a Quay semantics engine. So the way that we look at search, we break it down into two parts. The first one is retrieving the documents. And this kind of depends on what the search query is and analyzing the search query to figure out what is relevant to the search query. The second is ranking the documents that have been retrieved. And this includes personalization and relevance ranking based post-processing. So in order to give a little bit of context, let's look at what was before we implemented a Quay semantics engine compared to after. So for a search query called IAM, so IAM means chicken in Indonesian. And the reason that we have things like this is because our largest user base is in Indonesia. So most of our data and everything is in the local language. So for a search query IAM over here, we see that there are a list of restaurants and every single one of those restaurants includes the search query in its name. But this may not be the most efficient thing because when you're searching for something like chicken, you're probably looking for a dish and not a restaurant name, right? So what we wanted was besides restaurants which also contain IAM, we also wanted it to contain say, like translations. Restaurants containing whose names contain the word chicken. Along with say restaurants that have been tagged with a category called say chicken and duck or restaurants that include dishes called chicken or dishes called IAM. So this was basically the problem that we had that we defined for ourselves that we wanted to solve is that we were using only elastic searches, native retrieval functions which included exact matches and partial matches along with somewhat fuzzy matches to account for misspellings. What we wanted in addition was an ability to understand the semantics of the query to derive the meaning of what the user was trying to tell us that they wanted. So this is why we wanted to build a query semantics engine. So if we are able to divide the system that understands what the user wants, that means we'll be able to give them what they want with minimal effort on their part. Minimal effort on the user's part means that they will end up being more satisfied, so that leads to higher user satisfaction. And overall means that the user has a better experience. Now, user experience is somewhat qualitative. It's not something we can define. So in order to kind of define how we wanted the user's experience to improve, we had to figure out what was the total addressable market. Now, the total addressable market depends on what is the potential for impact that we have on our user experience. So we look at things like, for example, when you search for something very specific and maybe that restaurant or that dish is not exactly available in your area. Like say, you want chicken biryani and chicken biryani is not available, but maybe mutton biryani is available. Or something like that. So if you search for chicken biryani, you may not get any search results. So in order to surface the user something that is relevant, but maybe not be an exact match, we wanted to be able to understand what the user wanted. So that is one aspect of it. Another aspect would be to make sure that if we are not just looking for exact matches and the user has to say, they try searching for something. They're not able to find it. They try modifying the query slightly. They're still not able to find it. So at the end of doing say multiple searches, they get frustrated and they leave the platform. So they abandoned the third session. So we wanted to reduce that. So in order to measure this, we had to define certain metrics for ourselves. So let's look at what they were. So in order to quantitatively measure improvement, we had to define what relevance was in our consideration. So the most obvious definition or the most obvious way to figure out whether something is relevant or not is to use human judgment. Because when you're looking at a list of results, it's difficult for an algorithm to say that, hey, these results were relevant because the only person who knows what they were actually looking for is the person themselves. So human judgment, since that is not something that we had, we had to use clicks and bookings. So if you're looking at a list of results, if something catches your eye, you click on it. That means it's slightly relevant. If you end up ordering for them from that restaurant, that means it's definitely relevant. So these are the things that we considered as relevance. So for our metrics, the first one that I talked about, session abandonment, how many search sessions led to a booking versus how many search sessions kind of the user dropped out afterwards. Second one, null search rate, which is basically the number of searches which have, say, a very small number of results. So which basically means that the user experience was not as good. So these are the two ways that we kind of quantify our qualitative user experience side of things. Our main business metrics that we were recording were search to click conversions. So how many clicks or did our search result in at least one click? And the second one, the main quality metric that we have is search to booking conversions. How many of our searches resulted in a booking? So now that we have an idea of why we wanted to build a query semantics engine and how we can measure it, we look at what the existing workflow was. So essentially what a user would do, they would type in a query. We'd get the query as a part of a service. We'll send that query directly as is to Elastic Search. And it would return all restaurants which contain the search query in its name. What we wanted was a step in the middle to process the query and extract meaning out of that query. So, and that query understanding bit in the middle is what we called a query semantics engine. So what exactly is inside this query semantics engine? So first, when the original text query comes in, we have a spell correction mechanism as a pre-processing step. After that, we send the spell corrected query to an intent classifier, which determines whether the search query is either a dish, a cuisine, or a restaurant. Then we send it to a query expansion engine, which gives us certain expansion terms which can be say synonyms or translations of the user's search query. And we keep a record of those. And from the intent classes that we got along with the expansion terms, we generate an Elastic Search query, which we then send to Elastic Search. So let's take an example. Can I take questions at the end? So yeah, so let's look at an example. So the original search query was coffee, which is misspelled without any. So the spell correct corrects that word to coffee. And it says, when we send it to the intent classifier, the intent classifier says that say that it was identified as a dish with 67% probability, a restaurant as with 23% probability, and a cuisine with 10% probability. Then so we record these numbers, we find what are the expansion terms. So words that are similar to coffee would be say cappuccino or espresso because let's say if you're looking for, if you want to order coffee from somewhere you're probably open to ordering one of these types, different types of coffee, right? So using these inputs, these intent probabilities as well as the expansion terms along with their level of closeness with the original query. We built the Elastic Search query using the DSL and send that query to Elastic Search. So before we move on to looking at how each of these components is built, we'll talk a little bit about how we represent semantics. So why do we need a numerical representation of text? I think as Katherine mentioned in her talk earlier as well, computers don't understand text. They can only take numbers as an input. And in order to take numbers as input, there needs to be some form of numerical representation. So essentially what we need is a sequence of numbers that can represent each word that we have in a corpus. So it can be any sequence of numbers. One way of doing this is say one hot encoding, which over here there is an example. Say you have two sentences in your entire corpus, which means there are five words, have a good day and great. If you one hot encode them, that means you insert, you create a vector of the size of the vocabulary. And you put a one at each position which represents which word it is. So over here in one hot encoding, we can see that good and great are two orthogonal vectors. And that means that the kind of relationship that good has with great, the degree of similarity would be the same as what say great has with day. Which is not the case because it's not encoding any kind of meaning into this. So what we want is a way to make sure that words with similar meanings occupy close spatial positions in the vector space. So if you look at the same example again, instead of a dimension five vector, we've made a dimension four vector. And good and great would be 0.8 and 0.9 instead of one at different positions. So this indicates that there is a high degree of similarity between these two words. And they are relatively orthogonal compared to the other three words that are there in the corpus. Now what is closeness? How do we define closeness? How do we measure closeness? We measure this using a metric called cosine similarity. So if we have two vectors A and B, the cosine similarity is a measure of the angle between them. So cos theta, if A and B are both unit vectors, A cos theta divided by B is essentially equal to cos theta. And it takes a value between 0 and 1. So the higher the value, the closer to 1 it is. The more similar these two vectors are. And hence, the more similar the words represented by any two these two vectors would be. So cosine similarity is essentially equal to cos theta. So how do we encode semantics into vectors? One way of doing this is by using something called word embeddings. And in order to generate a word embedding, one of the methods that we use is word-to-wek. Now word-to-wek is essentially like a shallow neural network. I won't go into the details of it, but there are two different ways of training a word-to-wek algorithm. One of them is continuous bag of words, and the other one is quibram. So both of them, what they do? They look at a word, and they look at what the context of that word is. What are the other words that exist around that word in a sentence? And they try to see, they try to form co-occurrence relationships. So if there is a word, say, good and great, both of them occur in the context of have a and a, which is how we determine that these two words would be similar words. So for example, one way, one advantage of using word embeddings is that you can form word analogies with these. And you can sort of say that words would follow the laws of vector algebra. So if we have a vector called for fried chicken, and we want to subtract chicken and add duck to it, we get the vector for fried duck. So this is one of the qualities of training a word embedding that we wanted to take advantage of. So now, let's come back to how we build the components. Over here you can see what was the timeline that we had, the workflow. Yeah, the workflow that we had. So we had a spell correction, then the intent classifier, then query expansion, and the elastic search query generation. So first, let's start with what was our pre-processing step, which was spell correction. So we wanted to build a frequency based spell corrector, which consisted of unigrams and bigrams. Which would also take into account the fact like how probable that word was to occur in the corpus itself. So let me explain this using an example. Say the user types in crispy with a K, and crispy with a K exists in the corpus, but it has a count of 1000. And say it exists as a restaurant called crispy creme. Now, there is another word, crispy with a C, which exists 10,000 times in the corpus. Say as one of the examples could be crispy chicken. So what we wanted was that even though crispy with a K exists in the corpus, because crispy with a C has a higher count, we want to correct the crispy with a K to this word. So essentially the output of, if the input word to the spell corrector was crispy with a K, the output should be crispy with a C. So in order for this to work, the user actually should have meant that. And what if the person actually meant crispy with a C? The way that we account for this is in our elastic search query, we still include a fuzzy match clause, which makes sure that even if there's a word within a certain edit distance away, it will still surface results from that search query. Even though that was maybe not what the user may have meant or may have written the wrong query. So that match clause will be given like a slightly lower boost, so that if this query would probably give us more prominent results. But results from this query would still turn out. So one of the alternatives that we had that we chose not to use was this library called simspell. It is extremely fast. It's significantly faster than what we have, but it does not account for errors in the corpus. So a lot of our restaurant names and our dish names are misspelled. So that means that even if a person typed a misspelled query and we wanted a spell corrected, since it may have already existed as a dish name somewhere, it would not get corrected, which is not what we wanted, which is why we wanted something that also took the frequency of the terms into account. So that is why we chose not to use simspell. Next, we have the intent classifier. So the three labels that we use for intent are dish, restaurant, and cuisine. So we know what exactly is the list of dishes in our corpus. We know exactly what is the list of restaurants and we know what are the different cuisines that we have. So essentially we just built a supervised learning data set with the X as the names of these three and the corresponding Y as the label that we know what it is because it exists from our content. We use this library called fast text, which is a Facebook supervised learning library. And essentially what it does is it builds embeddings on the ints, like internally builds embeddings, but it builds character level embeddings, not word level embeddings. So it sort of also takes into account misspelled words and words that do not exist in the corpus. So this is how our prediction kind of worked. So if you have a search query copy, it will return each of the intents, each of the labels with a certain probability. And this probability for each of these labels is what will be plugged into the elastic search query when we finally generate the query. So the challenges that we faced, there are a lot of ambiguous words. For example, burger. If you search for burger, say how many of you would want to actually search for a dish burger? And how many of you would want to search for a restaurant burger king? So it's quite divided. There's almost like equal number of people who are searching for both. So because of this, we know that it exists in our corpus like fairly equally. The algorithm would not give us a very clear output as to what the intent would be. In cases like this, what we would do is we try to enrich our intent by asking the user which one they were looking for. So that we can collect data, feed it back into algorithm, as well as provide the user with more relevant results to what we're looking for. So what were the results of our algorithm? We had about 88% true positives and 84% mean accuracy across our labels. So the next step is query expansion. So what is the purpose of the expansion? It tries to figure out what the user wanted to say rather than what the user actually said. It broadens the query by adding additional tokens in order to include more relevant results and increase the recall. So recall essentially defined as the number of relevant results that were retrieved divided by the total number of relevant results. So the kind of relationships that we wanted as our expansion terms were synonyms. For example, coffee and cappuccino could be considered as synonyms and translations. So ayam which is the Indonesian word for chicken and chicken would be a translation pair. So this is what we wanted to capture using semantics. So like how we actually ended up using fast text for our intent classifier. Over here we decided to build our own word embeddings. So the main challenge that we had was creating the data. Because algorithms like word to work expect you to have like a grammatically correct sentence as an input. But what we had essentially were phrases. We had search queries, we had restaurant names, we had dish names. And we had the names, yeah, so the names of the dishes that the person ordered. Now how do we include all of this? How do we create a sentence out of this so that word to work can kind of learn the kind of relationships that we wanted it to learn? So we kind of iterated over several ways of formatting the data. In our first one we took like for example, for a single booking that a user made. The user searched something, they clicked on a restaurant and ordered dishes from that. So the query word, the name of the brand of the restaurant from where they ordered, along with each of the cart items in one row. So this is what we first tried out as a single sentence. And this didn't really work that well for us. We tried a couple of other things where we included each pair as a separate sentence, on a separate row for a single booking. And in the last iteration, we also included the dish descriptions of each of the cart items. So what we saw was the best performing was the third one, which did not include the brand name at all, but included just the query word along with the cart items. And this was because a majority of people actually end up searching for dishes and not restaurants. So for example, there's this dish ayam goreng, which means fried chicken in Indonesian, which is very, very popular. So ayam goreng is one of our most frequent queries on a daily basis. And it's a dish, it's not a restaurant. So yeah, that is why we thought, that is why we speculated that this was the way that our data kind of like performs the best. The embeddings are the most relevant to each other. So from the embeddings that we trained, we plotted a TS&E plot. TS&E is essentially a algorithm that sort of reduces the dimensions into two dimensions so that you can visualize the embedding easily. So this is some of the more popular queries that we had. So we can see that this cluster over here is different types of noodles that are popular in Indonesia. So that means that the embeddings that we were training. Made sure that different types of noodles had word vectors that were spatially close to each other. Another example of a cluster that we have. Another example of a cluster that we had was this one, which is essentially different types of fish that are very popular in Indonesia. So how did we perform the prediction using these embeddings for query expansion? We took the word like the search query, found the embedding for that word and used the cosine distance to find different embeddings that were closest to the embedding corresponding to the search query. And figure out what were the corresponding words. So let's look at some of the examples that we observed. So bihon and kwetao are two different types of very popular noodles. Again, and these were determined by the algorithm to be expansion terms of each other. Then we had chicken and duck, which were also like, are used interchangeably in a lot of Indonesian dishes. So this was, we were quite pleased with this example as well. Another one was juice and juice, which is the example of a translation expansion pair. And another thing that we saw that we were quite amused by but very pleased with was say an association between different types of similar brands. So McDonald's was the word vector that was closest to the vector for Burger King. Now how do we generate the elastic search query? So I do have some examples of quote, which is at the end, if anyone has questions we can look at it. And the slides will be up, but over here I'll just qualitatively go over. So the intent classes that we had. We saw that there was a probability that was associated with each class. So we used that probability as the boost for different intent classes. So for example, if you searched for KFC, it was classified the highest. Say it was 90% probability a restaurant, 5% ish, 5% restaurant, 5% cuisine. So when you're searching for a match in the restaurant name, that boost would be the highest compared to the boost for the match terms with the cuisine and the dish names. Then query expansion, we added additional match clauses with a diss max function which made sure that if there was a say a restaurant name or a dish name which included both the original word and the expansion term, it would only, it wouldn't kind of increase the match score, it would make it twice as much. It would try to ensure that whatever it matched with most, say for example, if it was a dish called coffee cappuccino. It would ensure that it was, if the match score with coffee was higher, it would only include that match score in the final score. Versus, matching, including, increasing the score based on a match with both coffee and cappuccino. So we used the match terms for the query expansion were added for both the restaurant names, as well as the dish names. And the boost corresponded to, obviously we had the intent boost. But for the expansion terms, the boost was also a function of the cosine similarity between the expansion terms and the original term. So let's look at some of the results and how it manifests into the app. This is again the before and after that we had for the same term, ayam. So over here, we can see that the first result includes ayam in the restaurant name itself. Then the second query, the second result has the word chicken, which was the expand, one of the expansion terms for the word ayam as like a translation. Then the second, third, and fourth results had a cuisine, like if you can see at the back, a cuisine called chicken and duck. So since these results matched the cuisine, there was sort of like boosted up as well, even though these two did not contain either chicken or ayam in the restaurant name. Now the last results, we don't see any obvious matches in the name itself, which means this result was surfaced because it had dishes that were called either ayam or chicken. So let's look at how once we built the square semantic syndrome, how it actually improved performance. We were looking at session abandonment rate. We saw a 10% drop in the sessions that ended up being abandoned. So like a 10% increase in sessions resulting in a booking. The null search rate actually saw a 99% drop. That means 99% of searches that resulted in very few results now had a significant number of results. Search to click conversions, we did not see much of a lift, 8%. But search to booking conversions, we saw a 30% lift. And this was just one of the first versions that we built. We're still working on iteratively improving both the intent classifier as well as a query expansion bit. So some of the future work that we're working on is we're working on trying to add time and geographical context to our expansion terms. I'll take questions once I'm done. And maybe using customer specific booking and search history so that we have expansion terms that are somewhat personalized. Some of the related efforts that we're working on are building a food knowledge graph. At the moment, we're using open source data from TVpedia. And we're also working on building auto-suggest and auto-complete features. So going back to the key takeaways that I mentioned at the beginning of the talk, we talked about how we're using word embeddings to make a search engine intelligent and we're using this in the form of an intent classifier, as well as by using embeddings for query expansion. We talked about some of the data challenges that we face in intent classification because sometimes the query can be misclassified or the intent can be ambiguous. And what or how we try to structure a data to train embeddings for query expansion. And we talked about how we chose our metrics for evaluation based on the total addressable market that we saw. That's it, thank you. If you have any questions on the approach that we took or if you have any suggestions on how we can improve our system, I'd be happy to take those. So, yes. Thank you. And one question that I had was when you talk about search to click or search to convergence, do you look at the unique number of users who searched to unique number of users who clicked or you take instances of search and clicks? Like what is the view here? We take each search independently. It's not user based. It's based on each search session that happened. So each time a user typed in a query, did that result in a booking or not? Okay, so if I search ten times, I would count as ten searches? Yes. Thank you. Thank you, Sita. It was a nice session. Two question I do have. One thing it is like earlier your search was just hitting that elastic search cluster directly. So once you added those two components, how did you handle the latency part? Because in the middle, there will be more processing. So for the end user, if I just keep typing and the results are not appearing, that is a challenge. And second, which is related to on the spell correction side, I'm not sure which library did you use because natively elastic search doesn't support that. Yeah, so for spell correction, we did not use a library. We included it as a part of our code base itself. Yeah, and your first question was, I'm sorry. Like on the latency, yeah. So the latency did not increase by much, I think about. So our latency limits, I think the 99th percentile is about 200 milliseconds for the search response. And this increased the latency by about 10 to 20 milliseconds because we used a Redis cluster to store the key value pairs for expansion terms and stuff. So it did not really increase the latency by much. But so that being said, this is because currently in our experiments, we have static predictions for expansion terms, et cetera, and intents. So we've taken the top and 50% of most frequently searched queries. And we've built the system for that so that it's easier for us to do A-B testing on control versus experiment. Look. I think. Yes. But for query expansion, you used Word2Vec. So any particular reason why you didn't go for fast text in query classification as well? Oh, sorry. Expansion phase as well. It's because mostly we wanted more control over how we structured our data and the kind of algorithms and the hyperparameters that we used. So we built our intent classifier after query expansion. So we were experimenting with embeddings and how we can use them. And we thought, why not just use embeddings for query expansion instead of representing each restaurant as a doc2vector? So there were a couple of iterations that we went through when we built query expansion. After that, we decided that we wanted to build an intent classifier as well. And since we didn't want to take the spell correct latency into account for that for the intent classifier, we decided to use fast text because it takes care of misspellings very well. That's the reason. It may also help in the query expansion phase as well. That's what I think, anyways. And one more thing is, after you trained your word embeddings, how did you evaluate, like, how good they have been trained? Because it's unsupervised. Yeah, so some of the pre-experiment validation was mostly to do with human judgment. Remaining, we validated once we experimented with them, we saw the conversion lifts and attributions. So that's how we know that if users are clicking on it, booking on the results from the expansion terms, we know that those are relevant. So yeah. Hi, I have a question. So I'm expecting all those models, the intent classifier, the query expansion, there are supervised models built using neural nets? The query expansion embeddings were not supervised. Those were unsupervised. The intent classifier was a supervised model. What was the measure, actually, where what was the measure you used or the metric you used to measure the model performance? So I think I mentioned this in the beginning as well. For the intent classifier, we looked at true positives. So what we did was we took the top 1,000 queries. We manually labeled them. And yeah, so the top 1,000 queries were roughly about 40% to 45% of the total search volume. So we took those, we manually labeled them, and looked at offline predictions, and then compared them to the actual manual labels to find true positives and the accuracy. Sorry? We did think about that, but we weren't. Yes. But that would work better for a binary classification problem, right? So we had multiple classes. OK, point, point, point noted. Thank you for your suggestion. Hey, this is Jay here, here. So I have two questions. One is from the algorithm, which is the word to wake. Have you considered using LDA to wake because the intent itself is a big problem in the query when the user is giving it? So did you consider LDA to wake? And why did you choose LDA to wake or what to wake? Second question, you spoke about replacing the most occurring word with the misspelled word. When there is a misspelled word, you look for a similar, the correct word and you replace it. What happens the other case if the misspelled word is higher and then the correct word is in lower frequency? How do you handle it? Because I tried to take this as an inspiration. And if I were to use it, I'll take your idea. So we actually did not consider LDA to wake because I have not heard of it before, to be honest. So thank you. I mean, I will go back and read up on this. So thank you for your suggestion. Based for the spell correct, if the misspelled word has a higher frequency. If it has a higher frequency, then the chances are it's not misspelled, and that's a legitimate spelling. So that is the base assumption on which we have our spell correct, because for example, each word has, say, 10 to 15. Each of the more frequent words has more than 10 to 15 misspellings. And those have a significantly less frequency. Say, for example, 1 million compared to 100. So we were pretty sure that if we didn't want, all the misspelled words would probably be removed if we only consider the words within a certain added distance that were of higher frequency. So if they were misspelled and they were higher frequency, that means they do occur in our corpus a lot, and we would want to search those. Can you explain about the knowledge graph, the food knowledge graph, which you are working? So essentially what we are doing at the moment, this is something we actually just started working on a little over a month ago. So we took data from DVP. And DVP data is basically a graph network that's based on top of Wikipedia data. And we ingested it into Neo4j. And we are planning on using that for our next version of query expansion. So basically, based on traversing the graph, what are the most similar words compared to the search query? And yeah, I mean, we're still working on it. Now you have to work on the, from natural language conversion to the Cypher query conversions also. Yeah, so one of the things that we are looking at is basically how to map one, how to map the search query to a node on the graph in the knowledge graph, as well as how to map entities, nodes on the knowledge graph to entities in our content. For example, if there's a node called IAM, or chicken, how do you map that onto the different chicken dishes that exist in our database? So these are two things that we're still currently working on. We don't have a solution yet. But if you have any suggestions, we can catch up after this, I'd be happy to hear them. Hey, regarding the spell, correct. So do you use Peter Novik's algorithm? A modified version of that. OK, so you're saying it would be CPU intensive than memory intensive. Because Sim spell stores all the possible errors or combinations of it in the database and then looks up. So do you see a CPU intensive approach? Yeah, it is CPU intensive. And what are the latencies for this? A distance to the 99th percentile. I don't have the exact numbers. Sorry about that. This one? Yes. Yes. So for example, for the third one, if you have a query, say, for example, a chicken burger, and the cart item that was ordered was, say, a Zinger burger from KFC. So that means chicken burger, if you'd consider center word as Zinger, then chicken and burger would be the context for that. In the first one, the sentence was extremely large. So it varied a lot across different bookings. Because there might be people who order from a KFC and just order Zinger burger. In which case, the sentence would be, say, four words long. There were people who ordered a lot of things from very long name dishes, et cetera, in which I think the maximum sentence length we saw was, say, 250 or something, in the first one. So the window size that we would have to include was very large, which means if there were words that were towards the end, they would not consider the search query, which is in the beginning, as context. Yes. The query went part item maintenance. Yes. So when you're changing that line, what is happening is that the document is not, that means the second line is not, this is the outcome context in the first line. Yes. Yes. So when the window size, the second size will not be necessary. No. So in that case, cart item one and cart item two would be completely different data points. Yes. OK. So no worries. Yeah, definitely. So this works well. Query expansion works best for single terms. If you have multiple terms. So how would query expansion go? We have built embeddings for unigrams as well as bigrams in our dataset. OK. So, but that's the max that we go to. Because we are considering only, so at the moment, say the top and most frequent queries in our experiments and most of them are either one word or two word queries. So that is why we are considering only unigrams and bigrams and building embeddings for those. Yes. Even the boost factor would be with respect to the same. So basically the boost factor is term-specific, right? Based on the synonyms and the expansions. Yes. But the boost factor would need to be generated for the multiple terms as well in that case. Yes. It's there. So in our Redis map that we have, it's a search query along like as a key. And the value would be a list of dictionaries and each dictionary would have the search, the expansion term and the corresponding cosine similarity. OK. Thank you. One last bit of suggestion for autosuggest. We could have trained on queries as well instead of content to get what are, for query expansions, what are the potential candidates for expansions. OK. We can catch up on more of this afterwards. And I think your question, we can go up after this, I think. Thank you, everyone.