 Mae'n ddweud am y cyfnodd o'r hyn sydd yn ei ddweud. Fy enw yn gweithiau i Gwyl Sightsearch. Mae'n ddweud i gweithio'r C-studio Cynllun. Mae'r gweld Dunedig Milburn. Mae'r architekt cyfnodd o'r gwbl, ac mae'n gweithiau yn y ddigital cyfnodd yn ychydig, fy nghymru 14 oed. Mae'n edrych i'r cyfnodd o'r Gwyl Sightsearch o'r cyfnodd o'r gwaith yma. I've had the opportunity to develop custom search solutions in various different content management systems. And I've also had the opportunity to implement paid and open-source third-party search products on the site search. Now, today what we're going to talk about today is we're going to talk about the types of site search there is available for search. We're going to have a little examination of a case study that I was recently involved in. And also, hopefully, I'll discuss some of the tactics I employ when I deal with the implementation for site search. And with a little bit of luck, you'll be able to take away some of these lessons and apply them to your own projects and those sorts of things. OK. I wanted to share some observations of what I've learned over the last few years of implementing search is that everybody wants Google. It doesn't matter what they say, they all go, oh, I need to have Google on my site. However, the problem is that's really expensive. Google spent millions and millions of dollars on their algorithms. So it's far too expensive, but they all really like and say, oh, Google, give us Google, give me Google. Unfortunately, what happens is too costly. Now, in addition to that, what you find is search, particularly onsite search, is the first thing that gets neglected once the deadline hits or when the scope starts to go live starts to come along. It's the first thing that gets missed and starts to crunch. The other thing is that it tends to be a very black box issue. What happens is people get a little bit nervous about it. It's kind of like you've got this index and nobody really quite knows what goes in. You throw some keywords at it and all of a sudden it gets some results and nobody quite knows what goes on under the hood. So people tend to be a bit confused about it. Hopefully I should be able to share my experience and provide you with a little less of that black box experience. So there are essentially two types of onsite search. There's pattern matching and what I would refer to as a relevance engine. So pattern matching is essentially what you do is you take a keyword and you throw to the index. And then what it does, it goes through the index and matches that keyword against the documents in the index and increases the count by one however many matches there is. Now the more matches in that result, the higher the item gets shown and displayed in the result set. This is typically announced in the box search solution. A lot of pattern matching, a lot of out the box CMSs use pattern matching algorithms. It's also a very quick and cost effective method of implementing an onsite search. But it's not wonderfully sexy. I'm not a great big fan of pattern matching. The reason why it can result in a really poor user experience. We've all seen the result set where you're kind of like how on earth did that result get up to the top of the list based on that keyword. There's no visibility about how that result set gets returned. So it can be quite a poor user experience. The other thing is that pattern matching is typically a very, it's kind of a costly exercise to be able to manipulate it. If you want to customize it or modify any of the search, particularly when it comes to pattern matching, is it tends to be quite costly to change and also enables to incur technical debt as well. You've got to manipulate the core and mess around with it. Now on the other side of the search we've got this relevance engine. Now relevance engine is kind of like the Google experience. What happens is you do a search, somebody else does a search, they click on results, the algorithm then smushes it all together, figures out what is the most effective result and spits back the results. So it's all based on a more intelligent algorithm rather than just pattern matching. Now typically these, excuse me, relevance engine comes as a third party product. It comes as a software as a service. They can vary between open source and proprietary engines and however they can be very expensive. There's some really, really expensive third party products out there that could range like license fees of about 30 to 50k a year. They can be quite costly. However the good thing is that they are a wonderful user experience both for the user that's actually using a search and for the business owners that have the opportunity to manipulate the search and manage the index and you don't need developer assistance to be able to manage the index. One thing about tends to be with proprietary as well as open source with relevance engines is they're very easy to customize as well. They provide you with support files and APIs and that sort of thing that you can just manipulate and theme however you want. Now we've had the opportunity to talk about the types of search we use. I want to introduce the case study. Now this case study I was involved in the building of a website with onsite search. It was a very large implementation. It was large for a large law firm of about 150 partners. Now this case study I just highlight some of the user requirements that we had for this project. First up people needed to be because it's a law firm it requires a lot of partners and law firms are built on the individuals expertise. The people in the organization need to be ranked through multiple capabilities. They need to be able to be targeted through family law or mergers and acquisition or lots of multiple skill sets. The business wanted to be able to tweak the result set according to the business needs. They kind of wanted to be able to inject results into the top of the list based on their business requirements. They also needed to provide support for synonyms. Not everybody searches on the same terms so they need to be able to get that kind of support. The results should be faceted so the different sections and categorizations of search. The budget was quite low as well particularly when it came to search. The whole site was budget was fairly tight and we need to have it delivered within six weeks in time for a conference. They want to launch this new website with the search algorithm in time for the conference. The partners can go away to the conference and hand out cards. The people at the conference can come back and search their names on the site. What they also wanted they said we want Google. It's like no time, no budget but you want Google. The first challenge was actually to implement a relevance engine. How are we going to do that for this search service? The first thing we asked was can we develop this in-house? The answer was no, we can't. Why? It was just too costly. The development time involved to create indexing, creating search manipulation, faceted results was just too pricey. We thought okay we're not going to build it, we might as well find a, we need to look at third party suppliers. Now these guys said they wanted Google. They were like okay we'll go to Google, perfect site search. Then this happened. April 2018 we suddenly realized it was discontinued. So we panicked. We went oh no, what are we going to do? We panicked and went oh god. After a while we calmed down a little bit and then expanded our search to the wider market. What we found was there are loads and loads of search products out there. However the majority of them come at a cost, they are really expensive. Which didn't help us, right? It wasn't very, it was very difficult to find a very cost effective search solution. So we fine-tuned our search and searched a bit more and searched a little more. Then we whittled them down and found these five products. Now these are all software solution products and they also meet the cost criteria. These five products all are able to maintain index for under $1200 per annum. Which is very, very cheap when it comes to site search. So after a few decisions we eventually decided on Swift type for implementation. So all these search products will all meet the content requirements. Faceted search to provide the ability to manipulate the index. They all provide very effective statistics engines and that sort of thing. So in summary we want to talk about some of the key takeaways that I had from this search is when looking at a relevance engine don't build it yourself. It's just too expensive and people have already done it before. I know it can be a quite sexy technical challenge but it's just too costly to be able to implement it yourself. Also third party products provide visibility of the index. They're really, really effective to provide an understanding of what goes on in that index and what goes on and why those results are returned to you for specific keywords. Finally these products allow you to manipulate the results. They allow you let's say you've got a particular keyword and you need to inject a service or a very important document at the top of your list so you can move that customer onto that journey. That's what these products are great at doing. So challenge number two. We've made a decision. We've got our search engine. We knew it was a critical success factor and the next stage was dealing with the unexpected. So first of all we knew search was a critical success factor. It was vital for this site to be successful. So we're like yeah that's cool we'll do that. So the first thing we did was create this search engine, develop this search engine on a test environment and it was humming. It was perfect. We tested it with test data. It came out with lots of information. It was brilliant. We were like ooh pass on the back, job done, drop mic, leave the room. That's it but there was a problem. When we got to UAT testing and we actually started to test with real life data the client was like this is not matching our expectations. We're not getting the relevant information from the information you saw. We're not meeting their business cases. We weren't meeting their business requirements for the search which was really really scary because we were a couple of days away from launching this site and this was the first time we had the opportunity to actually test with real life content. So the client started to lose confidence. They were just like oh we can't have this search. This search has to be right before we go. So we had to postpone the launch unfortunately. What we did then was a couple of key takeaways for this is what I would suggest is factoring time to test out your search with live data. Nothing is a substitute for live data. If you can't do that, manage your expectations. Tell the business users if we have a search and we don't have time to test it post launch we have to tweak it. We have to fine tune that search. Then coming up to challenge number three. Now we had to postpone the launch. We knew research didn't meet the requirements. So what we did, the client said they wanted relevance. We implemented relevance. However when we came to do the testing relevance was not quite what they wanted. So what we did we spun up a bunch of rapid prototypes. A light weight that we could quickly test and figure out what they actually wanted when we implemented. And we found that what the search they wanted was this thing called fuzzy search. Now fuzzy search is kind of like a pattern matching algorithm. It's kind of like a spell checker. It kind of looks at variances for the closest words. It's kind of like what word and Google use when you're doing autocorrect and spell check. So we figured that out. We got that. We implemented that which was like who got that done. And what I want to also highlight with some of the key takeaways with this challenge was don't be afraid to get it wrong. We got it wrong. We had to postpone the launch. However, it gave us more clarity around what we needed to do and provided us with more clarity to find the success factor. Sometimes it's okay to get it wrong to be able to get it right. And also what I've also found is pattern matching is an acceptable criteria. We spent all this effort getting this relevant engine only to find actually pattern matching solution is more effective. And we could have found that out had we done that sort of iterative development in the start. So in summary, in terms of the case study, we didn't get the site live in six weeks. We got it live in eight weeks. However, we got it live in time for the conference. Everybody was happy. The site's been live for about nine months. It's a great user experience for search. And we've seen an increased uplift in search conversions because they have the ability to inject these ad words into the index. We're still tweaking with the search and improving the experience. However, we're using the analytics from the third party search to be able to tweak that engine and get better search results. And so in summary, I just want to leave you with three key points. First of all, first up is allow yourself time to fine tune the results with the real content factor of that in that is vital when it comes to search because you've got to put contingency into the project plan. Second, when it comes to relevance, third party products are the way to go. Don't don't try and build one yourself. It's just a waste of time and it's too costly. And then third, remember to bring the business along your search journey, consult them, show them things that get wrong. That way you can refine your understanding of those requirements and refine the models so you're managing their expectations and they're brought along with that journey. And that's everything. I want to thank you for coming and listening to me talk today. Do you have any questions? Why we selected Swift type over Elasticsearch? Well, it's partly down to cost. The other applications were a slightly higher cost ratio. It's still under budget and also their ability to inject items into the index was a lot more sophisticated than the other products. So, with these kind of websites because it's a greenfield development, we didn't have the actual content. It's usually what happens when you develop the sites. It's the last thing to come. The development isn't the tricky bit. It's getting everybody to write the content. So it just wasn't there. We had to write all the profiles and create all the insights and all that sort of stuff. So we didn't have the data at the time. What was the calculating in the annual fees? What was calculating? It's a subscription-based services. They're usually based on the amount of documents you create. Typically, those are all under about 50,000 documents for indexing. And then also they charge on API calls as well. So as soon as you throw a call to the index, that's counted as one API call. So you've got to be careful if you're doing autocomplete. So each keystroke is an API call. So you've just got to be careful about that. But those are the lowest tier subscription models. They have a little bit of a leap after that, but you've got a lot more API calls and document indexes. OK. All right. Thank you. You mentioned that you guys built some light with prototyping to test out what kind of search that you would decline once. Was that with real data? Yes. What method or what tools did you use to build this library? Oh, it was very, very simple JavaScript algorithms. We applied some JavaScript algorithms. That's how we established it. We needed to use fuzzy search. So it was just a very, very lightweight hitting the index quite quickly. What we found it wasn't the relevance engine that was the problem. It was how it was returned. It was kind of like how it sent the data into the API. OK. Thank you guys. Thanks.