 personally it's really nice to be back here after three years almost and seeing all this familiar faces I miss you all as Drupal community so thank you for being here in this session. Let's get started. My name is Jibran Jal and this session is about a location-based case study. So first thing first, acknowledgement of the country, we acknowledge traditional custodian of the various lands on which we work today and the Aboriginal and Torres Islander people participating in this meeting. We pay our respect to elder past, present, and emerging and recognize and celebrate the diversity of Aboriginal peoples and their ongoing culture and the connection to the lands and water of New South Wales. So let's go through the contents of the presentation, a brief introduction of my agency I work for and me, a slide and then a problem statement which we are going to solve today. All the available tools we have on our platform available for to solve this problem and then a solution design and it will be quick one and then we'll jump into the questions if you have any. Let's get started. We tried this video before. I hope it works. Wait, wait, wait, wait. We're making it easier for customers to access the information they need without having to understand or navigate the structure of government. We're working with each government agency to build New South Wales.gov.au as the one place for customers to source information. Here customers can find accessible, relevant content, complete with personalization services and will keep improving their experience by capturing and acting on feedback. We can make it more accessible. We can totally add that. Our customer centric approach has already delivered a real value to the people of New South Wales over the last 12 months. No way. Micro-awaiting package has been told in advance. Let's do it. And it's all in one location. NewSouthWales.gov.au. As the program grows, our agency partners will see more benefits such as greater exposure with over 3 million visitors per month. And to support digital capability uplift, we provide platform training and service desk support so that agencies can spend more time focusing on the customer. We're building a best-in-class open-source tech platform with security at the heart. Together, we're improving the digital experience of government by putting our customers at the center of everything we do. Thank you. That would be all. Right now, there are over 700. What, next slide? Yes. No. We didn't try this bit. So hold on for a second. So we'll try something. Sorry about that. Yeah, so the OneCX program is transforming customer digital experience for New South Wales government. It's making it easier for customers to access the information they need without having to understand or navigate the structure of the government. Okay. Yeah. So with our 950, 750 website across 10 NSW government clusters, the OneCX program is working with agencies to build NSW.gov.au as a single location for the customer to secure information. The OneCX program is making it easier for customers to seek and do by creating a digital experience of New South Wales based on their needs rather than our structures. So let's begin the official part of the presentation. So I have been a Drupal developer for 10 plus years now. I'm a core contributor. I maintain shortcut module and contact module in core. I maintain data pipelines and various other contribute modules. We'll talk about data pipelines in a bit more later. And I'm a solution architect at Department of Customer Services and I've solved these kind of complex problems along with my technical lead of the platform team and Nathan sitting in front row. So let's start with the problem statement. Like what was the actual problem? So, you know, someone said we don't want, you know, location search on our site and let's design this. They come up with this design and the top right part like postcode, suburbs and address. And then, you know, like you can click on your current location and you can search from that as well. So this piece, we are trying to solve this, you know, little piece and trying to find a solution for this problem. So first of all, we gathered everyone in the room. This took a while because every team has their own needs. They have different type of contents. As you know, we are onboarding different agencies to our platform. So we have health-related content. We have disaster-related content. We have education-related content. And content for different communities. So there are a whole range of, you know, content writing on our platform. So first of all, what we did, we gathered everyone who had a voice or concern or want to bring this product to life to the platform. And we brought them into the room and started the discussion, right? So Product Honor is here to give us the opinion about what it must have, should have and could have business analysts to record all these, you know, information, UI, UX designer. Because they have built this thing, they want to understand like, is this feasible in the future? How it would look like as well? And what amendment they have to do after, you know, solution has been changed. Content manager, because, you know, like they have to tag the content with the location information so that it can appear in search. What type of location information we'll discuss in detail later. And then, you know, like technical lead, what we need on the platform and the solution architect, what is the actual solution going to be after hearing the problem statement. So we got, you know, got everyone in the room and had like multiple lengthy sessions to hear everyone out to understand the problem. So the conclusion of that discussion was let's use, so in that search bar, you can use postcode and suburbs. And then, you know, using that link, you can use proximity search. So if your postcodes are matching, if your suburbs are matching, your content, you will see the relevant content. And if you click, you know, use my location, then your nearest content to your location will appear as well. So then, you know, content part of that is like the content needs to be tagged. And because we have whole range of variety of content, some content is tagged with regions information. And that region information is different for different clusters within things as well, cluster of agencies within New South Wales. And then, as some content is just tagged with suburb information and some content is tagged with both regions and postcode information and sometimes suburbs as well. And if you are viewing some events on our side, they have specific, you know, like street address. So whole range of, you know, like location information is available. And then, that's not it. Like sometimes, for example, you have disaster declaration and some area, you have, you know, information attached to a reference content that this grant is available for specific, you know, disaster section or disaster area. Area affected by the disaster. And as I said, we have different type of, you know, like regions in New South Wales. I will discuss regions in more detail. And we need to get all the list of the regions, like what are these actual regions? Like how many regions are there totally? And one thing which was working in our favor was that OSPOST has all this information available publicly and we can use it as single source of truth as well. So because we work with whole range of agencies, so we have LGS, local health district, we have DPC regions, we have DCJ regions, we have transport regions, and then content team mentioned that there is a chance that we will get different types of regions as well. So yeah, like this is, you know, like structure-wise or content management-wise, this has started becoming a very complex problem. So first of all, we need a mapping from regions to postcode and suburbs, but we need that, you know, like storage of postcode and suburbs somewhere and because it's a large amount of data, database storage is not ideal for these type of things, then we need to maintain the list of regions within the CMS so that content editors and content team can go and edit this. And regions should know which postcode and suburbs they belong to as well. And, you know, content team should be able to update and edit those at the same time. So again, CMS changes and data model changes part. And then once that's done, how is this going to work with search if the content is directly tagged or is referenced and how is then proximity search going to work? And we have to identify with existing tools and which tools we do need now to implement all this, so that was all open-ended questions. And we spend a lot of time thinking about these problems, you know, to go back and fix some of the other technical issues. And then once it's rolled out, we want to make it available for the whole site. Like if you have news items, if you have media releases, we want them to be searched by a suburbs postcode and regions maybe and, you know, like if you want to see news items near your location, we want that ability on the site as well. So we checked the inventory first, like what tools on the platform do we have? So first of all, our hosting solution is container-based, so we can spin up any container with appropriate services we need, if we need, and, you know, like start using those services in a very secure manner as well. Then there is a data pipeline module in Contrib. The purpose of data pipeline module is there are a lot of sources of data available which we can consume on the platform, like data.sw provide a lot of APIs. And the problem is you can't build a system based on, you know, the incoming format all the time. So you need a translator in between which can convert the incoming data to some version which we can reuse again and again. So data pipeline module in Contrib helps with that where you can, you know, send a JSON or CSV to data pipeline module, create a data pipeline, and it will, you know, like use the migration type API to translate it in a valid version. And you can then reuse it without thinking about the source data because there's a template and it will always follow the template. Then on our infrastructure, we have elastic search setup. We have elastic endpoint setup and our content is indexed using search API and elastic connector module and we can access that anywhere on the site or the front end. Then the next thing is in the core we have tools like custom fields, computed fields, and, you know, like field widgets which can improve, like, user data experience because a lot of this was related to data as well. So let's talk about solution design. So first of all, the point is convert all the regions into postcode and suburbs. So we have a region, let's say, a city and how many suburbs are there and what is the postcode of those suburbs. So that list should be maintained within the CMS. Then next item is maintain all the region types in single region taxonomy. So content editors don't have to, you know, move to multiple places. They just go to one taxonomy field and one taxonomy vocabulary and then, you know, like add terms to that and add additional regions if they like. So it will be, like, hierarchical, you know, like LGAs, all the list of LGAs, DPC, all the DPC regions and then transport regions and so on and so forth. And then we talked about, like, custom widgets being available. So write a custom widget where you can select within the content that which parent term is applicable for this, you know, like this variety of content. So if it's a health-related blog post or health-related media releases, you can select, you know, like health regions and your content editor will only be able to reference regions from local health regions or district and they will have no other data available. So it makes the editor experience, you know, like improve editor training vastly as well. And then we talked about having a, you know, like data pipeline to import the suburb and postcode data which is available to us from OSPOST. And the last thing is, yes, the data is in the pipeline, but how is we are going to use it? So we index that data, we create an elastic index and we, you know, like, index all this data within Elasticsearch so that our front-end can use it without interacting with Drupal at all. So once all this, you know, like connected with each other, we now have taxonomy which can be tagged to the content or a content which is referenceable, has a region and information attached to it, can be, you know, attached to the content itself as well. We want to collect all this piece into, you know, like a field which can be indexed and which can be searched upon as well. So for that we use computed fields and in the computed fields, you can have all the, you know, like information attached to a region about their postcode and suburbs into, you know, an array, index array. And then you can index that array in such API. So that is, you know, like a very handy thing because you can run proximity search and proximity queries if you are using Elasticsearch properly. The next thing is because it's like one-to-end connection like a region will have multiple postcores, a region will have multiple suburbs. So we need to support, you know, like GeoShape and index data type in Elasticsearch and out of the box in Elastic connector module that functionality is not available. So we had to implement that as well and we had to read up on how the GeoShape API will, you know, collect all this information. Once that was done, the next part is it's our hosting, you know, going to support this Elastic GeoShape data type or not. So we had to upgrade our containers to support the latest, you know, like Elastic version so that we can support, you know, like GeoShape data type. And last but not the least, because we want the proximity search to work, we use the GeoDistance API within Elastic to query the GeoShapes which can, you know, help us with proximity search. So looking at the design, we had the source, suburb postcode source data, bottom right. This bottom right, yeah. We look at the suburb postcode source data because the postcode suburb are unique combination so we can create a unique hash out of them and then, you know, like we use that hash for searching using Elastic Index. Once that hash has been, you know, like stored properly, we can then reference it using custom fields and taxonomy terms. Then those taxonomy terms can be attached to the content or referenceable content and content can, you know, like create computed fields and get the appropriate information from the source data or hashes to, you know, like give the piece of, like, postcode and suburb information. And then the computed field will get indexed as content in Elastic and we can run, and our front end can run the queries. So, for example, in the screenshot, if user comes in and they type in the postcode, we go to the suburb and postcode hash using Elastic Search Index. We get the lat and long and we pass that lat and long to, you know, using geodistance query to Elastic Content Index and get the appropriate results. So that's how the whole puzzle came together. Yeah, so this is a, like, a flow diagram in a very complex way, but I hope it makes sense. And that's it. Any questions? So, are you using the AWS Elastic Search? For now, we are using AWS Elastic Search solution, but in PHP 8.1 we will use, you know, like migrate to OpenSearch and then we will start using OpenSearch. And OpenSearch supports the GA stuff, the GA? We have to assess that and add reliability, if not. Did you tag all the content manually or was that done with something automated, like with the location tagging? Yeah. Really good question. So, some of the content, legacy content or existing content has already been tagged with regions because that's the convention. You want, in the search in the sidebar, like, you know, like I want to see city of Sydney related news items. So, you can, we had some of the stuff already tagged, but a new content, when it will come, the content editors will add that. So, we did populate the taxonomy terms with postcode and suburbs information automatically because that's a big piece. Like, you can have like 10 or 15 suburbs or postcode related information within one, you know, like region. So, that is automated. But the region piece, the content team needs the control over that and they want, and there is no, like, mapping you can create which is genuine for regions, right? Like, this content belongs to this region without having some rules or, you know, like predefined, you know, like this NID belongs to this region and stuff like that. So, yeah. And how did you just resolve data integrity issues with like spelling of suburbs being different in different places, or was that not an issue for you? We are using OSPOS data as a standard. So, yeah, we didn't consider that as an issue. But again, if that's an issue, we can update the source data and then re-index it and it will be fine. That's all. So, you've got region also a geo-shaped data. I'm guessing you're getting that from, like, the gist data source from Open. So, you can... Never mind. Overlaying geo-shaped data, how do you deal with the relevancy? So, you've got a region and a LGA and something content between them. How do you deal with the relevancy if something is only tagged against one of those? Yeah, so that part, we did a spike on, you know, like before implementing all this, so we had to do a spike on all this, like the proximity will actually work with multi-shape or not. And within multi-shape, like data type in Elastic, there are different types of shapes you can create. You can create a polygon, you can create a multi-point, and there are a few more as well. So, we went with multi-point and with multi-point, like Elastic makes sure that the proximity is properly relevant. So, we are relying on that piece. If not, then we can write, you know, a custom script within the filter, plug in a filter search, and make sure the relevancy is proper. Yeah, sometimes, like one region, the postcode will go across two regions, you know what I mean? The border. Do you use the geographic, like that mapping, like the edge, what do you call it? Multi-point to try and say which side of the fence. That's a very good question, and I'm glad you picked it up. So, the reason we gave the content editors the ability to, you know, like update the suburbs and postcode list within a region is for exactly the same, to fix the correctly same problem you are discussing, because within these LGAs and, you know, like local health districts and transport regions, postcode and suburbs move, like they reclassify, you know, like regions, and some suburbs will be added or removed or some postcode will be added and removed, and the data changes and all spores get updated, you know, like coordinates as well. So, the way we are indexing stuff, we can just change the source and it will re-index and it will, you know, have a cascading effect, but within the region, a content editor can go, okay, this region is not there anymore, just remove this, because it's a multi-value field, you can just edit it and remove that value, right? Any text field you remove. And once that is done, you have all the updated information throughout, right? And then we don't have to worry about elastic part at all. We don't have to worry about, you know, the relevancy of multi-point search, which is on the edge or, you know, which is not part of that suburb, because our data is making sure, you know, the relevant postcodes are only in the content index. I hope this answers your question. So, use your index, yeah, rather than doing every object, you just have that central index that proves which side of the fence, you know what I mean? Yes. So, you keep it consistent at a higher level sort of thing. Yes, because we need more control and content teams, the content team want more control on the source data as well. It will be extra. For the second question. So, I suppose something wasn't quite clear for me and I'm interested in when someone, when say someone proximity searches themselves with the button or types in as a region or a postcode into that search. The search that, their search that actually hits elastic, is that resolving a postcode hash or a postcode suburb hash first and then simply just sending that to elastic and ignoring what they might have typed or ignoring kind of the details of their search text that they've got. Yeah, so there are two different solutions here and two different paths here. So, if you have postcode and suburb, like we can get the exact hash for that postcode and suburb from the OSPOS data. Right? And from that hash, we can get the coordinates and from the coordinates we can run the proximity query. Right? Cool. You're sending in latlong. Yes. And then when you use use location, we have already the latlong. So we can just run the content query. So there are two different paths for intent to evaluate the results. So the content items themselves are, do they have a specific address or are they just getting tagged with a region? As I mentioned, it's a whole different type of variety. Right? Like, sometimes it's just regions. Right? It's health-related content and it's valid for far-west health, for example. So it's just tagged with far-west health region. Right? And if it's a disaster declaration or a grant, it will have collection of postcodes and then it will have some, you know, like region attached to it as well. And that region might be a, you know, DPC region or that region might be, you know, DCG region or transport region doesn't matter. So with the proximity search, if there's no address, it's just, is it just the lat-long of, like, the middle of the area? No, so as a user, you will input something, right? And that input will be translated into lat-long. But the proximity is proximity, so one, the lat-long of you is specific, but against the proximity of the content, if it's only a region, that's a blob, not a point, right? Yes. So that is what, you know, Elastic GeoShape API will help with because they make sure if it's a multi-point, then the closest will appear. Oh, gotcha. The closest, though. Yeah. Hi. Just a quick one on data pipelines you use to get data directly into Elastic Search. Yes. What data was that? Was that separate to the taxonomy terms that stuff is getting tagged with? Yes. The source data is a CSV from OSPOST. So OSPOST, if you go to the OSPOST website, you can download the postcode and suburb information with the geocoordinates in it as well, and you can extract it as CSV and we index that CSV into Elastic Search. So are you effectively having two lists of postcodes then? The stuff that you're directly importing via the data pipeline and the taxonomy terms that get tagged against content? Yes, you are correct there, but a slight correction is, yes, data pipeline is one index of the source we are getting from OSPOST and the actual content index which has all the suburb and postcode information attached to that content. So there's an Elastic Content Index. So if you see in this picture, there's an Elastic Postcode and Suburb Index and there is Elastic Content Index. So Content Index is just any index like you have Search API Enable or you have Solar Enable or Elastic Enable. It doesn't matter, right? You can search the content from that. So yeah, so we have two indexes. So why did you have the two indexes? Is it because Search API doesn't really support the Geo stuff or? No, we have built... So this in the top part, it's a search front end. So it's not Drupal related front end. It's just partially decoupled front end. So it's a React application maybe which is getting postcode information or suburb information from the input field and then getting a hash value and from that hash value finding the nearest content from that. Thanks, I think I got it, yeah. Thank you. Thank you.