 Tēnā koutou. Yes, yes, so kōrere noa rangitaku ingoa. Hi uri o hau no Ngārua hine. Ara ko Ngāti Haua, Ngāti Tū, Akohapū. Ae. I'm Rhys, Rhys Arn. I work at Victoria with Sydney and with Rere. Sydney's here in spirit. She left it to the boys. So, I thought it would start off with whakatauki. So whakatauki, our proverb, one of my favourites, me titi o whakamūri kaha re whakamua. In order to understand where you are going, you must look to the past, or conversely, in order to know where you're looking, you must understand the past. So that kind of underpins our approach to our problem and to our mahi. So this map behind us shows us the extent of land confiscations in Taranaki. By around about the 1860s, there's just over a million acres of land that were confiscated. There's come into about 2019, about 70,000 acres were given back to Māori, so about 5%. Well, yeah, just about 5%. The reason why I've shown this is because it gives us an understanding of the kind of context that we're working with in, the historical context. So we work with Pani Nene Ki Waitōtara, they're a Māori land trust, and they have been tasked with administering a good proportion of the remainder of these lands that were given back to Māori. They have a shareholder register of about 10,000 whānau, and they can contact around about, ambitiously, we'll say, 30% of their shareholders. There's about $5 million in dividends that are yet to be paid out to these missing shareholders. And in addition, if you needed to make any kind of administrative changes to the land, you need to have 75% of consensus to make those changes. So our problem really is trying to find people. That's the project that we're working on. How do we find people? So in order to kind of figure out how to find people, we needed to figure out how people go missing, and what it is to be missing. Yeah, so I guess that's what I mean, but if I could talk to you, we must understand where we've been to know where we're going. Technical difficulties. Here we are. So if you were to go on to Māori lands online and search for a land owner, you could go to their website and it would show a bunch of names. So if I did the same for my nana, it would show her Māori name. It would show her English name. And then a combination of those first names with some of her married names. Didn't have TV back in those days, so she had a few marriages in her time. So a combination of, she told me that. That's not me saying that about her. She said that to me. So over the years, the succession of land or the changing of interests of land or generations is a flawed process but also when we're trying to think about how we reconstruct the past, we have a look at the data and it's messy. So if we take into account the six or so different names that represent my nana, then if we look in the data, there'll be six or seven different entities that represent the same person. So how do we know which one is which? How do we know who and how do we find people? It didn't take us long to realise we were dealing with a very large data set and we needed to figure out enough of the meaning of the very, very, very rich data that was in there to make sense of to understand. So just stepping back a moment, we're funded by a National Science Challenge and there's kind of an expectation that we do science. But we figured that we had to work towards that and we had to start with some data and data. We didn't want to try and make a machine or something do what we thought the Paranini Kiwai Tautara people already had under control. We needed a different approach. So we thought we will help you with the context and the context is the Maori Land Court a very rich and detailed context. So we just harvested it. We harvested the whole thing. All of it. We ended up with a triple store with 30 million triples in it. Oh, okay. That gave us some data. How are we going to go about developing our understanding of what that data is and render it computational so that the real scientists and we have other people in the team so that they could get on to the linkages the multiple names of it is grandmother the occasions that people have come in of court over the issues of lands for so long this is an early attempt this is April Fool's Day we got there we got there with an ontology called CRM C-Doc CRM is a conceptual reference model thank you and C-Doc is French so I won't go there but this is an ontology and if you go looking at what people say about it they will mostly say it's way too complex and don't use it big ups to Rob Sanderson at the Getty for taking the linked art into that complexity because they needed that complexity and we thought we needed a rich ontology we just kept working away at it we would get on the plane go to Taranaki have our idea of what the data meant and what was valuable in it completely blown apart by Adrian and then we would come back and we would get on with re-comprehending our data and we just hammered away at it we were lucky enough to sit with so Adrian had 15 years of experience in the Māori Land Court and we were lucky to sit around some whakapapa experts so what that meant is we were able to fine-tune the ontology to match what the data landscape was and I guess the bonus of an ontology like this means that we can start to homogenise or not homogenise make interoperable disparate data sets so if we have a data set that is very messy and we kind of get some understanding some kind of base semantic meaning of that huge data set we can start to use other data sets that talk about the same people the same entities and it will hopefully tell us a little bit more and try to connect the dots the next step for the solontology is to hopefully localise it because there's some things that although this international standard can help to explain what say interest in Māori land is or identity or stories so the next step is to localise that and is to find the places where we've had to try to retrofit this thing, this colossal kind of ontology onto our data and where can we find the gaps and not having a bit more understanding about te te te Māori and how can we kind of use this international standard and make a bit of localised version of it so while we were doing all this work we constantly had in our mind where is the analytics going to fit how are we going to put that is grandmother back together again and um and and we had a lot of discussions with Adrian at PKW we're also talking with Marcus and Valerie Marcus is our machine learning Bayesian stats probabilities guy and Valerie is her student so we were talking with them and we basically had an approach handed to us and that was that we can identify people if we know their brothers or their sisters so that was really the crucial insight and it was pretty much Adrian saying well you might have 3000 names in that list but if you look at the ones who got the same share amount that was probably a division between children so you don't have to look at 3000 and in fact it's very awkward because you're scrolling the screen up and down all the women go all over the place because their surnames change the men tend to stay in a tight little group but the women are mostly older women they're all married they don't succeed to interests until they're senior one in the family so how are we going to find brothers and sisters and we ended up with a triple lock on it same minute book reference same shares same land block now each of those things taken alone might not get you very far so then we could take our huge data sets and pull them into these groupings that's fine we did that we got 570,000 odd groups of what we think might be siblings 570,000 except that we've just discovered that we think we missed out half the data we did it actually we kind of more we are going to have to get back to our conceptual reference diagram we've got some things that we sort of missed 300 well while all this was going on we were looking at another source we had our eye on birth deaths marriages we ended up going with births and we pulled births out of BDM historical records as many as as DIA would put on a screen for us we grouped them up pre-1920 birth records we have about 330,000 groups so there we are we're celebrating we've got this the analytics people are just going to have to solve they've just got one job give us a probability how hard can that be well I think now we're catching up we're only two weeks ago now three weeks ago we've done it except we needed to know that at least one pair of siblings could be found in historical birth deaths marriages and what looks like exactly the same pair of siblings we can find in Mariland online and we had a presentation we were doing we were going to New Plymouth weren't we we were going to New Plymouth that's right to tell them what we've been doing we needed something so I turned to this Tafano Harris if there's any of you who trace your ancestry back to the Harris's in the Hokianga well it'd be nice to nice to see you almost certainly almost certainly so if we can just roll a slide on we'll take you really really quickly through so that's that's the book Elina is my mother's mother the twink is mum the blue ballpoint the pencil is me so rolling straight on ahead that's what there's two of her there's a story there but I have no idea what it is but she is at least there once in birth deaths marriages and rolling on she is there in our one of our groups we pick one out of 330,000 there she is and those are her brothers and sisters according to birth deaths marriages rolling on there she is there she is in Murrayland online a court reference hallelujah many of them don't have that but this one does shares and a land block and it's of course it's on the it's on motorcracker road of course our group we found her brother we've also found her maiden and married name but we found her brother so that that was was a very a very nice moment and now we didn't write this none of you none of you are allowed to leave this is mathematics this is Marcus and Valerie and where their thinking is at the moment so their job one job is to just put a probability on that and that means that they need to build a model for how names change and why it might be not quite not quite recorded the same and all of that theoretical machinery we are going to turn it into running running on the whole thing and we have the national do some heavy lifting for that we've got some big toys what was really interesting with the birth records is that we can only access birth records that were registered over 100 years ago and so you'd think if you're looking for current owners and land then you're going to it's not going to work however as we can find I mean young man but his grandmother is still considered a current owner in Māori land online and she was born at the turn of the century and my my great grandfather he was born in 1911 and he's still considered a current owner there's been two generations of people that should have succeeded in by then but I guess it highlights the actual fracturing of whānau because you need to get people at the table to walk through these processes to talk about what do we do with this land this is like my little slice of heaven South Taranaki this photo is just taken just down there from Udipa where my great-grandmother and my great-grandfather are buried and they were both shareholders of PKW as well so I guess bringing it back down to some version of Earth what it really means is it means that there are 70% of this register of 70,000 plus people that can't be contacted what it really means is that they are unaware of where they connect or how they connect that they perhaps don't want to be connected to but I'd like to think about the other side of it is there's a lot of opportunity in this project and this mahi is seeing who can we find and how can we help them reconnect back to land and then through rebuilding these kind of structures of Whakapapa helping them understand how they connect to land helping them figure out what is this place that they Whakapapa back to and why is it important for them we're not doing this alone as Rhys said we've got some other bright and beautiful faces here and there's something to note as well that it's quite a multidisciplinary team we have backgrounds and all things all over the place, information management, political science law that kind of interdisciplinarity means that we are trying to tackle a problem with a kind of a holistic view because data science and tech is only going to get you so far a problem that is not tech at its core is not going to need more than just tech to solve the problem yeah yeah if anyone knows how to match a second verse you can stand up and join in all the welcome after the talk minutes for questions if anyone's got, if there's a couple of people that have any burning questions I have a mic so I will bring you the mic I can hear from here Kira I'm Jane from the University of Waikato and we have a lot of Maori land books there I was just wondering is there something that could be used in other areas at some stage you'd obviously have to go out and see the Iwi and get that information from them but we have a lot of people coming in inquiring so what's the prospects with it further on where we're we're relying very much on our collaborators and and and we're really wanting to land a bit of a fish before we start saying how nice it is but but at this stage what we're doing, what we've done is we believe with publicly available data so those data sets that we're working with are crown data sets and so at least this exploration that we're doing at the moment we would see hopefully as if it's successful as rolling out quite widely the the work that we're looking to get started with Pana Nihiki Waitotara sooner rather than later is actually to use other resources like this to tell us what's working and more importantly who didn't get found out of the the the data harvesting, the data munging, the data rearranging and the analytics and the assignment of probabilities so the we're looking to work initially in having an ability to fully assess what what we're trying to do and hopefully then at that point it's really I would hope good to go just something to add as well we we're keeping the engagement end with the community that we're working with so we people like Adrian and Mitchell Detay that are actually based in Tadonaki based in the community, they have rapport with the community it's up to them to be the outreach we can just give them as much information or as much matoating as we can but it's really living in their hands the right thing to do because as we all know relationships take years to build and can break in a second so it shouldn't be from some freaky data people in Wellington it should be on the ground it should be at the marae level Any further questions? One more? Kia ora to both of you just a couple of comments really that I really enjoy or like the way that you're working both on the level of looking at that model and kind of extending and possibly depending on how the work goes maybe decolonising a little bit that international model and engaging with that complexity and seeing it as an opportunity rather than oh nah too hard but also the way that it is relationship based and led and that seems to me to be a really exciting for using those skills of that the university communities have but being led by PQW in this case and the relationships with that so just a mihi to both of you Kia ora I think that's a good place to wrap us up Homai te Pakipaki tino whakahirahira monga kai korero o tini da