 That's a really hard act to follow, that was a fantastic presentation. This will be a little bit shorter, but I think it's going to be fairly complimentary. And as a, you know, usefully, I'm also talking about the A and Z fields of research, but for a different purpose. So I'm talking about an exercise that I did last year, I guess, which is mapping a subject classification to the A and Z fields of research, which, and Rowan has usefully introduced the A and Z fields of research, in case anyone's not familiar with those. Can everyone hear me okay? Does that sound all right, Simon? Yes, you're good. Thanks, Les. Thank you. I was having some problems before. Okay, so there's a little story. So I, the work that I did in this mapping was to address a particular need that was communicated to me by the Australian data archive. And that was about the, the, the, the data set metadata that they share with the, sorry, it's going to minimise something on the screen. That they shared with Research Data Australia. And the issue, the issue here, as I understand it, was that Research Data Australia likes to have fields of research codes in its metadata. And we didn't have those codes, codes available in the Australian data archive metadata. So there's, I guess there's two or maybe more than two ways forward. One is to, to do a big sort of back cataloging exercise where you tag all of the, the data sets in Australian data archive with fields of research codes, or hopefully there's some other way. Because this is required in the Australian, in the Research Data Australia repository. I'm not sure how hard a requirement it is actually, to be honest. So I'm not sure if it actually blocks records going in or if it stays as a record has been flagged. They don't have those codes. Nonetheless, there's an intention to get those codes in. But what we, what we do know is that Australian data archive have been tagging their data sets with the control of vocabulary. So this is a good, this is a good start in trying to address this problem. And that's the Australian Public Affairs Information Service, Thesaurus. Now, we, we're interested in making a mapping between A-Pays, as I'll call it, and ANZ. So that perhaps some kind of bulk tagging of ANZ codes could be done to the, to the data sets using that relationship. And we can, we can map the ANZ using link, link data pages, let's call it, because we have URIs, which is great. So here's an example of a field of research concept, industrial relations, and it has a URI. It's distributed, it's obviously managed by research where Cavalier is Australia. It's available in interoperable formats. JSON, RDF, XML, et cetera. So that's in good shape. I suppose in contrast, the A-Pays, this is the most machine readable version that I could get of A-Pays. A-Pays, I work in Hawthorne. I can go down to Readings Bookshop and I can, I can order the last printed edition of A-Pays, but really it hasn't been published or updated in over 10 years. I, in this presentation, I do want to talk a little bit about the, I guess, the collaboration context. I did have conversations with the National Library, who originally developed A-Pays and maintained it for a long time. I also had conversations with RMIT Training, who had taken over some kind of custodian shift of A-Pays. But between the two parties, I wasn't able to get a machine readable copy of the vocabulary. And that wasn't, and that's just because it wasn't available. But I was able to get this spreadsheet from Australian Data Archive. So this is what I was working with. So unlike a sophisticated, sophisticated collection of triples, what we have here is a label in the left-hand side or a term. There is a standard number, which is important. Some, a synonym or an alternative label we can see for industrial relations is the used four-term later relations. It has some narrower terms defined, delimited with pipes in that column. And also some related terms, which are non- hierarchical. So this is what I'm working with. And I took it upon myself to import this into the RVA editor. So that's the pool party environment. There was a fair bit of stuffing around. And I found that, and I have found this before working with pool parties. When you have a vocabulary that has a mix of hierarchical structures, but also equivalents, synonym references, it's very, very difficult to import the whole thing at once. You really have to import one at one structure. And then there's a bit of manual work after that. Or possibly I could have done it better with some help from a developer or something. But I did, unfortunately, spend a bit of time trying to get this in. In hindsight, I think probably the next time I'd try and undertake something like this, I'll seek more help, I think. But nonetheless, I got it into the RVA editor. And this is an editor account front page. So here's A-PACE. You can see the address there, editor.vacabs. And there's some top concepts. Now, that looks like, as you can see, all those top concepts start with A. And that's because there are over 200. This A-PACE, in my humble opinion, is not very well structured. It has a very, very broad top level. But that's how it is. And I've made no changes to the structure. I've just imported it as is into this environment. Here's a look at A-PACE under the hood in the RVA editor. So you can see the top arrow. There's the preferred label. And it has its own URI minted by the system. You can see narrow concepts that are imported with the second red arrow there. The whole tree is displayed on the left-hand side. A bit more detail, which is on the right-hand side of the page. We can see that there's an alternative label defined there, that's Labor Relations. That was in A-PACE already, that's it in them. And I've taken the term number from A-PACE and defined that as a notation. So these are all SCOS properties. There's a SCOS notation property. And I'll store that there. Right, so what I've also done is I've then scanned the ANZ fields of research. I've gone looking quite manually, I'll stress it at this stage, I've gone looking for, or rather I've come through the fields of research and then I've gone looking in A-PACE for matching concepts. And so here's an example of a URI from the field of research we were looking at before. I've put that in a field here called Exact Matching Concepts. And that... Let me just see... Yeah, okay. So there's a couple of things I want to say about this. And I thought about titling this talk, how not to do a vocabulary mapping, because I really think that there's two things that I've done wrong here. First of all, the mechanism that I've used, which is very manual, and me literally, you know, combing through the tree of fields of research. I mean, fields of research is about three or four levels deep. It's not impossible to comb through manually, but probably there's a better way of doing that. And I'm going to show you a better way of making quick matches between vocabularies in this environment. But another thing I think I've done wrong is I've defined this as an Exact Matching Concept. Now, Rowan talked a little bit about lots of considerations into deciding whether or not concepts were matching with Exact Matching or Close Matching. And that was really useful. And I can really relate to that scenario where I've been doing some consolidation work and some other taxonomies. And one thing I often want to know is how is these terms that appear to be the same, how is the first one being used in a particular repository? Has it been used consistently? Has it been used differently to how the other term's been used? And there could be a number of reasons for that. It could be that it's a structural... Yeah, it could be where the term, how the term is structured in the vocabulary and what its broader terms are, what its siblings are, siblings as well, or it could just be how it's been used in a particular cataloging operation. But I just want to talk a bit about this distinction between an Exact Matching Concept and a Close Matching Concept. I sort of got... I got some way into this project and I thought sometimes I'm using Exact Matching Concept when I feel very confident about something and I'm using Close Matching Concept when I feel a bit less confident about something. And I started to feel a bit... I started to wonder if that was the best way to be progressing. And so I actually went back to... I actually went back to the SCOS reference to have a look at what these relationships are. And I realised that I should have been a little bit more away with what these mean. If we look at the definition of a Close Match and an Exact Match in SCOS, I can't make assumptions that the other party would make the same match back to my vocabulary. Whereas in an Exact Match, there's an assumption that the match is transitive. And I think that this really... With that comes an assumption that there's been some kind of dialogue between the parties. There's been a dialogue between me as a vocabulary manager and someone else as a vocabulary manager and there's been some sort of agreement, some sort of handshake. And I can't say with my hand on my heart that I have that agreement between APAS and the Fields of Research. I haven't consulted enough, really. So I've really no position to be able to make these kind of assertions that there are Exact Matches between these concepts. So I think, usefully, these SCOS definitions, the way I interpret them is that they really suggest if you're going to make an Exact Match, you really need to be doing certain consultative work, I think. To bring parties together. The second thing that I did wrong is really the manual linking I did. There is a facility in the Research for Cavalier Australia, the RVA editor. There's a facility for doing batch project linking. And I think Rowan was referring to this and I'm sort of showing you the back end a little bit. So in this screenshot, I've got APAS in the left-hand side. And on the right-hand side, I've got another vocabulary. This is just a dummy test. It's called a thousand turns. And there's a facility here for matching concepts. Now I can scroll the left-hand terms or the right-hand terms and I can find what I think is a match and I can drag one term onto another and that creates a match. But there's also a facility to do what's called batch linking. If I do that, the system will pass all the terms and go looking for matches. And so if I hit that batch linking, I end up with two columns from each vocabulary. And it's then up to me to review those and to decide, on the right-hand side, if you see a thumbs up or a thumbs down, I can decide whether or not they're in match. I can also change what kind of match it is. You can see the exact match there is the default for better or worse. So this is useful for getting through the list more quickly. You still make decisions along the way to approve all the matches. I'll just point out something else which is really useful. So here's a match that the batch linking predicted and the reason why it's predicted that industrial relations and labour relations are a match is because in the vocabulary on the right-hand side, industrial relations is the alternative label. So this matching algorithm doesn't just match preferred terms with preferred terms, but it also matches preferred terms with non-preferred terms. And I really, any sort of matching or mapping algorithm that doesn't take into consideration, I think is really not very useful. So anyway, it's good that this facility is here. Okay, so just in summary, if I was to go about this again, there's a couple of things I would have done differently. Probably I really should have worked more closely with RVA because this is something I went about really on my own. But what I should have done is asked for a copy of the fields of research or find some way of getting the fields of research into my editing environment so that I can do this kind of batch linking. So I really should have investigated that. And so it's something that I'll probably be talking to Rowan and others about going forward. Also possibly assistance with liaising with the third parties. I feel a little bit of that out of the loop, to be honest. I don't know who to talk to about the field of research. I don't know who do you talk to about the possibility of linking back to another vocabulary or whether there's any interest in that or not. I don't know. Got no idea. And the matching, well, yeah. So I suppose really the rule that I would use going forward with doing this kind of work is that it's a close match by default. And you want to get an exact match of this background. That's my reflection on that. Okay. I think I'm done.