 I'm just going to try and address the question for you. How do you choose which vocabulary to use to enhance discovery and description of your data sets? And how do you know their quality? And these are questions that are being increasingly asked. And I'm going to use, I know I'm a bit of a mad hatter of Australian informatics. And the hat I'm going to wear here is my position and relationship with CoData as chair of the Academy of Science National Committee for Data in Science, which is also the de facto Australian National Committee, because this is starting to become a serious issue scientifically. Next. And so what's the context? There's an emerging issue with vocabularies to put it bluntly, they're proliferating or brooding like rabbits. And a review, as you know, is carried out of research vocabularies Australia that highlighted some issues. And this one came out in particular. The level of community or domain endorsement per vocabulary is unclear. We have authoritative data sets, but are there authoritative vocabularies? If so, how are they identified within a vocabulary service? So I'm sorry it didn't come out on the right, but within RVA there are 13 vocabularies on boreholes. And this is the issue because the users are starting to say, well, gee, it's really nice having 13, but which one do I pick? Now, the attitude in setting up RVA with Adrian Burton has been that if somebody wants to come along and publish a vocabulary in RVA, then we will. We are not going to be judgmental or say, oh, this one's better than the other one, because as you can imagine, ARDC is research data. We're not domain experts. We don't have the ability to differentiate and, you know, like who'd go that way anyway. So what we thought would be, can we actually provide users with guidelines as to which are the better vocabularies? Next slide, please. So when you think about it, they're asking ARDC should be for something substantial. And I said to Adrian, I said, oh, my God, I'm not going to sit down and do that by myself. Let's go and see what everybody else is doing. And so Simon, Rowan, Adrian and I have run four sessions on this topic in the last six or eight months, starting with ESIP, Earth Systems Information Partners, which is an Earth Science Group. We then brought the same questions back to eResearch Australasia. We took it to the RDA plenary in the Earth and Space Science Group. So we went back to Earth and Space Environmental Sciences. And then finally, we presented it at the GoFair Co-Data Symposium. And so if you click on all those links, you've got all the documentation, all the presentation. And in a nutshell, all came to the same conclusion. We're aware of the proliferation problem in some cases that is necessary. But are we really tackling the problem at the moment? And that is being able to help users know which vocabulary they should choose. Next. And so I really want to say that I'm using the definition of a vocabulary as a value set, concept set, vocabularies, glossaries, the psorias, lists, knowledge, graphs, et cetera. We're all putting it in that bag. Next. And the reason we did that is that if you look at this diagram, which I've based on some work by Leo Orbs, you can sort of see that down in the bottom left, you have lists, which are lists. And then when you move up to the top right, you've got people who are doing ontologies and axiomization of those ontologies, which is fairly sophisticated. We're also starting to notice that there was a discrimination occurring in that the people who are up in the top right in the ontology said, oh, we should ban all this. They should be taken out of the equation. The bottom line is I'm a scientist by nature. It's my primary role. And in some cases, it's all we've got. We've only got a list. So we can't discriminate on the maturity of it. We need to be aware of it. Ideally, I watched many groups start to move up to the top right. But at the moment, we have to live with the fact that a lot of these vocabs that are in RBI are lists, and they're valid lists. Next. So what is important to know whether you can trust the vocabulary? And these are the questions we've been asking at all those sessions. In the machine actionable world, unless we know what we mean by the terms we are using, communication can be fuzzy and not well understood, particularly in transdisciplinary or multilingual environments. And so for effective and accurate communication, we need the adoption of shared vocabularies and terminologies. It's critical to the discovery and understanding of our published resources and will help reduce ambiguity, but at the same time increasing interoperability. Next. So how widely do I share and have my data and have it understood? As I said, many organizations have local lists that serve multiple functions, including enhancing discovery and annotation and description. But the size of the group that can interact with that data set and its terms is only as large as the size of the group that understands the definitions, concepts and languages being used to describe that data set. And so that's another attribute we have to understand. Next. So with increasing globalization of data and information resources, particularly we've seen it with COVID and climate change, enabling a common understanding of any concept used to describe will define a thing in our world is becoming critical. And so we have this need to develop common vocabularies that can be used across broader communities and support harmonization of information across disciplines and language. But we need to be able to communicate to the users how reliable, persistent and usable is the asset they're wanting to use. Is it governed or endorsed by an authority? Is it fair, which is what Simon has been covering? And more importantly, what is the quality of the vocabulary? Next. And again, as I said, we're going to take it at any stage of maturity of a vocabulary. Next. So the challenges to communicate to the users information about its quality, sustainability and conditions of use. So the users can make these informed choices. And it's not easy. Next. So what I started to do was to delve into this work being done by Rama Purin et al. on ensuring the quality of information. So this is designed for quality of data in science. And you can see it breaks it up into four dimensions. So for inner vocabulary, the science, I mean the content. What are the terms that are being used? Are they endorsed? Are they valid? And then you put that content into a vocabulary in the bottom right as the product. Then you have to steward it and preserve it so others can use it. And finally, you're making it available as a service. So these are the four dimensions. And just an aside, this US group will be presenting at the ARDC Data Quality Community of Practice in April, if you're interested in following this quality dimensions theory. Next. And so you can see here how myself and a couple of others we've started to say, well, what's scientific quality in a vocabulary? Who defined the term? Are they recognized as authoritative? What's the vocabulary structure? Does it have a definition? Does it follow standards? Who's the gatekeeper, which is what Simon was saying? Who's keeping the vocabulary fair? And then above all, we come to the service like RVA or NERC that's delivering up the vocabulary. And what are we doing? You can sort of see how you're getting different communities working around that. But to work through that, that's going to go take years of work and we don't really have the gift of time. Next. And so as per usual, what's a quick and dirty approach? And so what we thought of was that Tim Berners-Lee has this five star open data on the web, which we're all very familiar with. And it just gives you a quick guideline to the quality and usability of the asset that you're trying to take up. Next. And already if you go to data.gov, it's got a five star rating for data sets on its site. So one of the things we toyed with was whether or not on RVA we could do a quick five star ranking for our vocabularies. Next. And so you can sort of see this is what we've taken. So one star is just someone's what we call brain fart on the web of the list. Then it becomes machine actionable. And concept-based RDF. And then we're getting up to that concept-based RDF linked, endorsed, multilingual. And so then when somebody goes into RVA, they can see the five stars. And if you've got a ranking on each of the vocabularies that we have, like the 13 boreholes, if your purpose is that you want a sustainable resource, then you'll take the five stars. So anyway, that's just a proposal. I don't know where it'll go to next. But where are we going in the longer term? And again, this is a co-data background. They are planning mapping the landscape exercise of semantic resources and will seek involvement in the social science unions and other relevant societies for endorsement. It's critical to their decodal plan. It's also relevant to anyone who works in the Sustainable Development Goals. Co-data also planning to delete the development of a core trust-like SEAL certification. It's not going to be the one that you've got for repositories, but something similar for vocabularies and for endorsement. So anyway, that was just a side on what we're doing to try and make our vocabularies that are on RVA try and get over to the user. Things they should think about when they choose the one without going to that drastic step of saying, this is the one ARDC feels you should choose. It should be up to you, the user, and your use case to make those decisions. Okay, so I think that's all.