 Morning everyone, or good afternoon, depending on where you are. My name is Matthias Liffus, and I would like to start this webinar by acknowledging the traditional owners of the land on which we all are today. For me in Perth, that is the Wajuk people of the Nungar Nation. And I would like to pay my respects to the elders past and present. And I'd like to extend that respect to the traditional owners of the lands on which you are this morning, or this afternoon, sorry. So this is the third Q&A webinar out of four for our Fair Data 101 Express course. And today we'll be focusing specifically on interoperability. Now today we have two excellent speakers, very, very knowledgeable. We have Rowan Brownlee and Catherine Brady. And as in previous sessions, they'll be giving a bit of a story about some of the work they've been doing, and then we'll be able to get into the Q&A itself. Now as with previous sessions, that in your go-to webinar window, which is hopefully on the right of your screen, there is a little questions module where you can type your questions in. You can type your questions in at any time. And then when we get to that Q&A session, I will be pointing those questions at Catherine, Rowan, myself. Okay, I don't think there's much need to delay. I don't think there's anything else to cover, so we'll get right into it. So I'd like to introduce my first speaker, Rowan Brownlee. So he not only works at the Australian Research Data Commons, but he is also quite heavily involved in the Australian Vocabulary Special Interest Group. Over to you, Rowan. Thanks, thanks Matisse, and welcome everyone. While I've got a few slides that I'll go through, and while I'm going through these, I'll switch off my video and then I'll come back in after the presentations. So I understand that you've discussed the use of vocabularies to support data interoperability, but if in the one subject domain, you have choices between separate vocabularies, how do you decide which one to choose? Faced with semantically overlapping vocabularies, how do you make an informed choice? And if you choose a vocabulary, how do you know that it's fit for purpose, backed by the research community that you want to work with and that the vocabulary will be managed and maintained over time? Next slide, thanks Matisse. I'll talk about this issue in relation to one example research domain, that of earth and environmental sciences. These are some organizations involved in earth and environmental sciences, a mix of Australian and overseas organizations. At each of these organizations, publishes vocabularies in the ARDC Vocabulary Service. Next slide, thanks Matisse. As more vocabularies are contributed to the service, users have reported uncertainty in making selections when faced with semantically similar or overlapping vocabularies. They express concern about making the wrong choice and want guidance to make informed choices. They want the vocabulary publishing portal to provide a community lens, the domain view of their community's vocabularies, explaining which vocabularies to use for particular purposes. In this example, over on the right hand side, an extract of results for searching on a concept titled Bohol. In this example, four vocabularies, four separate vocabularies are returned, each containing a concept for Bohol. So how is the user to know which vocabulary to use for their particular purpose? Next slide, thanks Matisse. Vocabulary users want guidance, but how does ARDC get the information needed to make recommendations about particular vocabularies? How does ARDC know which vocabulary to recommend for use for a particular purpose? One approach is to work with research communities themselves. In particular, working with those who have an interest or expertise in informatics. And one such attempt involves working with the National Earth and Environmental Sciences Facilities Forum, otherwise known as NISF. NISF provides a means for Earth and Environmental Science Research Facilities to periodically meet up to discuss issues of common interest. NISF member organisations have links to related organisations within Australia and across the globe. And they have an interest in encouraging uptake of community standards. So some good starting points, interest in informatics, understanding of the importance of using shared standards and relationships to like organisations here and overseas. And so we're aiming to work with NISF to benefit from their domain expertise, to gather information, to improve the documentation and presentation of information about vocabularies, to help users make informed choices, to answer questions like, which vocabularies address the common need or purpose? If I choose this vocabulary, how can I tell if it's well-governed, trustworthy, authoritative? And so you might consider NISF as an important element of research data infrastructure that offers a level of social interoperability, communication, common interest, social interoperability in support of technical and semantic interoperability. And if this approach is successful, ARDC will seek to explore its applicability to other domains. Thanks, Matthias, that's it. Great, thank you very much for that, Rowan, and for that insight as to how we get domain-specific expertise contributing to our vocabulary service. So next up, I would like to introduce Catherine Brady. Catherine Brady also works for the ARDC in our data and services division. Over to you, Catherine. Thanks, Matthias. As Matthias said, I work in the data and services area. I'm the program manager for the Australian data partnerships program that sits under what we're calling our national data initiative. Now, I just wanted to talk you through a story from one of the projects that we ran with last year. We, the ARDC, co-invested in something called our data and services discovery projects in 2019, the last half of 2019. And they were small, short exploratory projects that were conducted in a limited time frame. And one of these that I'm going to talk to you about today, just to give you a bit of like a real life example, was the Australian Brain Data Comments. So the A, B, D, C, nice acronym, which was led by the Australian Brain Alliance. And that was under the auspices of the Australian Academy of Science. So some big players there. Now, their aim was to use a consultative approach to determine requirements for what they called the coordinated and internationally compatible national brain science data framework. And they also wanted to educate their neuroscience community on how to reuse their data for maximum benefit and to promote data sharing standards across that sector in Australia. So they assembled a working group of stakeholders that represented a range of disciplines in the neurosciences and across the private sector. So who thinks this is sounding a bit like it might be a bit of an interoperability challenge already, right? So what problems did they have in this sector? I mean, it's actually, this is quite a good example. So Brain Data Comments, that sounds like quite, you know, a homogenous sort of discipline, but it's actually quite a broad spectrum of methods and data across a range of sub-disciplines in the neurosciences. So when you're thinking about neurosciences, maybe it's the, it could be psychology data or data coming out of the cognitive neurosciences. There's animal behavior data. There's data that comes from MRIs and PET scans and molecular imaging. And there's macroscopies data, as well as molecular neuroscience data. There's electrophysiology and calcium imaging. They also collect neurogenomics and clinical data and computational neuroscience, AI and machine learning data. There's patient observational data and all sorts of other imaging data that they collect. So that's quite a diverse sort of data types in a particular discipline that seems narrow, but it actually crosses quite a range of data types. And what they had was no clear standards within or across those sub-disciplines. And that's not surprising, really, because partly that those disciplines, you know, from psychology to the calcium imaging in a cell, is, you know, that's quite different things. And it can depend on the type of equipment and software you use to acquire that data as well. Obviously, you know, that if you're getting something off a PET imaging kind of machine, that's the machine you're getting it off is going to determine the data types that you're collecting. Now, what they found as well, that there were real and perceived barriers to data sharing in their community. There was a lack of appropriate sharing solutions. There was a lack of appropriate technical and other resources. There were legal and ethical concerns around the data. And obviously, a lot of it's sensitive clinical data. And there were concerns about the data ownership. And they found that software platforms did exist, but they weren't always user-friendly enough. And they weren't always adapted to cope with the different types of data in neuroscience. So what they did was convened some workshops around the country to discuss and build consensus around identifying the infrastructure, the technical and human resources, and the culture required to develop a neuroscience data sharing standard that met the fair principles. Now, it's really interesting, I think, that they included culture here and very insightful of them, because when you think about interoperability, you're probably thinking about technical challenges and solutions to the problem that are technical. And perhaps you're thinking around technical infrastructure and metadata and data standards and exchange standards for that data. But these guys identified culture as one of the important things they had to work on in their community if they were ever gonna achieve fair. And they started this process by bringing their community together, talking, asking them what would they need to do, what would they need to make happen to make their data more fair. And not just that, what were the perceived barriers? Where was the incentive? What was the return on investment? What were the social or discipline norms in the sector? And how can we make it easier to participate? So in this case, interoperability started out with getting the right people around the table, just a conversation, acknowledging those shared problems and getting agreement on a way forward. A shared vision of what interoperability might look like for the Australian Brain Data Commons, if you like. So acknowledging what is common, what's different. What standards and infrastructure exist already and what can be built on and what needs to be started from scratch. And working out what would be fit for their purpose if the goal was a national brain science data framework. So I don't really have an ending for this story. The project ended at the point where they did achieve some consensus around a way forward for the Australian Brain Data Commons, that it was a feasible idea and more exploratory work had to be done. And we'll be following up with them again in time to see how they've gone with that. But I think they made a really good start at the ground floor on getting the culture right and getting the people on board in order that interoperability might be achieved and might flow from the work that they've put in place by sort of starting, I think, from the right, putting a framework in place and putting that groundwork down and not just approaching it from a technical solution or point of view that sort of goes over the top of everything else. So on the basis of the success of the discovery projects of which the Australian Brain Data Commons was one of those, we've launched a larger program in the same vein. It's called the National Data Assets Initiative. Underpinning this initiative, we've got the principle that collections of research data can be national infrastructure when they support leading edge research and national end scale. So that's what that initiative is about. That's what a national data asset is from where we're looking. And what we've done is building on the back of those discovery projects. We've made sure that the fair principles, including interoperability, are an important component of the projects we intend to co-invest with under the auspices of this program. And all of our projects are being asked to sign up to the fair data principles and explore what it would be like to make these collections more fair. And they'll be learning from those early experiences of the projects and the groups that did that early work last year in the discovery projects with how they went about getting their community to look at the various data and how they might achieve fair and how they might work towards interoperability. So I think we had some really good lessons there and some things that people can pick up with going forward in our National Data Assets Initiative and the Australian Data Partnerships Program, which I'm program managing. So that's it from me as a little story. Great, thank you very much for that, Catherine. And I'm just gonna stop sharing my screen for this Q&A session. Now, we haven't got any questions just yet. So please, if you do have a question for Catherine or Rowan or you have some questions about the webinars for me, please pop them into the question box and we will get those answered. Now, Catherine and Rowan, what I noticed there is that both of you talked about the need for communities to work together or for consultation to help underpin interoperability work. Now, perhaps a bit of a left field question, but have either of you observed a situation where an attempt at interoperability has failed because there was no community consultation or no real community involvement? It's a good question. Go, Rowan. Just thinking of how that would... How would it happen? Who would be the groups that would be seeking some level of interoperability independent of the community or the people in their particular group? That's... I don't think I haven't had that. If that's what you're asking about, Matthias, I haven't... I suppose it could be more seen as maybe somebody thinks that something will help interoperability so they propose a standard or develop something, but because there was no community involvement, it was never really adopted. Nobody really went along with it. I'm sure there probably are plenty of examples of exactly that, but nothing concrete that comes to mind immediately, Matthias, but I think from the stories we've just sort of related there, that would be sort of a likely outcome, and I'm sure that has certainly happened across... I mean, perhaps we don't hear about them because no one likes to really talk about the things that go wrong and fail. So those stories probably aren't out there so much, but... Yeah, I think that would be... Sometimes they can be really useful stories to hear. I can remember one or two years back at the Air Research Australasia Conference, and there was some sort of panel or forum where a bunch of people got up and described projects or situations, things that just hadn't worked out at all, and that was very useful and informative. And in areas like health and medicine at hospitals and things, they have morbidity conferences or analysis. Something terrible happens, and everyone gets around and discusses it in order to learn what happened and why it happened. Maybe more of that approach could be brought into our area. Yeah, I mean, it's really easy to imagine interoperability not working out. I mean, the times when I've gone to a meeting, gone to a meeting and a group of us have prepared something that we think, oh, this is going to work, this is going to be of interest, this is going to provide a way forward. It reflects all of our assumptions, and then we provide that to the other members of the meeting and we find that, no, something else entirely that's needed. Yes. OK, so in the meantime, we have received some questions from our audience. So the first question is, and this is for both of you, whether either of you have advice on best practices for capturing decisions, agreement or consensus in this way? What's the best way to document and share standards? I haven't seen an example of any one that I would promote that you would necessarily follow. Again, it would probably depend on the context and the discipline. There might be some I know in medical and health. There's there are some I think there may be some I might be wrong with something called the Delphi principles or guide. And I know some of our projects were using that to structure how in our discovery projects, how they went about seeking some of that consensus. That might not be all of the way you document, but they looked to their sector and their discipline for or internationally for a kind of robust process, I guess, and of bringing stakeholders together. And perhaps maybe part of that would be documentation. But I'm not aware of particular others or something that's generic. Yes, and certainly I've I feel like this is can be a very community specific thing as well. So I mean, one of my favorite things to say to people asking any kind of data question is just check what is the standard or the norm in your discipline, which isn't very helpful. And I suppose that's where people don't have to go. Well, how does my community determine this? But I have attended workshops and seminars where people have been brought together to try and work on agreeing on standards or agreeing to move on a way forward. OK, in here's another question. Possibly this is a vocabulary question. Are there any examples of parallel coexisting vocabularies in a particular field that have a crosswalk built between them? Sometimes that does happen. I'm aware of examples in agriculture. I'm pretty sure that AgriVoc has mappings from concepts within that vocabulary to like or similar concepts in other vocabularies. And this can be a way of helping to bring different well data that's associated with different vocabularies can be a way of helping to take a step toward bringing that data together by expressing relationships at the level of the concept. These might be direct relationships from the concept in one vocabulary to a like concept in another vocabulary. Or another approach is to express relationships between a number of vocabularies up to a what might be described as a backbone or a pivot vocabulary or an upper level vocabulary. And in that case, it can be less maintenance over time because rather than creating a mesh of relationships directly from one concept to another, you have connections made up to common upper level concepts. And essentially you're hooking up trees from a vocabulary into that upper level vocabulary. So, yes, that does that does happen. And there are some approaches to expressing relationships between concepts in different vocabularies. Great. Thanks, Rowan. OK, now here is it's actually a question for me because it is about the prerecorded videos that our attendees have watched. So in a video presented by Liz, she made a reference to Principal I3 and that is metadata should have qualified references to other metadata. And the question is what exactly is a qualified reference? So Principal I3 without being incredibly explicit about it, unfortunately, it is implicitly talking about linked data and the idea that a piece of data or a piece of metadata can be linked to another piece of data or another piece of metadata with an explanation as to the nature of that link, which is that qualified reference. So, for example, you could have metadata about an author or a data creator and there is a link from that person's metadata record to a metadata record of one of their data sets, saying that this person is the creator of that data set. And there might be a reverse link as well from the data set to the creator, saying this data set was created by that person. So the idea is that these qualified references create semantic links that not only let you understand the existence of relationships between objects, but also understand the nature of that relationship. OK, here is another vocabularies related question. So how does the ARDC fit in with some other organizations that might manage data vocabularies or dictionaries? Now, for example, the Australian Institute of Health and Welfare has a range of data dictionaries related to health and community services and disease. There are quite a number of domain based registries of vocabularies. And one of the differences with the ARDC service is that it is cross domain. We're interested in publishing and enabling access to authoritative vocabularies from any number of domains. And that and we would also be interested in being either a primary point of publication or a secondary point of publication. So, for example, some of the vocabularies that ARDC publishes in the Australian service are available elsewhere in North America or in Europe. And certainly some of the resources available in AIIHW, I'm thinking in particular of their code lists, could be of a type which might be able to be represented in a format that could be made use of within the ARDC. And as we see, sorry, within the ARDC service, and as we see more registries of vocabularies becoming available internationally, we see increasing discussion in the vocabulary community about the scope of efforts around vocabulary, registry, interoperability, which would enable different registries to draw in and enable access to vocabularies that are accessible from other registries. So why would you want to do this? One reason may be to have multiple points of access, in case one access point falls over. Another reason can be that maybe one registry provides a type of service, a type of access, which is not available from another registry, or it could simply be a case of having a registry geographically closer. You might get a faster response than one in another country. So other, I suppose, areas of interoperability and ARDC is really interested in working with other registries in that way. Great, thanks, Rowan. Now, we've actually had some, while we were answering those questions, some people have been contributing to the discussion because with a few clarifying comments. So one comment here that, for example, in health and medical areas, so this is around consensus building. So in health and medical areas, consensus statements are often published by the relevant lead professional organisations such as the Royal College of whomever, whichever discipline of health and medical or sub-discipline, rather. And somebody has been brave enough to talk, to mention an example of a schemer and system that was not terribly interoperable. So an attempt at Western Sydney University's Adela project. And it was originally supposed to be interoperable with the European cell project. But unfortunately, it was not quite set up correctly with the right fields and the right systems to actually be able to contribute to that cell project, unfortunately. And then an example of a meta vocabulary that in the dark combines all the other vocabularies. Sorry, nerdy reference there. So the unified medical language system from the United States National Institutes of Health National Library of Medicine is a meta vocabulary that helps combine or link together the other vocabularies. And then a question about the fairness of vocabularies. So this particular person has been working on a project that ferrifies vocabularies. And apparently in that case, they're called semantic artefacts. Do we think the panel that is necessary that the vocabularies, not just the data, but also the vocabularies used to describe the data or used in the data are fair? Well, Rowan's nodding and look, I... Yes, yes, it would be helpful for these resources. And I'm aware of some of the efforts around this looking at what sort of properties should we reasonably expect to be associated with any vocabulary to make it more explanatory and give provenance and give information about authority. And so I think if you're working on one of those initiatives, that's a great endeavour to be involved in. Yes, absolutely. And I think to myself, you know, if a vocabulary is not fair, so an unfair vocabulary, I ask myself how useful it actually is and especially in the context of interoperability, where we do want people to or anybody, when they're describing their data or using vocabularies in their data, we would like that vocabulary to be a vocabulary that other people use as well. And so there are certainly some fair elements then involved in that. OK, now here's an interesting question, possibly a little philosophical. Could a common key language assist in targeting links between disciplines? Now, that's an interesting one. So I've been debating over the the meaning of certain words with some colleagues recently, and I think that certainly directly lead to me. Highlights or indicates the issues of of language, of any kind of human language, especially where words can have ambiguous meanings and mean different things to different people and especially have different meanings between different disciplines. So I mean, from from my point of view, I think one of the biggest misunderstood words is the word theory, which when you use in science, theory, a theory is considered essentially to be fact. Whereas in common vernacular usage, when somebody says, I have a theory, what they're referring to as the scientific principle of a hypothesis, which is the case of, I think this is the case, but, you know, I need to collect evidence to make that the case. So you have lots of people saying, oh, the theory of evolution is just a theory. Well, scientifically speaking, it's considered essentially fact. And then, but even then between disciplines, there are different meanings for the same words. So if you ask a librarian what a database is, you're likely to get a completely different answer to if you ask a computer scientist what a database is. I mean, there is some commonality between the two, but they are I think there's enough difference between the meanings to possibly cause confusion if you throw the word database into a document without a little bit of clarification as to exactly what kind of database you're talking about. Although, Roland, you're the vocabularies guy. You probably have a few more thoughts about this than I do. I was going to ask you about what I'm not familiar with the terminology of a common key language. What's that about? Well, I'm sorry, I understood there to be a comma in there. So could a common comma key language assistant targeting links as opposed to a common key language? Now, sorry, if the if the person who asked that question could let us know if a common key language is a thing, that would be great. And here is our last question that's come up so far. What is the difference between a vocabulary and a terminology? It sounds like one for Catherine. Just a vocabularie. There are probably some there are probably some definitions around it. I suppose the terminology about what these things are is itself overloaded. And one person's terminology can be another person's vocabulary. This came up recently, again, in discussion around health and medical services. There's a national vocabulary service, the National Clinical Terminology Service, and they provide access to a major health medicine vocabulary called SNOMED. And they refer to SNOMED as a terminology, and it's provided through a terminology service. And yet I would understand it to be a vocabulary that perhaps a more complex vocabulary than we might typically see represented through SCOS, through the simple knowledge organisation system. SNOMED is quite complex. It has a large number of specifically typed relationships between things that are more specific than just generic concepts. But yeah, I've heard these terms used interchangeably, taxonomy, terminology, vocabulary. I don't think I'm able to offer anything more specific. Do you have some thoughts on this, Catherine? No, I like your answer, Rowan, about well, yes, it's really the semantics around it, but some people's vocabularies and other person's terminology, I do like that. I mean, really, you're looking whatever it's called, you're looking at how authoritative it is and how it's well structured and all those other things that you might look around a vocabulary. And I suppose what it's known by may, though distinction between those things at the meaning may have been lost over time and in some senses have come to mean the same thing rather than something different. It sounds like an examination of said vocab or terminology would only be the only thing that would reveal whether suited your purpose anyway. And I just went and had a check of the dictionary. And then look, I mean, a standard dictionary doesn't necessarily reveal very disciplined specific uses of terms. But interestingly, the first definition I found for terminology was the vocabulary of technical terms used in a particular field, subject, science or art. So I think that does sort of lend itself to Rowan's interpretation of terminology, vocabulary being used interchangeably. And again, I think that's a classical example of where confusion can arise when people use different words to mean the same thing or use the same word to mean different things. So perhaps what we need is a meta vocabulary that defines the difference between a vocabulary and a terminology. Those definitions become recursive, though. Don't know about yours, but I think that one you read out is both vocabulary and terminology in order to define it. Yes. And here, somebody else has introduced another term. So an overarching term for a vocabulary used by a branch of knowledge is a lexicon. So look, we and somebody has now commented that their head is exploding. So perhaps we need to pull back from this from this part of the discussion. Although that being said, we are coming up to time and we haven't had we have got some discussion happening here. And in fact, here's a reference to NKOS, Networking Knowledge Organization Systems. And I believe that's being raised as an example of something that does help define knowledge systems and or rather knowledge organization systems, which, yes, we could talk about all day, but we won't. So in our final moments, so unless any more questions come up, well, I'll start the wrap up of this session. So in fact, I think I need to go back to sharing my screen. So if anyone doing the course has any particular technical issues with accessing the quiz, with doing the activities, please either contact Nicola Burton directly or bring them up in the Slack. Maybe you'd like to discuss one of the activities. And we do have a discussion channel in the Slack workspace. We're also welcoming any kind of silly discussion that you might have if you have a GIF or a meme or any terrible fair puns. Now, when this webinar finishes, you will be asked some questions for feedback as with previous webinars, and we do use this feedback to improve future webinars. So thank you very much for all the feedback you might have given to a previous webinar. Now, we have not had any more questions, but we are getting all the thank yous rolling in. Actually, there is a comment from somebody saying we do not need more terms. So I'll leave that with you, and otherwise, I'd like to thank you, Catherine, and Rowan, very much for your time and for sharing your stories with us and being on a somewhat philosophical panel. And to everyone else, thank you for coming and have a fantastic day. Thanks for this. Thanks, Catherine. Cheerio, everyone. Thank you all. Bye.