 Okay, I think it's about time to get started. Welcome everyone. I'm Cliff Lynch, the director of CNI, and I'm very pleased to be able to welcome you to the first project briefing session of the third week of our 2020 virtual conference. This presentation is on ontology for scholarship, the revising of the vivo ontology. We have two speakers with us today. They will be presenting starting in just a moment and Violetta will lead off. We will take questions at the end. Diane Goldenberg Hart from CNI will be moderating the questions and we would invite you to use the Q&A tool at the bottom of your screen to pose questions at any point when they occur to you and we'll come and take all of those questions up at the end. So with that, let me just say welcome one more time and turn it over to Violetta. Thank you, Cliff. Thank you CNI for having us talk to you today about the vivo project and the vivo ontology. Both Mike and myself are presenting on behalf of the vivo ontology interest group. We hope you're all safe and staying home. So with that, I would like to start off by saying that the vivo project provides an open source approach to collecting, studying and showcasing the scholarly work of an institution and sharing data about the work using a common ontology. The vivo ontology has experience demands to represent new forms of scholarship while adapting best practices in ontological design and ontological development. Domain representation have emerged and much good work can now be incorporated in a revised and modularized ontology which is the goal of the vivo ontology tool. A first domain has been isolated and a work product is available and that is the language ontology which is based on the ISO 639-3 which can be used to represent the language capabilities of people and the languages of their works. This is very important because we have learned that that's with the vivo being international. There is many requests for being able to represent the language of the scholarship and the language capabilities of the scholars. So in this presentation, we will describe the pressures on the ontology to modernize the domains of representation that can now be separated, new forms of scholarship to be represented, best practices for ontological development being adopted by the project and the current state of the work to revise the ontology. Next slide please. Some trouble, just a sec. Not advancing. It's not advancing. This work just a little bit. So why do we want to revise the vivo ontology? There are many reasons for this and some of them are improving the representation of scholarship, the breadth, the depth and the consistency. We would also like to address technical depth, licensing issues, methodology, use of other ontologies. What exactly is happening here? This new ontology related ontology modules is based on community standards. There are multiple tools, documents and training available to work with the new ontologies. And how? Of course, because this is an open project, open source project, this is an open multi-year effort to develop, test and implement the new ontologies in the vivo software, upgrade the existing data document and train. The next slide. Vivo development has a goal of decoupling various components of the vivo environment. The current software is heavily dependent on ontology. Future interface software will operate from a decouple index easily populated from any ontology. The Vivo ontology interest group has been meeting for more than a year to identify issues related to the ontology. Implementing incremental changes to the ontology is difficult in the current vivo environment. It was last attempted in 2013 with significant difficulties. A full-scale version upgrade to the ontology can be planned years in advance to include all the changes, change management required. Ontological practice has matured significantly in the past 12 years since the Vivo ontology was created. Best practices can be implemented to significantly improve the ontology and its related models. So in the slide four, the BFO, the basic formal ontology is for data sharing. The BFO came out of the biomedicine because they started first doing this kind of work. It's heavily using the open biomedical ontologies. The BFO is an upper level. It can be used in any domain. I just want to mention that this doesn't restrain it only to biomedical sciences. Any ontology can be expanded. If you use your vocabulary and I use my vocabulary, for example, we need to translate to be able to share. The oboe standards are a good thing. In the large community of oboe ontology maintainers, they have adopted best practices for ontology development and maintenance. We actually are planning on following those standards. Because to be fair, Vivo did not do a perfect job with subsumption hierarchies. We have a lot of stuff that are kind of misclassified. We would like to revise this ontology and put everything in place the way it should be so we will be able to easily share the data with anybody else. Vivo does not fully comply to global foundry because of the number, class names, the name, and the way we version the ontology. For example, oboe requires class labels to be unique and we don't do that currently. We are working on all of this and the work will be shared with all of you when we are ready. In the next slide, slide five, of course, much good work has been done across many projects. For example, the BIBO, the EIRO, the research ontology. Vivo needs a consistent, well-maintained set of ontology to represent scholarship. Ontology is based on standards, a law for sharing of data regarding scholarship. Vivo intends to capitalize on this good work to build consistent, well-maintained ontologies to represent scholarship. We found out, unfortunately, that the CC BI licenses are not recommended for ontologies. We would like to, and we would prefer to use the CC0 licenses. Like Mike says, attribution is polite but not required in the CC0, but that's what we would like to be able to do, to be able to build on other ontologies. So the Vivo ontology too will contain assertions regarding the relationship between terms in the Vivo ontology and those in other ontologies. And now I will hand it over to Mike. Okay, so when you're building an ontology to represent scholarship, you want to reuse as much work as you can. And in version one of the Vivo ontology, we used quite a number of different ontologies, but that created some ontological inconsistency because the ontologies were not using the same upper level ontology. It also created some potential licensing difficulties as Violetta described. And then over the years, we found out that some of these ontologies were not actively maintained. And so as we came to own our collection of ontological work, we came to the kind of conclusions that we just heard about with regard to selecting ontologies and then assembling them into something that might be useful for representing scholarship. Our current thinking is that we'll continue to use the ontologies that you see there to meet the criteria of being well maintained, having Vivo as an upper level ontology or being readily mapped to the Vivo upper ontology and open licensed in an appropriate way. We expect to be referencing a number of other ontologies, not directly using their terms. And so when, and we may reference terms that are in wiki data, for example, which isn't actually an ontology, but as a set of RDF terms that are useful for representing things. And so our ontology is likely to reference those things so that people using the Vivo ontology can find additional material about the terms that we're defining and using. And then we intend to create modules that are complete ontological entities, which may not reference Vivo at all, but can be used by Vivo to represent scholarship. We've always supported the concept of a local module, which allows extension of the entities. But as we move forward, we create ontological artifacts that can be assembled in a scalable way. And Violetta described, I just got one of those little notes that my connection is unstable, so I'll stop talking whenever I see one of those things. Violetta described the language ontology, which we will have a draft available to the public shortly. The language ontology as Violetta described is a complete standalone ontology. It's based on ISO 639.3 and the common European framework for capabilities of languages and can be used to represent the capabilities of people, organizations, translation of works and represent the language of works. And you see a GitHub repository there. The Vivo domain is the domain of scholarship. And this scholarly domain can be quite broad, as we all know. Scholars are involved in a wide range of different activities. Here we show examples of, in the class hierarchy, from the SRR point of view. For those new, this example, a cat is a mammal. Right. So these arrows that we're seeing here are all IS type relationships. And so in the lower left, we see an ORCID ID. And an ORCID ID is a centrally registered identifier, and then a centrally registered identifier is an information content entity. In this figure, this is our proposed hierarchy. The yellow items are from the BFO, upper-level ontology. The white items are from an existing information artifact ontology that is consistent with BFO. And then the blue items are items that are introduced by Vivo to create a comprehensive approach to representing scholarship. As we move forward with, you know, we've been in the Vivo business for, the ontology has been around for about 13 years. But there are areas in the representation that we currently have that need to be extended and improved. So, for example, attribution was not, has been an emerging topic over the last decade, trying to create and improve our ability to attribute the work of people who participated in scholarly activity beyond that, beyond simple authorship. So we might want to know who created the figures or who did the data analysis or who wrangled the data or other elements of the construction of a scholarly work. And that would be true whether it's a, whether across all domains of scholarship, whether it's scientific domain or humanities domain, we'd want to be able to get sort of inside the work and understand the nature of the participation of the people involved. And all of the other areas that we see up there have similar depth issues where we can go deeper into representing the nature of the scholarship. We've done a fair amount of work to date, trying to lay groundwork for the work to come in creating ontological artifacts. And so we see thoughts about different parts of the scholarly domain and how we may choose to represent them in the work that goes forward. All of these are links. You'll have access to the presentation. And then the presentation, these are actual links to Google Docs that are publicly available. So you'll be able to dive down on any of the areas if you're interested. And then for additional information about the work that we're doing, you can check in with our ontology interest group. And we meet biweekly and discuss all manners of representing scholarship and all manners of owning and operating an ontology in a public way. And then there's a link there to the actual text of the current vivo ontology, that version one. And as we move out with additional artifacts, they'll be about the facts for those as well. So with that, I'd like to thank you for listening. And perhaps we have time for some questions. Thank you very much, Mike and Violetta. It was really interesting. I have to say it's been fascinating to watch the development of the vivo project over the last decade. And really fascinating to hear these developments in the ontology. So we appreciate you coming to CNI to share that with us. At this point, I'd like to invite anyone who has questions to please type them in the Q&A box, which you should see at the bottom of your Zoom screen. You're also welcome to share them via chat if you prefer. While we are waiting for folks to think about the presentation and some questions that they might have, I just want to take an opportunity to remind you that this webinar is part of CNI's ongoing Spring virtual membership meeting, which will continue through the end of May with lots of offerings yet to come. And I'm pasting into the chat box there, the direct link to the complete schedule of upcoming project briefings. And this afternoon, we have two more project briefings, including the first presentation from our call for proposals involving topics relating directly to the COVID-19 crisis. That one will be starting at 2.30 this afternoon, Eastern time. And then followed by a project briefing on SimplyE about an academic e-book experience. So I hope you will join us for those project briefings and many more throughout the next several weeks. And with that, it looks like we do have a question for our presenters. And our question is, could you please explain a bit more about what would be included in the quote teaching extension that you mentioned as a possible new direction? Sure. So in the original Vivo conception and modeling, teaching was handled in a very simple way. We basically wanted to be able to answer the question, who taught what? And so we could connect a person with an instance of a teaching thing. But we weren't clear about what that teaching thing was. And we weren't clear about the difference between the classroom experience as a, from an logical point of view, as an occurrence, something that occurred in time, akin to an event, a performance, a gathering, that kind of thinking of a classroom experience from the concept or idea or term of a course, the way the registrar might conceive of a course as being an entry in a catalog of courses. And so that's an information content entity, clearly, and appears in a collection of courses and then takes part in things like requirements for degrees. So you must take this course in order to complete that degree. That concept of a course is very different than the course concept of a gathering, like this one that we're having today. And so we discussed that contention or difference between the course as an information content entity and the course as a gathering. It turned out that the University of Florida had clear language about this and had always considered these things to be completely and totally separate. Course was reserved for the concept of the information content entity, the thing that would appear in a degree requirement. That's called, that is a course. The thing that you teach is not called a course, it's called a course section. And so economics 101 might have many course sections, one taught by me and one taught by somebody else and one taught by somebody else. And the catalog says economics 101 and then the student finds a course section that they want to participate in and take. And so that language about teaching and taking courses, that's about course sections. And that was very clear where I came from, but was not very clear to others. And so when we saw the vivo ontology originally we said course, what's that? And they said, you know, the thing you teach. And we said, you know, like, no, we don't know what that is. So we started this conversation about separating those two ideas almost 10 years ago. And the University of Florida immediately created a local extension and used course to mean the thing that participates in the degree requirement and use the thing that vivo sent us as course section. So we separated them immediately. So yeah, that's what we're planning to do is eventually separate them and then be able to say things like this course section consisted of course meetings. And the course meeting on September 13 was hosted by so and so. So we could get all the way down to the detail of we had a special guest star appear in our course at that moment. And when you're representing the scholarship of people who are doing guest lectures or doing guest lectures and courses, you need that kind of representation. And we didn't have so interesting. That's great. Thank you. Thanks for that explanation. And thank you for that question, Nancy. And we do have a little more time if there are other questions. Similarly, if you would like to make a comment or engage directly with micro violetta, this zoom environment does permit me to move you into speaking mode. So if you want to raise your hand, I can recognize you and allow you to address and engage directly with our speakers. I don't see any more questions coming in via the Q&A. Except for now I do see another question coming in through the Q&A. So let me read you that question. Earlier you mentioned ontology to describe institutions. And I'm wondering if you're leveraging a standard for institutions. It's an excellent question. Oh, yes, we are. Yes, we're right. Yes. So this is difficult. So there's two parts here. One is the ontological work to represent an organization and what we want to say about an organization. And then there's the data about organizations that we might use to create instances of organizations in an ontology. On both sides, we're interested in standards. And on the data side, there appear to be at least three. And so we have been in conversations with war people at CNI actually, at previous CNI meetings. And well before that, we were working with the grid people at Digital Science before there was grid. So we were talking to them about how they would create grid and how the representation might work. We mapped grid to the existing Vivo ontology and created a full RDF representation of grid for Vivo and put that in open Vivo, which is available on the web. So openvivo.org contains a complete RDF representation of the grid data. And we have the tools to update that as necessary. We'll do the same thing for the ROAR data. And so we've been in conversations about the ROAR data. I think our last ontology meeting may have been about the ROAR data. And so we're definitely interested in ROAR and establishing a standard for identifying organizations and for representing information about them ontologically. One concern we have about the ROAR data, since I'm here, is that ROAR is very focused on scholarship as are we. But it turns out that in certain elements of scholarship, they are very interested in large collections of organizations that are not considered scholarly. So an example might be the laboratory people I work with. So I'm a funded NIH investigator. I work with metabolomics people. They buy equipment. They buy equipment from companies. It's important that we are able to represent what company made the equipment that is in the laboratory. And so we have a chain from the investigator to their equipment to the vendor. And the vendor is not scholarly. But the vendor needs to be known and needs to be known in the scholarly realm. And so we have the need to represent a large collection of organizations, not just scholarly organizations. And to that end, we've also been looking at wiki data. So wiki data has hundreds of thousands of organizations and has no particular interest in scholarship. So they represent everything. So that might be useful to us. The ontological representation will be the same regardless. The basic ontological elements of organization, what kind of an organization is it? Contact information, location information, identifier information, the ontological model. But that we would expect to hold up regardless of whether we're representing grid data, raw data, or wiki data. Well, that was really interesting. And Robin says, thank you, Mike, and good to hear from you. Interesting issue on corporations and vendors. And if anyone from Roar is here, perhaps they would like to weigh in. Very interesting observation. So we very much appreciate that explanation. And we do still have a little more time. If there are any other questions for Mike and Violetta, here's another question. Similar question is earlier regarding a standard for subject domains and subdomains. What do you say about that? Well, I may have flagged by, but we're currently for subject domains. I think you mean like topics. So you want to say the person is interested in physics or this book is about aeronautical engineering. That's what we mean by... Is that what you mean, Nancy? Thanks. No, from which subject discipline does a teacher or scholar come from, she means? Oh, okay. Well, actually, that's interesting because those are vaguely related. So, of course, we always represent the person as being a member of a department. And so that's like an explicit thing that has to do with their academic home. And then Vivo provides the opportunity to represent the research areas that a faculty member is interested in or anybody is interested in. Those research areas usually come from a control vocabulary. And I think Violetta is the expert on this. So you want to talk about the... We use few vocabularies? Yes, we use few vocabularies in Vivo. And by the way, depending on the institution and the needs of that institution, what's the focus, what are the... Some liberal arts colleges may not want to use the mesh vocabulary, for example, you can upload your own vocabulary in Vivo and you can utilize that. Yes. And FAST, we've used the FAST vocabulary, but FAST vocabulary did not have the term for biostatistics, which is my academic discipline. So it had 500,000 terms, but it didn't have that one. So local extension will always be necessary, apparently. I think it's time. Is that right? I think we lost Violetta. She said she had to drop off at two for another meeting. Oh, I see. Okay. And I think we're coming up to the time for your second. Okay. I don't know how you do this. Well, we do have a little more time if there are others that have any questions. Nancy wanted to thank you. And then she comments, there's always a problem with whatever you use. That's correct. That's why we have that local trap door for all of our... Well, with that, I will go ahead and close out the formal public portion of this webinar. And thank you so much, Mike. And thank you to Violetta, although I'm sorry I wasn't able to thank her before she had to leave. But again, we really appreciate you coming to CNI to talk to us a little bit more about Vivo and we wish you luck with...