 Welcome everybody. Sorry we started a moment late there. We'll give everybody about a minute to come on and we'll start in about 50 seconds. We'll be starting in about 30 seconds. Welcome and thank you for joining us. Okay it's time to get started. Welcome everybody. I see we have a good house here and I welcome all of you to this session. You are participating in the CNI spring 2020 virtual meeting and this is the project briefing on the work to enhance the joint ESIP data one and USGS founded data management training clearinghouse that ESIP hosts. We will have a presentation today from Carl Benedict of the University of New Mexico and Nancy Hobel Heinrich of Knowledge Motifs. They are the co-pIs on this IMLS grant that we're going to hear about today. I'm Cliff Lynch and I'm the director of CNI. I'll just once again extend my welcome. Carl and Nancy will take questions at the end of the session and the chat is open. We'll take questions primarily through the Q&A tool. The button for that is on the bottom of your screen. You can queue up questions at any point and when we get to the question and answer period they will look at the queued questions and try and answer them. Diane Goldenberg Hart of CNI will be moderating the Q&A part and with that let me turn it over to Carl and disappear. Welcome Carl. Thank you and thank you for handling the logistics for moving the entire meeting online which is a non-trivial undertaking under the best of circumstances. As Cliff said I'm Carl Benedict. I direct our research data services program and IT programs at the University of New Mexico libraries and we're here today to provide an introduction to the data management training clearinghouse to the CNI community and share with you some of the capabilities that we've been developing some of the content that we have continued to bring into the clearinghouse and finish with an opportunity for members of the CNI community to join us in being able to enhance the content and contribute to this growing collection of materials related to data management training which of course is a key area of focus and interest across many institutions and especially of significant importance to our libraries and our academic institutions. So let me see if I can advance my slide here. So as I said we're gonna provide an update on our current clearinghouse but as an explicit part of this project and a specific intent of IMLS is to also increase the visibility and use of the materials in the clearinghouse especially in the library community and then also to recruit folks that are in this community to participate in the project moving forward in our various working groups and activities. So we're gonna start out with an introduction to the data management training clearinghouse. We're then going to go through a very brief demonstration of the search and resource submission process and then at the end have an opportunity to hopefully answer any questions you might have and provide an opportunity for you to express interest and provide some information about how we may be able to engage you in continuing the continuing process here and I'm getting a message here that my screen sharing is paused. Are you seeing my live screen or are you seeing my introductory screen still? We're seeing your intro screen I believe Carl. Okay let me there we go that is interesting but I think hopefully you're now seeing the presentation outline screen. Yes. Okay great so so far there's nothing very substantive on the slides anyway so you didn't miss anything. So as a background to our work with the data management training clearinghouse I wanted to start with a brief introduction to the work that actually started in 2015 and a collaboration between the US Geological Surveys Community for Data Integration, the Data One project and the Earth Science Information Partners or ESIP. Starting with the recognition that with the growing collection of research data management training materials across the internet there is an increasing need to be able to curate that collection and increase the discoverability and access to those materials while recognizing that those materials are continuing to be maintained across the internet by the original providers of those of the developers of that content. So with the project starting in 2015 the first content was actually added into the clearinghouse in 2016. Starting in July of 2018 we started an enhancement project for the data management training clearinghouse funded by IMLS and this three-year project had a number of key goals. The first is to continue to increase the quantity of the content that is in the system but to also diversify that content to be able to go beyond what had been the core content that had gone into the clearinghouse that was focused on earth sciences when there was a disciplinary focus. There is a large amount of content in the clearinghouse that does not have a specific disciplinary focus but for the content that does there is a strong center of gravity as you might expect from the original collaborators and contributors around earth science data. So one of our goals has been to diversify the content. The other areas of focus are to actually revisit the documentation model that was originally adopted by the clearinghouse and look for opportunities to expand and improve that model to further increase and improve the searching capabilities as the collection has grown. Search has become an increasingly important capability for the clearinghouse beyond the capabilities that had been previously more focused on being able to browse through the content. We also have an explicit assessment objective being able to both assess the content that is in the clearinghouse and develop mechanisms for being able to capture feedback from instructors using training materials but also learners that are working with those materials but also be able to assess more effectively the clearinghouse itself and its ability to serve its growing population of users. And then finally as I alluded to a moment ago broaden the outreach about the clearinghouse to more explicitly target the library community as it is the library community that is really stepping up and taking a key role in delivering training around research data management and other data management and analysis issues and increasing the visibility of use and use of the clearinghouse is a key component of the work that we're doing. So you can see here in the illustration the success that we've had so far where essentially the vertical line in the middle of the plot indicates the beginning of our IMLS project. Everything to the left was the content that was added to the clearinghouse prior to the beginning of our IMLS project. The plot on the right half of the graph shows the increase that we've made in the content. And it's here that I can also highlight that essentially the solid black area is the published resources. Currently just over 400 resources have been published. The dotted line and this will be a pattern for the other graphs that I will show as well indicates the number of items that we have added to the system that are currently under review. And this is highlighting one of our areas of need in terms of bringing additional folks to help us with the editorial and review process so we can help work through what is a substantial backlog in materials that have been identified but we need to get into the clearinghouse. When we talk about the diversity of the content in the clearinghouse there are a number of ways to measure that. One of them is in terms of the subject categories. And on the left here you can see in the colored solid plot you can see the distribution of the high level subject categories and the growth categories again both for the initiation of our IMLS funded work and then on the right half of the graph the growth in the subject areas since the IMLS funded work started. And you can see that while we've continued to increase across the board we've seen some particular growth in the social and behavioral sciences and arts and humanities as additional disciplinary areas where content is being added. The somewhat more busy graph in the upper right hand corner shows essentially subdiscipline, sub subject areas where the main plot is showing everything except for essentially the earth and planetary sciences, physical science and mathematics as they still dominate the collection. So we can see the growth of both published and added materials represented by the solid dashed lines in the system as well. Another dimension of content diversity is illustrated by the distinct keywords that we have in the lower left hand corner. We have the trend line in terms of the distinct keywords that have been added to the system since the IMLS project started. One of the key dynamics here is that we typically are adding the keywords at the tail end of the publication and review process which is why the plot is relatively flat and in this area here as we've been working through our review process and of course I managed to advance prematurely. But then you can also think about the diversity in terms of our target audiences and the growth and diversification of the wide array of target audiences that the Clearinghouse designed to be able to have content available for. And these are these figures that I've been showing here are available also through the link that is on this slide in terms of being able to actually download a handout that has the more detailed breakdowns of the frequencies that are showing up in these graphs. I mentioned also that we're working on enhancing the metadata model and using that to essentially contribute to enhance search capabilities. In the left-hand side of this table we have essentially the existing metadata elements that are the ones that you would commonly associate with online materials in terms of authors and organization, citation information, the types of contributors, some of the information relating to those trend lines I was just showing. Information about licenses associated with the content. And one of the key issues there is that the vast majority that with very few exceptions the focus is on content that is open access and nature. So it does not have restrictive terms associated with being able to use it. And also educational frameworks within which those training materials can be placed, whether it's data ones, data lifecycle, whether it's the USGS framework that they use for organizing the research and data lifecycle. We have a number of frameworks that are also available for essentially flagging content and enabling search aligned with those frameworks as a complement to some of these other capabilities. But as we look at extending the metadata model to further enhance search, we're looking at being able to extend our information about additional educational frameworks and making more robust our reference to those frameworks. Additional information about the resources themselves in terms of when they've been modified, increasing our capacity to be able to use and leverage persistent identifiers. And also with a strong focus on increasing documentation about accessibility features for those training materials. And we are now looking into the future in terms of fair components related to the metadata and how we can actually make the metadata model itself and its contents more accessible and fair for being able to share and build additional capabilities on top of the content of the clearing house. And this is our first round of work in enhancing the metadata model. And it's being built into the new version of the clearing house that is currently being developed. But we're going to be continuing to look for probably another round of enhancements with feedback from the community. I mentioned assessment as another activity area. And this is where we're applying Donald Kurt Patrick's four levels of assessment for training materials in particular. And this starts at the lowest level being immediate reaction to a training opportunity, then learning in as in terms of what skills have been developed and learned, ultimately changes in behavior. And then finally, results that are an outcome of that. The reality is in the context of the work that we're doing here, we're most effectively going to be able to focus on the reaction and learning tiers of this. But we're developing within a larger framework that will allow for future work to allow to address in a more longitudinal fashion, the behavior and results tiers as well. But this is all highlighting also the things that we need to do in terms of defining performance indicators, which translate into targeted behaviors, which translate into learning objectives, which ultimately give us a set of performance standards that we do need to roll into the assessment tools that we're even doing in the lower tiers. And that's providing sort of the raw material input into the assessment strategy that we're building. As a part of our outreach, we're basically taking all the opportunities we can as we are here for being able to do workshops, training sessions, hands on opportunities to work with the clearinghouse get feedback on the clearinghouse, be able to both perform this dual outreach and training opportunity, but also for engaging with the community to bring their insights back into the ongoing development of the clearinghouse as we go through essentially the second half of our three year award in producing an enhanced platform. Speaking of which, there was an unplanned opportunity, let's call it with the announced end of life for the Drupal 7 platform that the current clearinghouse is based upon. As the end of life was announced, basically for several months after the end of our three year project, which is not a good, good message when you're talking about developing capabilities that you want to sustain well beyond the life of the project. So what we've decided to do, and we have additional support from ESIP, and we also have some additional pro bono contributions that we're looking at from the Science Gateway Institute for essentially rebuilding the platform from the bottom up, moving from Drupal into a Python based system. And so far, we have actually built out the two of the three components of that new system in terms of the lower tier, and the blue boxes in the right hand corner of the slide, developing a new indexing system for the metadata, a middle tier, in terms of building a set of web services that allow for integration of new content in, updating content in the clearinghouse, but also being able to query content in the clearinghouse that will ultimately feed into the top tier of client applications, a new web interface for the clearinghouse. And with this, I hope to be able to hand it over to Nancy to demonstrate our browse and search capabilities as though we're having a technical issue at the beginning. Nancy, are you online and able to take over the screen share? I am. Can you hear me? Yes. Great progress. Okay, I did. I'm also starting my video. I don't know if anybody can see what can people see anything from the screen? Or is it still you as a presenter? No, I stopped sharing my screen. So now you can start the screen sharing. Okay, let's see what I got here. Okay, I'm going to switch to an online version. So can people see the front page of the clearinghouse? Oh, great, great. Okay, and then I've got slides too. And I think I can shift that move back to that other screen to continue the slides if need be. So yeah, so so again, thank you very much for having the opportunity to both Diane and to CNI and everybody who's here on the call to to share the clearinghouse with you. What I wanted to do with the last few minutes is to do two things that Carl mentioned. One is to give you a quick precy of preview, I guess I should say of the functionality of the clearinghouse in terms of mostly mostly search a little bit of browse, but also submission because as you can tell from from Carl's slides and graphs and so on that, you know, we would love to engage other people in making sure we close that gap between the what's published and what is yet in the queue to move out. So what I want to do here is just to start out with show you the clearinghouse and you probably if you've had an opportunity to see the link at the bottom of Carl's slide, you can certainly come here yourself if you'd like to. This is the home page. And as you can see there, that's three basic functionalities, the search of browse and submit are down at the bottom here and then also up at the top. So what we're going to do first is just go to browse. And this is a I mean, this is not rocket science. I think it's especially this crowd is very familiar with this kind of search portal. This the browse function gives you a sense of, you know, all of the resources that we've got in the clearinghouses, as Carl said, a little over 400 now for almost four 10, I think. So with these, this is what you can see is some very brief information about the resources that are in there. And then there's just one of the filters are facets is on the left side, the framework that Carl mentioned earlier. So that as I said, 410 is a lot of records to browse through. So that's why we are focusing more on the search capability. And to get to the search screen, you just hit that search button, you know, at the top, if you go from browse or or down below. And sorry, there are a couple ways of searching that are are pretty obvious, I think, but there are a couple of tips to it that that would be important to know. First of all, up at the top here, you can see this is where you would enter your search terms and hit the search button, your search will show show up down below. And then, you know, you'll get the search and the number of search results and the list of them below on the so that's one approach that you can take. Another approach is to kind of cut to the cut to the chase to some extent and look at some of the filters we've got or facets we've got. Framework is one we've got key words that is the control vocabulary that we we are adding to as we get resources in organizations show up that are either authors of content or contributors to content. And then the names of people who are again, either authors, creators or contributors to the content, publication date of the resource and the license information and the cost. So those are the key filters we chose at the beginning, we may well change that a little bit with our new interface as we've had some experience with it. But but that's basically what what what those filters are. So I want to just do a real quick example here of how you can use those different approaches together. If you're interested, for instance, in looking for information about identifiers, you would enter the search here and you would get 41 results, which as you look down through it, it looks like it's about identifying materials to be created, something about persistent identifiers, other kinds of information like that. If what you were interested in would be, for instance, persistent identifiers, you can look to the left here at the filter and see that you can you can click that that particular one and then and then refine the searches to just the ones that are talking about PIDs or persistent identifiers as a reviewer, as a submitter and a reviewer assigned that to that to that term. So so that I think is pretty straightforward. If you wanted to go backwards from that, you know, you would just hit clear all and go back that way. So again, pretty straightforward. The other thing you can do that that you can add on ad filters to that whole process as well. So if you wanted to, for instance, find in terms of the search we just did identifiers and then persistent identifiers and then ones that that really met the educational framework, that would be the fair framework, for instance, then you would reduce it to even to even fewer. So so that's a way to to refine. And the one thing that's important to know about searches that at this point it's a it's a full text search. So it's what's what's a search method mechanism is using is not only is the description and in a lot of a narrative text that's in the descriptions just to give you a quick example of what a record actually looks like. This is the brief record, but here's the full one. So all of this text is going to be searched in a full text. So, you know, if you're interested in something precise about data, given that it's all about research data, the suggestion is not to use the term data in your in your search, at least at that point. So that's one of the kind of a quick tip to use. So so that's basically that. And then just to go quickly through submit, what we have is the opportunity to just drop in a title or URL changing the access fee if necessary. Your name at the submitter and a contact email address so that we can come back to you if need be. That's the only required information. And then the other information, you know, are different screens to make it a little bit easier to look at. You can see on the left hand side what those are. And of course, I'm logged in with privileges as both a reviewer, submit a reviewer and editor. So you've got you're seeing things on my screen that you wouldn't see depending on the privileges you had. So I won't go through the rest of those. There are some tips and tricks, of course, to doing that kind of submission. But the important thing I think to know is that the privileges change based on, you know, what, as I say, what the privilege, what the role is, reviewer has more and reviewers and editors have more rights to change some of the control vocabularies and to do to look at delete records and that sort of thing. So the whole process is one that we are following. We've got a workflow process we're looking using GitHub for. And, you know, there are tutorials, it's a tutorial on this on the website. So if I can get back there or not, I guess I can to attempt to change screens so doesn't look like we have time. But to look at that, but it's possible to see what that workflow is and to get a better sense of what the possibilities are. So I think I'm going to leave it at that for now. I'm going to stop sharing because I think we've just got a minute or two left and I want to make sure there's time for some questions if possible. So I guess, Diane, you would change the screen and get back to Carl. I think if Carl turns on his microphone and starts talking, he will be highlighted and I can unmute him. Welcome everyone. My name is Diane Goldenberg Hart from CNI and I want to thank Carl and Nancy for that really great talk, fascinating tool. Carl, have you got more to add at this point before we move to questions? At this point, the last thing I wanted to highlight is the areas where we are seeking folks to join our Merry Little Band in working on this as we do have a number of working groups that are helping provide some input and guidance to our activities, whether it's our metadata working group helping us with our continuing assessment of our documentation model and the search built on top of that, our assessment working group helping us helping to guide our development of the assessment tools. Our editorial team, which is a really key need at this point, as we have over 400 published items, but we have over 680 items overall in the system, so if you do some quick math that highlights we have a bit of a backlog to work through and we do have an interest survey that is linked here and it will be linked on the bottom of the next slide as well as we move to being able to take any questions on, but we are really interested in being able to broaden the participation in whatever way possible from the rest of the community in moving this project forward. That's great. I love the slide. Thank you. Thank you. And the link for that survey, that interest survey, is right here at the bottom of the screen. Thanks. So I just pasted in the link for the interest survey that we have on your project briefing page. It ends in DMTC-interest and I see the one you have linked to ends in CNI 2020. Is that a different survey? Actually those both point to the same survey is just a different entry point into the same survey. Okay. Either of those will work. Okay. Terrific. Great. I also pointed to the clearing house in the chat screen there. Folks are interested in taking a look at that tool. I'm sure they will be in exploring it further. So that was really fabulous and interesting and what a terrific tool at a time when we need access to this kind of information. And I see we're already getting a few questions here. The first question is what are you using for org identifiers? I can take that one Carl if you want. Sure. At this point we don't have identifiers per se. We're getting the information from the landing pages often of the resource itself. We've got a someone searching to find these things and what you know she can find she puts into the organ into the form itself. And so it depends on what the organization provides in that kind of public venue. We're looking to you know to add that kind of specific identifier for both people the first instance orcas or other individual IDs and also organizational IDs and the next iteration. It's a little unclear what you know what would be a kind of a community most community supported ontology or registry I guess for organization. But if you have any ideas or suggestions please let us know. That'd be great. Okay great thank you. So soliciting ideas and suggestions please feel free to share them here in the chat or if you want to stick around and have a conversation with Nancy and Carl later I'd be happy to facilitate that as well. Moving on to our next question. Are the materials primarily submitted by American universities? American content only or international? English language only or other languages? And what is the vetting process for inclusion in the clearinghouse? Okay that's a good question. So we have most what we have is English language. We have a couple of primary languages in Spanish and French. Some are available in translations. We have not we're limited to some extent because the people who are working on it are more English speakers although our grad students speak sparsely. We haven't found much in sparsely yet but but you know we have the capability of adding that to the extent that we can within our content management system. So and also to the extent that again our reviewers and editors are able to really understand the translate. So that's something we really want to look into. We know there's a real interest in need in that and that's part of what we're looking at to some extent as well in the new iteration of the clearinghouse. So the other questions I think I forgot it was. What's the vetting process? Oh vetting process right. So so what what what we're doing it when people submit the the resources you know some of the people who are submitting know what we're looking for. There's a selection criteria that we've developed that's on that's what I was trying to show you with that other slide that I where I lost my mind for a second there in the presentation but that's also that's on the slide when they're made available and or it's on the clearing house itself under the help guide but there's a selection criteria that allows us to scope and the vetting process is done by the reviewers and the editors. So if something comes in that doesn't meet the selection criteria then you know it it may it may be it just not accepted. So we're you know part of the idea is that a we want to stay on research data management topics and research data skills topics that evolves a little bit over over time but also because it's a community maintained service really we you know we have to scope what we can do as well. Got it okay thank you and going back to the the identifiers question cliff just wait in in the chat box there with a suggestion that you consider the ROR IDs right yeah yeah so I just good idea like there yeah thank you one that's been on our shortlist yeah I bet yeah okay if we've got any more questions we can take another minute or so to take those in the meantime I just wanted to put in a quick plug for CNI's virtual meeting which is ongoing through the end of May and just pasting in here a direct link to the meeting schedule so you can see what other webinars are coming up we hope you'll be able to join us for more of these great offerings we are continuously adding to the lineup so check back often and at this point it looks like we do not have any more questions if you would like to stay on and chat with Carl or Nancy please feel free to raise your hand I'll move you into chat mode and or I should say I will turn on your microphone and with that I want to thank Carl and Nancy once again for coming to CNI and sharing with us your great work it's really fabulous thanks for your time thanks to our attendees for for joining us and we hope to see you back at CNI webinars very soon thank you thank you very much many Carl and Nancy and much applause indeed be well everyone I can hear it I can hear it it's coming in through the chat box for sure thank you take care everyone thank you