 Hello. We're gonna go ahead and get started. Hi. Right. I'm Maria Gould. And I'm John Shadacki. And we're at California Digital Library, which is one of the operating organizations behind the Research Organization Registry, which is otherwise known as ROAR. And we're here today to talk about both those things. So for those who may not know, California Digital Library or CDL is the Systemwide Library Unit for the University of California system. And UC research output is immense. And one of the questions that frequently comes up across our university system is how do we identify and track all of the research that's coming out of University of California? It seems like it should be a very simple thing to do. Just find the research. Just find all the research associated with researchers at UC across all of the UC campuses. But this is actually much more challenging than it should be. And we end up spending lots of time and lots of money tracking this down with very mixed results. And so what we're talking about today is not just how we keep track of things internally and in our catalogs, but how we make sense of where all of the research coming out of UC is being disseminated in different systems around the world. And so there are lots of layers to this problem that we're trying to untangle. And this is coming up for a few different reasons. So one problem is that this is a metadata problem. The metadata just isn't there. Metadata about the research who's producing it, where it's published, it's just not there. It's not in a usable or reusable consistent format. If we think about the name UC Berkeley, for instance, trying to find all of the articles published by UC Berkeley researchers is a challenge because we may not have usable metadata that has a consistent form of UC Berkeley's name. It's also an infrastructure problem. The information about that research coming out of UC Berkeley may be out there, but it's available only in siloed or closed systems or behind paywalls. It's also a labor problem because the metadata is bad and because our infrastructure isn't open, we end up having to duplicate efforts and reconcile data from multiple systems over and over and over again. And so what this ends up with is essentially a values problem. All of these challenges present a situation that goes against the openness and connectivity of knowledge and information that we value as a library and as a public institution. As a public institution, we firmly believe that the research that's coming out of our campuses and across our institutes and other research centers should be available and open to all. But our infrastructure is not set up to support this vision. So we end up depending on these imperfect solutions and on external entities to make this research available and open. To give one example, when UC passed an open access policy, we realized around that time that we needed some way to actually find all of the publications coming out of UC and identify those ones that needed to be made open access and deposited in our repositories. So we didn't have an easy way to do that and so we ended up having to purchase a contract with a commercial vendor called Simplectic Elements to actually go out there and find and harvest all of those UC publications so that we could then deposit them in our open repository. This was intended to be a stopgap measure, but eight years later, the situation still hasn't changed and we're still relying on that temporary solution. So we have this interest in knowing where our research is coming from because this is crucial data for pursuing the goal that all of our research is openly available. But if we're going to be able to get that information downstream, how do we, you know, what do we need to do to actually address that upstream and make this happen? Pass it over to John. So taking a step back and thinking about how we in general find information about research outputs from our universities and from our campuses, it really boils down to the systems that are in place, not the local systems within our campuses, but that the wider open infrastructure systems that are in place are trying to tackle the connection between people, place and things. And so we as a wider scholarly enterprise in a community have really tried to invest in certain types of identifiers and systems. We have publishers who have invested in DOIs through Crossref and data centers and other software and other outputs that have invested in data site with DOIs. And we have, as a library community, really embraced orchids of trying to get all of our researchers to think about identifying through persistent linkage their CV and their research outputs. One of the challenges that we've had, though, is that there hasn't been a clean, easy way for connecting these through affiliations. And so what we have at CDL was we sat down and talked about this back in 2016 of why are there so many of these identifier schemas for organizations, but none actually fit this use case of open CC0, able to be reused, type of a use case that would allow us to leverage it as a wider community. So we brought together a group of people who had worked on this, over 17 organizations who each had their own schema for organizational identifiers and said, how do we tackle this problem? How do we tackle the problem that even with all of these identifiers for outputs and people, we still don't have a clean way of tracking affiliations so we can find things that are from UC. And ultimately, what came of that was the launch of, well, actually, before I, one of the ways that we did that was actually at CNI. So I mean, one of the presentations that we did was with identifier schema organizations at CNI where we brought together the community to discuss this. And ultimately, what came of it was the launch of Roar. So as Maria said, open identifier registry for research organizations. And we CDL are now one of the operating organizations that help run it. It is a partnership between CDL Crossref and DataSite. And really, in the last, sorry. And now with, I got, I messed up my slides. So now, today, we have over 102,000 entries in this registry. And so with that, we have made all the information about each of these organizations openly available with a CC0 license and that CC0 waiver. And with that, we have a set of infrastructure tools like a REST API and public data file that allow the community to invest in and leverage the information in their systems. And so Roar IDs really are meant to be integrated into other systems. So you can think of, you know, Crossref DataSite or get as big systems that have information about people and outputs. Well, Roar is more intended to be something that is integrated into other systems. It's really about how do you identify that affiliation and piece of information. So to demystify it truly is just a string integer, a link that is within a metadata file that then declares an open identifier scheme that can be leveraged and reused. And so the question that we have is, you know, this is very powerful. This is very important. But what does it mean to us as a library when we're thinking about the questions that Maria was posing? How do we think of this when we're talking about our open access policy, about our strategy to push towards open access and openness and transparency within research? So with that in mind, taking a little bit of a step back to just talked about the launch of the research organization registry and what Roar is. And so now how does that connect to the open access strategy at UC? In 2018, the University of California Libraries launched what was called Pathways to Open Access or Pathways to OA. And this was really intended to be a guiding and framework and a multi-pronged strategy for advancing open access across UC and addressing some of these challenges we discussed earlier in opening up our research and making it available to all. And this has been a really important framework, not just for University of California, but other libraries as well as a model and something that has really been guiding subsequent strategies across our campuses and specifically at CDL to figure out what is this diverse set of tools that we have to really look at every single layer of our research infrastructure and research ecosystem and figure out how to make everything as open as possible. And so one of the things that we're focusing on today and one of the things we've been exploring at CDL is if we have these pathways that we have developed, this set of diverse strategies, they lead to a state of openness where the outputs that are coming out of research activities are made open. And what we're taking a closer look at today is how open are those pathways themselves? How open is the infrastructure that we're using to actually construct those pathways? And how open do they need to be? And so that brings us to what we are doing at CDL to really walk the talk and figure out how to make those pathways as open as possible to enrich our infrastructure. So one of the ways that we're doing that is by building open identifiers and open infrastructure into our own systems and our own services at CDL. So we have a number of different services that we operate, including the Dryad repository, the DMP tool. And so we're making sure that in those services and in those systems, we're building an opportunities to collect ROAR IDs for researcher affiliations, make sure that we're capturing those at the source, and also sending those ROAR IDs in the metadata that we register as part of DOI registrations that we send along to Crossref and DataSite. And that means that throughout those system integrations, we're collecting standard identifiers for researcher affiliations, we're making those affiliation identifiers openly available in open and publicly available metadata so that not just University of California, but anyone can more easily track research that's coming out of UC. A second way that we are building open identifiers and open infrastructure into our open access strategy is looking at our publisher negotiations. One of the strategies we've taken more recently is to make sure that we're including ROAR and other identifiers in our publisher agreements. And that means making sure that we're emphasizing to the publishers that we're signing transformative agreements with that it's extremely important that they are collecting ROAR IDs in their systems, specifically for the researchers at UC, and making sure that that metadata is getting deposited in Crossref or other places where it can be made openly available to UC and anyone else. A third area of our strategy for incorporating open identifiers and infrastructure into our open access plan is to support persistent identifiers and rich metadata as part of the best practices that we're implementing for library publishing at CDL. This is something that John and I don't work on directly, but it's a key part of what we do at CDL to open up research. And so we think it's really important that we are following best practices for opening up not just the outputs of those publishing activities, but also the core metadata about those publications. And lastly, we're involved, you know, more community-wide with a wider set of initiatives and groups that are working collectively to open up our infrastructure and make information about research as openly and publicly available as possible. So that includes the Pitipalooza Festival of Open Persistent Identifiers, the Open Access Switchboard Project, the Initiative for Open Citations, the Initiative for Open Abstracts and Metadata 2020 just to name a few. So we really see the work that is going on with Roar as a case study for how CDL can incorporate open infrastructure into our open access strategy. And we wanted to talk about that experience today because this isn't just our story or our mandate. We think that this can be everybody's story and everyone's mandate. It's something that we can all be thinking about because we should all be paying closer attention to our choices and strategies regarding open infrastructure because when we do that, we all benefit. So this isn't just jargon. This is actually a tangible investment that we are making at CDL. It's a tangible investment that others can be making as well to really support the long-term sustainability and openness of our research. So one of the things that other organizations and other libraries can do is to really look at enriching and opening metadata. So not talking about cataloging metadata, but this is really about the scholarly metadata about institutional outputs and research activities, which we all need to discover and track research and to understand long-term impact of that research. We need to consider metadata openness in our publisher negotiations so the metadata can be openly accessed and reused. And we also need to consider metadata quality and metadata openness when we're evaluating vendor platforms and service providers, specifically ones that deal with research tracking. A second thing that libraries can do is to look at opening infrastructure. We can't transform our systems if they're not open, and so we all need to think about investing in open and connected infrastructure for publishing and tracking research and to think carefully about the vendors that we're paying and what we're getting or not getting from them. And a third thing that libraries can do is to pay attention to where time and resources are being invested. We specifically need our teams to be equipped to work in and navigate open information systems. We need to get around this challenge of everybody working in silos, trying to solve the same problem, or working with these imperfect or closed systems and information. One example that would really help is if we have more librarians and library programmers to be trained to work with crossref and data site and orchids, open APIs and open data. We need to really do this collaboratively and try to solve our local problems in a scalable way so that others can benefit from them as well. And lastly we need to factor in, you know, where our values come in when we're navigating these challenges and make sure that we're not coming up with solutions that end up compromising our strategies and pathways toward open access. So those are the four, you know, kind of the four things that we're trying to do at CDL to advance our strategy toward open access and open infrastructure and four things that we think other libraries can be doing as well. And just wanted to share that story and talk about what we've been doing at CDL and with Roar and also wanted to turn the conversation over to those of you in the room and see what kinds of challenges you're grappling with yourselves or what kinds of questions you have about Roar or about how it fits into this larger open access strategy at CDL and UC. So thank you for listening and we look forward to your questions and comments. We could also just scream out questions if you have any or comment. Definitely. Yeah, so the question was about how do you how do you leverage Roars and when you're looking at an aggregate level across a whole system or a campus. So first of all, just like very basic level, which is that Roar is an open system that can be leveraged by anybody. Identifiers within Roar are intended to be at the highest kind of level possible. We're not looking trying to recreate an organizational chart within a university. So we're really looking for organizations. So like at the University of Mexico would be the University of Mexico as well as many institutes or that kind of structure that's within like affiliated organizations that are with UNM. So within that family, the question would be like, how do you create a set? So very simply, one thing that can be done is just aggregating the 30 to 50 Roars that exist. Also, all of us starting to think about how we can ensure that there is a clean set of 30 to 50 or you know, thousands in the case of big systems and stuff. And really starting to think about that being part of the set that you're leveraging. There has been a set of projects that have happened where people have thought about, okay, if we're running our own IR, our own data repository, we know that the majority of people there are going to be from the University of Mexico or from wherever you are, then pre-populating a submission system where leveraging those Roar identifiers into the submission system guarantees that the downstream metadata that goes into open systems will be clean and will be customized to your use case. And so instead of thinking of Roar as just this big open field of 102,000 identifiers, it can be really like focused on what your local use case is. And I think that there is definitely ways that within the Roar metadata we can do more to connect. There is a set of a level of very light hierarchy that has parent, child, sister, sibling kind of relationships. But in general, that kind of information is something that you would want to do in your specific case. I guess it could also be the same as Roar. Yeah, I mean, I think it's really, it's very important to just state this is a very simple issue that has huge ripple effects across our entire organization. That's really this across our whole community. And that's really the point of our talk is like, you know, this is truly just a string being sent against an integer. It's like, you know, people starting to just normalize. It's the stuff that libraries do all the time. The problem is that we do it internally within our systems or within our community. And then the business intelligence tools and all of the kind of metrics tools that we're all buying. They're all grabbing information from these open metadata suppliers. And so the point being here is like, we as a library community, it's important for us to think of having some level of ownership over that metadata about our outputs, too. It's not just the catalog that we maintain. It's also the set of metadata fields that are out there that a publisher puts in cross ref that is for a UC public researcher. I need to start thinking about how that information is being managed and that the identifiers within that metadata is open because I am downstream and paying for that information later through business intelligence tools. So it's really just kind of thinking of open infrastructure as a community, as of something that we are investing in and that we are making better over time. Yeah. Just curious, show of hands. If this is your first time hearing about Roar, raise your hand. Yeah, I would just, we have a little bit more time for questions if you want to continue thinking about anything that comes to mind. But just to add to what John was saying and to follow up on your question, I think we're not here telling our story because we know all the answers and we have it all figured out. But that this is a problem at UC and a problem at CDL that we're trying to solve and something that a lot of you, a lot of other institutions are trying to solve as well. And so one of the opportunities we have with Roar is providing this core set of open data that can populate in many other systems that we're all using or could be using. So it's a problem that we can try to solve locally, but in a collective way or vice versa, however you want to phrase that. And one thing that a couple of other things that came to mind when John was talking is there are some opportunities now with the kind of data that exists in Roar now and that is beginning to exist in the systems that are integrating with Roar and the way in which Roar IDs are becoming embedded in ORCID records and in DOI registration packages, that systems that are starting to aggregate that kind of research are making it easier to solve these kinds of tracking problems. And so many of you will be familiar with the story of Microsoft Academic Graph and the decision that Microsoft made to shut that down. There's now a successor project called Open Alex that the team that our research has been building that's really meant to kind of fill that gap that is being, you know, that was created with the departure or sunset of Microsoft Academic. And that's something that's relying heavily on Roar as a core institutional identifier and because they're providing an open API and open tools for querying their data, it's going to enable some of these downstream tracking problems like how do we find and roll up all of the University of New Mexico Roar IDs and query them and track all those research outputs. So I just wanted to point to that as an example, that's not a solution that CDL came up with, it's not a solution that Roar came up with, but it's the way in which other, you know, other developers and other initiatives can take that open data that Roar has and build something that other people can use. Yes. Do you know if the problem like UNM has or the University of California has about so many Roars with one institution, is that getting worse or better? I was going to say, yeah. I wouldn't say it's on a spectrum of, sorry, it's not on a spectrum of worse to better. I would say it's just opportunity, getting more complex, but it's also becoming, it's clarifying in a way. This is the challenge that we've had is that the wild, wild west of, you know, just strings, you know, of text strings and the challenge that we have as a, like, it's not an insurmountable challenge for our community to catalog, to create a set, a registry of all research organizations. It's not. We have just not focused the attention on doing that as a, as a single community. And we have focused in our library world and doing it through different strategies. And the scholarly metadata world has kind of left it alone. And now we're seeing with the intersection of our own catalogs and the way we look at the world and the way that the rest of the world is, has built metadata catalogs. It's not penetrating through, you know, that wall. And so I think like just this, this, this level of us working together with metadata from Crossref Data Site Orchid and the systems that are being leveraged like knowledge graphs and business intelligence tools. This is where we really need to focus the attention on creating that, like, that registry and making sure it's being leveraged. So, I mean, I think with great complexity comes great responsibility or some phrase, but like, you know, it's, it's not necessarily bad or good. It's just, it's a, it's a journey, you know, exercise. I'm not sure if this is relevant, but do you take any action with a publisher like Wiley, who uses Ringgold? Yeah, I can speak to that. Roar is there are a number of publisher integrations that are already underway with with Roar. And in some cases, publishers are looking to replace Ringgold with Roar or in other cases to supplement Ringgold with Roar because they're hearing from publishers are hearing from librarians and from some of their, you know, other users that they want that information in the metadata that they're producing. So Wiley is one that is working on this. Other publishers are a little bit further along Cambridge University Press, Hindawi, Elife, just listing a few off the top of my head. But it's definitely a growing topic among the publishers that we've been in contact with and something that we're, you know, trying to support everybody's integration journey is a little different and use case is a little different. But that's going to be definitely something increasing as adoption grows. It's also important to know like, you know, people might want to use a proprietary system like Ringgold for their internal business operations at a publisher or at any, you know, organization or really trying to focus on here is as a library community asking publishers and others and data repositories and vendors to leverage an open identifying schema like a CC 01 when interchanging information through places like Crossref and DataSite and Orchid. And it's really about that interchange of information. And so it's less about trying to destroy the business of a proprietary system and more about trying to just focus on our values and on the systems that we use when we as a library community consume and disseminate information. So it's it can be a complementary situation with Ringgold or others when it comes to the publishing industry or other Time for one last quick question. But I know we're coming up on lunch. Well, thank you all. Thanks for your questions. We'll hang out up here if anybody wants to come up and chat. But I hope you have a good rest of your day.