 Hi, everyone, I think we'll get started. I want to leave some room for questions at the end, so I want to kick off the show. So I am Maria Gould. I am with California Digital Library, and I am the project lead for the Research Organization Registry, or ROAR. You can say ROR, but it's more fun to ROAR like a lion. So now you know how to pronounce it. So I'm here today to introduce ROAR to the CNI community and give you an update on what we've been doing with the project and where we're going next. So quick show of hands in terms of who's in the room today. I just want to sort of get a sense for how many of you work in a library. OK, how many of you publishers? OK, Research Institute, Museum, OK. OK, a little bit of a cross section here. How many of you have already heard about ROAR? OK, great. So some familiar faces and some new faces as well. And how many of you have to work with affiliation data as part of your jobs? OK. And how many of you think that working with affiliation data is super easy? Affiliations are, excuse me for a minute, they're complicated, right? They are messy. So I just took this picture of my badge from CNI and my colleague Daniela's badge from CNI. And we happened to enter two slightly different affiliations for ourselves in registering for this conference. So just kind of a simple example of how messy it can be when we leave the description of affiliations up to an individual. So I'm kind of going to talk about what ROAR might be able to do to address that issue. So the Research Organization Registry is a community-led project to develop an open, sustainable, usable, and unique identifier for every research organization in the world, no small feat. So I'm going to explain a little bit about what that actually means and why we're doing this and sort of what we've done to get up to this point and where we're going next. So this concept of a registry of organization identifiers might sound familiar to some of you, especially those of you who were at the spring CNI conference in 2016. There was a presentation at that meeting by Jeff Bilder from CrossRaf, Trisha Cruz from Datasite, and Laurie Hawk from Orchid. And in this talk, they shared this image of a wobbly, three-legged stool to illustrate this need for organization identifiers. And the problem statement that they presented was that in this earlier talk, was that as a scholarly communication community, we had made a great deal of progress in building tools and infrastructure to track and manage and assess different parts and points of the research and publishing process. And so we had content identifiers in the form of DOIs for articles and for datasets. And we had people identifiers in the form of Orchid IDs. But we didn't quite have a great solution for organization identifiers that could really help to enrich and make connections between these different points, these legs of the stool or the people, places and things or what have you. And so to be fair, there were at the time and still are some existing identifiers for organizations already, but they didn't quite meet the mark as far as a need that had been articulated for a global and open and community-governed infrastructure for identifying organizations. So that's kind of the flashback to 2016. And so this sort of led to this collective effort starting in 2016 to work across organizations and develop a vision and a plan for what it would mean to build truly open infrastructure for organization identifiers. And there's a lot of history about what happened after that point on our website, rohr.org. I'm not gonna go into all of the details here, but a few highlights which are listed on the timeline that this really started at meetings like CNI and others in 2016 to have collaborative workshops with members across the scholarly community to discuss what it would look like, what the requirements would be. They created several documents for public comment to get feedback on the proposals. There was an org ID working group that formed with about 17 different organizations to sort of flesh out the feedback and develop a more full-fledged set of proposals. They put forth a request for information to get more feedback, lots of meetings and lots of feedback. And then January 2018 and January 2019, there were a couple of meetings with specific stakeholders, people who had been involved in the earlier requests for information and people who had given feedback to really drill down and say, we're gonna take this forward. And at that time, a smaller kind of project group formed to really deliver this first prototype of a registry and I'm gonna tell you all about that in a minute. So the major development and the major project updates since that earlier 2016 CNI meeting is that this earlier org ID initiative is now an actual registry of organizations, Rohr and we now have an identifier for organizations called a Rohr ID. So yay, our stool is wobbly, no more. We can identify people, places and things and we can make really meaningful connections between them. So going back to this description I presented earlier, I want to really highlight and emphasize a few things that distinguish Rohr from other organization identifiers and that really helped frame what this project is and also what it is not. So the first thing that I want to emphasize is that this is a community effort. This is something that has been driven by and driven for the community and that has always been a really central concern dating back to the early days of the project when these different organizations all came together to say, let's solve this problem and let's sort of work on this shared vision for open infrastructure for organization IDs. And so what this ultimately led to as I mentioned before was this streamlining of this broader group into four steering organizations. And so that's CDL where I am and we're working with Crossref and Data Site and Digital Science as the steering group to bring the next stage of the project forward and to collect feedback and input from people across the community. We also have a community advisory group that we just formed with about 60 people from around the world who were calling on to give feedback and all of you are welcome to take part and I'll echo that call at the end as well. A second thing that sets Roar apart is this focus on providing a truly open solution to this problem. So I mentioned before that Roar is not the only organization ID that exists. We're not trying to replace other ones that exist and I'll show you in a minute how Roar IDs actually map to other IDs for organizations that may be in use elsewhere. So we're not trying to be intentionally duplicative or competitive, but what we really lacked was an open solution to this problem. And so the Roar dataset, it's not for purchase. The data is licensed to CC0. It's human and machine readable. It can be used as widely and accessed as possible without restrictions. So that's really crucial to understanding what sets Roar apart from other organization IDs. And the third thing is this focus that we have on this project on a very specific scope and a very specific use case. And so this is this concept of an identifier for research organizations. And put another way, organizations that conduct research. And so the question that we're really trying to answer is which organizations are affiliated with which research outputs. And so Roar is not a registry of every single legal entity in the world. Roar is also not, even though you're all going to ask about this, Roar is not mapping departmental affiliations and hierarchies within institutions. We know that that's really important. That's a problem that has yet to be solved. That is a problem that we with Roar right now are not trying to tackle. Because we think that we can be really successful in establishing a single and open top level registry of organization identifiers, solve that problem, and then kind of see where we are at, work with other people who might be tackling the departmental question, and perhaps the Roar IDs can be interoperable with some of those other efforts. So this focus on research organizations is also really crucial to understanding what we're doing and what we're not doing. So these are really the three things just to sum up that set Roar apart and that really define our mission in this space and the reason why we are here. So now I wanna talk about sort of giving you the preamble of how this emerged and the problem statement that we are addressing. And so now I'm gonna give a little bit of an update on where we are at right now. So I've mentioned that this is kind of the culmination of a couple of years of extensive discussions and planning and workshops and proposals to get to this point. And we reached a really important milestone in January when we launched a minimum viable registry or MBR for Roar. So I'm gonna talk about what that means. So the first thing is that we have a registry that's up and running. You can search it right now, roar.org slash search. The example that I'm showing here is for California Digital Library. And what we did to sort of jumpstart the launch of the registry is we had been working with Digital Science as one of our project partners. And they were actually operating, they had their own database of organizations called GRID. But they weren't necessarily interested in turning GRID into a broader community-gummered effort in the way that Roar was interested in doing. And so as part of Digital Science's contribution to the Roar project, they donated the GRID dataset that Roar will now be building on. So the one caveat about where we are at right now in this minimal viable registry stage is that we are still very closely tied to GRID. And as we move forward in the project, as I will explain, the task that we're taking on is how we build up Roar further. So I'm just gonna highlight a few things here in a Roar record and sort of walk you through the different pieces in the upper left-hand corner. We have the actual Roar ID in this case for California Digital Library. This is a URL that will always resolve to the record. It includes an opaque and unique identifier string, always starts with a leading zero to distinguish it. And it includes a checksum as well. We're capturing the official name for the organization. I'm sure there'll be some questions about what that means and we'll talk about that in a minute, as well as alternate names, abbreviations, names in other languages, if they exist, and things like that. In the bottom right, you can see that the record is also capturing other identifiers for the organization. If those exist and if they are available. So California Digital Library also has entries in the GRID database, ISNI and WikiData. And so that's all super important for understanding that as I mentioned before, Roar IDs are not necessarily intended to replace any other identifiers that are out there, but we're contributing and building this registry, these pieces of metadata that can be very interoperable and linked to other identifiers that are in use. There's additional metadata in the record as well, like the location and the category. And these are all based on GRID, as I mentioned, and as we move forward, we'll be adding additional metadata to the records and other types of information as well. One thing that we know we need is a parent-child relationship. So we have an entry for University of California System and we'll be adding metadata to the record so that we can say that California Digital Library is a child of University of California System and be able to establish relations across records as well, like California Digital Library is related to UC Berkeley. So as part of our launch of the Minimum Viable Registry, we also released an API. This is, the link here is just if you wanna go look at the JSON files. We have documentation for the API available on GitHub and I will circulate links after this talk or send them to you if you have questions. So all of this is ready to use right now when we really are looking for feedback and use cases and things like that. So I encourage you if you are interested in checking it out, please do so and let us know what you think, what feedback you have and things like that. We also built a reconciler that works with open refine and this is available on our GitHub page. The screenshots here just, I know it's hard to read the small text, but so this is an option for people who may already have their own lists of affiliation strings, the open refine reconciler allows you to map those affiliations to Roar IDs and sort of automate and streamline a lot of that data cleanup and make your affiliations a little less messy and less complicated. So we have this registry and that's wonderful. We have about 91,000 organizations already in the registry all with their own Roar IDs. So now what? That's obviously a really important milestone, but the work is far from complete and there's a lot more that we need to do right now, a lot of questions that we need to answer. And so this is sort of a call to the broader community to help us in this next stage of the project. So one of the biggest problems that we will be turning to next is this question of data curation and how we're going to be maintaining the database and the records over time. And so this is something where we're really looking for community input on the best approaches. And there's some really complicated questions that need to be answered here. And one of the challenging things about managing identifiers for organizations as opposed to people is that organizations might change over time in ways that people don't. And so there are some specific situations that we need to account for in Roar and develop appropriate guidelines and approaches for handling. So one very simple thing is having transparent and consistent criteria for adding new organizations to the registry and also situations when institutions might merge or an institution ceases to exist and or there might be a joint collaboration between two institutions, they're not merging, but that might be an affiliation use case that needs to be captured. And so the question sort of is how do we account for that? How do we make sure that those guidelines are really clear? When we make changes to the database, how do we, what kind of history do we need to maintain? What kind of versioning might we need? And lastly, this is something that many of you might already have thoughts about is how do we make sure that we are making the right changes to records? And to what extent should somebody be authorized on behalf of an organization to request or approve a change? And we've heard on our community calls different suggestions like, oh, somebody should just be authorized on behalf of a country to authorize all changes. And we know that that's not something that will work for every country. And, but even within a single organization, this question of who should be that person who can say the official name is this or this website needs to be updated. And so that's something that we are actively seeking and put on and would love to hear what your thoughts are. We're also working, I kind of mentioned this already, on updates to Roar metadata and the search functionality within the registry just to support better discovery and disambiguation. So for example, who knew there were so many Lincoln universities in the United States? This doesn't even include the Lincoln Christian University and Lincoln Memorial University and University of Lincoln in the UK. So we need to make sure that our search functionality is really robust and that the metadata really helps to support disambiguation between universities that might happen to have the same name. A major aspect to our success, as some of you might already be thinking about is making sure that wherever Roar IDs or wherever affiliations are requested and collected in systems and platforms that the Roar is going to be most successful as a project if that capture of affiliations is tied to Roar IDs in the backend. And so that would be journal submission workflows, funding workflows, data publishing platforms. One example here, this is kind of our first proof of concept. This is a screenshot of the Dryad data publishing platform that CDL is launching or relaunching this summer. And so the situation here is that Dryad did not previously collect affiliations as part of the data set submission workflow in the new Dryad that is launching this summer. They will be collecting affiliations. The field that circled will basically be a lookup. So the researcher who's submitting to Dryad will look up an affiliation. Those affiliations will actually be pulling from the Roar database. And so the affiliation that the researcher chooses will be capturing the Roar ID in the background. And so that means that all of the data sets published in Dryad will be connected to a Roar ID. We won't have one researcher submitting a data set saying I'm at UC Berkeley and another researcher saying I'm at University of California at Berkeley and having those be essentially registered as two different organizations because we'll have one Roar ID for all of those affiliations. And Dryad is going to be doing this retroactively as well to all of the data sets that were previously submitted in Dryad. So this is kind of our first pilot, our first kind of proof of concept and will be a really helpful example to show to other platform administrators as an example. And we also know that it's not sufficient to just capture affiliations in these systems. We also need to make sure that the metadata schemas that are used widely across the community also support Roar IDs. And so we're working with our project team, cross-strap and data site are both going to be updating their metadata schemas later this year to include Roar IDs. So that means that when publishers, for example, adopt Roar IDs and when they submit their metadata to cross-strap, for example, we'll be able to have Roar IDs for all of those articles that have been published. And Roar IDs can then be exposed in searches and in the API for both data set, data site and cross-strap. And lastly, we're looking at different approaches and options for how we sustain this going forward. And so we've put so much work into getting to this point. We're really committed to supporting this for the longterm. And there are some different things that we're considering. And again, this is another opportunity that another opportunity to get input from the community about how to proceed. So some of these questions, as I mentioned, the policy question is really important and how we can have sustainable longterm and transparent policies and procedures for managing the data. We're also looking at this governance question and how we keep the community involved and what kind of strategic planning structure we might need to sustain this going forward. And then lastly, this question of sustainability and how we ensure that we can maintain this for the longterm. So essentially right now we have this project team, CDL, Crossref, Data Site and Digital Science donating their resources and some of their staff to supporting the project. We know that that can't last forever. So we're looking at building out something that can be more longterm. So I've already kind of mentioned that we are actively seeking input and feedback and involvement in different ways. We have a community advisory group. You're welcome to be a part of it. Basically just means that we have a Zoom call every other month and we look to the community advisory group to give feedback on some of the policies that we'll be drafting and some of the decisions that we'll be making. We have a Slack workspace if you like to hang out in Slack and you can email info.roar.org. That goes to me. If you have questions, if you wanna set up a call, if you want someone else at your institution to talk to me, I'd be happy to. And we have a mailing list and Twitter feed as well. So lots of different points of contact and ways to stay up-to-date. So that's the end of my overview and update but I imagine that some of you might have some questions or ideas and I would love to hear them.