 Hi everyone. Hello. My name is Maria Gould. I am here from California Digital Library and today we are going to talk about roars, not like the lion roar, but that's how you pronounce it. So I'm gonna talk to you about projects called roar in which we're building an open registry of research organizations. And first thing I just wanted to point out is how amazing these badges are. I don't know if you noticed on the back that you can plant them. They will become wild flowers. It's very Portland. And the other the other thing that I wanted to point out about these badges, if everybody takes a minute to look at the affiliation that is on their badge, it's probably something that you entered when you were signing up for this conference. Think about any other kinds of variations in how your affiliation could be written. For instance, the images of the badges that we have here on the screen for for myself and for John Chidaki have California Digital Library and CDL. And both John and I have the same affiliation because we both work in the same place and they are nicely written and expressed in the same exact way. Very nice, very clean, not messy at all. This is not at all how affiliations really work in the real world. They are complicated. For instance, here is another set of badges from myself and my colleague Daniela. We attended a conference last month and we both entered our affiliations in different ways when we registered for this conference. And my affiliation ended up being so complicated that it didn't even fit on my badge. And if you know, say for instance, Daniela and I were each submitting papers to the same to the same journal or even to different journals because we entered our affiliations in two different ways. It's quite possible that whatever kinds of tracking systems or analysis work that's being done to identify, well, how many publications are coming out of California Digital Library, that our processes of analyzing those affiliations wouldn't necessarily capture that Daniela and I both work in the same organization. And so this is one of the problems that the Roar Project is trying to solve and trying to get at addressing a simple yet also very complicated question, which is which organizations are affiliated with which research outputs? And so within, I work in a digital library that's part of a massive university system. I've also previously worked in open access publishing and so I'll talk to you a little bit about this problem and why it's so hard to actually identify which organizations are affiliated with which research outputs. And that is we've sort of have three pieces of this puzzle in this world of scholarly publishing. And so the upper left of the screen here, we have the people who are producing the research, who are publishing articles, who are creating data sets and submitting data sets. And in the upper right, we have those outputs themselves, the data sets or the articles or the presentations, what have you. And in the bottom, we have the organizations or the institutions where these people are affiliated. And so raise your hand, are people here familiar with Orchid IDs? Okay, great. So we've sort of solved this problem of how do we disambiguate people who are involved in the research project? We have Orchid IDs that kind of function like social security numbers. So we can know, you know, John Smith over here is not the same as John Smith over there. How many people raise your hand are familiar with DOIs? Okay, excellent. So we've kind of solved the problem of identifying content. And that is, you know, being able to assign a unique identifier to a piece of research. And a lot of, you know, and we have these connections that can work back and forth between the people who are involved in the research project and the research themselves. So we can say, a person who has this DOI produced this data set or this article, or sorry, a person who has this Orchid produced this research, piece of research that has this DOI and we can make those connections. So how many people here are familiar with an organization ID? Okay. So this is sort of the missing piece of the puzzle or the missing leg of the three-legged stool that we are trying to work on with Roar, which is we can only go so far in being able to measure the full extent of research coming out of a given organization or the full scope of work coming from a particular research community because we don't necessarily have an organization identifier, or we didn't until recently, and that's what I'm going to tell you about, to be able to triangulate between people and places and things involved in the research process. So this is where the research organization registry comes in. This is a community-led project to develop an open, sustainable, usable, and unique identifier for every research organization in the world. It is a project for which I am working as the project coordinator representing CDL and collaboration with a few other organizations that some of you might be familiar with. We are collaborating with Crossref and DataSite and with Digital Science and all of our organizations are contributing developer time, project support, outreach and communications to bring this project to fruition. And we are working as well as a community project with a larger group of community stakeholders and advisors and all of you are welcome to take part because you are all part of the research and scholarship and data community. So there are a few things that are important to understand about Roar and a few things that I wanted to emphasize in this talk. And the first thing is that Roar is community-led and so this is what we're working on now is kind of the fruition of several years of work by many, many different organizations from publishers to system providers to data analysts to really think about how do we solve that problem of identifying organizations and connecting them to the people and the research outputs. And so this is a very much a community-driven effort. The second thing that's really important to understand about Roar is that we're building an open registry. Roar is not the first organization to exist or the first registry to exist that identifies organizations, but it's the first one to be truly open. And so we believe that if you're a librarian or an institution and you're trying to find out information about where your researchers are publishing, that information shouldn't be locked behind a paywall. It shouldn't be owned by a commercial provider. And so the things that have kind of come before Roar have not necessarily solved that challenge of making that data fully open and fully accessible. And so the Roar dataset is available CC0. It's human and machine readable. We want it to be used and as useful as widely as possible. And the third thing is that we're focused very specifically on identifying organizations with Roar. And so that means that we're not necessarily working on identifying every single department that exists within an institution. We know that's really important to some people to be able to answer a question like how many people in the Department of Philosophy or in the Department of Biomedicine published X number of papers. And that's a really complicated question to sort out. What we're focused on right now is identifying a top level organization, building this at a truly global and open scale and then other kinds of local efforts can can can fit into that and be built around it. So in January we launched our registry we're calling it the MVR our minimum viable registry. We have 91,000 research organizations in Roar. They all have their own unique Roar ID. And you can go to Roar.org slash search right now if you are on a laptop or on a mobile device, it should be working on mobile. Let me know if it doesn't. And you can look up your own organization right now or any other organization that comes to mind. I'm showing an example right here of California Digital Library. So if you go to Roar.org slash search and you look up an organization, you'll see a few different things in the Roar record that I wanted to point out. So the Roar ID is in the upper left hand corner. It's a URL that resolves to the record. We're capturing a bunch of other metadata about the organization. So that would be an acronym like CDL, the website, and we're mapping to other identifiers as well. I think wiki data, we have a wiki data person here in the audience. So Roar IDs are currently mapping to wiki data pretty soon it'll be vice versa as well. And so as part of building this open registry of organizations, it's really important that it's interoperable with other identifiers that are out there. We have an API. All of this is available on our GitHub and you're all welcome to play around of it. We have an open refine reconciler. So if you're working with your own messy list of affiliations, you can run it through the reconciler, map it to Roar IDs and come out with a super super clean list of affiliations that are all tied to Roar IDs. There are still a few problems that we are trying to solve with Roar. So remember, we're at MVR stage right now. So we have the registry up and running. We have IDs for 91,000 organizations. There are still some things that we're sorting out in the next stage of the process. And if anyone is really interested in these problems, I encourage you to get involved. So one of the things that we're working on next is improving the metadata in Roar and also looking at some of problems that come up with disambiguation and just the nature of affiliations and institutions around the world. A lot of them happen to share the same name. So there are a number of institutions in Roar that are all called Lincoln University. And these three that I'm showing right here all happened to be in the United States. And so they all have unique Roar IDs. And so that's really a step toward identifying these unique entities and these unique Lincoln universities. But to show an example here from my colleague at CDL, Daniela Lohenberg, we'll be talking more about this later today. This is an example from CDL's data publishing platform. And we're building a Roar integration into this data publishing platform. But you can see in the example I'm typing Lincoln University. It's pulling up all of the Lincoln universities that match in Roar. But there's no way for me to know which Lincoln University is which from this drop down in Roar. So that's one of the problems that we're now working on in Roar is how to further disambiguate after assigning these unique IDs to all these organizations. Another problem that we're trying to solve is how we curate data in the Roar registry over time. And how do we account for the sometimes unpredictable nature of institutions and research organizations? This is an example from my home state of Vermont, a small college that recently went under. And so, presumably, I didn't check right before this talk, but presumably this college, Green Mountain College has a Roar ID. But now the college no longer, or after this semester will no longer exist. And so how do we account for that kind of situation in Roar? What other kind of metadata do we need to build into the registry to designate organizations that are obsolete or two institutions that decide to merge? Or how do we add more institutions to the Roar registry over time? And so we're going to be working a lot on questions around data curation and maintenance and how we involve the community in that effort as well. Questions like should somebody be authorized on behalf of a specific organization to approve changes or to request changes? There's a lot of ways to be involved. I mentioned we have a community advisory group. We have a Slack workspace. You can email info at Roar.org. That goes to me. We're on Twitter at research orgs. And we're encouraging as many people to get involved in this effort as possible. So if you're interested in organizational IDs and affiliations in disambiguation data governance, please come talk to me. I would love to talk to you. Thank you.