 Hi everyone. So thank you all for joining us today to learn about open orgs. We're really excited to present the service that's so vital to data quality and its newly launched website. So today we'll start off with presenting open orgs and the rationale behind the service and why it's important for the open air graphs data quality. After that we'll run through the new website and then break down how you can participate and contribute as an open org curator yourself. And then we'll finish off with 15 minutes of Q&A at the end. So we kindly ask you to keep your microphones muted but in this case don't think you have a choice. And we'll open the floor in the end to the Q&A's where you can either raise your hand, post in the chat or in the Q&A section. So with that said I'll pass the floor over to Ivana to get us started. Hi everyone and thank you for coming. Just let me share my screen. Do you see it now? Yes. So thank you Ivana for the introduction. I'm here today presenting to you my colleague Martina and also me. What we will talk about in a second. Let's go through the agenda for today. First off we're going to talk about this pretty common issue, the challenges that come with organization affiliations. Next we're going to talk about solving this problem through open orgs and give an overview of the service itself, what deal with it and how can you benefit from it. Then we're going to skim through, as Ivana said, our new open orgs website. We'll navigate the sites together, pointing out key features and how you can make the most of it. And then it's my colleague Martina takes over and it's all about you and our curators. We'll talk about how you can pitch in and help because community expertise is the backbone, you can say, of what makes platforms like open orgs work so well. We'll also delve into why your knowledge skills are invaluable and how you can share them with the community. At the end we'll also share with you an invitation to an open orgs training, which is a great way to learn more and get comfortable with how you can contribute. And finally, as Ivana said, we'll wrap up with the Q&A session and some concluding remarks. This is your time to ask questions and share your thoughts. Other than Martina and me, I think our colleague, Claudia from CNI is also here. She's more knowledgeable, we can say about technical stuff behind open orgs. So if you have questions in that domain, Claudia will be here to answer those. So let's start. And let me start by emphasizing the catchphrase that has become synonymous with open orgs. It's evident in the past webinars and sprints. And also on the portal I will showcase later. So breaching registries of research organizations. We can say that this title captures the core message of the service itself, which acts as a tool for creating the connections between existing initiatives that maintain the identities of organizations involved in the research ecosystem. So open orgs addresses a critical need within the open air infrastructure, offering an added value to other open air products. So essentially it's a tool for managing disambiguation of organizational entities and enhancing the metadata quality that describes this entity. So let's address the problem at hand, the ambiguity of information, which significantly impacts the quality of data foundational to a scholarly communication system. Specifically, regarding organizations, information is scattered across various data sources, leading to inconsistent representations through legal names, full names, acronyms, short names, aliases, etc. Moreover, organizational structures often appear unclear, such as the division into branches, faculties or departments, further complicating clarity across data sources. This ambiguity poses a substantial challenge, making it imperative to disambiguate this information to ensure the efficiency of the scholarly communication system. Also, this ambiguity not only affects the entire system, but also complicates the development of robust services essential for advancing open science and science in general. So why do we need this unambiguous clear information? For example, clear information is crucial for researchers seeking to search and discover enabling services that facilitate the findability of research outputs. Then research performing organizations require services to aggregate and showcase their scientific production. Similarly, research funding organizations need monitoring services to have consistent information on the impact generated by their funding. And in general, all stakeholders want open science services that are functional and up-to-date, highlighting the critical nature of this disambiguation issue. Within Open AIR, organizations are introduced through the Open AIR aggregation subsystem, which gather information about organizational entities from a diverse range of data sources, each presenting the data in its unique way. On the left side of the screen, you'll notice names such as OpenDoor, R3Data, Cordis Grid Roar, and others, which are key contributors of organization lists to the Open AIR. And because we are collecting information from many different sources, it is not easy to determine whether the information refers to the same organization. Organizations are mentioned in various ways, as responsible bodies for managing data providers, as managers or beneficiaries of projects, and as affiliations of researchers in the full text of articles and papers. However, how harvesting data from these sources presents challenges, such as the lack of common identifiers across all initiatives, despite the existence of IDs for organizations. This often results in the same entity being represented differently across sources. And just an example here, how this looks in OpenOrgs when it is already curated, we can see here all of these different versions of the same organizations that come from various sources. So here we have OpenDoor, National Science Foundation, Richard Bosco Institute Library, this is from the project database, etc. So these are all the curated replicates of the same institutions. You see here that different names can be used, names in English, names in the national language, etc. So this often results in the same, sorry, the primary issue in reconciling this data is disambiguating and finding duplicates among these mentions of organizations. OpenAir began addressing these issues through an automated system that despite being configurable to various criteria, faced uncertainties due to data incompleteness, the current matching algorithm employs a method that groups organizations by legal name and website URL then performs a fair rise comparisons within each group. However, a significant challenge is the often lacking comprehensive information across different data sources. Among the fields that describe an organization, only a couple of characteristics could be used to drive the matching. Namely, you can see here in this table the legal name is basically always present as a strongly distinctive characteristic along the website URL. In many cases, other information is not available. So as a consequence, an automated mechanism cannot really take a decision at its own alone. The two states, the two mentions of a given university or research institution or school are the same or not. So even if it's let's say a large portion of the data was correctly disambiguated, the uncertainties start to harm the quality of the results at the end. So these inaccuracies or limitations in disambiguation pose several issues, such as generating false positives that then skew statistics and affect the reliability of searches in open-air portals. If our data isn't precise, we might miss some important details. For instance, if we are focusing on just one org in open-air, you might not see all the services or products it offers. Similarly, if we try to look at the big picture and consider all institutions in a country, we might not capture all the relevant services or products for each of those institutions. Also, when using a search tool, for example, like open-air Explorer, to find information about a specific institution, we might end up with several entries that in reality refer to the same place. This makes it hard to get a clear and accurate understanding of what we're searching. On the other hand, false negatives can also cause similar issues. For instance, when different aspects of the same organization, like its research papers and projects are treated as if they belong to separate entities. So let's say the project belongs to the University of Zagreb and the research papers belong to Zagreb University. Essentially, this means we might overlook that all these aspects are actually part of the same organizations. But for some case, it has a different name, even though it's the same. The visuals here resemble what was the previous approach in open-air. Initially, there was a direct link between the application system and the open-air portals. This means the results from the application system would automatically show up in these portals here. However, due to problems with accuracy and reliability, as I said, the decision was made to remove this direct connection and that's when open-air was introduced. This system acts as a middle layer, we can say, between the application system and the public portals. As you can see here, within open-orbs, the metadata curators play a crucial role. They review and mediate the information produced by automated algorithms before it is shown to users on the open-air portals. Well, this change aims to improve the quality and reliability of the information that users see. So, as we have seen in the previous slides, open-air services benefit from organization disambiguation. Services that are most affected with this situation and the problem of ambiguity of the metadata are these that you can see here. So, open-air explore, monitor, connect, open science, observatory. To these services, the curation work can be really helpful and useful because the organizations resulting from the data application and enhanced by the user feedback are indexed and then exposed by the open-air portals. So, in summary, we can say that the process of organization disambiguation significantly enhances the effectiveness and utility of the open-air services by ensuring that data and metadata are accurate and clearly attributed. This in turn supports better research outcomes, better interlinking of research outputs, enhanced discoverability, more effective monitoring and the advancement of the open science initiatives. So, to recap and give you an overview, open-orbs is essentially based on three main activity pillars. The first is automated approach. We have, as we said, an algorithm which detects a similarity between organizations and it establishes this similarity, but that has to be accepted by the curator. So, the second step is the manual management of duplicates. This enables curators to manage these duplicate suggestions by either confirming or denying them, recognizing that some decisions can only be made by humans. And the third pillar is the metadata curation in which the curators can enrich the quality of data, basically, and improve the findability of organizations within the large space of data that open-air has developed. So, that's it on the short intro. Let's hop on to the new portal, open-orbs.openair.edu. So, yeah, let's hop on to it and I will show you what it offers. Just a heads up, although there are all of these complicated algorithms behind it, in the world of open-air, open-orbs is a small rather simple service and the such, we didn't want to complicate things much on the portal itself, but still wants to give you all the info and also provide some sort of online everyday support for our curators. So, when you first come up on the page, this is what you will see. As I already explained, the Bridging Registry sketch page has became synonymous with open-orbs since the beginning, so we wanted to showcase it here also. Right in the hero section, you have two buttons for the tool, I would say, most important things to either explore open-orbs and start the curating or become a curator. We'll go over these a bit later. Other than that, we also wanted to showcase who the key beneficiaries of the service are. We went over those at the beginning, if you remember, when it talks about who actually needs unambiguous and clear information. So, just to recap, the key beneficiaries are researchers, RTOs, RFOs, and basically, in essence, entire research community. Then we have a section named why open-orbs, where you can expect to find information about open-orbs capabilities and benefits, including how it enhances the discoverability of research organization, how it integrates different organizations and registries, resolves duplicates, also curates metadata, and features this curated information on the open-air portals. The next part is very dear to me because it showcases that open-orbs doesn't just talk to talk, it walks the walk. Here you can explore case studies which illustrate the impact open-orbs has on a research collaboration and the global research landscape. These examples highlight the tools, capabilities, and the difference it makes. So basically, by clicking on them, you can explore real cases from Serbia, Greece, and Cyprus. So, let's just click on one of them. If we'll read you here, here you can read all about it, get an overview, see what was the challenge and scenario, how they implemented the solution, and the impact it had on their community. Also, details with screenshots from the service. Okay, let's go back up here and check out the menus. So, on the about menu, two main articles pop up. One is about open-orbs in general, the problem statement, etc., and the other is about metadata curation. Let's check out the first one. So, here we have the main problem statement and how open-orbs works to solve that problem. It is explained that open-orbs works by using automated workflows and curator feedback to deduplicate and manage organizational records. So, this is basically the magic that happens behind open-orbs, but explained in a simple manner. For those of you eager to delve into the net involves of open-orbs, the data curation section offers a deep dive into the processes and practices that ensure the high standards of data quality. We have a few things here, but overall this section of the website highlights the community-driven nature of open-orbs, underscoring its role in enhancing the open-air graph with clear and accurate data to elevate research quality. The things that pop up on this page immediately are the benefits we get from data curation. Martina will mention them later, but let's just briefly go over them. Recuration work in open-orbs plays a significant role in enhancing the functionality and accuracy of the graph, contributing in several ways. So, data quality and accuracy enhance discoverability, interlinking of research outputs, compliance and reporting, facilitating research assessment and metrics, and supporting the open-science movement. Next year we have a section on how you can help and the roles and responsibilities of data curators. Furthermore, curation methodology and these are the activity pillars we talked about earlier. And lastly, an overview of metadata that is available for curation. So, you can enrich organizational entities by adding official name and type, adding geographical location, acronyms, aliases, identifiers of the URLs and relations, which are very, very valuable because you can define hierarchies between organizations such as departments, institutes, subunits, etc., clarifying the structure and enhancing the understanding of the organization's composition and scope. On the About menu, we have a few more articles. The website provides a statistical overview of the services reach and impact with over 70 curators from 24 countries and more than 100,000 curated organizations across 200 countries and territories are that speak to the, I would say, vibrant community and collective effort driving open arms forward. The website also introduces you to the team behind open arms. Here you can meet the service managers, some of the data curators who are actively involved also in the development of the service and all the people who actually made open arms such as developers, data engineers, and data scientists. Also, we also introduce you to the newly formed curation board, which is a dedicated team with one goal to enhance the quality and reliability of research organization metadata globally. Beyond date of the curation test, the board actively engages with the open arms community, providing feedback, answering questions, etc. For new curators joining the platform, the curation board provides training resources, guidelines, and ongoing support to ensure that they are all well equipped to contribute effectively. So whether you're a potential curator looking for guidance or a researcher with questions about our data, the curation board is here to ensure your open arms experience is both rewarding and impactful. Okay, let's go to another tab in the menu called support. And here you have a few things to go over. Let's go look first at the supporting material. As of now, available here you have six tutorials covering different aspects of data curation. So first we have an introductory tutorial called understanding open arms. Then we dive deeper into the metadata enrichment and curation, approval of new arms, curating duplicates, resolving conflicts, establishing relations, and more will be added soon, of course. Other than this, whether you're new to data curation or an experienced curator, our glossary section explains key times used throughout the service, making sure you have a solid understanding of our processes and objectives. Of course, there is also a frequently asked questions section where you can find the answers to most of your questions. We have a general section and also a section about the curation process. So here you can find the answers to questions like what qualifications do I need to become a curator for open arms, or can I contribute to open arms if I'm not a professional curator, etc. My opinion will go over these two. It is important to note that the materials in this menu will be updated regularly. This is something that evolves over time. Of course, we plan on updating the application itself and with that new materials will come probably new functionalities, and the answer is new questions arise and so on. In the menu, we also have this publications section, which leads to the new open-air graph site. This is something for those who want to know even more. You can find here the articles on this application and research more about it. And of course, we have a health desk here with a contact form where you can ask questions and comments, which will then be sent out to our open-air health desk. And another thing you might have noticed in the menu, how to become a curator. This is something that will go over in detail, but just know that here you can reach out to us through this form if you want to join the team and we will get back to you as soon as possible. And the chair on top, let's go back to the home page for this one. To explore the service itself, there are two buttons. One is here in the here section. The other one is in the menu. Click on it and it will lead you to the signing page. And if you enter your sign-in, you will be directed to the open-air service. If you are not yet a curator, you can send a request for becoming one through this signing as well. The interface of the application itself will change in the near future and I hope to make it compatible to the service website we've seen just now. And this is where my tour is ending. And Martina, the floor is yours. Thank you, Ivana, for the presentation so far. And hello to everyone from me. My name is Martina and I am a data curator and part of the newly formed curation board, as Ivana has mentioned. And I'll be talking to you about the role of data curators and how you can become part of our open-air community. So first, let's start things off by exploring how you can help. So as data curators, you play a vital role in curating and enriching metadata for organizations, ensuring that information is accurate and comprehensive. This involves identifying and resolving duplicates, both algorithm-based, as suggested, and manually identified. You also establish parent-child relations to create organizational hierarchies, approve organizations that add them to our database and gives them a stable open-orgs ID, as well as add new organizations to the database, document your insights in the node section, and resolve conflicts through either merging or distinguishing organizations. Now, there are two curator roles in the open-orgs community. The first one is the user. So as a user, your scope of data curation is typically limited to a set of countries. As a data curator, you are editing and enriching the metadata of organizations, you are approving suggestions from the automated duplicates identification, and you are potentially creating new organizations that then depend national admin approval. The second one, so the national admin, the national admin curator role has a similar responsibility, but with the additional task of giving final approval to new organizations, managing user access within their nation, resolving conflicts and making decisions on suggested organization, and finalizing their approval status. Now, why do we need you and what would be the benefits of curating within the open orgs? Now, your expertise is unparalleled, especially when it comes to knowledge of your country's research landscape. Your contributions are vital in enhancing the open air platforms, ensuring that shared information is thorough and accurate. Your input also helps maintain up-to-date and precise data about research organizations. It increases the visibility of research outputs, projects, and datasets, and it also simplifies the adherence to open access regulation, which in turn promotes greater transparency and accessibility in research. Next that I'll be talking about are the skills of a data curator. Now, as a data curator, you will need expertise in research organizations within your chosen country, as well as awareness of major research institutions and universities and national or regional research information systems. You also need to be familiar with national or regional research information systems. Other than these specialized skills as we have labeled them, there are some general skills that we think a data curator should have, and those are the attention to detail, strong data management skills, analytical skills, technical proficiency, and problem-solving skills. If all of this resonates with you, we encourage you to join our team as a data curator. And as Ivana has mentioned, the process of becoming a data curator, if you're interested in it, it's pretty much straightforward. As Ivana has showed you, there is an online form which you fill out and that expresses your interest in becoming a data curator. After that, you will have a one-on-one meeting with a member of the curation board, so either Ivana, Bojan or me. And in that meeting, we will be discussing your role as a data curator in more detail and also answering any or every question that you may have. After that, after that meeting, you will receive a comprehensive training on how to use the OpenOrgs platform. Effectively, you'll sign a volunteer statement outlining your rights and responsibilities. And once you are accepted, you will be ready to start contributing as a valued member of our data curator community. Now, if all of this that Ivana has said and showed you and what I have said about becoming, about being a data curator and becoming one, if this has sparked your interest, we also invite you to join us for an engaging training session on OpenOrgs. The training session is scheduled for the April 15th and it will take from the 2.30 to 4.30 Central European time. During this session, it will be different than this webinar. You will gain a deeper understanding of how OpenOrgs operates. You will learn about the workflow involved in data curation and you will participate in a hands-on session where you can ask questions that are tailored to your country's context. And this is it from me so far. And now we've already had the Q&A time, but if there are any more questions, we can answer them now. Perfect. Thank you so much. Yeah, we do have another question or two. And then of course, so now that the Q&A session is officially open, feel free to also raise your hand. Anyone if you have a question, so you can type it in the chat and the Q&A or simply raise your hand. So first off, we have Will OpenOrgs also have API interfaces for query and use all the collected data. And this is from Jordan Piszczak. Yes, Claudia, thank you. I think I can answer that. For the moment, no, but it's in the plan indeed to support interfaces, programmatic interfaces for returning back the added value built inside of OpenOrg thanks to the disambiguation activities. So I cannot provide yet a timeline that has to be evaluated and prioritized with the rest of other many other activities OpenAir is involved with. But yes, this is the idea. Else, in the audience, if you have any questions, feel free to raise your hand. A reminder that this is recorded. And what we can also do is we can send the registration link here in the chat for the curation training webinar. That way, if you didn't have a chance to get the QR code, you can have that now. So I'll just share that. If you're interested in becoming a curator, we have the training session, as Martina said, on the 15th of April from 230 to 430 CEST. And then here is the registration link. There's a question here. So that it sounds very time consuming. And how is the experience of current data curators in this scale? I'm a data curator and it is time consuming. But once you get in the rhythm of curating, it becomes easier and easier. We will explain more how to curate and what the workflow is in the training session. But once you know what type of organization there are and how the duplicates work and how the suggestions work, I think it becomes more and more easy. So I don't know if Ivana has anything to add to this. I agree with you. I mean, nobody expects from you to work on it eight hours a day. It depends on your schedule. Great. Thanks. The question is if we indicate in OpenOrgs that some records of one Xenodod community are affiliated to an organization, will future records of that community automatically be linked to that organization? So this is about linking OpenOrgs to Xenod communities, same for other institutional repositories. Maybe this is something called you can answer yet? Speak. Hello, hi everyone. Can you hear me? Yes. So to answer that question, the linking between the affiliation information coming from Xenodo depends on the availability of such information in the metadata record that Xenodo delivers to OpenAir, but not only to open it to everyone through the OAPMH export format. So let's imagine that Xenodo exports the raw identifier of that organization indicated as an affiliation of the authors of a given deposition. Then OpenAir is capable to capture that raw identifier and use it to assign all the products to that specific institution. So this is how the integration works. However, if OpenAir is aware of multiple variations of the same institution, then it might be needed to perform some curation, indeed to disambiguate them. In general, in Explorer, in this case, those organizations would appear in multiple occurrences. So until the curation is done, might not be straightforward to identify, to search it and identify it in Explorer right away. So this might need some time to perform the curation and update the index, but that's how it works. Okay. So we have also, this was already answered in the question to answer, but just so we have it recorded for those who weren't able to make it watching the recording. When you become a curator, is that only for your own organization? No, I think it's not only for your organization. You will get the rights to curate all of the institutions in your country. So you can curate all of them. But yeah, if you want to curate on your own organizations, yeah, you can. Thank you. And we have, okay. All right. So another one. We have, what's the benefit of curating data that's not from your own organization, nor linked in any way? Let me talk about this tool. The benefits are, I would say global, because it benefits in your own country. You know, you work for a higher and all of the research outputs get mapped and, you know, attributed to the right organizations and stuff like that. Yeah. I can support what Ivana said, or to give another angle. It would in general improve the overall quality of the data so to better support their discoverability. So maybe today some organizations might not be linked to research outputs, but maybe tomorrow they will. Perhaps the platforms that host the products that are from that institution are not today supporting the exposition of affiliation information, but maybe tomorrow they will. So it's important to disambiguate the information anyway to leverage on the persistent identifiers. Thank you. And then we have another one. Is there, so I think Julie said you can respond to this one. Is there any relation between open air and some editor repositories, Elsevier? Yes, I can answer. So in general in the open air graph we are making data exchange agreements with publishers and editors in order to have metadata from them. So this is part of our agreement and the transparency that we have in the open air graph. Other than that, we don't do that much. I don't know if Claudia would like to comment or it's fine. Actually, I think you summarized the situation. As far as I know, open air is only collaborating with some publishers to experiment on some research activities, but that's it. We are getting some metadata from them and some full text from them, but that's it. All right. So that looks like all the questions in the Q&A and the chat. Are there any other questions? I don't think I see any hands raised. I can also add to what was said before in terms of the time consuming activities. It's also true what Martina said, but we are organizing two sprints a per year to help you to do together. So it's going to be also a community activity and we would like you to be part of this creation activities. So don't feel that it's just a constraint from you. It's actually something that it can be of the benefits for the country. You will have more visibility. So we will also amplify the section that is on the creator's activities and how it looks like. So be part of our community and don't be afraid of it. Perfect. Thanks. I mean, we just have another, oh, sorry, it looks like I thought I was muted. It looks like we have another question coming through. So if national curators will do it for all other institutions in the country, how many curators will be accepted per country? I don't think we have a general consensus on that. At this point, we are, you know, whoever wants to and whoever can join, we encourage you to join. What, how will be organized, what the trainings? I think how will the curation be organized? Oh, oh, so who will curate which data? This is something we leave to the teams in the country. So whenever we get requests for curation and if there are people from that country already curating, we, you know, we send them an email, say who, who, who's, who, everyone who's in this country and who's a curator and it's up to them what they will curate. It's up to their, you know, agreement. Can I add something else again to give another angle? I think different countries might require different level of efforts to perform the curation. Some countries do have a lot of reserve performing organizations and might require a more structured approach and finding someone that is actually knowledgeable on how to address that is an immigration task in different languages can be challenging. So once I think my perception is that once a given country has a certain base of curators, then the time will come to organize some meetings, maybe on how to shape the curation activity per country. Some approaches could be reused also in countries that feature similar characteristics, maybe small to medium by number of organizations, of course, but large countries might require a dedicated approach, a more structured approach. But this is up to the specific curation board, so those that do enroll to perform the curation, to organize. Of course, OpenAir will be there to support and to guide them, but we'll see. Consider that by now, who is requesting a monitor task port for their own organization, we are giving the opportunity to create the data, so they will have more curators, more curated dashboards. But also we are open to the members, so if you would like to contribute to the data in OpenAir, but also have an active role in OpenAir, you can contact the nodes of your country and it requires to be also a member of OpenAir, and this will give you an extra visibility and extra opportunity to co-curate, co-design with us the services. Thank you. Will there be a public list of curators? No, but we'll think about it. But for now, no, no, it is not public. So it looks like someone's looking to know if there's already a curator, so how do we know if there's already a curator in our country? For now, you can reach out to us at, I will write an email here, OpenAir admin, and we'll tell you if there is anyone already curating for your country, of course. Anything else? So a reminder, so the webinar is then recorded and we'll upload it to the OpenAir YouTube channel, and then also I think we have the list of everybody here, we can email you those links as well. So you can go back and watch and also see where to find certain things in the website and who to contact when. So no last final notes? Anything? It looks like not. Okay, so I guess we'll give you four minutes, so you have four minutes for your next meeting to get some more coffee, and thank you everyone so much for joining us. It was great to, it's great presenting OpenORX to you all, and thank you again to our presenters, Martina and Yvana, for giving us such a good walkthrough, and it looks like we had some good comments on the website as well. People said they liked the interface and it looks very user-friendly, so congratulations. Job well done. All right, all right, well I think we'll leave it there. Thank you again everybody, and have a wonderful rest of your day. Thank you everyone. Thank you, bye.