 Okay, hi everyone thank you for joining us on this community call. Today we have several things to cover. Let me if you move to the next slide please. So we're going to give a brief intro to the monitor service in case there's anyone here that doesn't know about the service yet they're new to us. And then we're going to discuss about the World Bank monitor dashboard. In particular, Leonidas from open air is going to tell us how the entire pipeline goes from provides to monitor. And then we have with us Katie Ballant from the World Bank that they went is going with Lonidas through that process, and we'll discuss her experience and what they expect from a dashboard and so on. And at the end I'm going to quickly close with the new indicators that we have the dashboard and the ones that we're working on for the fall. Okay, so let me start. So, the, the idea of offering a monitoring service of why organic research performing organization would like to monitor is so that they can make decisions in a way that is evidence based at timely consistent and replicable. So in order to do that. The first step would be to, as we call it here in the country may not know thyself. So to understand the resources that you put into the output that comes out collaborations the visibility the impact you have in different aspects of interest, such as one that is of a lot of interest lately which is the open science uptake. And this leads to tracking all these measures leads to an understanding of how things work in your organization, especially as it relates with everyone else. So the impact pathways become more clear opportunities for growth from improvement, and in general insights are gained from the, the large world of world of data that is available to us. And this allows in the end the organization to position themselves make the decision, and so on. In order to invest in the research activities that are in an efficient and profitable way. And besides decision making I'm also have also added and decided reporting and storytelling, because these are, although it's not the main goal there to additional attributes that are required from a monitoring service. So that we have why open air monitor in particular. Well, here we emphasize the fact that the open air is built by the community. We work very closely with the community to go develop indicators and provide things that are relevant. And this nice code says on the side, not everything that can be counted counts and not everything that counts can be counted. I love them because it's really the approach right when you have so much data available you have to pick and choose what to show and in which way so that it's meaningful. And this is also about the open science open data open methodologies, so that whatever we do is transparent can be applicable and it is clear and inclusive. And the open air graph, which is what the monitor saves based on aims to have a very wide coverage of the scientific domain, linking different research products with projects organizations and so on. It's fully embedded in the US infrastructure. So, I briefly mentioned this before but the approach that we take. Can you go one back please. Thank you. I think both in the ways that the indicators are built, but in fact is in the entire pipeline and Alamedas will emphasize where in a bit. So the idea here is that we need the tracking and evaluating of research activities and open science of take if this is of interest to particular organization to be data driven and relevant so we work very close with the community to make sure that we pick data aspects of interest comprehensive and granular so you can both see the entire work but also zoom in in particular areas of interest, automated and timely of course, as sustainable so we can keep tracking in the same way and repeatedly. And trust worthy, which comes from the openness, transparency and replicability of the methodology and the data sets. And this leads to our dashboards. The next slide. And so that actually there are two monitoring states open and monitor the observatory. Just to mention them as the observatory as well. So monitor has three types of dashboards on demand dashboards for institutions, founders and research initiatives. And each of them is tied with a very friendly open air expert. And that populates the dashboard automatically from the open air graph, but then works with you closely to guarantee the data quality of what is shown in the dashboard and the indicators and so on. And that's it for the introduction now I'll pass the floor to learning this. Thank you. Yes, thank you, Joanna. Before I know I would like also to thank Katie banon from the World Bank, that he was kind enough to join us and share her experience, the collaboration in creating the and also evaluating the dashboard for the World Bank. So, I will continue the community call by presenting a few details regarding the monitor dashboard and the, especially the data quality that we check and evaluate with you. In order to have the best result in the institutional dashboard. It's important to that's why I'm noting good to have a good collaboration and constructive collaboration during this phase, while we create and create the dashboard and also check and validate the data quality. That's why I'm displaying this as a tune gears that we work together we collaborate together in order to build and maintain the monitor dashboard. And that is what we have done, as long as with all the institution also with the World Bank. Okay, let's start and see the big picture now. In open a monitor we have the institutional dashboard that this is the workflow the pipeline that start from the bottom to the top, in order to reach the institutional dashboard, the dashboard. And as you can see the bottom we start from the opener graph, where we have the data that has been ingested it for the specific organization. And this is the part that we will focus from for in this presentation from now on. So what is open a graph. Let's start with the, the basic infrastructure of open air where we get all the meta data and the information in order to create. In order to get the output in the monitor service is a collection of metadata that describe objects research products in the research life cycle, and of course the relationships among them. We collect information from around 70,000 trusted data sources. And this metadata are further enriched without with other metadata and links that are provided by either end users, which has with can we who can provide links from scientific products to projects funders communities or other products. And rich with full text mining algorithms through from open access from full texts in from open access articles and also from research infrastructure, infrastructure scholarly services that are bridged to the graph via open air. This is a slide showing the infrastructure in a glance that the hardware that open air use the heart of a opener graph and all of open air services that we use in order to have the opener graph provided to you and also all our other services. I won't go further enough to pretend about getting the numbers. We'll see it in the presentations after the end of the community call. So we're talking about data quality, what exactly do we mean, and what can go wrong. For example, in an institutional dark sport. I have a few examples to indicate the examples which we have in a few cases in institutions. And we will see the reasons what a bill behind this issue that may arise. First of all, we might have relatively small numbers of research products of publications, data sets, software or other research products. This can happen due to several reasons, first of all, we might need to have a disambiguation of the institutions different names. We may have an issue with the repositories. The records during the aggregation the institutions repositories meta data records. For example, there may be missing mandatory meta data fields of the research product type. And of course, the registration of institutions repositories sets validates and enriches the affiliation information. The registration is has been done via the provide dashboard we will get more details in the next slides. And of course, the data set affiliations come come from the institution data sources that are registered and aggregated by open air. As an example that the have come has also come in a few number of cases that the there was there was missing information there was missing funding information regarding projects from European Commission. This is happening due to the fact that the the funders that provide this information must interact with open air and provide this information for the funded projects and the respective institutions and organization in order to have this information linked to the institutions. And of course, there is a might be a disambiguation issue in open logs. So how to improve how can we improve the data quality. There are two stages. First of all, regarding the data sources of your organization. Before these are entered are ingested in the open air graph. Here we have the registration part of the data sources that has been is done through the provide dashboard, which have also the compatibility with the open air guidelines in order to register your data sources to the provide dashboard. And the second part are the data sources that are already ingested in the open air graph. Here we have the disambiguation by the open or the disambiguation of the names of the institutions of the organization of the different names, and also we have the step of improving the compatibility level with the open air guidelines. If, for example, the repository has been registered a few years ago and is not compatible with the latest version of the open air guidelines. So what are my data sources, someone from an organization from an institution might ask, well, the answer is pretty straightforward. We have the institutional repositories that may contain later to data software or maybe of course other research products. We have the Chris systems. And any general that you might have that any solution might host. How can I register my data sources. You can register your data sources to the pro in open air, so that they are ingested in the open air graph through the provide dashboard service. The prerequisites for this registration is that your method that the meta data records of your repository should be exposed via the OEI PMH protocol and also they should be compatible with the open air guidelines. Why should I register my data sources. Well, first of all, if you're entering your gateway to us, if you are compliant with the latest version of the open air guidelines, then you are automatically will automatically be onboarding the old portal catalog and marketplace integrated platform. Additionally, your metadata, you will have improved interoperability. In order to meet the latest IT and repository standards standards. Your content will be more contextualized with links and relationships among all the research outcomes and entities more flexible by using different and improve vocabulary and embedded they in the R&I ecosystem. The line with the open science and dates and standards and there's also support for well established meta data schemas. Of course, being compatible with your, registering your data sources and being compatible with the open air guidelines is the road to is the road to fairness. If you are compliant with the open air guidelines, then you are also fair enough. And finally, you will have accurate and qualitative metrics in the monitor service and the institution and dashboard. The common metadata framework for exchanging minimum metadata information through the open air guidelines will provide orientation to the organization in order to define and implement their local data management policies according to the requirements of open air. And also through the open air guidelines we expose your research products via the IPMH protocol in order to integrate with our infrastructure. OpenOrgs. OpenOrgs is a platform with a service of open air on disabiliting the organization names and status. First of all, we should say that there's an information activity about the organizations which is distributed in all the data sources that open air collects information, collects metadata from. What do we mean with information ambiguity that they may have that the organization may appear with different names, legal names, acronyms, short names, alternatives, and others. And of course the organization structure may not be so clear, faculties, departments, different branches and detachment appear. This disambiguation of this information is performed through OpenOrgs where you can deduplicate the organization names and also identify the parent-signed relationships of your organization. The activity pillars of OpenOrgs is the automatic suggestion of duplicates where when you enter the platform, it suggests you produce new suggestions with the different names through the various sources if various sources have been registered in open air and have your organization name with the different status than an existing one. Also you can enrich the metadata description of your organization entities, improving the discoverability, and you can manage the duplicates, which is that cannot be performed by machines. It's only humans can carry out precisely this task. Of course we are here to assist you. We have the provide dashboard validation and registration guide through our support email and of course we can have dedicated calls with our experts for any information or help you need in each phase of the process. So, thank you very much. Now we'll give the floor to Katie Bannon from the World Bank. Thank you so much. I think you were going to share my presentation. Yes. So while you're pulling that up, my name is Katie Bannon. I work at the World Bank. For those of you who don't know about the World Bank, we're a large multilateral institution that lends money and generates knowledge for development in all our member countries, which is many countries around the world. I work in the IT department. We use more the I part of IT. We manage the libraries, archives, digital publishing, records management, and also implement a lot of the access to information program. So, I have two slides. The first one is to give a little bit of context and the second is to jump into some of the details that we've run into while we try to build our dashboard. But first of all, for us, like why the open air monitor dashboard. And the main goal is that we're just beginning to explore how open science could apply to the World Bank. The banks invested a lot of money in open data, open access access information. So open science is the big thing we're trying to understand like how are we doing compared to these open science standards, like are we already meeting them like where are our gaps. So based on that information will be able to hopefully decide how to move forward. You know, reliable data is really essential to persuading management to make changes in some of our implementations or really hoping the dashboard can provide no clear reliable data about how we're doing currently. In terms of how we're connecting the dashboard to kind of other institutional initiatives. The World Bank has World Bank Group has a data roadmap and they report on this annually to senior management. And one of the indicators on that roadmap is the existence of a public open science dashboard. So that was just added last year. So we're hoping by the end of this year we'll have something to show. We also produce a public report on the access information policy implementation and we're hoping that we can maybe include some of the data from the dashboard in that report. Secondly, like why open air instead of building it in house. So it's extremely at least from my perspective very cost effective to leverage the technical and intellectual investments made by open air and not rebuild it ourselves. That would take us forever and we really don't have the staff or the skill sets necessarily to do that. The big appeal is the very transparent methodology. So it makes the dashboard very defensible within the institution can always people might argue about the numbers and you can always refer them to exactly how those numbers were arrived at. And I really appreciate also that it's really walking the talk of open science and showing the work and being transparent. You know, it's always good to have a third party running the monitoring because it's very hard to self monitor yourself you're just inherently biased and it's very hard to see yourself, or know thyself I guess. Another big appeal is been the use of by open air of standards. And the World Bank, you know, does follow some standards but maybe not all the time. So working with open air has really made us pay a lot more attention to these standards which really helps us become more interoperable with others. So that's been also a big draw. And lastly, just having these comparable indicators. Not that we want to compete with others but our senior management does seem to respond to seeing how how are we doing compared to others. For example, for access information. So there is a group called yachty, and they create an annual publish what you fund index that ranks all these development agencies about the transparency of their aid flows. So every year, you know, we get to see oh how did we come in, in terms of the ranking. So that's very helpful in terms of like spurring action on the part of the institution to see if you're doing well you need to make improvements in a certain area. So I think having some sort of comparison to how we're doing to others is great. And not that we want to compete instead of collaborate but just it's it's it's good to have a little healthy competition it helps institutions I think move forward. But we really welcome the opportunity of this community to really learn from others. We're hoping that others are finding solutions to some of our struggles and that we can learn from all of you. Okay, the next slide. So our experience building the dashboard and really a huge thanks to the need us he was suffered through many many questions for me often repeated questions. He's been great to helping us walk through some of these details. So one of the sources that we've been trying to integrate into the dashboard. The first is the open knowledge repository. And that's been the easiest because it uses d space, which is really set up to be an open access repository. It's been registered in the directory of open access repositories by default it was already following the way I can make some metadata standards. I'm going to end the data about it's content was already included in the dashboard. The heading tasks for that is really, we need to validate the data to make sure like everything that we have is actually showing up. It should be, and we need to add something on our side so that open air can count the downloads that we get it open in the open knowledge repository. So that, that was the easiest and the least work to integrate the open knowledge repository. The easiest repository is documents and reports. So this is a humongous repository there's more than 400,000 documents in the repository. It really is the archive of public documents and is our official disclosure mechanism for the access to information policy. So we do have an API for that. But it was not aligned with LAI PMH. So we worked with you need us to map our API to that standard. At which point we also discovered that, you know, only some of those documents had a DOI. So we're now trying to figure out, you know, should all those documents have a DOI, or do they not need that. There's some duplication with the open knowledge repository. So that's also making us reflect on how to best use these two repositories. And a kind of larger question that's come up is like where do we draw the line about what should be included in the dashboard because the document reports repository has everything and has procurement records it has you know project records it has board minutes. There's a lot of gray literature that's not necessarily peer reviewed formally, but it's kind of working knowledge like tool kits best practices those sorts of things. So our challenge is really to like decide what should we include like what is what is kind of that's relevant for this dashboard. So I'd be very curious to see how others, if others are facing this challenge and and how you've drawn that line. Next for journal articles. We have lots of very energetic researchers at the bank, and they often submit journal articles and don't, we don't know where they've submitted their articles necessarily mean their manager and unit knows but centrally as an institution it's sometimes hard to track. They often do submit the author prepared manuscript to our open knowledge repository but they often do not. So, one of the things that's difficult that is to figure out like what articles have are affiliated with the World Bank, and that's led us to start looking more closely at the use of orchid IDs for our authors. We need to be looking more closely at that open organization registration tool that we need us just mentioned, because we really need a clear way to associate from a metadata standpoint, our content with the World Bank. Another thing that's come up is, you know, sometimes there's a final article that's published, and then maybe a paywall journal. The current was published in our open knowledge repository, but there's not really a link between them. I mean so this linkage between kind of pre versions and the final published article. It's also just been like confusing to us in terms of how do you link them together how do you count them. That whole relationship has been something that we're trying to understand better. So the data sets of the World Bank has a development data catalog, and has been investing in open data for a long time. So, they do have an API but again we needed to do some mapping of that API to the API PMH standards and so that's still a work in progress. So I'm very interested in exploring tools that can, you know, kind of automate the assessment of how a data set complies with fair standards. I think OpenAir has some tools available and I think we're going to explore how to use those to assess the fair, you know, level of the data sets of the World Bank. So that's the research code. So a lot of researchers at the bank use GitHub, and they occasionally use the note but not too frequently. So we're just in the middle of planning how to scale up how to store research code, how to capture it, archive it, and how to align the capturing of that code with again these standards the way I made standards so that we can accurately connect it with the data sets and the text research. A challenge for us also as an international organization was figuring out like what license to apply to some of these, this research code. We recently reached a decision that we could use MIT and BSD3 licenses for research code. But this opened up a whole discussion about like what exactly is code. There's a very wide range of code. There's kind of just instructions about how to use code. Some people write their own code. There's like software. There's an entire range of like code. So that's something that we're also trying to better understand. Another thing that we've been looking at is NASA recently produced a policy on scientific information. And in that policy they have outlined two different, well, many different two big categories of code. And I think that's going to be helpful to us in understanding about how to treat different categories of code. Other research products. Again this is similar to documents reports and that we're trying to understand more precisely what should be like counted in the dashboard. We have lots of final research products but we also have lots of like raw material products. We have a lot of digital archives we have a photo catalog. We have maps like we have these other assorted like things that are more raw material other than final products. And so we're trying to figure out how to best include those. Lastly the impact. This is going to be great when we are able to measure our impact better. Researchers are obviously very interested to see who's downloading their stuff. So we're very interested in capturing that information better, but we realized that to do that we need to improve our metadata and make it all captureable in open air so we can accurately measure downloads from these different sources. So we're excited to continue working to improve our metadata in order to provide researchers a better view of the usage of their content. So I think those are all of the things I had to say I'm happy to answer any questions. Maybe there's something in the chat. Okay, thank you Katie. It was very interesting. And just to add to that, during that process of working your team work together with Leonidas and the team has really uncovered many things for us also to improve so the collaboration is very fruitful in both directions. Thank you, that's right we really enjoyed working with you guys. Very nice to hear. Okay, let me wrap it up with the new monitor indicators. I'm not going to spend too much time on it so we have time for questions. So we worked on three types, three groups, four groups of new indicators. Recently, first of all, just to motivate it is what we said before we tried to capture all the aspects that will allow the organizations to make informed decisions. So the themes of indicators that we captured. Please move to the next slide. And the following so we have the funding where we discuss, but we have indicators for grants and projects and we have research outputs. We have different kinds of research products but also the fruits of science they belong to if they're disciplinary and so on. Then we have open science where our target is to cover fairness access rights and access routes. Different general business models article processing challenge charges and plan mess indicators. We have collaborations both via project participations and my co-authorship or co-creation, co-creation I mean, you know for data sets and so on. And then we have the impact indicators that right now we're working on downloads and citations, and also sustainable development goals. So this is kind of almost all of this is already available at the dashboard, but we're working to finalize everything in here by September. And then we'll move on to different areas if, if there is a demand for that. Okay, so the first thing we worked on is. And we break this down by different different fields of interest in order to allow the user to have a granular zoomed in view of what is happening so as to identify weak spokes really understand and analyze the data. Okay. Oh yes. So the first thing we worked on composite open science indicators for organizations. So we created three composite indicators. The openness score, which is the average share of open access research outputs for an institution, the findability score, which is the average share of research output with a persistent identifier. And the fairness score, which is the average share of research output with metadata completeness in the methodology section. So we can see what metadata completeness exactly means which parts, which metadata elements we chose, but it's basically based on us. Okay. And so something of interest in this case is for example the trains for this type of scores. And here on the left chart, for example, that the openness score is pretty much consistent over time for the particular institution that we're looking at. However, there are some deeps. So using the link to open it explore that we have available for the dashboard, someone could go and look at the research outputs that are closed access and are causing the deeps at different points in time and see what happened there. And why, for example, in 2020, there's such a deep when there's such wide use of repositories is just an example. And on the right you have the trend for the findability score. And here one can see that although there was a short increase at the beginning of 2000, it deep down again but now it seems to have reached a perfect findability for the last few years at least. So all research outputs in the opening graph have a PID. And we're working on other breakdowns of interest so now that this indicators are ready we will work with all of you that have dashboards in order to see what what other breakdowns of interest there are. Now, as you may know, the there's the open it now has SDG classification system, and you can go and see in open it explore the publications by SDG as well. And not the entire graph is is classified but a large chunk of it, and we're classifying more and more publications as we go along. But this allowed us to start building indicators in terms of SDGs for the dashboard. For example, if you go to the next slide please. The first simple thing is publications by SDG that you see on the left. So for example this institution has clearly a lot of things in the medical clinical health care since a good health has a big spike in terms of publications, but actually it's interesting to see that pretty much all SDGs have some coverage. In fact, you see a graph of the of the downloads, and we're including other ones now such as downloads for publication, the characteristics of these SDGs in terms of this SDG related publications in terms of openness and so on. And the last thing to mention is that we have included an additional dimension of analysis of how to break things down, which is the fruits of science so now open and has a first classification system. Again, you can go to explore to the link that is there to see the different classes. So for level one is the things that you see on the left. So natural sciences engineering technology medical and health sciences, agriculture and veterinary sciences social science and humanities and the arts. And then there's a level two, which is what you see on in bold and then level three. So this allows organizations to really track which areas, which disciplines they are contributing to and which are perhaps some areas for improvement. Okay, next slide. Here are some breakdowns that we have included. Number one is the trends on the fos level one over time. For example, here you see that proportionately social sciences are becoming more popular in this institution. So if you see the number, the most popular level tools by the number of gold open access publications. So clinical medicine health sciences education and so on. Now this makes it possible for a university to really fine tune the decision making by the discipline and notice here that we didn't say by the department because it's not necessary you could have cross disciplinary work right at different departments are working on different disciplines. Okay. Here on the left you have a breakdown within fos one of fos two. So in the second column you see that within engineering and technology, the biggest share for this university is in nano technology. And on the right you have the different open access flavors or colors by the fos level one, seeing for example that for engineering and technology proportionately, and there's more more uptake of using repositories and so on. And then for natural sciences proportionately given the number of gold publications. Okay, and next slide please. Okay, let's let's keep to the last slide. Next. Okay, so these are the we'll have some upcoming indicators in terms of apc citations and so on. We'll send the slides as soon to leave a few minutes for questions. So the indicators that we show today this will be available in the dashboards in May. And the new ones in September. And please if you already spotted something that you would like for example you want something as a share as opposed to as an absolute number will be happy to incorporate it immediately. In case there's someone here that doesn't read the already have a dashboard this is the website on the right on how to get started. Thank you. Any questions. We have two questions in the shot. I'm going to ask Bill Yena about the monitor dashboard for the University of Belgrade. Referring to the data that is available in the dashboard and he's asking when the data will be updated. The last data in production is from March 2023. So, there must be so as a little bug with the cash so we'll update it immediately. Unless there's an issue with the affiliations from the University. How do you want to tackle this. Bill Yena speaking, I could detail my question because we are in the process to include this open air monitoring the official website of the University. The data there on the for the University is quite nice at the monitor dashboard, and it is public, but the last date of us is from 2021. And for the organization and everything is clear and arranged and updated. Okay. You please send us details on the repository so that. So, there are quite a lot of repositories are harvested there because this is the big university so it includes it includes more than 15 repositories. So, and they are the data are visible at open air graph. They are also for the 2022, but at the monitor dashboard. The last year we could see is 2021. I think it is, it must be a little back because the, the graphs are automatically updated from the graph. So once you see them in Explorer, you should see them in monitor. So, we'll take a look to see if there's an issue with the cash or something. Okay, please. Yeah. Yes, so we'll get we'll try to get in touch tomorrow or Monday, if that's okay. Just a remark for Julia Julia I did not forget to sign those paper, but they are stuck in some rector office, but I will push to sign them. I mean, if we are in the official university website it's as good that Simon. Yeah, yeah, yeah. That's great. Okay, and I see there's another question. Which indicators are the most common to be included into the dashboard. So, at this point, every indicator that we have has been requested to be included. And so, you can see Julia. Do you have a dashboard yourself. Hi, I'm Julia. No, I don't have yet, but I was interested in the service so I was wondering what can be what it's the most common to be monitored and which indicators are the most common yes, we will. We will add here a link to a public dashboards so that you can view, but just to briefly discuss every dashboard has an overview section that includes some representative trends and numbers from that dashboard. And then there's different sections on research output, open science and then different and open science inside has many fairness and journal business was and so on. And then we have academic and society impact. So far, everyone has requested everything that is available so if you want to take a look at the dashboard that need us added now. And then we can have a call. Ask which university you're from. I'm from the University of Bologna. And, okay. Yeah, we've been in touch with many people from open air lately so Julia knows me. Yeah, we're just looking around because we are trying to implement few new things for open science here at the university and we were interested in monitor to monitor mostly. How many data sets our researchers are producing more than publications because we have many different methodologies to check their publications and if they are open access and stuff like that. But still we need to implement something about data for data and datasets. So I'm very interested in this call today to understand a little bit better about monitor and then yes of course we will get in touch. I think this is also very interesting use case for us. The data sets and so on and how they show so. In any case, if, if, if you want to discuss this further. It is very little effort from us to create a new dashboard so we can see what we have. I'm saying it's because learning those will do it right. And then we, for me, very easy. And learning this is like the data guru he will check your data sets and everything so if that's something you're interested in let's, let's do it. Yeah, it would be very interesting and thanks for sharing the dashboard. Okay, thank you. Any other questions concerns. I would like to thank Katie for presenting and everyone else for joining. I think Andrea will send a feedback form and the presentation, or it will be posted on the website. Yes, yes, I will. Okay. Thank you very much everyone. Have a good afternoon.