 of this webinar. So thank you again for joining us for this open-air webinar. We are starting a series of open-air webinars focused on the open-air services. So in the coming months we will organize another webinar. So please stay tuned on our communication channels. We will promote other webinars, targeting other services, other stakeholders. Just a brief introduction and housekeeping rules. So the webinar will be recorded. Participants' microphones are off at the beginning, but if you want you can open your microphone during the questions to put directly your questions to the speakers. If you want to participate you can use the chat to introduce yourself, to interact with participants, and write questions to the speakers. Or you can use the Q&A document that we will provide in the chat is a logo in which you can add your questions. And after the presentations we will address all the questions. And you can also raise your hand to speak. Presentations and recording will be shared with you by email and also in the webinar event page after the presentation. If you want to share in your social media channels this event you can also do it using, for example, the hashtag of open-air or refreshing the open-air account of Twitter. Making just a brief overview of open-air services and the brief context. So open-air, since last 10 years more or less, is developing a set of services targeting different stakeholders, mainly to support the adoption of open-science good practices. And since the beginning of this year open-air started a new project, the Open-Air Nexus project. That is a H2020 project that onwards to the AOSC 14 services to implement and accelerate open-science. We can see on this slide these 14 services. And these services are provided by a public institutions, infrastructures and companies being structured or grouped in free portfolios as we can see here. The publish, monitor and discover portfolios. These services are widely used in Europe and behind and integrated in open-air Nexus to assemble a uniform open-size color communication package for the AOSC. Here we can also see the open-air connect service that is the service we will talk today. And for this we have two speakers, Alessia Bardi from CNR who is also the service manager of Open-Air Connect. And today we'll present us the service in detail. And we also have Dianne von Guten from CRIM that will present us a use case of NRMaps project that are using the Open-Air Connect service to create scientific gateway to collect research outcomes in the energy research domain. So I think we can start the presentation starting with Alessia to tell us about the Open-Air Connect service. So Alessia, the floor is yours. You can close my... Yes, thank you, André. Let me stop my sharing. Okay, so you should now be able to see my screen. Okay, so first of all, thanks for being here to this webinar. And thank you also, André, for the introduction because this makes my first slide useless because you already presented the services that are offered by Open-Air for open science. So we will focus today on the connect services for the research communities and its relationship with the Open-Air research graph. So first of all, Open-Air Connect delivers on-demand open research gateways for research communities. And the goal is to lower the barriers that the adoption of open science practices in the research communities. In particular, there is the problem of the literature and data deluge and the fact that research products are scattered across different repositories and sources. So there is not an easy to find entry point towards the outputs of our research communities. There is a lack of community awareness in the sense that in some cases community lacks awareness about themselves. In other cases, there is lack of awareness about the open science practices that they should follow. Finally, another barrier that exists is the lack of open science services and tools. And regarding these, well, some communities are more mature. For example, research communities which are backed up by a research infrastructure. But some other communities are not mature in this sense. So they really like the tools. So the goal is to support those communities, those that are not mature enough so that Open-Air can help them shift towards an effective implementation of open science publishing practices in their domain. So to do this, the gateway offers a community view of the Open-Air research graph. And we will see in the next slide what the Open-Air research graph is. But it's not only about a discovery portal, it's more than this. And you can see connect as an open science toolkit. So an entry point to different tools for the implementation of open science practices. So the Open-Air research graph. So the Open-Air research graph can be described as a big collection of metadata records describing entities of the research lifecycle. So we have publications, data sets, software, other types of research products like methods, protocols, compound objects, research objects, and all different outputs that researchers produce during their research activities. And these are linked to other entities, which can be, for example, the funding project, the organization, the author, with their orchid identifiers, and the funders, and also link between each other. So for example, we have links between publications and the data sets they are supplemented by. The graph also includes information about the access right, so that you can see which versions are open access, for example. And we also have information about the research communities and the research infrastructures thanks to which these products have been produced. And the graph is built by aggregating metadata records from many different sources from all over the world. So we include crossref, data site, orchid, we have software from software heritage and also GitHub. But we have, let's say, the very famous players in scholarly communication. So pre-prints from archive, we have Zenado, we have Cielo, PubMed, PLOS, and many others, open access journals. Regarding the funders, we have the projects from the European Commission, the FP7 program and the H2020, and soon we are going to introduce also the horizon Europe. But we have also other European national and international funders. And so in the end, thanks to this aggregation, we have a graph with more than 150,000 publications and clearly it's hard for a community to find what's relevant. So this is why with Connect we are offering this community view. So the slice of the graph that is relevant for the community. This slide shows how the graph is created and not going into the details of each phase, but just to explain the challenges that we are dealing with. I can see that we collect the metadata, we harmonize them according to an internal data model. We identify the duplicates and we merge them together because we may collect different metadata records about the same entity from different sources. So we identify them as duplicates and we merge them together so that we count it only once when we produce statistics. After this, we have a step of enrichment. So we enrich the graph with additional properties and links that we can infer from the metadata and from the full text of open access publications. And for example, we infer links to projects, links to datasets, software. We enrich the metadata with additional subject classification and of course we identify if a product is related to a community or to a research infrastructure. Finally, we have a final step of cleansing and then we make the graph available to our portals. So we index it on a solar index that powers our portal and our API. But we also calculate statistics. We calculate statistics and we produce numbers and charts that are made available on our monitor portal for funders and research infrastructures and organizations and also on the open science observatory. And the graph is delivered as an open resource and basically it will be the resource catalog for the YOSC, the European Open Science Cloud. And the graph is made available via our API but also via DUMPS, which we published on Zenodo. And API and DUMPS can be used by others to build on top of this content. Here you can see some of the clients, let's say, that we have. So we have services of thematic research infrastructures but we also have Sigma, which is the system for grant management of the European Commission, which is using our API to suggest publications into the projects. But we also have Scopus using our APIs to get the links between publications and datasets. So let's go back to the OpenAir Connect. So as I said at the beginning, you can see it as an open science toolkit for the research communities. And to start with, researchers get a discovery gateway where they can explore, navigate, search, all research products and entities of the research life cycle that are relevant for their community. And the portal is also customizable to their needs. And you will see an example of this with the presentation that Dian will do after mine. So how do we do it? Community managers are experts of the community and they have access to an administration dashboard. And basically through this dashboard, they can configure the criteria of inclusions of research products in their gateway. So OpenAir applies the configuration on the OpenAir research graph. And this is used to identify the products of the communities so that those are made available via the web portal of the gateway via our OpenAir API. And they are also published as JSON dumps on Zenodo so that the content is reusable also by others. And for example, the community may decide to develop their own domain-specific portal, for example, or their own specific statistics. So the configuration criteria are fundamental, are very important, and they are in the hands of the community managers who are the experts in the domain. And specifically with the dashboard, they can specify these different criteria of inclusion so they can provide a list of keywords, project grants, thematic repositories, thematic journals with Zenodo communities, and also organizations which are known to be working in the field. Also, the simple users can contribute to grow the record of the research outputs of the community. And they can do it using the link functionality which is available in the gateway and on the OpenAir portal so that researchers can basically add products to the gateway if they are not already included, but also add links between them so they can add links to projects, links between publications and data sets, links to software, and so on. Finally, we have the OpenAir algorithms. So we have the so-called propagation algorithm which basically propagates the fact that a product is relevant for a community from one product to another. So for example, if we know that a publication is relevant because it was collected from a thematic journal and this publication is supplemented by a data set, then also the data set is added to the gateway even if the data set didn't come with any metadata information about the community. Then we have the full text mining algorithms. So while the propagation algorithm only works on the metadata, the full text goes into the abstract and to the whole full text of the open access publication. And here we mine for new information that is not explicitly available in the metadata and they include links to projects, affiliations, document classification, relevant research infrastructures, related data, and much more. And this is all information that we can exploit to identify products of the community. Then we have the tools to bridge the places where research is done, the research infrastructure, and the places where research is published, which is the scholarly communication ecosystem. And for this, there is another API that can be used by the services of the community to publish any type of research products and made them available via the gateway straight away. In addition, we can also provide up-to-date information to repositories of the community or other services of the community, thanks to the open-air content provider dashboard, which is one of the other services that open-air offers which target content providers and repository managers. Finally, statistics and tracking. So in the gateway, you will find tools that ease the reporting, for example, to funders, that gives you information about the open science uptake in the community, and then the community, the community managers can decide to use this information, for example, to shape new policies on open science for the community. We provide some default indicators, but those indicators are configurable. So if the community has specific needs or specific indicators to check, this can be done on request. So to conclude, open-air connect delivers configurable open research gateways for research communities that lower the barriers that hinder the adoption of open science practices. It is the service by which open-air supports community building, strengthening, and powering. And as you can see in this slide, we have already delivered 10 community gateways to community of different disciplines, and we are working with some more. So we range from energy research, as you will see, but also transport research, digital humanities, then we have, for example, agricultural and for science, we have a gateway for COVID, and we are working also with galaxy workflows. And this is, let's say, a work in progress with the science and technology innovation policies as well. So if you are interested into these and you want to know more, you can go to connect.openair.eu. You will get the full list of the gateways that are already available, and you will also find a contact form that you can use to contact me, and then we can start a conversation to understand which are your needs and how open-air can help you. So this ends my presentation. Andres, shall we give the floor to Diana? Yes, thank you, Alessia. After the presentation, we can address the questions and the participants. So now we can start the presentation of Diana. Diana, the floor is yours. Yes, I will start. Do you see my screen now also? Yes. Great. Do you still see my screen when it's in presentation mode? Yes, it's perfect. Perfect. So hello, everybody. Welcome to my short presentation. The goal of this presentation is to show you an application of what you told Alessia about. So what we can actually do with this gateway for a particular community. So in this case, energy research. This gateway was actually part of a larger project and asked 2020 projects that we call NMAPS. It's the goal of improving data management and accessibility in the field of energy research. So mostly the goal was to get better data management, to improve our fair practices in energy research. It's a two-year project with a one million about funds. If you want to know more about this project, you can clearly go to our website. So this is a general goal, is to identify interesting database in energy research to actually see that they fail and that we can actually reuse it and use it long-term for the energy community. So what is the current situation in energy? One thing in energy is that we have quite a lot of data. I mean, especially this last year, we start to have a lot of data, for example, electricity, electricity costs, consumption, also heat. So all heat consumption, there is still less data, but we still have, let's say, an improvement on the sheer amount of data. However, we usually separate it into different databases. They're not easy to find, and the quality is also to be discussed. So there is sometimes high quality data set and sometimes less high quality data set, and we don't really know where to find the right data. So scientific data is stored, but it's not easy to find, which among the facts, nobody will use it. It's also, we don't have too many visualization and analysis tool. So for the moment, we have different tools, but we don't really like of a generic tool than we can use, let's say, to visualize data. And in general, I would say that's a notion of open science and fair is improving into the research community, into the energy research community, but it's still quite low. And of course, we would like to improve that. More generally, when we look really at open science, the challenges we have would be, often, that's a project I quite short. So the two or three years and after we stop, so nobody really have incentive to keep the data long term, to keep it available. Often, it's unclear which type of license it's linked to the data, so we're not really sure if we can reuse it. It's not also clear how the data was created before, so we don't really know if it was created of measure. The access of software, also you have the same problem, you don't really know exactly how it was created. And there is difference between countries or data providers who also have a problem of, let's say, unification or at least convergence between the different. In particular, with the energy field, I think one particularity, because what I said before could be quite true in a lot of different fields, there is a strong differences between the user needs. Energy is quite political generally, so we have high data needs for non-technical actors, which are important. We really need precise data, quantitative data, but also they want to access it easily. And somewhere where we see this, this strong difference between the user needs, we thought, okay, it would be too difficult in the NMR project in general to create just one entry point, to have one tool which can answer to the need of all our users that wouldn't be possible. We also think that the two tools we have created might not be sufficient, but our idea there was to have this, to create already two tools, which let's say would provide the best support for the maximum amount of user. So we had created two-layer tools, so we talked about these two layers, where one is this gateway, where we really have the chance of using the opener research graph to have accessibility, so that means we can really find as much paper, as much data as we want. We can always research for particular data sets, we can look for link between publication and data sets, so we really have access to many things, let's say. And in the same time, we thought, okay, what we need also is to have a layer on top, which would be more of a curated data set selection. So we decided to select 50 data sets, which we thought are very important, that they're critical, and that there we would do a visualization tool, so we already have part of it, to really help to visualize these data sets, and we know our quality controls, we know they're quite high quality, and that we know are really important for most of the actor to understand, let's say, the field of energy quickly. And that's in case if somebody needs something more precise or a particular data set, it can go back and go to this gateway, and having access to many things. And to make really a link, our idea was to make the link between a visualization tool with less data sets, but more control data sets, and the research community gateway, which will be, let's say, larger, but maybe harder to exactly find what you want, but which will be accessed to everything, and you can also go from one side to another, or let's say, go from one new to another, and switch between these two tools. That was, let's say, the idea of an air maps, and here, how we, let's say, use the gateway in practice with these two layers. So yeah, that's our idea of the concept of the project, this is to get a two on three point system for data management for our user. So with this two on three points, it's functional as a current system, but it's a different user to access the data in the way which is more practical for them. Okay, so the first layer, as I said, is a community gateway, which is exactly what Alessia showed you before. And this would be how the gateway looked generally, with here's a way to share the different title, author, et cetera. And here, what we had in the air maps project, so we actually modified the gateway for our purposes so that it's adapted to the need of our community. We actually here added a different future data set, critical data set that I wanted to show in the visualization tool, so that there is a link between the visualization tool and the gateway. So this new tab is added there. That was what was new in the air maps, let's say, to be able to have this two layer with a link to the visualization and the gateway. So here, when you click on this feature data set, you get here the different list, there's only one, but there is a longer list with a different data set than we have selected as critical. It makes like around the long interview of different experts and so on to actually select the white data set. And there, when you click here on one particular data set, it's actually show more info about this data set, and you can click there on the image. And there, in this case, it will go to the visualization tool. So if you go back here, there is a tab with a critical data set here, all the list of them, you can select one. And after here, you click and you go to the visualization tool, which looks like this. So here, you have the different, there is only one that is still marked. You have the list of the different data sets. And you can select and visualize the different data set. And if you click there again, well, you go back here. So you go back to the gateway with all the information about the data sets. And that was part of our dirt. And we did have this two layer, what we did also was to check this critical data set were fair. So we would check some of the 50 select data sets, then they were actually easy to be reused. And we checked, we reach out to the data provider, we improve on existing metadata, we check for consistency. So being sure that the data set we select are high quality. And also provide the common and frequent for these different data sets. So I mean, visualization and also here's this list of the data set. And we add to the other also some of the data sets which were not available online anymore. So that's also a part of the project with its one part is indeed to have the tools online, to have the numerical patch of the tool online. But another part is also just to work with the data provider to make them, to make the other set as fair as possible. So there is need a part of software development, there is also a part of community, let's say development, where we work with different actors to try to make the data as fair as possible. Okay, so that's such a very short overview of what of what NMF is here. And I wanted to say that the advantage of the gateway is that we could use the strengths of the Weshach graph of opener for our purposes. And then we could, you know, through the gateway, we could really use that says the power of this Weshach graph, but which to something which is useful and work here applicable for us in a larger system than we actually imagine before we're pretty soaked up with the gateway, but where we create this layer that we thought, okay, actually this layer that we already have it, we can reuse this gateway, and we use what opener already do for inside of actually, you know, we start into zero. And that was actually quite quite useful for us to try to create this larger view of data management in energy research. So one challenge of the gateway is that you actually have to personalize it. So you have to find the right keyword and the right project. So it's relevant for your community, but when you actually do that, you that has a advantage of getting the interesting paper on the interesting datasets related to community was a quickie. That's all. Thanks. Thanks a lot to listen to me to you. I'm obviously available for all questions you have on an all the discussion. Okay, many thanks, Dianne, for your presentation. So now we can open the time for questions from participants. Let me share my screen to share the questions. So we already have one question from Neam Brennan. I think this question is for Alessia. I'm looking for a way to improve the discoverability of open educational resources. Is it possible for open air to harvest these resources directly from a learning management system such as Moodle? So regarding the inclusion of content in open air, we provide guidelines for different types of repositories. And so with Moodle, which is honestly a platform that I don't know. So if Moodle is able to expose the metadata about these educational resources according to a format compliant with the open air guidelines, there should be no problem for us to harvest. But this is something that you should check with the capabilities of Moodle itself. So this is a rather very technical question. Okay, thank you. I think we have another question in the chat. Right now I'm not able to see the chat. I can see it. Okay, you can read it please. Yes, from Asta Matur. Sorry if I pronounced the name in the wrong way. So does open air itself have any human creation steps to create the research graphs? Okay, yes. We have some human creation activities which can be performed also by external people because via the gateway the users have this link functionality and this can use to add miss links or add missing products and also specify when adding new products they people can also specify the access right that are to be associated to this product to this product. Then internally in open air, yes, we certainly do some creation not completely manual because we are supported by tools. So for example, we have the validator service which can be used to validate the compliance of the content with our guidelines. And then we have a suite of internal tools that checks the quality of what we produce. Okay, thank you. If someone wants to put some question directly, you can open your microphone and ask directly Alessia or Dianne. I think Kirina, just send some question here in the chat, Andrei. There was also this project of set open air mobile beam for space geospatial sector. Does anyone have more information about this? Someone can. Personally, I cannot help on this. But probably we can investigate offline and see what we can get. And there is also suggestion from Bruce Herbert Niem. He says that we create our open educational resources out there at Texas A&M in our space repository. And the open air research graph could harvest that metadata. Yes, yes, indeed. Another question is about the coverage. So is the open air knowledge graph representing European research? Not only. So we started from Europe. But then we really widened our scope. And for example, we have all Crossref, which is basically all publications with 99% of all publications published with the DOI. And this is really worldwide. And as I was showing one of my first slides, we also have Cielo. And Cielo is one of the most important platform in South America. And we also have La Referencia, for example. And Gyro, which is a big aggregator in Japan. So we can say that the coverage is pretty wide. And this is also a benefit for the research communities. Because in many cases, what we see is that the communities are geographically distributed, let's say. So having research outcomes that are not focusing on a single region is very important for them. Because it also gives the numbers and the feelings of doing something that is possibly reused also outside the original boundaries of the community itself. If you have another question from Serge, are there any user statistics for the gateways? You mean for the content? About the content available in the gateways or the gateways themselves? Serge, if you want, you can open your microphone and clarify your question in detail. Yeah, I'm back. I was trying to find the microphone. No, it's about the usage. I mean, it's a service for people who want to use it. Okay, so you mean the statistics about the usage of the service of the gateways? Okay. Yes, we keep some statistics currently only internally. But these are, of course, available for the gateway managers. We are monitoring the access to the gateways via the Matomo platform. So I think it's in the plan of Nexus also to make this information available to the public. So it's part of the project we are currently running. And we have also a question from Carlos. Any access to Chinese data sources? Okay, I have to be honest, I don't remember. But what I can suggest is that you go on the Explorer portal. You go on the Reason menu on the right, which says content provider, and you can click on Browse All. And one of the Browse options is by country. So from there, you can see how many providers from China we have. Okay. And another question from Richard. I noticed in the develop portal that we'll no longer support for exporting OAPMH. There are quite a few repositories which use OAPMH, but build functionalities. This is very disappointing to see and ready. Curious why this continues this service? Okay, so it's not that OpenAir is not going to support OAPMH harvesting anymore. We will keep harvesting from repositories. What we will not do anymore is to expose the whole graph via OAPMH because basically it was not sustainable to operate the service as it should be. So because the graph is too big. So we went for a solution that is also adopted by other research graph initiative. So Crossref or the ENDS research graph, the PID graph from Freya. So what we are going to have is that we will offer the search API, but for bulk consumption we are going to deliver dams. So you will not be able anymore to consume the whole content in bulk via OAPMH. Okay, thank you Alessia. I don't know if someone wants to raise some additional question or comment. There's a question here from him. Beem long, do long. Are only scientific publications included in OpenAir? No, not only scientific publications. In fact one of the let's say pillar of the concept of open science is that of giving credit for all types of research products. So one type is the scientific publication, but we also have data. We also have research software and other types of research products of very, very different types. In OpenAir we also have let's say non peer reviewed materials. So we have the preprints, we have the presentations, we have the technical reports, deliverables of projects. So we have very different types of research outputs and scientific publications are just a fraction of that. Any other questions? I don't know if you in the end, if you want to raise some comment or final thoughts, you can also say something if you want. No, nothing special from my side. It's all right to everybody. Okay, thank you. Thank you. We don't have any question in the chat. If there are no more questions, I think we can close the webinar. Any thanks first to the speakers, Alessia and Jan for your participation, collaboration to share your thoughts, your knowledge about the OpenAir, Connect and the NMAPS gateway. Thank you all the participants for accepting our invitation to come to this webinar. So just to conclude, both the presentations and the recording will be made available in the event page. We will also send by email to you. And stay tuned in the upcoming initiatives of OpenAir. We'll have more webinars that targeting other OpenAir services. So we'll continue to communicate and we hope you can join future webinars from OpenAir. Thank you all. Goodbye.