 So welcome to the fourth OpenR Connect community call. I'm Alessia Bardi, I'm a researcher in computer science and scholarly communication at the Italian National Research Council. And then the service manager for the Connect services and OpenR service that allows research communities and organizations to promote and better implement open science practices in the community. So the past community calls were somehow reserved to community creators. So those people that, those people that already have a Connect gateway, today we did an open, an open call for all those that would be interested to have a new Connect gateway. So today, let's say the agenda is a little bit different than usual. So I will go through an introduction to the service and what can be done for you and for your communities. I will briefly go to a couple of highlights and novelties we are working at. And then I will give the floor to Yvonne Desmond from the Technological University Dublin that we present how the Connect gateway was used by the EUT plus Alliance. And then clearly we have some time at the end for questions and answers to address everything that you will ask. So let's start from the beginning from Connect. So what's the problem that we would like to address is that we realize that due to the publication and data deluge, the research communities, single researchers but also citizens and organizations find it very difficult to to discover the research productions that are useful for their needs, both in terms of, for example, data discovery for doing their research, but also for having indicators about their performance and the impact that the organization has on the research landscape. And in Europe, we can find almost 20% of all the higher education institutions in the world. And they tend to form alliances so to form groups of universities that work together for creating more knowledge for improving their organizations and the learning path for their students and for their researchers. And we think that this is a very important activities that they are doing. So they need tools, so they need tools to discover, to assess and to outreach their research findings. And how can they do this? Typically, they would like to have a personalized portal for their research communities, for their researchers, for their audience, so that they can really show to the world, to the stakeholders, to the funders, to everyone involved, what they are doing, what the researchers are doing, and also find a way to make it more apparent and not to hide it behind a wall. The problem is that it's not easy to put in a single place all the research outputs that are relevant for a community. So how, because research products, publications, data, software, and also other types of research products like protocols, methods that are created by researchers and we use by researchers as scattered in different places. So they are available from different repositories or data archives, and it's not easy to find them. We need infrastructure, we need infrastructure that aggregates research works of specific interests and put them all together. But this requires time, resources, and expertise at different levels, because there are several challenges that need to be tackled in order to do that. And I can mention four, so the heterogeneity, so the fact that the different repositories where these research works are hosted may have different, may be compliant to different protocols. So if you have to build a system that is able to collect from all these different places, the metadata about the research products. And, and they are a lot as I said the publication deluge. So you have to deal with big data, both in terms of numbers but also in terms of structure, because the description of a publication is probably very different from the description of a protocol over that data set. Then the quality and accuracy and completeness. There is not one place where all the information is available, you have to integrate so you may want to have information about the authors, for example, from our kid, which is a registry for for researchers. You may want to have the links between publications and data sets from another service. And so you need to have the expertise in order to put all these things together and create a rich informational space which you can have part us for the discovery and also part us for statistics and monitor indicators. The first is that open air can tackle those challenges for you. Open air is already operating an infrastructure that does that. So thanks to this infrastructure, we are collecting metadata about research products of different types from more than 2000 scholarly communication sources. And we are talking here about institutional repositories, thematic repositories, funder databases, authoritative registries like open door with redata, roar for the organization, but we also have crossref and data site which are the main persistent identifier authorities that are used. So by collecting all this information we put everything together and and we further enrich the information that we collected by finding duplicates, because we may collect different metadata description of the same product from different repositories. We identified that these two records describe the same thing and we put them together and we keep information about the provenance of each piece of information that we that we merge, let's say, and we also have algorithms that analyze the metadata and the full text of open access publications in order to extract additional information that is not explicitly available in the metadata. So, for example, citations or links to data sets links to software and subject classification, for example. Then the result of this work is the opener graph where all these different entities so software publications data organizations, data sources are linked together. And the numbers are pretty impressive so we're talking about 160 millions publications 58 research million research data and and so on. So what, how can it be useful for research community or for a university Alliance. We need to find a way to identify from the graph the slice the part that is relevant for for the Alliance or for the research community. And once we identified this slice that is relevant then we can build customized portal on top of it. And this is exactly what open air connect us. And in addition to the portal, you can consider connect as a facilitator of open science practices. In fact, this is integrated with other scholarly communication services like orchid, crossref usage counts, but also other open air services like Zenado and and Explorer. The thing is that in this case, you do not, you do not have to build the infrastructure for the aggregation and processing of the data. You don't have to think about the maintenance and operation of machines that are needed to run the portal, or to implement the the search facilities, because everything is operated by open air. So the installation maintenance upgrades of the machines and and the services regular backups and the automatic that updates are all operated by by open air. So how does it work. Basically, with connect you will deliver a gateway. And as I said, we need to identify the slice of the graph that is relevant for you. For this, the gateway offers an administration interface with which you can basically provide a configuration so the criteria of inclusion. These criteria are applied to the graph. And they are used in order to talk the research products that are relevant for you. And based on this, we have an end user portal with discovery facility functionality. And we also publish that the metadata records in a dedicated data set on Zenado. You can also get the metadata and build your own portal if you want or build additional. I mean, other added value service if you want. And clearly it's also available via the open air API. So the configuration criteria are very important because they identify the subset of the opener graph that will be searchable in your gateway. And we can do this in different ways. And different criteria are more useful for specific type of communities. So for example, if you're are a discipline specific communities, then the keywords are very important criteria so you can select for example, all the research products that have. I don't know digital humanities among the subjects in the metadata. Then we have the projects and with projects we mean the project wins so that the fundings. So you can basically select from the 25 funders integrated in open air. We have three million projects and you can select which are the grants that whose publications whose deliverables whose reports data sets you want to include in your gateway. Then clearly we have the data sources. So thematic repositories, archives, journals, but most important for university alliances is the fact that we have the institutional repositories. So a gateway creator can say that all the research products that are available from the repositories of my universities should be included in the gateway. And then similar to that we have the Zenodo communities and finally the organizations. Why is very important, especially for university alliances, because sometimes researchers do not deposit their publications their research products where they should. But they do add their affiliations in the metadata or in the full text. So open air can extract the affiliation from the full text. And in such a way, you do not only get everything that is in your institutional repository, but also everything that is affiliated with your institution. So it's like that the picture that the record of your organization is is as much complete as possible. Then there is also an option for end users of the portal to add missing products in the gateway. So you can basically link products that are already available in open air. But also products that are not yet in open air, but they are available from Crossref, Datasite and Orchid. And they can also add links to other research products and to project rents. So in order to grow the records of your community. Then we have the open air algorithms. So I already mentioned the full text mining algorithm that finds the links to projects, affiliations and document classification. But we also have an algorithm that propagates the fact that something is relevant for your community to its related items. So for example, if a publication is relevant for your community and is supplemented by a data set, then also the data set is added to your community. So we connect what you will get. You will get an administration dashboard to configure the gateway, not only in terms of the inclusion criteria, but also in terms of look and feel. And you don't need to code if you don't want. If you want, you can write a little bit of HTML to make your page a little bit nicer according to your needs. But if you don't want, there is what you see, what you get administration panel where you can actually do everything you need. And you will also get the portal where the users can search for any types of research works they can, they can browse with the flitters on the left. They can find the right place to deposit. So if there is, if there are any community specific Zenado communities to use or any institution or automatic repositories to use. And as I said, they can link research products among them and with funding project. So the idea is that you can contact us, we understand which are your needs, and we tell you what we can do, and we sign a memorandum of understanding. So we agree on what needs to be done in order to support your community. We create a first version of the gateway, and you assign creators that will manage the gateway and will provide the configuration. And then we can support with the public lounge. So we can helps you deliver a, let's say a final version of the gateway that can be public can be publicly launched and we support with the communication and dissemination activities, specifically for the university alliances. Within there are three actions that are very important to do. So the first one is to ensure that the repositories of the universities are in open air. This means that they, the repositories must expose their metadata records according to the open air guidelines. And they can register to the content provided dashboard of open air in order to validate their metadata records and to provide the old information that we need in order to harvest this information from them. So the university is to check if their organization is properly represented open air. So for this, you can go in our discovery general discovery portal open ed Explorer and search for your organization could be the case that you will find duplicates, or you will find a wrong name. These are all things that come from the information that we collect. But we have, we do have a creation tool that's basically, we can use in order to create all these details so names, URLs and search to organizations together if they're the same or split them if they are not. And yes, clearly the action tree is to find internally on the side of the alliance who are the gateway creators. So this is a matter of policies. Basically, we give connect for free because it's funded by a project, which is open air nexus. We need some time for the delivery about one month, more or less for the first phase. But we land soon. So if you're interested, I think it's a good idea to ask us for for a gateway in order to bootstrap the process. But in addition to that, we also have monitor. In some cases you also want to monitor the impact in the research landscape you want to have indicators about the open science uptake. Sorry, the uptake open science practices in your community. And with monitor, you can have basically a dashboard with different types of visualizations charts numbers that monitor different aspects that can range from the, the number of open access publications to the APC's organizations or, or are the indicators related to the collaboration with other organizations. And you can keep it public restricted. And you, as you wish. So there is, it's a powerful tools. So very briefly, the novelties I would like to present you today. They are related to the inclusion criteria. So the part of the configuration where creators specify which are the criteria by which a research product is relevant for the community or not. We are, we have added on on beta, the selection criteria based on fields of science and sustainable development goals. And also the advanced criteria. What does it mean is that for now, all the criteria I mentioned before, they are not related to each other. But now you can, for example, say that not only I want everything whose subject is United States. But in fact, I want the subjects to be United States and Mexico, or I want to have a subject that is digital humanities and the contributors is Daria, which is a digital infrastructure for digital humanities. So you will see soon, these two options in your gateway in the production environment. We are finalizing some, some details before adding to production. And with these I think I can leave the floor to even for the presentation about ut plus. Hello everybody I'm just going to share my screen. Yes, so mine. Yes. I hope everybody can see that. Yes, we can see it but it's not yet in. Okay, now it is in presentation. Well, thank you very much for the invitation to come and share our experience with you, which I have to say has been a very positive one. So just to explain what the European University of Technology is. It's basically me you fund a project to create one technological university for Europe. As you can see there are eight partners of technological universities and you can actually see the scale of project just by looking at the numbers that are involved. Initially funded for three years, which will end in October, we were told that we had to produce very concrete evidence of project of progress in order to get funding for next year for the next phase. So, it's been a very practical kind of project to work on. So, the proposal included the ambition that the ut plus would be an open university. And this is largely driven by to you Dublin, which thinks this is an ambition everybody should have a work package 8.6.7 to Pacific deliverables. One was to create the institutional repository for the ut plus. To create an online open access academic press and naturally enough, we were to do this without any money, which is pretty standard. And we had a, there are about 36 people in the work package, but we had a core of 12 people who are mostly librarians and I would say if you want something to happen, definitely include librarians in your work package. The first thing we did was do a landscape survey of the Alliance members to see what the status of open research was. And it varied a lot. Some of the universities were very advanced in this area, some of them are not and some were literally at point zero. All but one of the universities had an institutional repository already. And so the subgroup was established. And they were given the task that given that we had repositories already. We just had to find a way of putting an interface on top of those and find some way to accommodate the one university that didn't have a repository. So we took three approaches. One was to build a portal from scratch, which was impossible because we had neither the personnel nor the money. We were could look for a hosted solution because to you Dublin, for example, uses digital commons. Again, we had no money. And then we found open air, which rapidly became the only solution, given our lack of resources on time. So we then contacted open air with a request to develop the portal. We signed a memorandum of understanding in September 21. I think you can't underestimate the market value of this for us. As you can see, this was the press release. And they were really excited that we had something to show that was concrete that we were moving forward. We were all working together. And they also made a commitment that we were moving to open scholarship, which is something I have used to my advantage sense. So it was a very big deal for the EU T plus that we had done this. So we were then moving on to build the EU T branded web space and to have as a noted community for the university that didn't have a repository already. So it was the first tangible expression of the EU T plus as an entity is showcased the diversity and talent within the EU T plus. This is really very important for technological universities as opposed to more traditional universities because a lot of our material that is high impact is not necessarily the peer reviewed literature. So we need we need a showcase to put everything that we do out there as most of our research tends to be applied. And it actually happened very fast. We were operational the end of November, and I have to say the delay was mostly our side, because the universities had to look at the state of their repository in open air. It's also important to mention that we did not have huge technical support our end. We were basically librarians with some technical knowledge but not a lot. And open air did all the heavy workforce really in this instance. So this is how it looks. It's very simple. I think it's quite elegant at the bottom of the screen I couldn't fit it onto my page. The university logos are displayed together. So I think it was a really, a really good job that happened very quickly. So, as I say we kept it very simple the feedback was really positive everybody got quite excited, because when you look at it aggregated there's quite a lot of material coming from the eight university eight universities. And the subgroup worked really well together which is amazing given you know the diversity and languages and approaches and all the rest, but everybody was sort of committed to the idea that it was really important and works quite hard to make it happen. And as a leader of the work package I was very relieved because I felt it was sustainable, and it was simple and intuitive to use which is really important. There were problems making the repositories compatible with the open air requirements took a lot of work. We had difficulty with the different versions like we have been harvested for open by open air for years, but we discovered we were on version three and didn't know that there was a version five so that was a problem. I'm working with the help desk can be so and you do require a little bit of patience understandable given the volume of what they had to do in the initial stage we weren't really clear what had to be done by us but that did become apparent over the time. My biggest problem is delay and updating the IR it's currently, you know, 12 weeks, which I think is is far too far too long, and I've been assured that that will be worked on. And the big disappointment was one university did not take up this nodo option as they didn't have the resources they said. So the statistics were really, really welcomed because they were coming from an independent source. The funder's information was very useful as again it showed the diversity of funding that existed within the UT. We're delighted that the STGs are coming because STGs are really important for technological universities as we all work to a number of them. It highlighted the diversity of research communities. We thought we were really bad on data, but yes when we aggregated we had only 1500 data sets. Again, given the whole trust of the UT plus this was really important to psychology very important that we could demonstrate all this evidence. And we then started looking at the open air services so the monitor is now on the UT plus repository and to Dublin has put it on our own. Both are open. We dearly love the visualizations. I just want to remember that this is really hard to collect data for the for the seven universities. So it's really good. For example, these are my favorite ones. I love the one on the right, which just shows the publications rising. And the next one is the openness over time we're very keen on the green route. And we want to keep demonstrating demonstrating that the green is possible you don't always have to be paying APCs. And this like I was just when I was putting together this repos this presentation I was looking at the stats and this one is alarming me greatly because there's a lot of material without a license. So for example we have a meeting of the UT next week and this will be highlighted as people need to clean this up. The published versus deposited again we're very happy with the green being very prominent. And it's very, very useful information when it comes to demonstrating the importance and the impact of openness to the sometimes quite cynical university presidents and rectors. So, all in all, I would have to say it's been a really successful partnership from our point of view, we couldn't have achieved what we have achieved without open air. I think updating is a problem which I'd like to see sorted. And sometimes the communication is not too clear or it's too technical, which the problem our end we don't understand sometimes what people are actually saying to us. I think the documentation could be a little bit clearer so you should be told exactly what you need to do. And it needs to be on the latest version for open air compliance and again you need to know that. I don't know what we thought was happening we just thought everything was upgraded but obviously that's not not the case until you double and uses proprietary software so we really need a bit more flexibility. But at the moment, when you were trying to deal with that you need to really check that you're being properly harvested it's become very apparent. Particularly until you double and it's not so bad in the ET plus one that we need to do great data work to make sure that the harvesting is complete and accurate accurate because that has ramifications for the statistics, which we really like and which are separate example my head of research finds really really useful. I would say sometimes how the services internet interconnect is not always very clear. And we were told that we should have turned on the usage kind of service, which we didn't know about, and we're still trying to find out how to do. But these are kind of minor levels. As I say, we couldn't have done this without an open air. We're now very conscious of open air. For example, again until you double and we're going to start using the Argus software, which is for data management. We're going to roll it out within the technological university sector in Ireland, because prior to this we've all been using DMP online, it is as you may know is British based on given Brexit we kind of feel we shouldn't be doing that anymore. Would I recommend using open air if you're trying to do this kind of thing. Yes, I think you need to have very clear conversations in the beginning. But for people like us who just wanted it to happen, but we didn't really want to know how to do it. It has worked extremely well for us. And, you know, all I can express is the fact that we are very grateful within the UT plus because we have managed to reduce both our deliverables within a year, which makes us best in class. So that's basically what I have to say so thank you for your attention. I'm happy to take any questions for people want to I'll just stop sharing my screen now. Thank you very much, Yvonne, for this overview and for the details you share. So clearly we will address all the, all the suggestions that you have in order to improve at a different levels both at the level of communication, which I understand sometimes also myself. I'm sorry for that. You see, you have to you have to tell us all down for people like me. But I mean, again, I would say, you know, even given all that it just required asking the questions, and then we paint understand, you know what I mean, it's not it's not a huge problem I don't want to make it a huge problem. It's just, I think it could be a little clearer. And maybe some straightforward written documentation that tells you exactly what you need to like turn on this turn on doctor and whatever would help. As I say, it's a minor. And the end result was we got our repository in four months which was incredible. Great. Let's check if there are any questions I see some in the chat. Okay. David. David the church is asking if there is there any timeline for the inclusion of data about citations of research products. Okay, so just this information is already available in the in the graph, but we still have to work on the on the visualization on the portal. And I think you're more interested in the in the statistics part. So I know that this is still a work in progress. I can ask my colleague for an update on this. I really will. I think that the plan is to make it available. Before the end of the open air and access project, but I really need to check with them. Martin Diaz is that has two questions. So the first one. So we have a record of the impact of the gateway in the dashboard we can see or at least I don't know how to see the traffic inside the gateway. So what we are doing is sharing short links to have some data, even if they're linked. On the gateway traffic. Yes, we are monitoring the traffic on the gateways. I am a trauma be weak. But but the system is is not available for you. So, but I agree that the managers of the gateway should somehow have this information, at least on request. So, let me check with the system administrators. If we can maybe open the pages of a gateway to a specific user or if I can produce. I know I can produce maybe an export of of the data that you can use. Thanks for thanks for the question. And the second one. We would also like to see in the dashboard, for example, specific data from certain gateway sources. Okay, so I guess you mean in the monitor dashboard to have some charts that are not relevant about everything in the gateway, but only a subset that is collected from a specific source. I don't think this is in now are to do this for now, but we can evaluate because we have the information because for every record that we collect. We know where it comes from, where it comes from. And so this is not something technically. And I think I read all the questions in the chat, but if someone would like to to ask something. Hi. Sorry, can you hear me now. Yes. Okay. Yeah, I just need to switch. My question is a more generic one regarding the relationship of the records in the gateways and what happens to them, or how they land in the US resource catalog. I'm asking this for very selfish reasons because like a Daria. We received this question relatively often like okay, we are Daria memory institutions. How do you make sure that our resources land in the US. Our individual data providers are answers, of course, to the opener gateway. And, but when I make a bit of like a practical check up and search for individual records that are in the open area gateway and check them in the US resource catalog. I cannot always find them which might be a granularity issue as well like if the US resource catalog is a catalog of the catalogs. But so what are your perspectives in this respect. What do we expect that these records will be available in the US resource catalog on the record level or only on the catalog level if you understand my, my question. Okay. So the US resource catalog, let's say is built on top of the opener graph. But not all the research products in open air go there. We make a selection. So we try to identify which are the products. The US products and currently the criteria is that the US resource catalog should contain only the records that come from a source from a data source that is registered as a data source in the US service registry. So, the thing is that in the in opener and in Explorer, sorry, in opener in general, and in the gateway for that, for example, we have more we may have more, because it's not only, for example, the research products that we collect from the text grid repository. But it's also more. So, so this is why there is this discrepancy and not all the products that you see in the data gateway is available in the US. We should talk with us team in order to understand if it's possible to update this criteria. So, for example, say considering that area is an infrastructure these news should everything in of the data gateway also be in the US resource catalog. Probably yes, probably should. But we need to start a conversation with the team. This is extremely helpful. Thanks a lot. I feel enlightened now. Thanks. Thank you for the question because the connection with the European Open Science Cloud is often not clear. So, are there any other questions that you would like to ask to me but also to to even based on on our experience. So, I have a question. So I know that the project, the open, open-air nexus project will end at the end of June. So, I was wondering what will happen next. If there are any other way in which open air will be still available. I mean, I'm sure that it will be still available but I don't know in terms of development on and about the requests, technical requests, but the current gateways, what will happen about it. Yes, so before I leave the floor to Julia, let me say that you don't have to be worried. We are not going to close the gateway on the first of July. No, we're not doing that. What we're doing is that we need to start a conversation with you and with all the research infrastructures we have been working with in order to find paths to continue these collaborations and to make it sustainable in the long term. And I think this is where Julia can. Thanks, Alessian. Thanks, Davide, for the question. Yes, so the gateway will be available till the end of the year in order that we are continuing our conversation and check on the sustainability plan, which means that the gateway may have a price and this is depending also on the development that you would like to have in your dashboard. So if you are requesting something that it's not on our plan for the moment, it may require some more development. So, anything that you are thinking to make it right here or to make it more interesting. Let us know before June 2023. If you want a monitor service for your gateway. It's better that we start now. And we update the number of understand that we have already in ongoing and unfortunately, I will start sending the mail by now I am alone but probably other people from the team will help me to deal with all the memorandum understanding and being touched with all of you. So it's a bit of time, but for sure not before December of this year. Alessia, I have a question for you. Yes. Martin was asking about some specific data about the certain sources. And with the user accounts, and then with the one we will find out how to enable you with the user account service as well. Is it possible to have this kind of information from. You can get the question. Sorry, it's a nice. I jumped a little bit. Okay, so the question is, if we have from the service that are registered to provide. If the user account will be possible to enable the dashboard or to see some specific data from the gateway resources. Okay. They said that the landing page of each research product in the gateway already shows the usage statistics. If the corresponding repository is activated the service. So this is already integrated. I think what Martin asked is to be able to see statistics about a subset of the gateway. In this specific case, basically, what is happening is that the gateway includes a lot of resources that are related to human rights and sustainability and laws. But he would like to have specific statistics about the research products that are available from the zonal community of his own organization. And we, I mean that there is a technical issue I'm not going to explain here, but we could somehow do it by exploiting information about the source from which we collected records in order to provide the subset, but this is something that probably we need to discuss with the Leonidas and you are now who are my colleagues who are responsible for the for the monitor dashboard. Thank you. Okay, I'm good to know Martin that I understood your your suggestion correctly. Okay, so if there are no other questions, we are. We have almost finished our time so I would like to thank you all for being here. Even thank you very much for your presentation for your very important input. Thank you. And I wish you a lovely day. And if you have questions, if you want to ask for more information, just contact me you can go to connect dot open air dot you, and you will find a contact form. And I will reply or reply you we can arrange a meeting and we can solve any issues and address any questions you might have.