 Welcome everyone. I hope that I just want to check if we are already recording. Okay, perfect. So these recordings will be also available for those that were not able to join, but thank you for all those that have joined. So we have the possibility here to have one more. The content providers community call in order to share the recent developments and to receive your, your feedback about the services and if you are, if you have any issue that you want to solve you can also have the possibility to have it. So the slides are already also shared in the, in the, in the, in the, in the, in the specific page that we have for provide community calls and also in the, in the minutes if you can, you can have the link maybe. Okay, my colleague Andrea will also share the link for the, for the minutes in the, in the chat in order for you to access if you want to put any question in the, in the chat, or in the, in the minutes please, please feel free to put your questions or to ask for fencing or to comment. We have the notes that you can place there your comments and the shot, and we have the slides available and the recordings of all the community calls also available in this provide page in the open air portal. So today, we will, we will dedicate off of this meeting to the, to the broker service, the open air broker service. So basically the service that provide all the metadata enrichments that we have available via the provide dashboard. So we have my, my, my colleague, my colleague, Claudio from, from, from Italy from CNR East from pizza to, to detail a bit how the service works and what are the the new events that are being generated and some, some, we will share also with us some, some novelties about this and then we can have time to discuss to, you can put questions to Claudio or we just discuss other issues that we, that we have with the service. Before Claudio starts with the presentation I just want to give some more informations about, about some recent developments in the, in provide some of them we have already provided by via the newsletter. But we want to highlight that, and this is a reminder that we have a new type of broker events available in the, in providing production. Okay, we are testing some in beta but we have already one that is related to the outer IDs. So we have links to, we provide the information of our kids in the, in the metadata enrichments that we are sending via the provide so you can use this information check. And you, so update your, your record so you, you can access in provide to the content a part you see the events and there are one type of events that are related with outer IDs where we are sending IDs from authors that are authors of that specific publication and you can update your record in your repository. The other information is related with the guidelines. There is now one, one version one Spanish version of the guidelines so our colleagues from the Latin America network of repositories la referencia have worked on this translation so thank you very much for all the effort that this this team from la reference put it into translate this into Spanish and in fact was not only a translation was also to fix also some they have also contributed to fix some some miss small mistakes that we had in the in the in the current version so it was quite important also for us to update and to fix some some things. And so we remind also that this version of the guidelines is already in production in the validator so you can, you can access the validator and test the compliance of your content providers of your repository against this latest version of the guidelines. So, I like that we have a new way to track uses events from your repositories it's a generic tracker script that you can use so additionally to the plugins that we have available for the for this place. You can also access the, the github open air repository where you can find information about this script, and how can you use it in your repository in order to enable your repository to be directed by by open air for the user statistics. The last one, if the last information is always to highlight that we have a public road map where you can be, we can provide feedback but you can also be aware of the, the main changes and that are in the pipeline to be produced and then, if you have any suggestion if we think that the suggestion is relevant. We also put it here in the roadmap and you try to schedule this in our plan in order to deliver it as an as an enhancement of the service. So, these are the main information that I want to share with you later you can also ask more things about this novelties, but now let's have them. This is a sorry presentation. Claudius from CNR is T as I said and she was one of the main responsible fighter by the broker service that manage the broker service, and he will present some details of how the service works and what are we working on to develop this service. Do you do you want to share your screen, or do you want to pass the slides it's better if I share my screen so that I can go forward. Yes, so the floor is yours. So, I just stopped my, my screen. So, thank you for all for those that have already joined so now we will have a presentation from cloudy about the broker service. Pedro, can you see my screen. Yes, perfect. Thank you for the introduction. So, welcome everyone thank you for joining. I'm going to give a brief presentation of the open air broker service. The concept essentially is to leverage on the metadata quality that is produced by the open air aggregation system supplemented by all the other set of processes that contributes to increase the metadata quality. The aggregation system that puts together method bibliographic information from publication repository data archives and so on. We are capable to build an aggregated information space where bibliographic records are cleaned, disambiguated, validated and enriched by inference processes. Some of you might have already seen the sketch of the main model behind the open air research graph. Essentially the possibility to navigate these links allows to get information from surrounding entities that can provide additional information that can be relevant for your repository. But the main concept behind the broker is to produce information that is potentially of interest to the repositories that contribute to create the open air research graph. And this materializes into essentially two actions, either providing new records that are of interest for a given collection or enrich records that already exist in a given collection of records with extra information. Inside the broker, there is a per se architecture behind the broker service, but everything starts from the analysis of the information space built by the open air aggregation system. So the open air research graph and Skoll Explorer. Let me tell you that they consist of two for the moment independent information spaces constructed using two different aggregators. In the first phase, so there are dedicated algorithms that identify events. So there are algorithms tailored to capture information that can be relevant for the repositories that provided originally the information, the bibliographic information. In a second phase, such informations are matched against the subscriptions performed by users. So starting from the basic assumption that a repository managers is also implicitly subscribed to every event that is pertinent to the repository he managed. This is how we are able to assign every event produced by the algorithms to the associated subscriber. In the third phase, there is a periodic process that checks, which information has to be sent to the individual subscribers. So periodically, new information is produced as a set of new notifications, paying attention to not repeating the same novelty more than once. So as I mentioned, there are a set of processes that contribute to increasing the quality of the metadata might then be especially the application and inference processes. So how does those up mentioned algorithm works. They are based on a simple concept, they analyze the group of the application, the application records identified by the application system and comparing the individual set of metadata record or metadata fields in each metadata record. They are capable to assert that certain field that was not for example found in a given repository was instead provided by another repository. So say that repository a providing a given metadata record does not expose the abstract for a certain record, but the same record provided by another repository exposes the abstract. Then the second, the first repository might be interested in integrating that information through a dedicated event. So we analyze the topic that are associated to different classes of events into two main categories. Macro categories that represent events about field values that are different from those that are available in the repositories we call them enrich more. This category instead is related to group of events about metadata fields that instead not present in the metadata of the repository. At the moment in production we have topics that touches upon publication abstract the publication date link to projects. We have open access versions persistent identifiers or kids and subject classification terms. Most all of them derived from automatic algorithms algorithms that extrapolate those terms from the PDFs. In beta, we are testing the production of events related to links to data sets links to software and links to publications. An important aspect to keep in mind is that these events are produced by algorithms so automatic ways that takes decisions utilizing information that is not authoritative. So in this sense it is important to express this uncertainty as much as possible, because we want to be transparent in the information that is provided to you. So whenever a repository manager asks itself if it's the case or not to integrate a given reference to a certain project. We want to be transparent in saying that okay we are not 100% sure that this publication acknowledges actually this project here. Because such link was in the first place inferred by a mining algorithm extrapolating it from the full text. There is a chance that confidence level is not 100% sure of the trustworthiness of this information. As for, well not today but up to last month this table represent the number of events that were built according to each topic. As you can see the number is quite important. The numbers involved here are quite important. And I alighted the novelty of orchid events produced for the 700 repositories that were involved in the event generation procedure in the production system. While instead on the right, there are new type of events mostly related to references between data sets and publications. As well as references between software entities and publications, but they are still available only on the beta system. So at the moment, we are evaluating the quality of this new set of events, essentially for software events. We noticed that the majority of events are essentially mentioned somewhere in the article. So the question is mostly for you. Are all these mentioned relevant to be, are they worth to be integrated in your collections, regardless of where they occur. For the moment we cannot distinguish if such relationships sprouts from the citation section or if it is supplementary material, or any other kind of semantics let's say that links a publication and a software. So this is a question for you and food for discussion after the presentation. So instead for data sets for the moment the majority of events are related to data repositories like and see no links from data set to publications are good quality, but for the moment they are not yet available on the production portal they are built on the beta system. So we are validating it anyway consider that majority of the events that we can, that open air can generate from between publications and data set depend on the availability of information in the graph. As you already know, the big open air research graph is currently hosted only on the beta system so the broker in the production system instead will be able to benefit from this extra information from the moment that it will be a promoted to production as well. So this after the validation phase, we still need to wait for the production system to be ready to host the increased open air research graph. So which are the new types of events that we are working on. We are evaluating the possibility to generate what we call alerts. For example, generated starting from the continuous validation process. Let's say that in a given moment in time records in a certain collections. I don't know the title publication title this appears from for any reason, or due to technical reasons, due to migration of the repository platform, the records are not available for one day or one week. These alerts could be a mean to inform the repository manager that something is not working as it should with their repository. Might it be related to content or might it be related to protocol issues or to unavailability of their service. In this moment we can think about notifying this information via email to the content provider dashboard. Informing that your content was aggregated or that the content was indeed indexed by open air, but we can provide more fine grained information. We are working on defining an API to allow other services to consume the content of a notification. As for today, only events and notifications are only browsable through the content provider dashboard. So integrating such information back into your repository for time being is a manual work. But the task is moving forward we started to define preliminary version of an API for the bulk download of the notifications. And future works will also assume to define to rely on the protocol to automatically exchange information but this is still work in progress. And I think we can move on Pedro. Yes, many thanks for the, for this presentation I think to the, to the, to the point, let's say I'm sharing my, my, my screen now just to, in order to highlight so that now what I would like to have it's also your opinion I just want to. You can comment if you want to type you can type but you can join the conversation just activate your, your microphone and you can join just three, three remarks. Be aware that you can access the result of this work the broker in the, in the, in the provide dashboard in provide dashboard here in this area of contents you click and events, and you will see the events now I'm showing the events the University of mean your repository. Okay, so you can access here, you click in your source, if you have more than one and you will see the different types of events so this is where we have so you can see here, the one that is the one that the new one about the missing outer information so links to, to, to our kids okay. So, this is the kind of information that you can see and just highlighting to demonstrate this I know that some of you already did that but you've, for example in this record, as open air is found this orchid. You can use it for your, for your repository. Okay. Then the other, the other thing that I want to, to highlight is regarding this new types of metadata that cloudy was was highlighting so. So, we, we do a validation process evaluation process of the results of this new generated events in a better environment. So, we are happy to, to have more positive managers, those that are here participating in this call if you want to join a small team that we can share some developments in bed in order for you to test and one of the work that you can do is also. In fact, testing this, this event. Okay, if you want to, to join just send an email for for us. So, you can send an email for me so Andre can also share my mail here in the, in the shot, or for another channel in open air, so that you want to join to join them a team of, let's say provide the provide the users advisory advisory board or provide the users board that you can test some new things so, and one of those things to test is this kind of new events so if you want to comment some, something about this, it's, it's quite important. So, the question here that that cloud you raised about the software is that the type of events and metadata enrichments that we are generating from the software are not the normal ones are not completely as we have today mentioned. So, we have some metadata records that we found with links to software that are in fact, it's related to software so we should have a kind of link from that publication to that to that software. The majority, in fact, are just mentions, like references. So, a mention to a software in duty in the text of the paper or mention to a software in the references of the of the paper. So, and this is something that is not. It's interesting information but it's not the metadata enrichment. So, we are, we are not sure if we should have this also available in them as a metadata enrichment because it will make a little bit of confusion based on what we, we have, we have now. This is what, so if you want to, to, to provide your information about it so you feel free. I'm just also accessing the provide in beta just also to, if you want to put your questions please put the time just want to share with you. So, this is what we are talking about so this is the kind of in beta this is the kind of metadata the new metadata fields that we have so in rich missing software, but in fact it's not only missing software it just mentions to to a link to a software. This is just the paper so this is the kind of information that we can provide as you can see a link for a for a specific software that is related or mentioned in this paper. So, but if, if, if you want to have this information you need to check and you need to spend a little bit of time to access the original link. You can check if this is if it is the just a mention or a link real enrichment of the metadata. So if you want to provide your feedback we will, we are happy at least from my side I'm not completely confident that we should put this kind of new metadata that is available in production. This is why we also want to have your, your input here, but this is the kind of information that I'm sharing now from the specific case of the repository of me that we can, we can have some cases from source for us from GitHub. For example this one from GitHub. So, this is the kind of examples that we have. Okay, if cloud you want to add something you can head or if you have any question I'm not sure if we have any question here in the chat. Just feel free to to to jump in. Okay. How do you want to say something. I can just, I can just tell that I share your same doubt about the availability of simple mentions. Perhaps we could think about renaming the topic and indicate that it those links are software somehow related to a given publication, give a generic flavor. Then, in future we cannot exclude exclude that the mining algorithm will be able to capture more detailed semantic about the kind of reference. So if it's, if a link appears among the references is one thing if it appears among in other sections of the article, it can provide more detail semantic within the link. So the broker will be able to benefit from this extra information. Okay, so we cannot exclude to have an improvement in the next months also on this front. Many thanks for sure for sure if we go through to put this in production we need to rename the type of event because it's in fact not missing it's a, it's they are missing links or or, or additional mentions something like that. Let's, let's, let's, let's hear, let's hear our community and also the thoughts about that. So feel free to share your thoughts during this call, or if you want to join that, that team that can also contribute to the, the better, the better of some of these developments please contact us. And we will, we will put you in that in that group. Of course we need to ensure that your repository is well represented in better because not all the content from the from each repository is fully available in the bat environment but, but we will we will, we can try to do that is specifically for your repository. Okay, I hope that this presentation was, was useful and comprehensive for you at least I thought that Claudia did it very well and it was clear. So, so now you are aware of the type of all the type of events that that that we have available for you and that you can reuse. We are aware of course that the links to projects. The links to so missing PIDs and the open access versions and missing open access versions are usually the three types of events that people that we are aware that repository managers really want to have and really, and are using it. So, I heard about abstracts, for example, missing abstracts. But so, feel free to give your opinion about the others. I just want to also try light that for example the publication date is, is something that is relevant because we can say that we are sending not we are sending a method that error. If, if you are not exposing properly the publication that that this is a critical error in your in your metadata so we are suggesting you to have this that information properly but all the other events are quite relevant and depending on the type of policy that you have for your repository. Okay, there are some comments here. This is about, so from one. She wants those are from polytechnic of the idea. Yes, so it's me new is this space, this space repository. Yes, it's true. Okay, we have bolting and saying, thank you. Okay. So, is there any other common issue that you should that you would like to raise for this related with the broker or with any other, any other topic from the, from the provide service, feel free to, to, to comment. The criticals are not only for us to present what are the main novelties and to end the specifications of specific components of the of the provide but it also to welcome your feedback. So, I would like, I would like to highlight. We have 10 minutes more for this for this call if you want to to head more, but in the meantime, I just want to highlight two things. One is that we have, we have the specific newsletter for the provides so in, in, in open air we have the generic the generic newsletter where we every month send out information about the open air activity, not only the technical side that all the network, but we have a specific newsletter for the content provider issues. So, if you are not a subscriber, please subscribe this newsletter and you will receive every month information about the content provider development so we usually send this newsletter in the day before of the monthly community call. So please subscribe this and also be aware that in the, in the, in the page of the community calls, all the, all the calls until the summer already schedule. Please have them to your calendar. We have also the links for the sessions there so we have an issue with the session today but we will test if everything is working fine, in order to avoid the problem that we had today but please have them to the calendar. We tried to send reminders via the newsletter but add them to the calendar so good again is, is asking for a specific question so the link, the link for the tracker code, the new, the new script is available in GitHub here. But also, so if you, if you check in the, here in the metrics part of the, of the, of the provide service. We have, we have that information available also in the, in the support if you are not. If you didn't enable your service, I'm clicking here in order to demonstrate this. If you have this information also in the, in the, in the, in the guidelines so of the users. Let's let me put also here the link in the shot. All these links are also available in the, in the presentation that we have already shared. Okay. So thank you for, I know that my colleague Jose is also here in the, in the call. Before we, we, we close this meeting. I also want to, I would like also to hear. Jose Jose opinion about this mentions of the software this enrichments of the software. Jose thinks that this, this is relevant or not for example for the, the context of the Portuguese repository manager so it's it's one more opinion in order to have also the feedback from the community I'm not sure if she's a. Okay, Jose, you are. Yeah, what is your, what is your opinion. Okay. Can you hear me. Yes, yes. My opinion is that it's still not yet a generic practice in the way that people make mentions to software in the way that they use specific fields, for example, on the repository systems or on the journal systems to be able to share that on a structured way. So, we are now able to do other types of process like this one to get information from the texts, the full text to get this information. I think it's important to get this information out from the full texts, but also important is to understand what kind of information we're taking off to be able to define if this is just a generic reference. Just something that has really been used. Regarding the publication itself. So, so for example if if this is a piece of software, you needed to be users to reproduce some studies that have been described on the publication. So this is something very, very relevant. In this way, if I have a paper where I describe some available software that can be used by the community for specific tasks that are not related to the paper itself. So this is not so much relevant in this way. Also, we can have a middle term in this way where people and mainly repository managers can make a decision in the way that this can be or not available on the repository as a reference or just something that is referred to the publication. The manual decision. It's also very important and something that can be complementary to the way that machines and algorithms extract this information. So by now I think it's the way we can do it in the future we may have other kind of practices like in the guidelines for example. Yeah, many things you say for your, for your input for this discussion in the meantime, Jordan have put it here an important question cloudy if you want to give some more information about this this is something that the community is asking Jordan is asking if it's possible to export in any way the events from the dashboard. So maybe, maybe cloudy can had some more information this is in the, this is something that we really want to have. We have the limitation in the way that we are exposing this events to them to be visible here in the in the provide. This is something that we really want to have so at least from a simple CSV file or in the future also via protocols like like sword. So, in fact, this is something that we want really want to have and this is also visible here in the in the in the in the road map so this is under our under our consideration to have it. It's not in progress or completely planned yet because we don't have yet the, the, the proposal. We cannot predict when are we going to do this improvement. But the quality do you want to add something for the for our colleagues here in order for them to be aware of about this. Yes, yes, well I can confirm this is an ongoing activity. The design of an API that allows to download the events to build download the events is already implemented. We are testing it at the moment so we need to define how this functionality would be available to end user through the user interface. So we can assume it to be available in the near future. Yeah. So this is what we want Jordan so we hope that we can in the coming one or two community calls to give a good news about this. So at least for we don't want to put. We cannot unfortunately put a lot of effort for this but at least delivering CSV file this is what we really want so attach it to which event type of event we have a CSV file or when we receive you receive. This is the basic idea when you receive a notification from the type of events or filters that you have subscribed. You also receive a link for to download a CSV file of those those events. Let's see what we can, what we can, what we can do and we, this is something that in fact maybe I will put it in all the slides for the coming for the coming community the disinformation. Okay, so if there are no other questions so we are just in the in the time. So we have started today a little bit late because of this issue with the link for the session but thank you everyone for for joining this community call. We're aware of the dates of the coming community call we will have in one month we will have another one so in the in the fourth of March, at the same time. You have the information there the slides are already available slides and recordings will be available also here in this in this page under these notes in the recording section. And if you have any any kind of issue accessing the provide service, just contact us because we will we will try to solve it quickly. Okay. Many thanks for your your presence here in this in this in this call, and Claudia, many thanks for your support and for your great presentation. So thank you.