 So welcome everyone for this first webinar of the of the open air celebrating the open access week. The international open access week is really a great pleasure for us every year to join this week where we celebrate open access where we celebrate open science practices. This year the topic is open for climate justice is great that several institutions in the world several several research communities are celebrating this this week together and putting the focus on climate justice. And this is the idea also for what we want to do in open air so to every day. We have a session in the morning checking the developments of our services and our services are contributing to open access and also putting the focus on climate justice. So we want to have those highlights every day. 11 Central European time. And it's great that we have today this first webinar so we have other sessions so we will share the link in the chat about our program we have our other sessions also in the afternoon. But in the morning and this is what we are talking about today we our focus is in the way that our services. Relates with climate justice and serve open access objectives. So today the session is being recorded as you already saw. So you can put your questions in the in the chat. Some housekeeping rules that I'm providing so the slides we also will share the slides with you afterwards we will put it also available in the in the program page in the in the open air website. We need to share some thoughts about this event and then our program using the hashtags open air or, or a week or open air services and mention also open air. So we want also to do an interactive session so we can have a kind of parallel conversation in the chat or just raise your hand to put some questions and you can for sure speak this is the idea so we, we don't have so, so many participants we can do a conversation and you can in fact clarify your thoughts about the topic. So today, the topic is in fact about one of our main services so the, let's say the, the, the central service of open air the open air research graph. And in the way that we are supporting indicators, researching innovation indicators. So sorry, from open air and from CNRST in Italy, you want to agree party from the Tina research center and from open air, managing different services managing Claudio the open air research graph services and your one also managing the monitor facilities that we have in open air. Thank you Claudio and Ioanna to be available to deliver this webinar. The floor, the floor is yours. Let's see what Claudio and Ioanna have to tell us about the open air research graph. So, and feel free to share your screen and to, and to start I think Claudia will start. Yes, thank you. Thank you for the introduction. Let me just share my screen. And I hope you can see the first slide of the presentation. Perfect. Perfect. Thank you. Okay, so thank you again for the introduction. Today, I will be briefly describing the open air research graph, the principles behind the activities for building it. A bit of the processes and especially how it is used can be used to support the calculation of evidence based indicators on research and innovation. So the presentation will be organized in two blocks. The first one, as I said before around the open air research graph itself. And then we leave the floor to my colleague Ioanna to describe how the Intel comp is using the data in the open air research graph to calculate the indicators. So, the important part that I want to stress in this presentation is the role of the open air research graph in allowing to give context to the research in a data cycle. Each object in the open air research graph can be seen not as a standalone dot, let's say in a data space, but thanks to the surrounding context. It brings another value publications are in this sense. As like a constellation of digital objects surrounded by other publications, data sets, research software, the research projects that produced them. So there is a multitude of views at which one can look at the data to get a glimpse of what is happening on a given scientific domain or a given slice of the data. So, in the end what is the open air research graph, it is a ultimately a data set can be seen as a collection of metadata records that describes objects digital objects that take place in the research life cycle. And of course, relationships among them by definition of graph, it's made of nodes and edges connecting the nodes. The principles guiding the activities on the construction of the open air graph is that it has to be an open collection. As long as possible. It must be dedicated because to account for numbers we need to avoid redundancy in the countings we do the same research product is often found on multiple sources so counting it twice or multiple times in general would hamper the statistics. So what I think is of course an important principle has open air does not want to be a black box for its users, but all the processes behind the aggregation system, the metadata aggregation system, how it does the calculations so all the processes that operates over the graph has to be as transparent as possible so documentation. It is based on a participatory approach, meaning that new data sources are always welcome to be on board that through open a provide as we shall see later. The efforts in building the open air research. The guidelines for content providers are built based on a participatory approach so that the community can actually contribute in setting up how the metadata exchange procedure can are defined. The content is managed in a decentralized way, meaning that open air does not own a master copy of the data, but as an aggregation system it is based on the data by how it is managed by each data source that is responsible to actually maintain the master copies. It has to be a trusted source. For all the applications that are built on top of it. So there are well known sources that are available in the open air research graph, as we shall see later. Last but not least, the research graph is available. For consumption using CC by license so one can do anything with the data as long as it sites open air and the research graph. So why we are doing this on one side we want to do to track open research so reproducibility and transparency. This is the outcome of all the outcomes of science and as I said before the related surrounding context so not limited only to publications, but also research software and data research data takes an important role in tracking the research activities in a wider sense. So discovering open research, meaning that the discovery of reproducible science outcomes must find new ways so not only driven by keyword based search, but driven by new methods for discovering research objects. And in fact the graph is being used in new projects also to drive new innovative way to implement discovery functionalities. Last but not least monitoring open research so quality impact and openness of science as transparent and reproducible reproducible process for all, of course, including of the context around research. So the opening research graph a glimpse of its underlying schema sees at the center research products. Authored by the relative authors that are in turn identified by orchid identifiers. Research product sub types counts for publications data set software and what was called other research products so container for representing objects that do not fall under the first three categories. Then research organizations are directly associated as to research products as organizations to which the authors of research product is affiliated to Then the beneficiaries organizations are also beneficiaries of research projects that has funded the publications. Each publication is associated to provenance information declaring from which from which data source. It was collected. Lastly, research projects are organized into funding streams that are provided by funders. So this is the high level view of the data in independent research graph. Some numbers updated that the last version published last month. The next version is going to be published the next week. Today, the graph counts 25 funders around 2000 data sources from which the open aggregation system directly collects metadata records around 3 million projects, and you can see the amount of 144 million 300k software objects 17 million research data and some six or seven other research products. The numbers are important, but this is just what is perceivable through the public services the actual numbers behind are much higher because these numbers that not indicate the gross number of records before the application, for example. So which are the users of the graph as defined by the European open science cloud. We have direct users like content providers publishers libraries content consumers like data science data scientists and service providers. Indirect users via the services built on top can be funders in research institutions research infrastructures policy makers in general industries and researchers, or the services themselves. Or like, for example, the US marketplace the commission participant portal, or a severe scope you sense eval springer or orchid can use the data in the pen and research graph. Of course institutional and thematic repositories. So, today, some data sources that we can mention as being part of the graph, we can mention registries, providing data source descriptors like our data or raw providing research institutions, open door for sharing. Cross is a data source or thematic and institutional repositories or aggregators like reference or do I j open citation. There are different categories of sources providing different types of contents. So, to join and be part of the pen and research graph, the gateway is represented by open a provide where a content providers can register and thanks to the open air interoperability guidelines have support and a way to define the metadata format describing how the information can be accepted and digested to be part of the pen and research graph. So repository manager can go and provide and register their data sources. On the right of this slide, you can see that there are different type of arrows building contributing to the open and research graph. It generally consists of metadata records and relationships among these objects. For example, when a publication has a relationship like it's a supplementary material that can be described within the metadata. It can appear as related identifier, for example, well depending on the format but in, for example, in data site format is generally appears as related identifier, accompanied by the semantic of the relation. So supplemented by supplemented to the source of mappings that are applied to this data can extrapolate the semantics of the relation and build a link between the publication that states this supplementary this relationship toward the supplementary object. And then to another identifier built by the PID generally used to express this relationship and knowing that the metadata record of that PID is available because open air collects the whole set of content from Crossref and data site data site. So in this section we are essentially building a pair of integral on interconnected objects. So the supply chain behind the research graph can be summarized by this picture. So once on the far left. The aggregation has collected. In a continuous way, the bibliographic descriptors are monizing the information according to a set of the fine vocabularies. These all these information forms what we call the row representation of the research graph so where all the different contributions from the different data sources converges. So this information gets disambiguated according to strategy that open air has identified as I said before to avoiding double counting the same object that can be that actually are collected from multiple sources. Then the the metadata records are enriched according to variety in variety of algorithms. Some of the some of these are based on TDM approaches so leveraging on the availability of the open access version of the PDFs described by the metadata records that we process. So this is the case for example for the extraction of the references to the grants. An algorithm goes into the full text of the publication and searches in the acknowledgement section for references to the research projects. Or another algorithm calculates the similarities between the publications, or again, other different algorithms allow to propagate contextual information from one publication to the supplementary material. So there are is a different. There is a dedicated framework called information inference information service where different algorithms can be plugged in to introduce enrichments, both in terms of links between the objects in the graph as well as properties and reaching the existing metadata records. An example of an enrichment about properties is the automatic classification of subjects. And I'm going to illustrate some novelties later on on this regard. Lastly, on the far end of the pipeline, the graph gets materialized into different kinds of backends serving dedicated ways of consuming the data. On one side, the the harrow pointing upwards the indexing make is make it's it available on the explore portal, as well as on the connect gateways, where the discord search and discovery functionalities are implemented. Instead, they are all pointing downwards here the statistical analysis, it send basically materializes the graph into a data warehousing system that allows essentially to slice and dice the data and to calculate indicators and perform statistical analysis over the records in the graph. So, as also Pedro mentioned, the graph is a bit of the core of the opener services, it's the, probably the measure data source that the different services in the opener next portfolio users to deliver content to the end users. So, different stakeholders, did I mean this is why, as a service per se, as such a multitude of stakeholders that are have an interest in the research graph. The purpose is to access it. Currently, there is a dedicated community on Zenodo where every six months opener publishes an updated version of a dump of the data set. The dump is available in different flavors. Here, the screenshot illustrates the one that is mostly accessed the dump of the funded products. But there are also different formats and type of contents are available. So there is a subset of dump that only relates to the products associated with research initiative and research communities. There is a dump that includes only material that is related to COVID-19 or a dump that only includes links between publication and data settings colleagues format. Of course, there is also the complete version that includes all the entities and the relationships that by the way the next dump is going to be published. Late November, or the beginning of December, it's going to be six months till June, when it was published for the last time. And we are working also to produce another version of the dumps way reduced in data volume so that it would make more easy for developers and data analysts to have an ends on with a, let's say, quick approach. As we do recognize that the volume of the contents of the complete version of the graph could not be treatable on the developer local workstations. We are talking about a data set that weights between two and 300 gigabytes so it might not be that easily approachable. Then another way to access the open and research graph, it's through the HTTP public APIs. You can go on graphopener.eu and find the links to see the different ways under which the content in the graph can be accessed. Either by discovery to support discovery operations so indicated here as the selective access. Either as a content provider, if you want to benefit from the enrichments that the graph introduces in the data I mentioned before. The subject classifications or the availability of orchids that gets propagated from one publication to the surrounding context to get for example, the publication that received an orchid where authors received an orchid. That did not have in the first place. So you can learn more about which topics about the enrichments are available on the broker section. Last but not least, the linked open data representation is somehow not that updated, but it's available. Another way to access the graph, it's through the Skoll Explorer service which falls under the API you can see here, API Skoll Explorer Opener.eu. And this is a way of expressing or representing the contents in the graph oriented to the relationships among the objects. Now, two novelties I wanted to highlight is the availability of two ways of classifying the research outputs by fields of science and by sustainable development goals. So quickly, regarding the first, the field of science classification, why do we need it? Because of course, the volume of scientific publication requires a well defined way of partition the publications by starting from a high level view of science. This is especially needed by policymakers, funders, publisher again. Majority of the stakeholders have an interest in having such classification available in a data set of this volume. Today, as for today, we have around 15 millions of records classified by DOI. The classification is based on this hierarchy that is composed of three levels and the work is based on a methodology implemented in this service that is being developed at the Tatina Reserve Center, Shinobu, a hierarchical multi-level classifier of scientific publication. Then, sustainable development goals somehow follow the same needs, so why there is a need for that? Societal priorities are partially at least set by the United Nations, so that also the research activities can be organized according to such 17 goals. And again, of course, the need for policymakers to classify research according to these goals. As for today, we got around 8 million DOIs that are classified and based on a classification system that is based on, for now, silver corpus of keywords and key phrases used to build the training set. And the classification uses the publication title and abstract to classify it. Currently, the results of these classifications can be seen on the OpenER Explorer website as well as on the different research community and initiative dashboards. Moreover, probably Johanna spent some time in illustrating something or just tell you something that some indicators are also going to be introduced on under monitor OpenER-U based on fields of science and sustainable development goals. So before closing, just a few words on how OpenER tries to make all this process trustable. So overall, can I trust the indicators that can be synthesized from the graph, which can be read as can also trust the underlying data. There are several challenges behind the construction of such a data set like data incompleteness, inconsistency, inconsistencies in the data formats in which the data is expressed, which becomes of prominent relevance. If you think about the number of different data providers that are actively aggregated by OpenER, it's not an easy beast to tame for sure. Or default values, duplicate data, as I mentioned also before, or old stale data, which relates to the freshness of the information available in the graph. Or inconsistent keys, like can I be sure that the related material associated with a given publication is actually a valid DOI or not. Last but not least coverage. Am I sure, for example, that for given country, I have all the most representative sources. So on the right here. These problems can be not. For sure mitigated, not entirely solved, but the vocabulary based cleaning does a huge job in trying to normalizing the representation of the metadata records that thanks to the automation of the processes. Ensure to be always up to date with the most recent updates from the data sources. Thanks to the provenance information. We can always know from which given records, which given record comes from. We are in the process to build a more detailed documentation of describing how we do what we do is going to be released next month is being contributed these days. Of course, all these to support the repeatability of the calculations and the reproducibility of the calculations to deliver the indicators. So when with these lights, I close my presentation so I would now leave the floor to you, Anna. Thank you, Claudia. Let me continue. Okay, I'm going to share my screen. Okay, can you see my screen and hear me fine. Yes, thank you. Okay, thank you. So just to briefly mention that the monitor service that Claudia briefly mentioned at the end, during the discussion on the SDGs and FOS classes will be presented on our Thursday monitoring session. Our Thursday morning session. But I will talk about something that is related, which is a project project Intel com, where we discuss where we examine evidence based research and innovation policy making. It is a different view of the graph and shows this a case that shows how it can be used as a, as a key player in evidence based research innovation policy making. I'm going to take 510 minutes to talk about Intel com first so that we are all. And then I will share the understanding of where I'm going with this, and then I'm going to zoom into the role of the open air research graph. So, see. Okay, so the Intel comp is a project composed of the 13 partners that is running right now and we'll put and we'll finish in the next December. So this is where we built our building a platform composed of all the tools necessary for a policy for for research and innovation policy making. And I'm going to make this more specific. Okay, so there is a 310 billion euros was the EU expenditure in research and development in 2020 according to the European Commission. It's still a small share of GDP below 5%, unfortunately, but it's still a very large amount of money. Research and innovation activities are priority across across different type of place. At the same time by this point, several studies that show that these activities drive a large share of European economic growth, helping the development or new and better jobs, and key in addressing societal challenges. We have money getting into research innovation activities and potential very positive side effects. So, when it comes to research and innovation policy making, it is extremely important to align with the priorities of the society. The sustainable development goals among other things and make sure that the different dimensions that are affected by research and innovation policy making at taking into account. The research innovation policy making has to be open and inclusive. So that it is transparent evidence driven and participatory so all citizens and all types of actors feel that they're included are actually included in the policy making since they are affected so significantly. The complexity with it is that the research and innovation activities create a space that moves along very fast, it's very complex, it's interconnected and it's growing, has a large size and is growing very fast. So, the tools that we give for policy making have to be up to par with this state of play. So, if one takes into consideration the two different ends of a policy cycle so from agenda setting all the way to the evaluation. So let's say for agenda setting, where should I invest next and by invest, I mean in any different types of resources, not necessarily financial. So, which research topic which organization, which country, are there some opportunities, is there room for improvement. Is there hidden potential. Am I, am I aligned with my societal goals, where should I put my resources in all the way to the end of the policy cycle where we have the evaluation. And we answered questions such as, what is the impact of research activities across different sectors, what is the timing of that impact. If something is really important now am I going to see these are short term effect on these are going to have to wait for 20 years from now. The enabling factor is open science helping things what are the pathways to really gain understanding on how my funding policy approach contributed in this setup. So from agenda setting to evaluation, we need to be able to navigate the research and innovation data and knowledge space. Easily, and make evidence based decision. The critical needs that we have that are given the characteristics of this space is that this tracking and evaluating of research innovation activities has to be data driven, of course, but also relevant. Not all the data that is out there is should be used for the decision making right so we need the human expert experts to tell us what is relevant and what we should look at. We need to be comprehensive because we know that several aspects are covered from people to industry to human resources to policy making to everything. And also granular because we in order to have any hope to find like to really gain an insight, we must go down to a very fine level. Given the characteristics of the RNA space it has to be automated and timely otherwise, there is no point. It has to be sustainable and transparent and replicable. So we need to be able to keep assessing these activities in the long term and repeatedly. And the same using the same methodology right so that we can trust our insight and trust that they can be used today tomorrow and so on. So the methodological needs. The guy that the poll the approach that we have, which is, it's, it's a two prone approach basically it relies on firstly on policy intelligence. This refers to several characteristics in the approach. So we need we are leveraging big data with AI systems and this data is dynamic with illegal and heterogeneous. So we take all this big data and the AI in good use, and then we include human experts in the loop, in order to expand the capabilities of the system and really augment the intelligence using state of the technology both in computational power and in AI techniques to be up to par with the, with what we're actually studying and having frequent updates so that we will stay timely. So policy intelligence is the first prone and the second prone is that is extremely important to have open and fair data, as well as open transparent and reproducible methodologies. Open and fair data is is key in discovering linking and tracking how the research activities propagate across the society. We don't have any hope to go from the policy to impact and the different, the different aspects of different dimensions of the society, we need to be able to link different data sets to each other, and different data products. And we need of course also to be able to do this again and again. For the, for the reproducibility. So, this is where the open transparent and producer methodologies count into place. So that we have a replicable assessment, a fair data just a side note is also important for the automation of the process. Okay, so the, what is the Intel platform now that we discussed the motivation and the approach. Well, as I said briefly at the beginning it is an end to end platform. On a high performance computing environment for evidence based research innovation assessment policy making. It is comprised by a set of tools that provide a cooperative environment where actors can visualize interact and analyze information. So let me break this down. First of all, the, the actors that I defect to the stakeholders are mainly policymakers funders different kinds of research and innovation analyst and public administrators. These platforms can also some of these platforms can also be used by SMEs, academia and citizens, but they're not our main main target stakeholders. Okay, but they're still very important. So, let me break down the tools that we're creating so they're business intelligence research and innovation dashboards. And there are such an browsing tools project proposal evaluation tools. So all these things that the end users need in order for for such innovation policy making. And then, at the same time, underneath of this, they are supported by the data lake and LP palm life and analytic workflows including an interactive model planer and a catalog of resources that presents all of this together and can help someone visualize the system let's say. So, these, all these are co created using a living lab approach that means that our work is awkward, based, based on three living lamps we have climate change in Greece, AI in Spain, and cancer in France. This tree represent obviously different domains different countries, which means also different institutions. So they're there to refine and curate the approach and the needs of each of the particular living labs and guide the Intel platform. So what is the role of the open research graph. In all of this. Let's take down using two kind of a bit simpler examples. So let's suppose that we are at the agenda setting part of the policy cycle. So there is a policy maker that is considering. We set their agenda in a particular topic, let's say climate change in a particular country, Greece. So we have a policy maker that wants to put some money in climate change in Greece, and they have to figure out how to do it where to invest. What are the topics that should be paid more attention to what is the future where are we going to. Where are we productive. Where can we actually make a difference and grow fast. Are there topics that we work on in science but then nobody really does anything with them in the industry. Can I hope with that. And do people care about these things. Which means that in order to make this decision, the policy maker needs to have a view of their research and innovation landscape for that topic in Greece, which means that they have to know different in different areas and how they link, such as the knowledge and diffusion so what's going on in science basically a human resources do I have jobs on that topic if not do I have people with skills on that topic, entrepreneurial activity, societal challenge and so on. I mean it's a big picture. But you have this this kind of this is the starting point at the end of the end of the policy cycle let's suppose we have a funder that is trying to evaluate the impact of a particular program that they run that guided research and innovation. So in this case you have a more kind of what you're interested in a step by step process which is how did the inputs, my program and all the characteristics of it the topics that people who participate in the projects, the projects and the amount of money and so on. How did this input leads to outputs, so what was produced during the project outcomes and then more longer term the impact of these projects of this program. So in these two cases, the major data source is scientific research products. So here in the agenda setting, what is key in answering these questions and helping this policymaker is to understand how this topic climate change or something more specific let's say a green hydrogen. How, how it plays out in the different sectors of the economy and how these are linked. And think also different sectors as different data sources, a little bit. So for example when I'm talking about knowledge creation, we are talking about scientific research output. So, you start with the scientific research output in green hydrogen. And then you do you see how this is linked what is the relationship and dynamics with the other sectors. When we move to the evaluation, what is of key importance here is that you first link the input the project or projects with a project output, which is most usually research products, and then you track it, you track the outcome and the impact and so on. For example, let's say that there is scientific publication. Let's say we're in the domain of health, and there's a scientific publication that leads to repair a clinical trial on in the outcomes a clinical trial with a repurposing of a drug. If you want to track later if this drug was successful if it wasn't successful perhaps a substance was used that was discussed in the publication was used somewhere else, or perhaps a therapy was created and tested, and then this therapy entered clinical guidelines. How many people are being affected today by this therapy, for example. So what we need here is two things we need the link between the project and the research output, and then we need to be able to track the innovation in the research output, and the organizations and the authors that worked on that innovation we need to track it to different parts of the of the society later on. So we need to extract information to extract the innovation for the research output. So what, what does this mean, it means that a research graph for scientific products is key as a major data source both for agenda setting what's going on in science, and a step number one in impact assessment. What is this project directly create this, these are of course only two of the main contributions of a research graph but these are two of the main ones that I will also talk about a bit later on. Okay, so why in particular to use the open air research graph, and not any other one. Well, it aligns extremely well with the methodological needs and that I presented earlier on and the approach and the principles that we have. So, so we rely on its coverage readiness and time loss. So, the scientific research outputs and links to each other have a very good coverage. There is rich method data from stuff that is inherited such as organizations data sources and so on, but also enrichments that we have citations article processing charges disease the first of that stuff now. And it is fully, it has a fully operational big data infrastructure supporting it. At the same time, it's all about obviously open science and open data. So it is inclusive transparent and replicable which means that using it as a major data source really aligned with methodological approach and will allow the end users will make it for the end users trustworthy as the everything that is created based on this will be replicable. And it's also fully embedded in the US infrastructure. How do we use it. So, it's a major data source in both nlp pile pipelines and analytic workflows. So we extract information from the publication and the abstracts and all the graph, the connections of the graph for topic modeling classification extraction, something called clave evidence which means we see what the abstract claims they do and then we see if we actually did it in all sorts of different pipelines and it's not this is not the best place to present everything but just to give you a hint. And then we also use the linkages across the research and innovation products and the actors and the networks that are implied through those. And then we use some as a major data says none of these and create a bunch of research and innovation indicator broken down on the very, very fine level, as it is allowed because of the granularity of the graph. Okay. So, what can we see with it in the end. So at this point, we are as we're still far from the end products of intercom. So I'm going to give you a sneak preview of something that is called for now at least the STI viewer. So it is one of the three intercom platforms for end users. The other two are the STI post participation portal and the evaluation workbench please reach out to me if you are interested in the specifics of each. And the target audience is research and innovation analysts. Okay, so someone who really needs to dig into the data and advise someone at a higher level. So the SCI view is built to analyze compare and visualize a comprehensive set of research and innovation related KPIs for agenda setting. So the beginning of the policy life cycle and evaluation. Okay. So I'm going to, I'm going to give you, I'm going to present now some screams of the working version of the SCI viewer. Of course it's not in the final form. And briefly make less abstract. That's what I have been saying up to now is so here we are the user story that we're following is that we are interested in agenda setting in energy in the energy domain in the EU. And we have broken down the economy to these six categories science technology industry human resources policy and society. And then we start with science. So what we see here is the evolution of different topics within energy over time in terms of the scientific production. So here we have classified. We have classified the topics in the publications in energy so we have extracted this, then within them, we have classified them using different techniques in a 25 topics in energy. And here we follow the trends. So for example, policymaker could be interested in here to see that there is this pink topic here. It seems to be very, very small at the beginning but then increasing in terms of volume. Similarly, the light blue has really become something very dominant in the production, whereas before it was a much smaller share. In this way, the analyst can kind of see what can what can see what the trends are of course this by itself is not enough for decision making, but it gives a first picture of what's going on within energy. This is based on publications from the opener research graph. Okay. Here on the left, we have the publications by country. So we use here the affiliated author, the affiliation of the author of publications. Okay, so we have again we are agenda, we're interested in agenda setting in the energy domain in the EU. We see, for example, that Germany and Italy seem to have, maybe Spain also seem to have the largest production in the topic of energy. So the sum in the universities as a real focus compared to the rest of you. On the left side, we see the citations per publication in each country in the energy domain. And the citations are taken from the open research graph as well as open citations is integrated into the graph. Here someone sees that the in terms of site ability of this publication. The view is much more even. It's not these four countries that are standing out, but it's just, it's a bit more flat. At the same time, we see that the Scandinavian countries are actually are producing less in energy but they're more siteable. I'm not an Intel come doesn't do the evaluation. I'm just giving example of some of the insights. So we use the open research graph to extract the collaborations in these publications in energy so we see here a timeline where in blue you have the publications over time. In yellow, we have the international collaborations. So how many of these publications are collaborative between two different countries. And the green line is how many of them are cited at least once. Notice that perhaps not surprisingly, I don't know. Again, all this information is from the open research graph. The two lines besides the end here but the two lines are very very close to each other which means that if you make an inference it means that international core publications are much more likely to be cited by let's say local local co-authorships. Okay. Here we have classified the energy publication in SDGs. I mean, we're actually taking the classification from the from the open research graph directly. And then we see that no surprisingly the energy domain is working on clean energy. Excellent news. We have the other SDGs that are affected. Good health is perhaps interestingly the third one most affected are most covered in energy publications. And then you can see how this propagates in the other SDGs. We have a little bit of the view and I am, I am almost done. Two minutes. We have a different user story here so we have monitoring for the European Commission Horizon 2020 and FP7 programs in the domain of health. What we have here that is based on the open research graph as one of the two sources is the participant company uptake score. What does it mean. What does it show? It shows how close what companies are working on today is to what they were working on when they were participating in FP7 and Horizon 2020 projects. You take the projects and their publications from the research graph, and then you take what the companies are working on now from the company website, and you find the semantic similarity and create somehow this score that the closest it is to one. The more the company is still working on the same things that they were working on in the projects. And this tells us something about diffusion and diffusion basically knowledge diffusion. Okay, this is just an example. This is the take a ways from my part of the talk. So for evidence based research and innovation policy making. So it's important given the way the domain looks to have a policy intelligence and open transparent and replicable KPIs. We have to leverage the open research graph for its links and metadata and to create additional ones for policy making research and innovation, and it allows us to have for at least the science, the science part of this question, timely comprehensive and granular views of what is going on. Okay, I see some questions. Thank you. Okay, great. Many thanks. Claudia and Johanna. Basically, we were sharing a lot of inputs is also in the useful information here in the in the chat. And Paula supporting sharing some of the of the useful links and also Claudia adding the also registration here in the shot for those that are interested. Feel free to ask the question so I think it was quite useful to have this overview of the open air research graph and then a specific use case from this project in telecom. If you have any question. Feel free to ask here in the. So using your microphone or just putting in your in the shot. If everything is clear. And I think it was. Any any comment, any question, anything that we need to better clarify, verify your dog. Cloud you also if you want to add something in practical terms also to what you want to present it feel free also in the coming in the coming days. So be we will also see other ways to to use them to consume the open air research graph in different open air services and by different users and stakeholders. So we really want to highlight during this week. The practical use of our services and here you want to present it as specifically example of a project that they've consumed the open air research graph for specific purposes. So some comments here. Thank you. Thank you for your comment. Pedro just a comment. As you also mentioned I was thinking that in general, the role that the graph plays in the portfolio of the services will probably be better highlighted in a comprehensive view during the also during the other sessions. Because it is quite central in the portfolio. The commitment from open air is of course to improve on the quality aspects of the data it contains, as well as on the practices around it. Just like on its approach ability. So to be approached in an easy way by data analysts, researchers, practitioners in general. So lowering the barrier on using this massive amount of information. Thank you very much. Also made a comment. Would there in the future be a way or a guide on how to extend the open air research graph to map with the variety of ontologies out there in a way to provide a more granular that the set description. Would there be an option to extend and describe that the more granularly on the end of the user. And should we have, we should be aware of the future granularity in order to develop a complementary data description. Let me reply to this. In short, yes. I'm going to provide a more fine grained level documentation. It's going to be published as a website, likely to be part of what you already see under graph dot open air dot you. Then regarding the mapping toward the known ontologies. The tricky part here is to pick which best fit the most because the choice as far as I know at least is large. So we need to be sure on what is made what makes more sense for the communities out there. We need to analyze the requirements from the communities and decide which are the ontologies that are most relevant because open air cannot to map cannot implement this mapping for every ontology that is available out there. But yes, there is this intention to do it. Thank you. Thank you, Claudio. And thank you also, I think the region raise an important issue here that and the end one of the comments that I want to make for those that are here if you want to send us any comment. Any criticism on the assessment of the research graph. And this shared it with us because we, we always are trying to improve and in fact we are in that critical phase also strategically to, to develop some updates for the graph so feel free to send to us. Some, some comments and some gaps that we may have when you try to use the graph. I know that several initiatives from research communities that even at the national level for some research information infrastructures at the national level, even some software initiatives like vivo and others that they are trying to consume the research graph, and they have some comments and suggestions to send so feel free, because this will be useful for us. We cannot cope with all but for sure, some will, we can try to address them that light they have that would there be a mapping of upper ontology of open air towards other uppers. In any case, I would be very interested to keep up to date. Thanks for the answer. Claudia already replied, I'm not sure if you want to add something as a upper ontology of open air. No, I don't have a comment on that right now but when we start working on this mapping. It's going to be likely some buzz, some information in the newsletter for sure. To me it's not yet clear when this is going to happen, but as we do regular meetings on setting up the priorities on the roadmap. It's going to be a topic. Then I see some other questions from Julia. Would it be possible for the user to add what SDG can be linked with their own research. Can you elaborate a bit more Julia on this. Yes, yes. Thank you, Claudia. I was wondering, for instance, if I publish a research, but in this moment, there is no link about any sustainable development goals or any field of science in which I feel that my research is more linked to. Can we provide any feedback or can I, there is any way in which I can suggest what the link could be. Well, in general, when you publish your own research, you typically deposit your manuscript by specifying some topics that are related with your manuscript. In this case, if you, for example, publish something that is relevant for the climate change SDG, you will likely indicating this subject among the keywords. Right, so this information will be available in as the metadata fields when open air with will acquire the bibliographic record. So looking on the platform exposing this information. It's not granted that the subject you indicated is going to be typed as an SDG. But recently, this is an insight from an ongoing activity. This recently started to extrapolate field of science and still need to dig the graph for possible candidate SDGs disguised, let's say, as keywords. That for for the field of science subjects that there were many, let's say low-engined fruits that could be explicitly be labeled as field of science provided for the large by Crossref and they touched on something like two million publications. So the data is so vast that sometimes to recognize that there is a pattern, you just need to dive it into the proper direction. But in general, you can for sure specify that your work is related to climate change. It's part of the subjects. There is no standard right now on that. And it also depends on the capability of the platform where you deposit it. But perhaps this could be compensated by the guidelines, giving some hints on how to encode the subjects, how to express them. Just an idea. Thanks. So I think we can conclude.