 story on the European open science clouds. In short, EOSC started actually in 2016 with Carlos Mudas wanting to have to ensure basically that all the research data that are being produced that they are actually used to their full potential. In 2000 back to 2016 and this tradition is still going on researchers get public funding for performing their research. They publish in research publications. They have their research data but after the publication lifetime and the publication is out in the journals, hardly anything happens with these research data yet. It's not a common practice at least let's say. Sorry to interrupt but your camera is still off. Do you mind switching it on? Oh don't worry. Sorry. Okay so and normally you should be able to see me. Okay. Yes. Okay so I'm going to recap a little bit from here. Okay and let me quickly swap them out. Okay so going back to the story of research data Carlos Mudas actually advised that research data should be reused and brought together in order that more findings and scientific findings and innovation could get out of that. In 2017 based on the initial thought of Carlos Mudas there was the EOSC declaration signed in Brussels on the 10th of July and this EOSC declaration basically wants to stop the the image or the current practice that's going on in the research world. In order to speak this picture actually depicts what's currently has been going on. So every country, every region, every research institution even has and produces research data. Most of the time people are not aware where these research data are residing in fact and also they are not made interoperable. If you want to for instance combine a research data set on mobility from Italy and together with Belgium this will be very hard to do because there is no way of combining the data according to the lack of a common data model, the lack of interoperability, the lack of metadata standards for sharing the data. Also most of the times also notisary or ontologies exist. So in order to ensure the sharing of data this was actually the vision that was put forward. To have and combine data from several sources and to ensure that EOSC as a portal could help research this and other public parties, private parties and the public at large to find the required research data and that EOSC could serve like a kind of a service catalog where you have one highway through which all the research data could be used. There were several models at a time being researched and looked after what actually would fit and do the job and they actually came up with a federated data model that would work best for with regards to EOSC in terms of resources, services, governance, also costs and they envisaged that the cost for making a federated model would actually be not that high that the acceptance by the member states and the stakeholders could be quite high but that some things were needed and in this talk we will for sure focus on the way how interoperability matters if you want to build a federated data model. Based on all the information on EOSC and the declaration signed in 2017, a strategic roadmap was signed in 2018 and based on that there were EOSC, an EOSC governance board was installed, an executive board was installed and under that executive board there were six working groups in EOSC that were actually trying to find out what was required in order to build that EOSC landscape. Within this field one work in particular is of mention today and that's the EOSC interoperability framework. I put some links here so if you would like to review this information this obviously is open with regards to EOSC and in the EOSC interoperability framework there were four major things put forward so first of all they claimed and it's not with a priority it's just the way how it is mentioned so a technical interoperability is required so technical interoperability it comes down to the exchange of data it comes down to the exchange between different infrastructures but also the services that go aside with that so everything needs to be combined in a technical layer but obviously and then I come back to point number two this only makes sense if you have also a semantic interoperability meaning that you need unambiguous shared meaning of the metadata and the data you're actually sharing this can for instance be done in ontologies in Tessauri but at least this is required in order to make that the data that you would reuse are making sense this will require the need of a minimal set of a common metadata formats at this point in town time many different metadata standards are existing so it's not that there is nonexisting the thing is just that it are existing too much and the thing on mapping them towards each other is quite a burden and hardly anyone is doing that everybody's making an own new metadata format when they think it's required however if this is the way to go forward that's another thing maybe we should first try more to see whether the semantics in between this metadata standards are and can be aligned and to what extent they can be mapped towards each other a third point put forward in the interoperability framework document was the organizational interoperability because if you have data installed in your own research data repository with some semantics this semantics obviously is also based on your context where you're residing on the organizational processes that are linked towards that's towards that's semantics towards that data capture process so some kind of in organizational interoperability obviously is also required however this most of the times is forgotten if people are speaking about interoperability terms port one was the legal interoperability that's required so if you want to combine data from several sources from eos from wherever places there are obviously some regulations internal regulations but also to a large extent external regulations which you need to and take into account some policy measures that might be drafted which impacts your the manner how you can reach this interoperability so in eos this was interoperability is was and still is actually a very important topic obviously is because this is really what's required in order to make the things happen now this is not only in the eos landscape a term that's predominant it's also for instance was last week during the endorse conference which was held which is a conference based on the isa square commission previously that's being installed by european commission and one of the things that was most important to foster data data linking across sectors not only the public sector but to combine data from the public sector the private sector the industry and so forth also across different domains the one thing that was the most prominent and you see here the percentages that i i just made as a print screen actually during the talks that was that alignment of metadata standards vocabulary and semantics really is tribal in order to get to take the next step forward so that brings me a little bit back to my story so if you see all these things are really required and necessary what does that mean then in the Flemish context so in Flanders there was a reira court which is an agreement of the Flemish government and for the term 2019 to 2024 it was decided that research data should be combined as much as possible should be made public as much as possible and based on the requirement that they receive public Flemish money in order to operationalize this this note there was a decree where the Flemish open science board was created this note can be found there are some references here but what basically was said is that the Flemish research data network which is the operationalization of the Flemish open science board needs to tackle three big things in for one the knowledge hub in order to share expertise not only between the research data management stewards but in between all the stakeholders involved in research data secondly a data hub where the research data can actually be combined and totally a discovery hub where the metadata of this research data can be provided to Fris and from Fris there could be coupled to EOSC in order that they become findable so that brings me to what Fris basically is so what Fris is is public research information portal you will see here a small print screen of this portal down the image and the link towards this portal is researchportal.be this research portal has been used for many years already in order to disclose information on public funded research in Flanders openly to whoever wants to use it at the moment there is only there is information available on researchers organizations projects and publications but based on decrease from the Flemish government the both which is a special research fund and the industrial research funds provided by the Flemish government this will be enlarged so there will also be an obligation for instance to the Flemish universities to include research data into this portal also the other stakeholders involved the higher education colleges also the strategic research centers and the Flemish scientific institutions they all get in their agreements with the Flemish governments obligations to provide this information to Fris with some specific deadlines so if we want to have indeed research data being discoverable to Fris yeah then we need some interoperability we need some metadata standard termination work and I might skip these slides so what you sorry what I you see in the previous slide is what we have thus far in Fris so these four objects what we will do in the near future is to enlarge this picture we will ensure that the links between all these objects are made closer towards each other but it's also an enlargement for instance with data infrastructure and information on patterns it's not only about the objects that there is an extension going forward but also on the the people and the stakeholders who are actually delivering the research information so previously it was most of the time is coming down to the universities that were providing the information now the whole picture the whole research picture will be involved so meaning the Flemish higher education colleges also the strategic research centers and the scientific institutions now having set the scene where we're actually working let's go and take a dive into how we actually develop this Flemish application profile for research data sets that will be used in Fris so first of all what we have to do and which basically might seem a little bit awkward is to define what research data actually are with all the stakeholders we have because this really sets the scene what we want to have and what we want to disclose through the through the first portal there are many definitions out but for now we have convened in Flanders on this definition which is quite broad as it also for instance includes secondary data obtained from from third parties it also discloses not only documents and software for instance but also discloses information on for instance biological materials now what does interoperability then essentially mean in this process so it means interoperability in this respect that we would have to know through this portal where this data are what they are about and we have to know it as goods qualitatively at least as if they were there the the data of the portal itself the metadata of the portal so that means that the read the information user that uses the Fris portal would not have to know any semantics any processes that are going on on the institutions that are actually providing this information everything should be looked and treated the same way so this needs that the information being sent needs to be contextualized when it sends and so this needs a convergence for instance to a common metadata model that's a question that's one of the options one can take or you need to have an interoperation among many metadata models but then you need to mapping towards from one model to each other so having around 25 stakeholders that are all having their research data created we had to see what was the best option to take and i can already drop you a line that it was the first one the conversions on a common metadata model but the high level metadata model the principles we were using is that the difference between metadata and data is the mode of use is being different metadata is not there just for the data but it is essential that it's useful for users for services and for computing resources it should make sense to have them it's just not just merely a description it's just in it's in order to ensure that it's useful for any for any other party it's also not just for the description and the discovery it also is for the contextualization so how relevant are the metadata provided to the data how qualitative are the data are there restrictions towards the data that we are wanting to use in terms of license agreements third party agreements disclosures or other legal restrictions or ethical restrictions even a fourth one is that the metadata should be machine understandable in order that they can also be harvested by other services that would like to use them as well metadata should also be relevant i mean and it should be linked towards other information sources already on the frisk portal for instance with regards to projects with regards to publication outcome whenever possible the implications of these principles basically come down to five points that need to be tackled so the syntax what's the metadata cover actually there should be created according to objects and properties the semantics there should be relationships and preferably also include the most often problem of multilinguality however as we are restricted to flounders this has not yet come into play there should also be temporal information should also be ensure that there is the metadata are have a profound way of being at the high level of integrity and that's there represented in some form of first order logic and not going further into detail there if there are questions later on just just tell me and i'll reach out so these are basically if you would say are quite general principles in order to draft the metadata scheme so when it comes down to research data we also are checking for research data and metadata guidelines that were being drafted and the ones that we had used actually are the ones of the go fair projects which defines four different principles according to the fair acronym fair standing for findable accessible intruparable and reusable so as you will see findable means that the metadata should be assigned with a global unique and persistent identifier basically already immediately posing some some urgencies and and requests on the metadata scheme we're making data are described with rich metadata metadata are clearly and explicitly and include the identifier of the data they describe and they are registered on indexed in a searchable resource so if we want that the metadata and frizz are findable this is the one things we have to take into account with regards to accessibility they should be retrieved by their identifier using a standardized communication protocol and they should be accessible even when the data are no longer available so meaning that metadata will will will live beyond the lifetime of the research data itself metadata should also be intruparable so they should be handled in a common way in order that machines can read it and that's uh that applies actually uh so that it's used as a formal accessible shared and broadly applicable applicable language for knowledge representation they should use vocabularies that follow the fair principles so there should be a disaster zone ontology based on that and there should be qualified references to other metadata schemes whenever applicable obviously with regards to the reusable facts this comes down to the fact that the metadata should ensure that there are that they meet for instance common agreements that are made and disciplines in order that they can be reused in the same discipline as you can understand many research disciplines have their own metadata standards in place based on all these four specific fair metadata principles we were starting off so first we were installing a governance structure like who is involved in this in this work and for this we directed our question to the Flemish open science board and we had in a working group on metadata and standardization which is with a representation of all Flemish research universities representation of the higher education colleges which are currently building a digital open science platform abbreviated as DOSP for representatives of the also for strategic research centers and next the Flemish scientific institutions and the research funders so this is quite necessary that you have this kind of governance structure because later on you will see in the process that we need to know who actually will be involved when we will speak about the semantics and also about the processes that are coming into place when you look at the research data these are the people most often residing for instance in research coordination offices or libraries that are actually handling the research data themselves through metadata catalogs for instance that's already have experience with that towards research publications with making data publication repositories so these people were included in a working group then we were checking actually what we needed to do so would we make and draft a new metadata model or would we rather start from a commonly existing metadata model a generic metadata model that's already widely in use and that's existing for many years already in that regard we were reviewing reviewing several metadata generic metadata models and we considered to the one of data sites the the 4.3 model that was concluded in 2019 as one of the best options to go this is also due to some extents to the Flemish context because the data site model is already being mapped towards serif there are serif xml guidelines to map towards data sites data site that's also being used already on open air a platform that discloses publications but also research data sets openly so in because this fact and because fris is built on serif it was more easily to also look at data site as a first glance because other than in the next stage when we are really implementing the metadata model in fris then you can draft the link towards open air data site serif and that meta model so then this tech step to take is not as big as would have been for instance with a retreat data metadata model then what we have been doing is actually to check how the data site model is built and what we actually want to be present present in fris so obviously not all the metadata fields of data site will be relevant in the context of fris so therefore we were checking what's relevant and also what's the concept that's being denoted in the metadata scheme of data site if you were having discussions on what an identifier would mean towards us we were having discussions for instance also what a contributor would be because the data site model is quite large they have more than 17 roles on contributors but we really want to include 17 contributor roles in fris for instance and also if you look at the roles that we would like to include in fris is the meaning of the concept being used in data site exactly the same as we want to intend with that so then we were actually checking line by line what data site had so this is just a small glimpse of data site 4.3 and you will see here for instance what is an identifier is this is semantically defined by data site which is already very nice for our metadata model not all metadata models in the world have this so this is very nice to start off and then we also saw for instance allowed values examples and the other constraints that were put forward by the data site and we were going over each of these facts and then checking whether there was a common agreement or not and this was done basically through what what happens in data governance with so-called semantic cycles of reconciliation where every information provider is checking basically what's technically feasible not only with the technical experts but also with the business domain experts who are actually having to store the information once the repositories the metadata models are being constructed what they think that's required with regards to the research information world and then they're checking that's towards what we have proposed as a generic metadata metadata metadata model on the Flemish level so each time we were actually checking something it was always coming down to checking what's all institutions and it's over or not to it were close to 25 27 institutions were thinking and then checking whether this would be whether we found a common agreement where we could have more like a high level generic metadata model we continued these kind of cycle until we actually drafted our metadata model and as you will see there were some kind of iterations going on so the model that has been concluded and has been agreed upon is the version 1.7 model the characteristics of that model are not completely the same I would say as data sites so data sites has many properties we have 15 that originate from data site but we expanded some of them so you see that there is an enlargement and I will in the second slide go into the details why we actually needed to do that you will see in this scheme the term that is being used in the metadata model in Flanders you will see the correspondence with the data site term the link actually to the properties as defined in the 4.3 model of data sites then you will see an extra field where it's mandatory mandatory if applicable recommended or optional and here we already deviated from the data site scheme and this was required because the Frisk portal and thus this metadata scheme will be used actually to see how open science in Flanders is evolving so in order to do that there were some KPIs defined and based on these KPIs some of these metadata fields are really required in order to allow for an automated measurement on these KPI measures so this already was something we contextualized if you wanted to say and there also is some indication whether this has been realized according to a KPI implication or not then there were also some discussions going on on the values that are allowed actually towards these identifiers so if you would have a look at the one of the previous slides I will share my slides later on you would see that in the data site scheme the preferred value for an identifier is DOI while here in Flanders we also allow for other identifiers in order to ensure that the metadata and the data can be found the definitions and the value and examples we used are most of the times quite in line I would say with data sites sometimes some additions have been made but examples and use cases are being are being brought up and added to the model in order to ensure that we have all a common understanding how to use this metadata fields so we had also some expansions on the metadata model as I just told you about and these metadata model expansions were three fields that were deduplicated and three new fields that were inserted so we have deduplicated for instance the metadata field on description basically towards one that is abstract and another one is description and description is a more formal technical description while the abstract is being used in Frisk in order to allow for searches so in this way we can ensure that for instance the description slash abstract can be made an obligatory fields a mandatory fields while the other description is for instance optional this has been done towards that field with regards to subjects there was also a distinction made between research discipline and keywords research disciplines are being used as filters on the Frisk portal and keywords are used in the search terms so these are two different fields in Frisk and the one none coinciding with the metadata scheme of of data sites so therefore we deduplicated that subject fields the third one we deduplicated was the rights fields we have made an expansion towards on the one hand the licenses on the data and on the other end the access rights towards the data so this is also the licenses and the access rights can be also visualized here and these are being used in the KPI implications so we needed some more detail level than what was available basically in data sites so if you look at the access rights we have here now values open embargoes and restricted or closed these are really values that are imposed on the Flemish level but not not present in the data site model but they allow us for measuring the KPI that's required to be measured as from 2022 and you will see then because it's a KPI request it's a mandatory fields if you look at the license fields what that has become that's for there are for instance values like there's no license it's a 0 0 cc 0 license or a cc by a 4.0 license so this is another kind of license that is possible and there will be a list made actually with some with some values in that that can be used for now this is recommended but as from 2023 this will be mandatory due to the fact that it has been taken into account with regards to the KPI monitoring next to that we also had three new metadata properties so one of them was the open format which is not existing in in data sites one on the legitimate opt outs options and one on the fair data label all of the three also coincide with the KPI implications that we have in Flanders so meaning that we are not only have created a metadata model based on an international standard data site 4.3 we have contextualized it to the Flemish context we have ensured that we know what's mandatory obligatory optional are recommended but we also have extended the metadata model towards a manner that can be used in order to automate KPIs on open science in Flanders that being said I'm not yet there but almost what is next next is we have to implement this we have to implement this and one this has to be due this has to be due by December 31st of 2021 this year does so that means that we have to ensure that the metadata model is built in in Fris but also in these 25 to 27 stakeholder systems secondly we have to ensure that the linkage between the metadata fields of data for instance and projects are also uh yeah linking towards each other in order to measure the KPIs this means in fact that the project metadata model had to be extended with a DMP attribute and a DMP identifier so this is the work that we have currently presented and this brings me to my conclusion that we have created a Flemish application profile for metadata standards based on data sites 4.3 the standards but we have contextualized it to the Flemish understanding we have ensured that semantics are correctly understood and unambiguously understood by all but are being used in the information and thirdly we have ensured that also metrics can be drafted for that in order to see how open science in Flanders is evolving that's being said I open the floor for questions there are already two questions in the chat one from Jochen and one from Peter let me quickly check where the where I can see the questions can you read them in the meanwhile yeah sure sure so Jochen asks I would like to know more where first order logic is used in the Fris metadata data schema yeah that is something that's a question quite in depth so we will I will come back to that later on maybe so I think best that Jochen that you sent me an email in order to discuss that and the second question where can you find that yeah I see it in the public chat Peter asks is it possible that the repo isn't public that can be the case because at first glance this was a private conversation until we had a common metadata model yeah but I will ensure that that it's available on a public place yeah if you want to drop me a line I can send you the information and so it's not a not the secrets to you know if Brussels the West also has the same system of metadata so I don't know so for now Brussels at least doesn't have as an information portal on research information so at the balloon part they also they are creating something like that but it's not so in Flanders we're quite let's say ahead of things in the way that we have already an information portal on several of these research information objects this is not the case in the balloon region nor in the Brussels region and we're also one of the first actually to have the metadata model implemented in such a information model together with metrics on open science this is also hardly done would it be hard yeah would it be hard to copy paste the same methodology no the methodology is is is not rocket science I would say it's about discussions in between stakeholders and often be going and ensuring ensuring that the heads are in the same direction basically first that needs to be put and then you can go and roll down to the details and to contextualize everything are there also questions there is a question in the chat from mark and also from Jochen I take a data site is is that a question that is yeah it's now based yes that's correct mark and then the second one of mark are there any relations normal yeah which regards to as they decad AP there are no I mean application profiles for research data at this point in time so they have them indeed for open government data but research data is lacking there and that's the reason why we're also participating in conferences like the endorsed conference in order to ensure that research data would also be established as an application profile there what we do have is for instance communities like Eurochris that's an international organization for common European research information formats the link to that one is here and these communities are building actually these common metadata models and that's the have semantics with them aligned the the metadata standard that they are using is and creating a Siri is this not too much isolating research from government data and the Google World Wide Web yeah in some part it is on the other hand there are initiatives coming in like this endorsed conference and the particular purpose there was actually to bridge the gap between all these different initiatives are coming up because I mean metadata standards are popping up like mushrooms from from the ground but the big the big thing is actually to how to combine all these metadata standards together how to create some kind of over a metadata model on the metadata models let's say and this needs resources these names time these money but also speaking together in the same language and this is quite hard at point in time because there's no instance that's actually providing this money so that was actually one of the things that came out of the discussion of the conference this week yes that's true mark and there is also one question from Jochen a bit yeah yeah it's been defined on what you have this is doing the extended series model yes and we'll continue to use the extended series model for days that's that's true yeah mark is again typing you can turn on your mic if you want mark maybe that's easier or anyone else if you have a question you should be able to hear me now yes we hear okay always pushing three buttons coming back to the fact that fris is now making their own xml schema I heard a complete explanation about about where you are deviating from data sites is the end results than in either direction compatible or I mean if it validates okay it's not because we have we have some for instance fields so if you have a deduplication yes you can do that but if so for some entries for instance like the open format there's there is no nothing there yeah we can put it on the format but that's not exactly what we actually intend with our thing so yeah I would say no no okay there are three fields yeah there are left for instance also the legitimate opt out this is something that's being discussed quite often actually in the european context because they think it's really valuable to have this kind of information but because no classification exists nobody is ever trying to get it in so while it actually is already providing very valid information on why people are not opening up their their research data and could provide some indications for drafting a policy on that and the third one is the fair data label and the fair data label who is aware of the research context and all the projects that's going on fair and you know that there's quite some discussion and debate going on on what the fair data label would mean and how this even can be included so there are for instance there are you can measure fair on the level of the metadata but more oftenly the fair is measured on the level of the data itself and then the question also comes down to what fair are you actually measuring and are you focusing first on the on the findable and accessible or are you also taking into account intrapropable and reusable and what policy measures are behind that when you're actually wanting to measure fair so that's the reason why most of the time you we will never ever experience actually someone having the fair included yet but I I think for fair there will be a counterport whether that will be then exactly the same as we will have in Flanders that's another question maybe I can come back to the to the link data as well are there any plans at first to I know there is something like a decad AP mapping for data site I don't know is there is there already something in the plan to go that route not yet because in the Flemish open science board we're trying to do many things at the same time in order to get this actually off the ground and while this is very valuable to do that we did not yet have this into our scheme yet also another question from Jochen in the chat yeah he's interested in the mapping between data site and serif that's something that's currently going on in the eurochrist group so there are also serif guidelines on that so I can send you maybe the the guidelines later on Jochen if you drop somewhere your mail address that would be great and an answer from Pascal to marx question but I think we have reached also that our time for today or for this session we said it would take 45 minutes so I suggest that I stop the official recording and we can keep the chat going for maybe a minute or two longer if there are more questions a with story on the european open science