 Hello, my name is Marianne Grootvelds, I work for OpenAir at Data Archiving and Network Services or DANCE in the Netherlands. In this presentation I will touch upon FAIR and open research data, the role that good repositories can play in preserving and publishing those valuable data, and initiatives for measuring the level of fairness of existing data. And my call to action for all of you is let's aim for a FAIR-aligned research data life cycle. But first a few words about OpenAir. OpenAir, as you can see, supports different activities and different stakeholder groups. We support open science policies, that's relevant for policy makers and those involved by the policy. We offer training and support in that area. OpenAir also supports infrastructures, both technical infrastructures and human infrastructures. We run a couple of services. Open research data and FAIR data is an important topic for OpenAir, as you can see also from this webinar, and of course open access to publications for which we provide guides and repositories and information about compliance. We're probably familiar with the FAIR data principles. The F stands for Findability. We want research data to be findable both by humans and by computer systems, and having good descriptive metadata allows the discovery of those interesting data sets. Data should also be accessible, stored for the long term in a trustworthy repository and combined with a good license that tells you what you can do with the data. In the ideal FAIR world, research data are also interoperable. That means you can combine data in one data set with other data sets, and you in this case can be a person, as well as a computer system. And in the end, research data should be reusable in future research for validation and so on. The image you see at the bottom is a screenshot from the article that was written about the original session in which the FAIR data principles have been defined with all the authors involved, and you can find other links at the bottom of the slide. Now open and FAIR data are also in your own interest. For some disciplines, we see evidence that when data are made available in the context of a paper publication, that publication is cited more often than a publication which only says that the data are available on request. So publishing your data increases your visibility as a researcher. And of course, publishing open data that are not FAIR is perhaps not so useful, so that's why we talk about open and FAIR. The European Commission has installed what they call a high-level expert group on FAIR data. And last summer, that expert group published their draft recommendations. You see here two of these recommendations. Recommendation 10 is about so-called trusted digital repositories, and it is a recommendation for digital repositories to get certified against the core trust seal scheme of certification. Not each repository is as good as it could be, so certification is a means of finding the good ones that are committed to sustaining your data for the long term. And another recommendation is about implementing FAIR metrics, because it's a very good idea to make your data FAIR, but it's also relevant and very valuable if all existing kinds of data turn out to be FAIR or are being signalled as perhaps not as FAIR as they should be. So we need ways to assess the fairness of data sets. Another thing that the European Commission does is supporting open and FAIR data by funding projects and installing a pilot, a flexible pilot, calls the research open data pilot, you're probably familiar with it, which also says that research data should be opened by default and findable, accessible, interoperable, and reusable. Part of that progress, you see it in the image, and it is also in the guidelines on FAIR data management, is about where you will deposit your data and all the associated metadata, documentation, software code for the future. The European Commission says in their guidelines on FAIR data management that preference should be given to certified repositories, which support open access where possible. To make your data FAIR or to consider the fairness of data for reuse, it may help if you assume the position of a potential re-user. So that's why the arrows in this research data life cycle point the other way. We started reusing and we go back in the process, actually. And when you go through those steps and consider what should be done, it is clear that lots of documentation is needed. That was one of the recommendations in an earlier you that FAIR data checklist, which you can download from Zenodo. And documentation starts usually with metadata. Metadata are typically the keywords and other bibliographical information that you need to find existing data and to also get a first idea of the content of the data. It's recommended that you use community standards in metadata because you want to speak the same language as your potential users and re-users, etc., and so on. So community standard metadata enable into operability. The repository where you're going to entrust your data may support or expect a particular standard, so it's good to check that in advance. And when you're not yet familiar with metadata standards in your field, you may want to take a look in one of these areas. So FAIR sharing, which offers a lot of information about standards, databases and policies, and also about the reasons behind it. That's a very good source of information. Another good source is provided by the Research Data Alliance in collaboration with the Digital Curation Center, the RDA and the DCC. They have a metadata directory for lots of disciplines. And another place where you can look, also in a collaboration between RDA and DCC, is the last one. And you see a very small print information on the right-hand side. The nice thing about that side is that it also offers a long list of tools for metadata. So you find tools for validating or producing metadata for different disciplines and fields. If you're not familiar with that, please go and take a look. In addition to metadata, a good reusable dataset needs documentation. Sometimes the assumption is that all the documentation that is needed about data is already in publication, but often that's not quite true. For instance, when you have a codebook with variables, you will not publish that in the article, but it is a variable source to include in the repository. The same holds for your study design, a lab journal, an electronic notebook, statistical queries that you used, and so on and so forth. So ideally, you would document and preserve everything that is needed to replicate the study. That is the kind of a replication package, you might say, and that is the best thing to publish and to deposit for later use by yourself or by others. And again, if there is a standard in your field, please use that. About selecting a repository where you could deposit your data and documentation, it's good to look at it from both sides. So both for giving your data and trusting it to the repository and for taking it. So reusing data, it's give and take as it is often. Some criteria here are that the repository is ideally certified, as the commission also recommends. There are different standards for that, and you will find them, for instance, through the core trust seal, I'll show you a link on the next slide. It's also important that a repository will support your needs when it comes to file formats, the access regime. Can it be open or only partially open, perhaps? Do they help you to select a license for future use of the data? And the repository, ideally, typically will provide a standard for the metadata. They may offer a Persistent and Unique Identifier, a so-called Persistent Identifier, a PID, which your research funder will probably ask for. They may provide guidance on how the data should be cited if someone uses it, which, again, is good for your usability as a researcher. And when there are costs involved, either in publishing or in downloading data, that should be clear from the start. So it makes sense to contact a repository of your choice, or perhaps a couple of repositories, when you're still looking and deciding. A head of the real, the actual deposit act don't wait till the end. They may help you when you come early. As I said, the core trust seal is one of the certification schemes that are around. And they used to be the data seal of approval and the world's data systems certifications. And these two, at some point, decided they were pretty similar. They joined forces, and the result was the core trust seal. You see here a map which indicates where you can find those repositories. This does not mean that repositories without a certification are not good. But the trend is towards getting certified. So maybe you want to push your discipline repository if it doesn't have a certification yet. Core trust seal certified repositories commit to 16 requirements, and they have been assessed that they not only commit to it, but really do it in practice. And here you see a couple of them. For instance, such a repository maintains the licenses, covering data access and use, and also monitors the compliance with the license. The repository is responsible for the long-term preservation. The repository enables users to discover your data and refer to them in a persistent weight through proper citation. So this is what I mentioned earlier. The repository can take care of this stuff for you. And because they also focus on reuse of the data over time, they will offer a metadata scheme, which helps you and people who are maybe interested. But there are more repositories than these for you, and actually you as a data producer or a data consumer, a potential data consumer, play a role in those guidelines too. When it comes to assessing the fairness of existing data, there are a couple of initiatives. So making your data fair is essential, but checking the existence of existing data sets is very useful too. You can learn a lot from that. So the easy thing to do is go to a data repository, any data repository, take a look at the metadata of a data set, and then decide for yourself, would you feel comfortable with reusing that data? What is it that inspires trust in you that this is a useful data set? Or what kind of evidence is lacking to give you the feeling that you trust the data set? What would help you to trust these data for reuse? You can do this alone, and I encourage you to do it just to go through the motions and see how it works, but there are also some initiatives on measuring this in a more structured way. There are some prototypes under development, and part of the discussion that's going on, and we hope you will play a role in that, is who should do the actual assessment? Maybe a fellow researcher from the same domain where the data were generated or produced? Or should we wait until the data have actually been reused by someone? Should it be the people in the repository? Can we make computers and machinery do it? This is still unclear, it's a very recent process and a recent initiative, so you may want to feel part of this. This was a very brief introduction to fair data in trust with the repositories. The core, I think, is trust, and I can only invite you to aim, as a group of listeners, watchers of this presentation, to aim for a fair aligned research data life cycle. We should credit researchers and others who see value in existing data, who want to add value to existing data, and of course we want to support and credit people who support new fair and open data by using trust with the repositories. So it's important that we stick to standards for data documentation in several areas and end up with good replication packages. Another way to do this is to check how fair existing data already are, and to learn from this when they're not that fair perhaps, and to teach and train early career researchers that they make the data as fair as open and open as possible. Thank you very much. When you have questions, we hope to hear from you.