 My name is Christoph Broch from Helmholtz Association. I'm here in my function as the co-chair of the RDA Co-Data Legal Interoperability Interest Group. It's a very complicated name, I'm going to explain it a little bit. The purpose of my presentation today is to introduce to you a document which has also a complicated name. These are the principles and implementation guidelines concerning the legal interoperability of research data. So it sounds very complicated, but I hope at least that some of the things I'm going to present to you are less complicated than what we just learned about data protection right, which truly is very complicated. I'll start out with saying a few words about the research data alliance. I have seen some familiar faces in the audience, but many of you who participate in this meeting today may not participate in the research data alliance or I'll shortly explain a little bit about the research data alliance. It has been launched in 2013. You can read with me if you want, it has about 5,000 members, which means individuals who sign up on the website. So each and everybody can just sign up, it doesn't cost anything. And if you sign up, you can participate in any of the meetings and you can join working groups and these working groups are really the focus of the research data alliance. So people come together, they find the topic relating to research data and they start, there's a process, a very informal process and you can have a working group. And of course, each of the working groups is expected to deliver some kind of outcome and the outcome of the legal or probability working group is the legal interoperability paper. That's kind of a close match. Okay, so a little bit about the legal interoperability group. The name has a combination at the beginning of the research data alliance and co-data. The acronym is explained below, which of course is a sub-organization of the International Council for Science, so that was a very important connection for us because, and now I come to the legal interoperability paper, the paper we have produced addresses scientists, yes, but it mainly addresses policymakers. So why are the International Council of Science, we hope, to reach many research organizations? So now about the paper, principles and guidelines on the legal operability of research data. Making research data available for reuse means you have to find ways to overcome several hurdles. There are technical barriers, there are legal barriers, there are financial barriers and if you look at the legal barriers, data protection is one big thing, liability law and then of course property law. Within the group, we only looked at property law. So what you will learn today will not give you a complete overview of all kinds of possible hurdles concerning the use of research data. It will be focused on property law, but we know that that is the case and that we do need more papers if you will to address all these various different legal areas, okay. So let's start with the principles. The document I will introduce has six principles, we promote six principles. The first of which is that you should give lawful access and you should facilitate lawful access and research and reuse of research data. It sounds very simple as a matter of fact as a policy criteria, it's not simple. Many organizations will not have a rule saying that research data produced at their organization should be made available for access and reuse as a default. So that is a very basic policy decision and it does not mean that you need to make everything available. We heard about the principle before as open as possible as closed as necessary. So openness is always a relative openness, but one of the main decisions to take when building a policy is whether the default is open or whether the default is closed. And the decision in one direction or the other direction also shifts the burden of proof. If you start with a decision that the information should be closed then someone who would like to make this information available would need to legitimize that. If you start with the openness, people who do not want to make the information available have to legitimize why they do not want to make it available or why they only make it partly available or whether they need to attach conditions to that. And I can assure you I'm active in many committees and a lot of people say it's naïve to have openness as the default rule because they think about business cases that involve the control of information and they fear that too much openness will harm these business cases. And that's a very important argument, an argument that is usually not considered enough in meetings like the meeting here, but I think that is something we have to consider and I'm still arguing for the openness as a default, but we have to be aware that there are very good arguments to keep things closed or only open them to a certain extent. Then once you have a decision possibly that you do want to make your information available then the next question is how you facilitate, how you support making the reuse possible to a large extent. One consideration is whether to attach a certain license to the data. We already heard that there is a problem with the concept of ownership in respect to data. So if we believe that data we would like to make available is protected by an intellectual property right then a license would be an appropriate legal instrument to make it available. If we believe that the data we would like to make available is not covered by an intellectual property right then a dedication putting it so to say in the public domain or affirming that it isn't that the information are in the public domain is in the public domain would be an appropriate legal measure, a legal instrument for that. Also if you consider to make data available we think it's important that the availability is granted on an equal basis so that everybody has the same ability to access the data and you don't just open it to certain audiences. So what I just said to you, that is the text I just didn't want to show you all the text or you keep reading all the time. The second principle is actually the most demanding, determining the rights and responsibility for the data. No matter what you would like to do with the research data you have to determine who, which person or which organization has the right to decide what is done with the data. That's a crucial decision. If you look at it from a legal perspective you do need to control, is that my, you do need to control, you do need to know who owns or controls the data. This is the person who can decide and checking on this is probably too complicated for most of the scientists. So it's part of the responsibility of a research organization to have clear rules and to clearly negotiate who can do what with the research data that are produced within a research project taking place at a certain research organization or research data which are produced or possibly bought from another entity in the course of having a research project. One important aspect in this respect is that if you acquire data from another entity during the research project you need to check the contract because if you buy satellite data, satellite data for example within the research project and then there may be restrictions connected if you get a license to use these data so you need to check what the restrictions are in order to then know what you can do if you want to make the research data of your project available. As I mentioned before there are reasons to make data available and there are reasons to not make them available. So we ask in principle number three that you balance these interests. That also sounds easy but can be a quite a complicated process because if you talk to policy makers in research organizations they will be aware of an interest in openness and of an interest in confidentiality but setting up a process to balance that is not easy. There are several ways one could go about that. You can set up a committee for example that looks at this. You can have certain rules concerning the justification of confidentiality. Most importantly I think is you do need to have a transparent process so that you understand why certain data are made available or not made available. Once you have decided that you do want to make data available for reuse by third parties you need to communicate what people may do or may not do with this data. Now it's very important that you communicate and state the rights as transparently and as clearly as possible. You may be able to find a lot of data in the internet but usually you will find very little information telling you what you may or may not do with this data. From a legal point of view that's a very unfortunate situation. If you collect data from different sources and you also would enable the reuse of that new data set you need to know what you are allowed to do with the data to keep on communicating that. So we have a real issue here at hand that as of now data are made available in many cases but often there's no clear policy communicated, no clear license or dedication information communicated. We from the legal interoperability group believe that the best, probably the best way as of now to communicate these legal situations of the data is the use of the CC licenses. Of course the CC licenses have different flavors. So I'm referring to the CC zero license in this respect and the CC by license. The CC zero license would apply if you believe that the data are not copyright protected and the CC by license would apply if you believe that the data are copyright protected and I will not go into details of how to determine whether data are copyright protected or not because that can only be decided on a case by case analysis. Generally speaking the question is whether it's a work in the sense so a creative work in the sense of copyright and I think if we talk about data most of the cases we actually talk about just pieces of information which are not, they can be the result of a lot of creative work but the information itself is not a creation, it's in many cases an observation. I already referred to the CC by group of licenses and that communicates to the principle number five promote the harmonization of rights in research data. Now that is, this principle is clearly addressing the policy makers. I applaud the European Commission for having started the open research data pilot and making it the default now. This is a typical step of trying to harmonize how we deal with research data. The publishing community is another important player and just before this meeting here I looked at some principles of the many publishers adhere to when they publish research data and I went through all the principles and there's nothing about the legal aspects in these principles. So I was a bit shocked by that. So I think the publishers need to think about that too. So of course there are several aspects of the harmonization. Part of the harmonization will be to have a common policy for example concerning embargo periods but also of course what kind of licenses are appropriate so people kind of don't have to have too complicated decision-making processes each and every time they want to make data available. And the very last of the principles provides proper attribution and credit for research data. That's very important and strictly speaking it's not part of the legal interoperability at least not if you're referring to data which are not copyright protected. If you believe that the data you're making available are not copyright protected if you make them available usually people can do with the data what they want and they do not need to cite the source. So from the legal point of view there may be not a strict necessity to cite the source but from the scientific point of view of course it's very important to cite the source. So we strongly argue that people do cite sources and that actually today I think can be a technical hurdle because if you would like to combine information from many different sources it's sometimes quite complicated to find an adequate way to provide the attribution information. For example for this presentation I looked for pictures yesterday I found all these pictures in the internet. In my presentation I have an extra slide that mentions all the sources but there's no elegant way at least for my technical ability to embed the source information in the picture in a way you can see it right now because it's not important during the presentation but if I make the presentation the way later you could easily find that information for me it's a lot of work so I have to put together all of that and imagine I want to combine many sources then it becomes really very complicated so I strongly believe that the ability to properly attribute needs technical support much more than what is available right now. After presenting the six principles I would like to ask you to have a closer look at your document it's quite a big document it's 40 pages so it's actually for the policymakers but I think many of you are in this position to read more closely all the explanations we have about the principles and then possibly make this paper available to the leadership of your organization and ask them to consider whether they would like to support these principles I just got the information about a website which was not ready until a few days ago where you can sign it you can say I support these principles and I will make the link available and give it to the organizer so they can make it available together with the presentation so there will be an option for individuals and for organizations to show their support for the principles we strongly believe that we do need organizations to subscribe either to these principles or other principles if you find others but I think at the moment they are the only ones in order to build a consensus about how we should legally deal with research data now this is the first version of that paper we hope and believe that a conversation will build on this paper people will have comments and there can be future versions that accommodate aspects that may not be covered in the paper yet or we may change certain things because people say this or that it's not dealt with adequately in the paper but I think the paper is a very good start for a conversation to build consensus concerning the legal aspects of making data available and making them available for reuse thank you very much for your attention so we have time for one question then later week you can if you want to do one question now we have time then later we have the panel discussion I would like to ask about why do you advise not to use CC0 for data where there is copyright protection in it is there any risk that we take doing that you said you advise and not to use CC0 but only CC buy for data where there is copyright protection in the data what would be the reason for this why would CC0 be a wrong solution do we do we take any special risks if we do that so the reason is if the you know from from the legal point of view if the data are not copyright protected then you cannot you cannot own them if you cannot own them you cannot license them you can only give information about the status so that would be if they're not protected you could attach zero or the PDDL license that is another option if they are copyright protected then you are right in theory you can you could also use the CC0 but that is of course then very much dependent on the copyright law of the in the various countries so you're not able to waive certain rights depending on where you where you live so for that you know I didn't go into the details and in this respect then the CC buy license is the second best but we agree and maybe I was not clear enough about that a CC0 license if you want to call that a license it's not a real license would be the preferred option but whether you can use it or not is dependent on the legal situation you are in and also and I'm referring to data which are copyright protected many researchers would like to use the CC buy license because they feel and it really is a feeling because they feel that if they use the CC buy license there's a stronger incentive to cite which in practice I think is not the case but it's very difficult to communicate to a researcher audience to use a CC0 license for them it sounds like saying their data is not worth much which of course I'm not saying but it's it's always a very difficult argument to tell someone who has worked possibly years to produce a data set and say it's not protected very difficult