 Hi everyone, my name is Pena and I work within the research data management team at UK Data Archive. We are based at University of Essex and my main responsibilities are to oversee ethical and legal aspects in data sharing and not only the guidance as well as the training covering this. In this session I will focus on intellectual property rights in the context of research and I'll begin by explaining briefly what secondary data is, which is then followed by discussing what rights might there be in research and I'll also discuss some issues that are very important in this context such as establishing right ownership challenges sharing social media data and national variation in copyright. I will also discuss some best practice steps to ensure rights compliance when it comes to share your data for future use and finally I will point you to the resources that may be useful. I'll answer your questions at the end and if you have any specific project related questions you can always email me, my email address is at the last slide. So IP rights are the rights that are granted to creators of works that are the result of human intellectual creativity, something that is created using your mind for example a story, an invention, an artistic work or a symbol and there are different types of IP rights that includes trademarks which is a type of intellectual property consisting of a recognizable sign, design or expression that identifies products or services of a particular source from others and another is patent which is another type of IP right it is an exclusive right granted for an invention. Third type of IP right is registered designs, a registered design protects only the shape or appearance of a product and it gives its owners the exclusive right to the design of that product and the final type is copyright which is the protection offered for creative works such as books, music, literary works and of course your research data. You get some types of protection automatically others you have to apply for and as you are aware that we will be focusing on copyright as it is most relevant to our work. Now let's begin with what is secondary data. Unlike I'm sure all of you are aware with it so unlike primary data which is collected by a researcher directly from the original source, secondary data is an existing data gathered from studies, the ways, experiments that have been run by other people or for other research. For example, existing data available at archives or from government or organizations, essays, views or information available on the social media. So just a quick question. If you can, sorry, go to the Mentimeter using the code 64324927 and answer just one question if you have used or plan to use secondary data or you can write into the chat. You can use Mentimeter on your mobile phone. So couple of people have answered so far. So that's good. If you have used or plan to use secondary data, then that means that you can gain something out of today's workshop for sure. So majority of you so far, that's great. You can keep this Mentimeter on because I will be using throughout the workshop with the same code. So that's perfect. Thank you for on screen. So that's good if most of you have used or plan to use secondary data. So the two most relevant types of rights applicable to the secondary data sources are copyrights and database rights. And I'll talk you through to both of these in today's session. Mostly it will cover copyright, but I will touch upon database rights as well. So the first section covers the copyright and research data. The copyright is an intellectual property right assigned automatically to the works creator. It prevents unauthorized copying and publishing of an original work. And the creator is automatically the first copyright owner, unless there is a contract that assigns copyright differently, or there is written transfer of copyright signed by the copyright owner. It can vary nationally, but under the UK Copyright Designs and Patents Act 1988, copyright applies to original literary, dramatic musical, artistic works, sound recordings, films, broadcasts, cable programs, typographical arrangements, publications, databases, and so on. So intellectual property rights affect the way both you and others can use your and others' research data. And these issues should be considered at the outset of any research project. In the context of primary data, if you plan to share it for future reuse, you need to consider how you want your data to be used by other researchers or students. You can specify this by licensing the data to match the intended use. So various types of licenses for sharing data have been developed by the data archives. For example, here at UK Data Service, we facilitated three levels of access for data, open access, safeguarded access, and controlled access. You can, I have added links to these access frameworks on the last slide, where you can read more information from our website. So coming back to this, in the context of secondary data, if a researcher wishes to share research data by publishing or disseminating them, all the right holders need to be identified, and the necessary copyright permissions be granted for data to be shared. So what the researchers do not realize is that they are allowed to use the data from several sources for their personal use. But when it comes to data sharing, they need to obtain permission from the data owner. And you can say that fair dealing exemption may apply, but in this context, if they have restricted their data to be shared to the third parties, fair dealing exemption does not apply to this. So in the UK, copyright arises automatically once the work is created. So to enjoy copyright protection, the work must be original. That is to say it must be your own work, not copied from someone else. And there is no copyright in ideas or facts, only in the way those ideas are expressed, such as diagrams, tables, and so on. So as researchers, when you need to obtain copyright clearance, please bear in mind that you do not need copyright clearance if you incorporate the factual data in your own words, in a structure owned by yourself. You may not need to obtain permission if you are making a copy and utilizing that copy for your own research, as long as it is not made available to others or citing from the research data. However, you need copyright clearance if you are going to include the secondary data in a publication or plan to share that data with other people. It also applies to incorporating secondary data in your own database that you intend to share with others. So there may be other rising issues. For example, where personal data is concerned, not only the permission from the person who has created the work is required, but permission from all the people whose personal data is in the work is required, for example, in the context of diaries. So copyright and research data is a complicated topic. However, I have tried to highlight some important issues when it comes to data sharing. For example, researchers need to keep in mind certain issues such as rights ownership challenges when using social media data and national variation. Before I go to this information, let's go to Mentimeter again for some quick questions using the same code. So what do you think who owns the right? If someone is doing a project research project in the university, is it the university or a staff employee who owns the right to the data that is being created while you are in an employment? Any thought? I'll answer these and then there are a few questions related to the right ownership. So majority of you have said it's the university and some of you think that it employs. Some people are still on screen, so I'll wait a few more seconds before going to the next. Yeah, so it's clear that most of you, apart from a few, thinks that it's the university who owns the right. So the next question is the university or the student. For example, students working on research projects, PhD students. So mostly people go for students. The last one, what do you think who owns the right? Is it a research funder or a researcher? Yeah, so majority thinks that it's researcher. So thank you for your answers. Coming back to this, yeah, so you need to bear in mind that who's on IP ownership will depend on national law, first of all, and individual institution policies as well. And it may vary from country to country. However, as a general, the copyright in a work is initially owned by the works creator. But this isn't always the case. If a work is created by an employee in the course of his or her employment, the employer owns the right unless otherwise agreed upon differently. So those who answered that university owns the right, they were right. But it could be different if it was agreed upon differently. So many universities or research centers claim ownership of any IP that is generated by academic staff in the course of their employment. And also when IP is created using substantial institutional resources. So and also went to or more authors prepare a work with the intent to combine their contributions, the authors are considered joint copyright owners. So in terms of the students, most universities recognize as a general principle that students who are not employees of university own the IP rights in the works that produce purely based on knowledge received from lectures and teachings. So however, they may some circumstance, there may be some circumstances where ownership has to be shared or assigned to the university or a third party. Typically these include sponsored students, students working on research thesis or publications in collaboration with academic staff. So you can say that it depends, but mostly it is with students. So research funders may also wish to exert some came over rights, although in most cases IP rights are attributed to the researcher unless an output becomes commercially viable. So if a university research project has commercial collaborators, there may be joint IP rights in the research output, which are best handled via legal contracts or consortium agreements. And researchers should clarify ownership and rights relating to research data or sources, both for primary and secondary data being used before embarking on research. So because this in turn will help determine how those data can be published and accessed in the future. So the best practice is to find out the ownership as soon as possible. So how can you find out who owns the rights? It is not that hard to find out. If you are affiliated to an university or research center, there should be a staff in there who deal with the ethical and legal compliance in research like REOs or you can find it looking at the applicable national IP law, IP policy of the university and the individual contractual agreements among the university's PRIs, creators and sponsor or as a last resort seek legal advice to be compliant. Because failure to do so can cause serious issues for the future uses of your research, such as its dissemination and future related research projects or profits associated with it. So the next issue to consider is the copyright considerations when using social media data. I'm sure you all are very well aware that what social media is, it is an umbrella term used for internet based or mobile applications that allow users to form online social networks. Some of the very popular social media platforms include Facebook, Twitter, Instagram, Snapchat, LinkedIn. However, the most widely used among these in the context of research is Twitter and LinkedIn. The data is usually obtained through the application programming interface APIs of the social media platforms. APIs are provided by social media platforms to enable controlled access to their underlying functions in data. And API acts as an interface between the social media platform and a consumer of social media data. The Twitter streaming API allows researchers and collecting institutions to obtain tweets generated by users in real time. Thus, accessing data through APIs provides the most authentic record of social media. So the social media data available on the platforms includes individual posts, tweets, what people share on a day-to-day basis, how people comment on posts and tweets, show their opinion and behavior, their likes, dislikes, visual content such as photos and videos, their interests, social interactions, networks and what is the current trend in any context through the data on ratings. So these different platforms poses a wide variety of functions that appeal to a different audience. They all create a by-product of valuable data about the users who interact with them. So again, time for a couple of questions on LinkedIn, Peter, using the same code. Have you ever been involved in a research that involves social media data? So majority of you have not, just like me, but recently I'm aware that it's a fantastic data, very valuable data in a real time. So some of you have been involved, so that's good. So, so far what you have listened to in this session, what do you think would be the copyright issues using social media data? Any thoughts? Yeah, verifying the truth, that's right people. So far, people think that they are different what they pose on the social media to the originally who they are, who owns the data, third-party material, permissions, yeah, that's right, permissions, licensing terms, that's right. Ownership, identifying the owner, getting permission to use, data protection, yeah, most of the answers are around ownership, that's great. Verifying sources, privacy, objections, more answers, yeah, that's right. Verifying sources, deleting posts, that's right. If somebody, you have taken the data, somebody has deleted that post, you cannot access that historical data, yeah, that's true. Thank you. Thank you for your answers, coming back. Yep, so you are right, most of the answers are just perfect. The terms of use for the most commonly used social media platforms are similar in terms of how they deal with intellectual property rights. So you were right, content is protected by copyright in the same way as books and journals, whatever you post on these platforms is considered your creation, your content. So these platforms clearly states that the users have copyright for their own content. You are the copyright holder of your tweets or Facebook posts, though you are the copyright holders, but when you agree to the terms and conditions to create your account on these platforms, you sign an agreement that gives the site a license to freely use the work for a variety of purposes, including an opportunity for researchers to access the data for academic research. So researchers using social media data need to abide by the terms and conditions of the platforms for API developers. However, terms and conditions of these social media data platforms or API developers play an important role in terms of the future uses of data. So you can use it, but when it comes to future uses of data, such as publishing or archiving, then there is a problem. So I will be using Twitter as an example, as it is the most widely used social media platform across the world, and it is relatively easy for researchers to collect data from it. So as an open platform, the majority of posts are available to public view, and researchers can collect large numbers of tweets in a very short period of time via the platform's API. However, though it is a valuable source for research, but researchers face challenges when it comes to publishing social media data or archiving it for future use. So after a researcher or research team has created a data set, it is not usually possible for them to deposit that data set with an archive or collecting institution for reuse. For example, Twitter policy restricts form sharing any data they obtain from the API and also from storing data in a cloud. The policy does, however, allow the archiving of tweet IDs, the unique number given to an individual tweet or user ID, the number assigned to Twitter gum holders. Other researchers would use a tweet ID to recreate a data used in a previous study, but only if Twitter continues to provide access to that historical data. It is not ideal, but at least it provides the better solution than sharing no information at all about data sources for published study. And besides this, there may be another challenge. Researchers use different methods to access social media data from APIs, different tools, different platforms, different types of APIs, and different resets with different services which create very diverse type of data set. So individual researcher use different methods to clean or organize their data as well as different tools and methods for analyzing their data. And in addition to the IDs associated with associated with the data set information about how the raw data was collected and how it was cleaned, it is also important. And it will be required for recreating the data set or understanding how and why it has been altered. So therefore the archiving of data set identified is more effective if the processes used to create them are also documented. Twitter places particular restrictions on the form in which tweets may be published requiring certain items of data to be retained in the published form. The forced retention of this material may pose a challenge to privacy. For example, if you need to quote some tweets while publishing, you cannot anonymize the tweets as Twitter does not allow modification in the content. You need to use the full tweet as it is. So it's really hard if you plan to share your data. Here I have added a useful checklist by UCL, though it is for the reviewers, but it can be useful for the researchers who wish to use social media data. I let you quickly read it. Quite a useful checklist. So I know there isn't enough time to read it, but you are going to have the slide so you can in your own time. Now another important point to keep in mind is they're copyrighted in the international context. Just a quick question. You don't need to go to Mentimeter. Just use the chat function in which country you are carrying out your research. UK, Jamaica, that's interesting, Switzerland, Australia, Germany. Yes, most of you are from UK, but we have people from Germany, Australia, Switzerland, Jamaica. So quite a mix. So someone doing a comparative study globally, Nigeria, US focus groups. Thank you. So we do have a diverse background. So every country has its own copyright laws, but over the years there has been extensive global harmonization of copyright laws to treaties and agreements. These treaties and agreements establish minimum standards for all participating countries. This system leaves room for local variation. And one of the most significant international agreement is the Berne Convention. Though it was signed originally in 1886, but it has since been revised and demanded on several occasions. And this treaty lays out several fundamental principles upon which all participating countries have agreed. And one of those principles is national treatment, which means that all countries must get foreign works the same protection they give to the works created within their borders. So assuming the other country is a signatory. Besides this, the minimum standards also include the type of work protected duration of the copyright limitations and exceptions. So the most important thing to bear in mind is that the national laws are built on the similar basic standards, but there may be variation on a country level in terms of type of work duration and exception. This is an interesting map which gives you an idea of differences in copyright duration around the point. I just leave you for a couple of minutes to have a look on this map. So we can have a look at the map from a continental perspective and we can clearly see patterns of same duration with small exemptions. For example, in Europe, all countries with the exception of one adhere to life plus 70 years as copyright duration and valor is not part of the EU has a shorter duration, which is life plus 50 years. And in comparison in Africa, we can observe more variability with Angola and Libya with only life plus 25 years, Yemen life plus 30 years and also the only country marked as plus life plus 99 years. And under the 1996 law copyright in Ivory Coast lasted for 99 years after the death of the author. But under the 2016 law, this duration was dropped down to 70 years. So overall, we can see the continent is dominated by life plus 50 years. And on the other side, on the Asian continent, we can observe a few countries which respect the default ban convention, which is lifetime plus 50 years. And Tajikistan, North Korea and Vietnam, they do have this. And in South America, Columbia stands out with life plus 80 years and similar with the Europe dominated by life plus 70 years. And Mexico stands out and is actually the only country to adopt life plus 100 years. So you can see that despite international agreements, there is a national variation. So you always need to check what the national copyright law is where you are conducting research. So now I will talk you through to the database rights as this is also relevant in this context. According to the UK's Copyright and Rights in Databases Regulation 1997, a database is a collection of independent works arranged in a systematic or methodical way that database rights protect and reward the creation and arrangement of a database. Database may be protected by both copyright and database right. And for database right to apply, the database must be the result of substantial intellectual investment in obtaining, verifying or presenting the content in an original manner. So that's the rule for the database law to be applicable. Simply entering facts into a spreadsheet would not count as substantial effort, but translating or synthesizing and coding information from multiple sources would. The author's time, scale and labor would need to be directed to the selection and arrangement of the database over and above gathering the information. So the database right is an automatic right and protects databases against the unauthorized extraction and reuse of the contents. If a researcher uses parts of a data from a database as well as the structure in which those data are held to create another dataset, they should obtain explicit copyright and database right clearance before they publish the data. You always need to check the terms and conditions of the database that you are using. If you plan to use secondary data, always ensure that you consider these questions. Who the copyright holder of the dataset is? Are there any database rights? Can you use these data sets and in what way you can use these? Are you allowed to archive and publish them in a data repository? And if not, you may need to seek further permission to distribute material you do not own because if you do not get permission, you may need to remove copyright material before publishing or sharing, which is a hassle. So here I have added a copyright scenario for you. So imagine a researcher has used secondary data sources for a research project and he intends to share his data for future reuse. And it's a real researcher. We do get data from him in our collections. He used World Bank and Microsoft academics, just I have mentioned here just two resources, but he has used several others as well. So as he has used secondary sources and he plans to share that data with us, so he should check whether he is allowed to share the data he has used from these sources. He needs to check the terms and conditions of these sources. The terms and conditions of the sources are usually at the very bottom of the webpage, but sometimes they don't need effort to find out. So I think we haven't got enough time that you can go to these websites and check yourself the terms and conditions. So I have added here the screenshot of the lessons conditioned from the World Bank. And you can read that. It is mentioned here that there is no restriction on sharing the data with the third parties so that the information he has gathered from World Bank website is fine to share with us or archive with us. So that should be okay. Here is a screenshot from the Microsoft academic. And here you can see that it is mentioned that you cannot modify, distribute, transmit, display, perform, reproduce, publish, and so on. And you may need to obtain permission before you archive the data obtained from this website. So if you are planning to use or have used, already used secondary data, and you plan to share with it, you always need to check the terms and conditions, because most of the time these secondary sources do allow researchers to use that data for their personal use, but they are not allowed to share it. So always, you have always, you need to check with it. And here on this slide, I have added links to our web pages. On copyright and access levels, I have also added a useful template, which is called variable information log for data sets being deposited that includes secondary data sources. Researchers are advised to prepare a variable information log describing these resources. So this log not only allows others to understand and use data correctly, but also ensures that repositories can check the appropriate terms and conditions applicable to onboard sharing. So like you see here that this information log should include the variable name, its source, how it was collected, brief description, any restrictions noted on it further. So that's very useful. And these are some of the useful resources I have added on two slides in the context of copyright, which you can explore in your own time. These are very useful resources. Yeah, so if you can go back to the mentee meter and answer a couple of questions. So what do you think if the research project has several researchers from different organizations, there will be joint copyright owners who are still coming in? The majority of you think that it's a joint copyright ownership, which is right, but some of you have said maybe so that's right as well. When two or more authors prepare a work with the intent to combine their contributions into inseparable or interdependent parts, the work is considered joint. Work and the authors are considered joint copyright owners, but exceptions may occur if there is an increment that it will be dealt differently than the answer is maybe, but yes, it will be joint copyright. And what do you think if a researcher incorporates secondary data into his own database, does he need copyright permission? The majority of you thinks that it depends and some of you thinks yes and someone thinks no. Yeah, I think it depends is the answer. You may not need to obtain permission if you are making a copy and utilizing that copy for your own research as long as it is not made available to others or citing from the research data. However, you need copyright clearance if you are going to include the secondary data in a publication or plan to share that data with other people. And it also applies to incorporating secondary data in your own database that you intend to share with others. And it always depends on the terms and conditions that you need to check. Maybe it is openly available just as we see in the example from the World Bank. So another scenario, what do you think? If you have discussed a prospective research with your colleague, are you a right holder in this scenario? That's good. Everybody thinks no. I think that's an easy one because yeah, it's a no, clear no because ideas are not protected. The work must be recorded in a material. And what do you think? Do you need to apply for the IP rights? That's an easy one, I think. The majority of you thinks that no. Some of you thinks yes. And the answer is no because copyright and database rights are automatic. If you have created something, you do not have to apply for it. It's an automatic right. So what do you think could be the right issues if you use COVID-19 live data available at Worldometer's website? I'm sure everyone is familiar by now with the COVID-19 live data website known as Worldometer. If you are using data from this website, what could be the right issues? Any thoughts? Getting permission for any personal information? Yeah. Snapshots and updates, personal data, that's right. National variations, that's right. Any more thoughts? Using the data for what it was not intended for, that's right. You must acknowledge the site. Personal information, regular updates, yeah. Yeah, exactly how much information you can use. And I think in this case, most of all you need, as I mentioned, you need to check the terms and conditions such as, I think creators have to cooperate in database, right? So you need to check the terms and conditions of using information, whether it is freely available for personal use, whether you are allowed to share it, whether you can modify it, whether you can incorporate the data available on this website into your own database and when you incorporate it into your own database, whether you are allowed to share that information or that database on any website. So these sort of things need to be checked. In a project where we collect diary data, what are the copyright concerns here and how could they be settled? Any thoughts on that? Any thoughts? Personal data, the diary data is considered personal data consent for sharing informed consent, that's true. Authorship and ownership, perfect. Yeah, anonymity, that's right. It could be sensitive personal data, true and original owner, yeah, perfect. Confidentiality is an empty issue, that's right. Is it keeping consent from persons, permissions for reuse, that's right. Yeah, that all of the answers, everything is correct. So you can use the information obtained in the discussions for your own research, obviously after the permission from the owner. However, when it comes to data sharing or publishing, you need to obtain permission from the participants to share their personal information. You need to obtain consent to share the information and permission to use the data for future research. So all of you are right. So what are your thoughts in this scenario? A researcher uses international social survey program data obtained from these two archives, these data are freely available to registered users. The researcher incorporates some of the ISSP data within a database containing his own research data. So any thoughts on copyright in this scenario? So needs to check licenses for sharing, publishing the data, that's perfect. May need copyright clearance of sharing own database, that's right. Labeling ISSP data and your own data. Yeah, some researchers, they do incorporate data from third sources into their own data. They choose some variables, incorporate these into their own, how much data is being incorporated, how much individual original thought. Yeah. Yeah, the researcher needs to consider obtaining a copyright license, that's right, is the data available to share may need permission, that's true. Will depend heavily on how data can be reused is only describes access, that's right. Terms of use, you must cite. Yeah, citation is important if they have allowed to share it. Yeah, check the data is correct permissioning. Yeah, that's excellent. Yeah, obtaining copyright license, that's right. So perfect answers. So it's again the same as previous, although the ISSP data are available for free to all researchers, this does not mean that the data can be published on a website. And through this made available to other people, the data can be incorporated into a database and used for personal analysis, but permission should be obtained before putting this database on a website, providing attributions. Yeah, that's important as well, when you have permission. So, yeah, that's true. So how would you obtain copyright for your research outputs? That's that's I think the last question. Any thoughts? Ask the content or the author, look at the source information to determine the copyright holder to check. Yeah, that's right. If you are using research, now I think it's just for your own research outputs, I meant because research outputs does not necessarily have the secondary data in it. It's just the research outputs, your own research outputs. Yeah, that's right. You don't need to obtain copyright, because as I said, copyright is automatic. It is applied automatically, that's right. You don't need to apply. Yeah, you need to add see the copyright symbol, your name and date, that's right. You need to set up the licenses anyway, that's that's true. And you do not need to apply. It is automatically applied. Yeah, symbol, copyright symbol, year and names of the donors, that's right. Yeah, that's perfect. Thank you very much for all your responses. And thank you all for attending today's session.