 Hi everyone. Thank you for attending today's workshop. My name is Hina and I work within the research data management team at UK Data Archive and my main responsibilities include advising board and reviewing and training related to ethical and legal aspects of data sharing. This session includes of course presentations and you will be getting the slides after the session and just to let you know that I'll be using Mentimeter throughout this session which can be accessed on this link using the code or you can use the QR code as well. I will be giving the code when I ask you to go to Mentimeter. You can also use the Mentimeter on your mobile phones. So in this session I will focus on intellectual property rights, specifically copyright in the context of search and secondary data use. I will begin by explaining briefly what secondary data is and which is then followed by discussing what rights might there be in research and I'll also discuss some issues that are very important in this context such as licensing, establishing right ownership, challenges in sharing social media data and national variation in copyright. I will also discuss some best practice tips to ensure rights compliance when it comes to share your data for future reasons and finally I will point you to the resources that may be useful for you if you are planning to use secondary data and I will answer your questions at the end and if there are any specific project related questions you can always email me. I have added my email address on the last slide. So IP rights are the rights that are granted to creators of works that are the result of human intellectual creativity. Something that is created using your mind for example a story, an invention, an artistic work or a symbol. And types of IP rights include trademarks which is a type of intellectual property consisting of a recognizable sign, design or expression which identifies products or services of a particular source from those of others and patents is another type of IP rights which is an exclusive right granted for an invention and third type of IP right is registered design which is an IP right that protects only the shape or appearance of a product. It gives its owners the exclusive right to the design of that product and the final type is copyright which is the protection offered for creative works such as books, music and literary works. So this is the more relevant type of copyright in academia. You get some type of protection automatically and others you have to apply for and as you know that we will be focusing on copyright today. So let's begin with what is secondary data. I'm sure you all are familiar but there may be some people who are new to this area. So just as a refresher unlike primary data which is collected by a researcher directly from the original source. Secondary data is an existing data gathered from studies, surveys, experiments that have been run by other people or for other research. For example existing data available at archives or from government organizations, essays, reviews or information available on social media. So just a quick question if you would like to answer this in the chat. Have you used or planned to use secondary data? So all the yeses so far that's great. So someone said no for the first time so hopefully you will find today's session useful if you have a plan to use secondary sources for your research. Well thank you for your responses. Most of the people say yes so that's great. Two most relevant types of rights applicable to the secondary data sources are copyright and database rights. I will be focusing on copyright in today's session because focusing on both is beyond the scope of this workshop but I have added a link at the bottom of the slide for database rights and other rights for you to have a look later on. So copyright is an intellectual property right assigned automatically to the works creator and it prevents unauthorized copying and publishing of an original work. The creator is automatically first copyright owner unless there is a contract that assigns copyright differently or there is a written transfer of copyright signed by the copyright owner and most importantly it can vary nationally but under the UK copyright designs and patent act 1988 copyright applies to all the original literary dramatic musical, artistic work, sound recordings, films, broadcasts, cable programs, typographical arrangement of publications and databases. So in the UK copyright arises automatically as I said earlier once the work is created but to enjoy copyright protection the work must be original that is to say it must be your own work not copied from someone else and there is no copyright in ideas or facts be in the way those ideas are expressed such as diagrams, tables and in a material form. So as researchers when you need to obtain copyright clearance for this you need to bear in mind that you do not need copyright clearance if you incorporate the factual data in your own words in a structure owned by your cell. You may not need to obtain permission if you are making a copy and utilizing that copy for your own research as long as it is not made available to others or citing from the research data however you need copyright clearance if you are going to include the secondary data in a publication or plan to share their data with other people it also applies to incorporating secondary data in your own database that you intend to share with others. So as long as you are using it for your personal use that's fine but when it comes to data sharing then you need to obtain copyright clearance and there may be other arising legal issues as well for example where personal data is concerned not only the permission from the person who has created the work is required but permission from all the people whose personal data is in the work is required and an example for this is the diary data audio recordings of interviews or focus group discussions you need to obtain permission from the people involved in that as well along with the copyright owner and copyright and research data is a complicated topic however I have tried to highlight some important issues when it comes to data sharing researchers need to keep in mind certain issues such as licensing frameworks, right ownership challenges when using social media data and national variation and I'll go through these now so the first one intellectual property rights affect the way both you and others can use your and others research data and these issues should be considered at the outset of any research project so you need to consider copyright when the data is created shared and reused for example when you create a data and plan to make it available for future use then no doubt you are the copyright owner of the data but there is another issue related to this and you need to consider especially when it comes to data sharing how you want your data to be made available and here the role of licensing comes in if you are sharing your primary data then you need to consider about which license to choose on the other hand if you are using secondary data you need to pay attention to the licenses under which that data is available and data collections can be available broadly under two types of licenses open licenses and bespoke licenses as the name implies open license is a standardized way to protect work and it grants people permission to use the data openly for example the most widely used open license framework is the creative common license framework creative common license framework offers different options and three of these have been listed here the first one which is CC by is the most widely used license as you can see that you are allowed to use and share the data you can create some derivation with it adapt it as you require publish your derived data as long as you acknowledge the original data source it also allows the commercial use which is for non-academic purposes and the only condition is that the credit must be given to the creator for example you have downloaded a data set which is available under CC by you are allowed to use it for your own analysis you can create your own data set using few variables from the original data and you are allowed to share your data for future use by giving proper attribution to the original source and you can make your data available under any other license that seems appropriate and the second type is CC by share alike it is exactly similar to CC by apart from one condition that any adaptations must be shared under the same license your data should be made available as CC by share alike and the final is CC by non-commercial which has again the similar conditions except that it cannot be used commercially however most of the data made available through responsible repositories such as UK data service is made available under bespoke licenses that there may be a residual risk of disclosure in data for example data owner might have removed any identifiable information but there may be any information left in the data which if combined with other information may disclose someone's identity so the conditions associated to these bespoke licenses ensures that researchers act responsibly and ethically with the data so UK data service and user license agreement is one of the examples of the bespoke licenses so if you plan to use secondary data always make sure that you are familiar with the terms and conditions under which the data is made available here at UK data service we've saluted three levels of access for data open access safeguarded and controlled open access for data that contain no personal information and they've got it access is for the data that contains no personal information but the data owner consider risk of disclosure resulting from linkage to other data it is available under end user license and users need to register to access this data users also need to agree to certain conditions such as not to disclose any identifying information and the controlled access is for the data that may be disclosive controlled data are only available to users who have been trained and accredited and their data usage has been approved by the relevant data access committee and access to this is to a virtual or physical secure environment so next issue is to consider the right ownership please bear in mind that rules on IP ownership will depend on national law and individual institutional policies and may vary from country to country however as a general group the copyright in a work is initially owned by the works creator but this is not always the case so let's see what is the common perception in academia regarding ownership so for this I have added four questions on the mentimeter for you so if you can go to menti.com and you can use the this code or you can use the urcode to access this so what do you think who owns the right university or employee or the research that has taken place the mixed responses the majority have said university and some have said employee I think it depends if a work is created by an employee in the course of his or her employment the employer owns the copyright it's the university unless otherwise agreed upon differently so it depends many universities or research centers claim ownership of any IP that is generated by academic staff in the course of their employment and also when IP is created using substantial institutional resources so when two or more authors prepare a work with the intent to combine their contribution the authors are considered joint copyright owners so as a general rule it's university but it may vary institution to institution or country to country so what do you think who owns the right university or a student somebody in the chat writes that in terms of the university or employee it is complex and the institution some of the rights owned by the university but they have given back it to the author for some work yeah it always depends so it's better to check with the with your individual institution so majority of the people said that it's students and some says university again it depends but as a general rule most universities recognize as a general principle that students who are not employees of the university own the IP rights in the works they produce purely based on knowledge receive from lectures and teachings however there may be some circumstances where ownership has to be shared or assigned to the university or a third party typically these include sponsored students students working on research these days so publications in collaboration with academics now so what do you think in terms of research funder and researcher yeah it's it's almost equal either research funder or researcher so here again research funder may also wish to exert some claim over rights although in most cases IP rights are attributed to the researcher unless an output becomes commercially viable or agreed otherwise so it depends on individual research funders so who wants the right in a collaborative project any thoughts yeah it depends on the grant all researchers yeah it is definitely a joint copyright ownership yeah that's brilliant so if a university research project has commercial collaborators there may be joint IP rights in the research outputs which are best handled via legal legal contracts or other agreements and everyone is right here that it depends on the grant and agreement it should be a joint copyright holders and it's best that researchers should clarify ownership and rights relating to research data sources for both primary and secondary data being used before embarking upon research and this in turn will help determine how that data can be published and accessed in the future so thank you very much for your responses and yeah that's that's brilliant so the best practice is to find the ownership as soon as possible if you plan to share your research or your second resources and then plan to share your data and it is not that hard to find out who owns the right if you are affiliated to any university or research center there should be a staff in there who deals with the ethical and legal compliance in research like RU's or you can find it looking at the applicable national law and IP policies of the university and the individual contractual agreement among the university's creators sponsors or as a last resort you can always seek legal advice to be confined because failure to do so can cause serious issues for the future uses of your research such as dissemination for any future related research projects or profits associated with it so next issue is to consider the copyright considerations when using social media i'm sure you are already well aware what social media is it is an umbrella term used for internet based or mobile application that the law users to form online social networks and some of the very popular social media platforms include facebook twitter instagram snapchat and linkedin however the most widely used among these in the context of research is twitter the data is usually obtained through the application programming interface which is called apis of the social media platforms apis are provided by social media platforms to enable controlled access to their underlying functions and data and api acts as an interface between the social media platform and the consumer of social media data the twitter streaming api allows researchers and collecting institutions to obtain tweets generated by users in real time thus accessing data through apis provide the most authentic record of social media and social media data available on the platforms includes individual posts or tweets what people share on a day-to-day basis how people comment on posts and tweets shows their opinion behavior likes dislikes visual content such as photos videos and interests and their social interactions networks current trends and so on so these different platforms poses a wide variety of functions and appeal to different audience and they all create a byproduct of valuable data about the users who interact with them so just a couple of questions on Mentimeter again I have added a code here to access it have you ever been involved in a research that involves social media data so majority of you have not maybe you plan to get involved at some point and few of you have been involved so you must be aware what challenges are so the next question is what do you think could be the copyright issues using social media data I'm sure you are aware with the consent issues and data protection so what could be the copyright issues any thoughts on that yeah the tweets are not the original work of the researcher anonymous or not real name contributors so how to acknowledge or seek permission proprietary data lack of suitable accreditation yeah that's that's fine so in terms of accreditation yes identifying where original work came from that's right thank you for your responses let's get back so that's right the the terms and related to the use of the most commonly used social media platforms are similar in terms of how they deal with intellectual property rights content is protected by copyright in the same way as books and journals so whatever you post on these platforms is considered your creation your content so someone knows right that it's not the researcher's content so these platforms clearly states that the users have copyright for their own content you are the copyright holder of your tweets or facebook posts though you are the copyright holders but when you agree to the terms and conditions to create your account on these platforms you sign an agreement that gives the site a license to freely use the work for a variety of purposes including an opportunity for researchers to access the data for academic research so researchers using social media data need to abide by the terms and conditions of the platforms or api developers so the terms and conditions of these social media platforms or api developers play an important role in terms of the future uses of data such as publishing or archiving i will be using twitter as an example as it is the most widely used social media platform across the world and it is relatively easy for researchers to collect data from it so as an open platform the majority of the posts or tweets are available to public view and researchers can collect large numbers of tweets in a very short period of time via the platforms api however though it is a valuable source for researchers researchers but they can face challenges when it comes to publishing social media or archiving it for social media data or archiving it for future use after a researcher or research team has created a data set it is not usually possible for them to deposit their data set with an archive or collecting institution for reuse for example twitter policy restricts from sharing any data they obtain from the api and also from storing data in a cloud the policy does however allow the archiving of tweet IDs so you can store deposit tweet IDs in your data set but not the actual tweets um tweet IDs are the unique number given to an individual tweet or user ID the number assigned to twitter account holders other researchers could use the tweet ID to create a data used in a previous study but only if twitter continues to provide access to historical data so it is not ideal but at least it provides a better solution than sharing no information at all about data sources for published studies and besides this there may be another challenge researchers use different methods to access social media data from apis different tools different platforms different types of APIs different resellers with different services which create very diverse types of data sets and furthermore individual researchers use different methods to clean or organize their data as well as different tools and methods for analyzing their data and in addition to the IDs associated with the data set information about how the road data was collected and how it was cleaned is also important and will be required for recreating a data set or understanding how and why it has been iterated therefore the archiving archiving of data set um identifier is more effective if the processes used to create them are also documented and twitter places particular restrictions on the form in which tweets may be published requiring certain items of data to be retained in the published form so this is a challenge if you use social media data sometimes yeah twitter do not allow content modification so and you cannot anonymize the content if you think that after anonymizing you can share the data here I have added a useful checklist by UCL though it is for the reviewers but can be useful for the researchers who wish to use social media data in this checklist is about copyright so now another important point to keep in mind is the copyright in the international context just a quick question if you would like to write in the chat it's up to entirely up to you just to see how the verses are then attend this in which country you are carrying out your research UK so far all from the UK that's fine thank you so every country has its own copyright laws but over the years there has been extensive global harmonization of copyright laws through treaties and trade agreements and these treaties and agreements establish minimum standards for all participating countries this system leaves room for local variation and one of the most significant international agreement is the burn convention though it was signed originally in 1886 but it has since been revised and amended on several occasions and this treaty lays out several fundamental principles upon which all participating countries have agreed on one of those principles is national treatment which means that all countries must give foreign works the same protection they give to the works created within their voters assuming the other countries is a signatory and besides this the minimum standards also include the type of work protected duration limitation exceptions so just keep in mind that nation national laws are built on the similar basic standards but there may be variation on a country level in terms of type of work duration and exception for example this is an interesting map which gives you an idea of differences in copyright duration around the world we can have a look at the map from a continental perspective and we can clearly see patterns of same duration with small exceptions in Europe all countries with the exception of one adhere to lifetime plus 70 years while in Africa we can observe more variability with Angolia and Libya with only 25 lifetime plus 25 years and Mexico stands out and is actually the only country to adopt lifetime plus 100 years so that's interesting so if you plan to use secondary data always ensure that you consider these questions who the copyright holder of the data set is can you use these data sets and in what ways are you allowed to archive and publish them in a data repository and if not you may need to seek further permission to distribute material you do not own because if you do not have permission you you cannot archive it you and ultimately you may need to remove copyrighted variables or material before publishing or sharing and copyright law does allow certain exceptions for example you're allowed to copy limited extracts of works when the use is not commercial research or private study however in the context of data sharing researchers are not allowed to share the secondary data unless they are allowed to do so so fair dealing exception nothing works when it comes to data sharing the majority of uses of copyright materials continue to require permission from copyright owners so you should be careful when considering whether you can rely on an exception or not and if in doubt you should seek legal advice and finally do remember that the details of the provisions will be subject to national law and while most will be similar details can vary from country to country for example users of copyright works based in the uk are subject to the specific exceptions to copyright outlined in the uk's law copyright laws and since each country will have its own exceptions to copyright which are likely to vary from country to country users in one country will be able to reproduce copyright work under the copyright exception in ways that users in other countries will not so always check your national laws just to give you an example of a secondary data I have added a scenario here imagine a researcher has used secondary data sources for a research project and he then intends to share this data for future reuse he has this is a real example this researcher has deposited his data with us he has used World Bank and Microsoft academics as data sources as he has used these sources so the first thing he needs to do is to check whether he is allowed to share the data he has used from these sources so he needs to check the terms of use these are usually at the very bottom of the web pages but sometimes hidden and need effort to find out I have put the terms of use the here from the World Bank it's a screenshot from their website I let you read it here you can see that it is mentioned that there is no restriction on sharing the data with the third parties so that's all fine all the variables of data he has obtained from the World Bank he can share it so this is the screenshot of the terms and conditions from the Microsoft academics and here you can see that it clearly says that you cannot modify distribute publish this the information you have taken from there you may need to obtain permission so these sources that they do allow if they are under open access they do allow you to use the data or information for your own personal use but if you plan to deposit your data you you do need to check the terms and conditions of use on the sources so here on this slide I have added link links to our web page is on copyright and access levels and I have also added a youthful template called variable information log for data sets being deposited that includes secondary data sources researchers are advised to prepare a variation information log describing these resources so this log not only allows others to understand and use data correctly but also ensures that repositories can check the appropriate terms and conditions applicable to onward sharing this log should include the variable name source how it was collected brief description and any restrictions noted on its further use so these are some of the resources that could be useful I have added on this slide not some I think it's a long list for you to have a look at it later and I have added some case studies on the Mentimeter and would like you to share your thoughts on this I think this would be useful as these are the real research examples that that we have when researchers deposit data data so again a quote for you imagine a research project where a researcher is collecting diaries given this is diaries participants might wish to publish these in future how would you ensure copyright issues any thoughts that's right obtain copyright permission from the creator of diary or use 70 years after their death 70 years lifetime plus 70 years if they are in the UK so you check the country specific laws on the duration of the copyright have a data plan and obtain permission to publish anonymize extracts for that's right need a shared license may be restrict amount use yeah these are the strategies to publish that's right so you can use the information obtained in the discussions for your own research but as you have mentioned when it comes to the data sharing or publishing you need to obtain permission from the participants to share their personal information so you need to obtain consent to share the information and permission to use the data for future research same as with audio recordings interviews and focus groups discussions yeah diary owner provide a limited license to the researcher covering what use could be made of the diary exactly that's that's right so you need a permission so here's another case study for the information available online I let you read that yeah someone in the chat write to the previous one that you can write it into the respondent agreement that's true so in this scenario researchers studies how healthy shoes around obesity are reported in the media in the last 10 years and they have used freely available newspaper websites library sources to obtain articles on this topic and they copied it into a database and coded according to various criteria for content analysis so the question is can the researcher use such public data without preaching copyright can the database be archived and shared with other researchers so any thoughts on that so someone has written it yes it's already in the public domain now I don't think so that is the case that would be the case if the copyright ownership has expired which is lifetime plus 70 years for the UK each author of the article would hold the right to their work so permission would be needed may fall under fair use fair dealing is the allowance for the researchers in academia which could be for their own personal research you can create your database you can copy it in your thesis you can use quotations for your own research but when it comes to data sharing you cannot you should have to obtain a permission to share it depends on original copyright that's right no being publicly available doesn't mean it's okay to share just like social media ports exactly as long as you put the facts in your own words yes that's that's though I think a very nice option that we follow UK data center the UK data service that if you cannot obtain copyright you can summarize the facts in your own words and then you can deposit cannot use the exact information so as a rule even though the articles obtained are freely available online they might still be subject to copyright while such information can be used for personal research purposes such as fair dealing you have mentioned the articles cannot be archived unless permission is obtained from the newspapers or other sources you have used otherwise this would breach copyright so the terms and conditions of all the data used should be checked before the archiving process begins so maybe it's open access the terms of use state that you can use it freely you can share it you can modify it you can do this you can do that so but as a rule you need to check the terms and conditions so that's another case study related to archiving data so this in this one a researcher uses international social service social survey program is sp data obtained from the jcys which is a repository these data are available to registered jcys users and the researcher incorporates some of the is sp data within a database containing his own research data can this database be placed on the researcher website or or less it's similar to the previous one so what are your thoughts on while you are writing your thoughts i am just checking the chat um someone has said for using ogl open government license data is the citation only enough yeah i think so ogl allows you to share your share the information um with the third parties with the citation or attribution but it's always better to check with the specific information but i think yes and there is another comment in the chat if you are not permitted to cite an author's own words then safe qualitative research is impossible that that's right it is really tricky in terms of qualitative research but it is um i think very hard when it comes to data sharing so always try to convince your participants to give you consent or permission to share it you can anonymize the data you can so don't minimize the data but i think you can it is hard no doubt in terms of quality so coming back to that case study someone has written there's no copyright in fact so yes now it wasn't uh of course uh data based on fact see the researcher has used it from the repository and it's a survey survey data so it is um available through that repository so it depends on the licensing and user agreement for the data that that's exactly right this would be derivative work of sensitive restrictive data so special permissions will be needed that's right so although the is spd data are available for free to all registered researchers at that repository this does not mean that the data can be published on a website and made available to others so the data can be incorporated into a database and used for personal analysis but before this dataset is placed on a website permission must be sought from the data owner so that that that's the course of action needs to be taken so another case study related to transcription more or less similar to the previous one a researcher has copied a series of statistical information from a printed work into a spreadsheet the transcription is a direct copy with minimal alterations and the book is in copyright so what would be the issue here any thoughts i'm sorry i may have used several case studies but i believe that this is the perfect way to understand different issues plagiarism yeah it is plagiarism but i i think we can copy others work for our personal use unless we share it somewhere else so someone has written there is no problem need to cite the source but you can't paraphrase data and no copyright in facts that's right someone has written in the chat that it needs to cite it that's right if the excerpt is cited in the original material as a clear license cc by true it might be a matter of citation but permission should be asked to be saved that's right the researcher should technically have cleared copyright before transcription if the work is for personal use this can be probably disregarded but if the newly constructed data set is to be archived and disseminated copyright clearance will need to begin from the copyright holder so yeah it doesn't matter if you are using it for your personal use but when it comes to putting it on a website sharing it with others archiving it then copyright permission does matter you need to check the terms of use you need to check the license is attached to it and you need to act accordingly so that's that's the last case study open data obtained from the uk data server so this researcher has used data from the participation survey which is available on our website and this data is available under crown copyright and available as an open access collection the researcher creates derived variables for their analysis and they would like to archive the derived data at the uk data service so this is the last case study what are your thoughts on this it would need to be cited that's right sorry someone needs to see the previous slide these are long examples so I couldn't put these using one slide that's why I have to use two I will find a way where you can keep on reading this case study and then answer it but at the moment I couldn't find any way on mentee meter I hope you have read that by now so it would need to be cited need to cite the original work but new work is derived data with new creative input that's correct depends on the permission in the crown copyright license exactly regarding derivatives and sharing if not granted permission should be requested cite the original data set yeah I think that's correct it's open because it's really available online not because it's under a cc license that means the researcher would need permission to share it they cannot archive the data that is not theirs when there are no added material they should cite the original data that's right the first issue in terms of copyright is to declare a joint copyright ownership there is a joint copyright over the process data shared between the researcher and the crown so this is the first issue the researcher must declare this joint copyright on the other hand data collection is published under the open government license so no further permission from the data creator is required to archive the derived data as long as acknowledgement is provided as described in the open government license however it all depends on the licenses attached to the data so always check the terms and conditions of the sources including license so yeah these are the case studies I have added and I hope by now it's quite clear to you what you need to consider when it comes to copyright in secondary data use especially archiving or depositing it for future use so that's all thank you very much for listening and is there any question you can this is my email address you can always email me if you have any specific questions