 Hi, my name is Margaret Henty and I work as a Policy Advisor in the Frameworks area of ANZ. This means that my work is involved with all of those major policy and legal issues which have an impact on research data and the way in which it's managed. Ethics is one of those issues. ANZ is developing the Australian Research Data Commons, which means we're encouraging researchers and research institutions to identify what data they have and make its existence known through research data Australia. If the existence of data is disseminated in this way, then it's available for subsequent researchers for reuse and reanalysis. By doing this, we're supporting the better use of information resources and, we hope, improving research outcomes. Ethics is seen to be a barrier to data dissemination and sharing. However, it doesn't need to be and today's webinar will attempt to provide researchers and others with some practical advice to be followed if their data is to be shared. ANZ has recently produced a practical guide to ethics consent and data sharing which is available from the ANZ website. There's been a lot of interest in this guide and today I'll be going through some of the information contained within it. The wordle on your screen is based on the text of the ANZ guide with the words research and data removed. What I'm not going to talk about today is the ethics of data sharing itself. That's a worthwhile topic and one which research institutions may well like to consider. Most research is funded from the public purse, so perhaps it could be seen as unethical not to share data, but that's a debate for another day. Before we go any further, I'd like to be a little more precise about what we're talking about. First of all, what kind of data are we talking about? We mean data which is created in the course of research on human subjects and which is subject to ethical approval and oversight. The data might be statistical data, survey data, audio-visual materials of one kind or another, or in other forms and formats. While we tend to think of health and medical data immediately here, there are many disciplines involved in research which needs ethical approval and oversight. Examples might include research in anthropology, sociology, psychology, history which involves say oral history interviews, journalism, economics, musicology and more. Each institution has clear definitions available about what research will need ethics approval. And what do we mean by data sharing? This is the practice of making data used for scholarly research available to other investigators under conditions set by the original researcher. It's not practical to suggest that all data can be shared openly via the internet, so conditions around data sharing may well be applied. And what do we mean by data dissemination? This is the practice of making known the existence of research data which has been newly created in the course of research. It can be measured in a number of ways and these aren't mutually exclusive. We might say that data has been disseminated if a description of the data is available in a publicly accessible registry, repository or catalogue such as Research Data Australia. This doesn't necessarily imply that the data is publicly available as there may be legal, ethical or commercial considerations limiting access. Data is really only shareable if it's in a machine readable, open and standards based format. The data might only be available on request or it may be available for purchase. We might say that non-identifiable data is available for reuse following the obtaining of consent from participants and appropriate ethics permissions. So who are the players here? Which is to say, who needs to know all this stuff? The ethical conduct of research is a complex business, involves many players. Researchers of course are the first to come to mind, but they work within an administrative framework which oversees their research practice. Human research ethics committees are central to overseeing the design and conduct of this kind of research. Others engaged in research administration provide information and guidance for researchers. Data administrators are those who ensure that data is securely stored and accessed. This is important not just during the research but for later curation, preservation and access. The list here isn't intended to be comprehensive but to show how many people are engaged and the kind of role that they might play in meeting your obligations. In Australia there are various requirements set out about the conduct of research and the desirability of making research data available for reuse. Research on human subjects is comprehensively covered by the national statement on ethical conduct in human research. This recognises the value of making data available for future research while emphasising the need for preserving confidentiality where this is required. How the researcher might deal with confidentiality will depend upon the nature of the research but must take into account the researchers' ethical and legal obligations. Researchers are also bound by the guidelines set out in the Australian Code for the Responsible Conduct of Research and the obligations set out by funders. The Australian Code for the Responsible Conduct of Research has quite a lot to say about the dissemination and sharing of research data. It says, for example, the potential value of the material for further research should be considered particularly where the research would be difficult or impossible to repeat. It says, research data should be made available for use by other researchers unless this is prevented by ethical privacy or confidentiality matters. It also says, researchers have a responsibility to their colleagues and to the wider community to disseminate a full account of their research as broadly as possible. Australian funders are increasingly mentioning the need for data dissemination. For example, the Australian Research Council, ARC Funding Rules for 2012 state, the final report must justify why any publications from a project have not been deposited in appropriate repositories within 12 months of publication. The final report must outline how data arising from the project has been made publicly accessible where appropriate. The revised funding agreement of the National Health and Medical Research Council states, if required by an NHMRC policy about the dissemination of research findings, the administering institution must deposit any publication resulting from a research activity and its related data in an appropriate subject and or open access repository in accordance with the timeframe and other requirements set out in that policy. There's also significant legislation that may impact on the sharing of confidential data. The Commonwealth Privacy Act 1988 is an important one and that has various state equivalents and the Human Rights Act 2004 also has state equivalents. The Privacy Commissioner website provides good definitions of what is meant by personal and confidential data. The personal data it says is defined as information that identifies or could identify someone. There are some obvious examples of personal information such as name or address. Personal information can also include medical records, bank account details, photos, videos and information about personal preferences, opinions of occupation, basically any information where someone may be reasonably identifiable. One confidential data on the other hand is data given in confidence or data agreed to be kept confidential which is to say secret between two parties. That's not in the public domain such as information on business, income, health, medical details or political opinion. In the same way as people over the centuries have invented mythical creatures such as the unicorn and the two griffins we see below in this slide, so too are their myths around data. These are three common ones. That human research ethics committees forbid the sharing of data, that data must be destroyed after a certain time and that ethical clearance once given cannot be amended. In practice however what we find is that data sharing is not high on the agenda of most human research ethics committees and few have expertise in this area. More data is shareable than might at first appear. The amount that can't be shared is probably only a small fraction. Even confidential data can be shared provided that consent has been provided to allow it. Examples of where a researcher might seek permission for the dissemination of confidential data might be oral history interviews with prominent people who would be only too happy to share their views about a particular topic. Retired politicians come to mind. The kind of data which is not shareable is data which contains information unique to an individual such as genetic makeup or data that cannot be de-identified without losing its meaning or involving huge cost. We'll come back to the topic of de-identification later on. There are three practical steps which researchers can consider if they want to share their data and we'll get to those shortly. But keep in mind that planning ahead is the key. When I started to think about preparing this ANS guide, I looked up the ethics requirements set out in the application forms for about a dozen Australian universities. Not one mentioned the possibility of data sharing either directly or indirectly. In the six months since then there have been a couple of changes. Victoria University has recently altered its ethics application form to include two questions. Is there an ethical reason not to share the data from this project? And can the information collected on the ethics form be reused in other places such as collecting project descriptions for office of research reporting purposes? The first of these questions goes directly to the issue of data sharing and asks that applicants consider the benefits of making their data available for the later use of others. The second is designed to simplify processes in the research office and elsewhere so that select pieces of information collected on the form can be moved directly into other databases without the need to re-key or to ask researchers to provide the same information more than once. This can be time-saving as well as adhering to the general principle that information should be collected only once but be able to be used for many different purposes. Another university which has taken action in this area is the Australian National University which is adding the same requirement about data sharing onto its ethics application forms. There are some things that researchers can specifically do. They can incorporate data sharing and dissemination intentions into their research planning and they can discuss these intentions with the Institution's Human Research Ethics Committee. Then there are three practical steps that they can take to ensure that data can be disseminated and shared. Step one involves getting informed consent. Informed consent is an important element in the conduct of this kind of research. Researchers are expected to obtain informed consent for people to participate in research and for use of the information collected. If the researcher hopes to be able to share the data at a later time, the consent form should be designed to set up precisely what will be done with the data, how it will be stored and made available and how confidentiality will be maintained. I said previously that confidential data can be shared and it certainly can, but in most cases, subjects will feel more secure about confidential data not being shared. While consent can in most cases be obtained retrospectively, this is time-consuming and impractical. At a minimum, consent forms should not preclude data sharing such as by promising to destroy data unnecessarily. The National Statement defines three levels of consent for the future use of data which must be made clear to the research subject. At the first level, they call specific, which is to say limited to the specific project under consideration. The second, extended, is given for the use of data in future research projects that are either an extension of or closely related to the original project or, alternately, in the same general area of research. Thirdly, the third level is unspecified, which is given for the use of data or tissue in any future research. If you use the specific option, then you rule out the possibility of disseminating the data further. The third level, unspecified, provides the most options. In addition, the consent form must specify whether the data is to be held in a form which is identifiable, nonidentifiable or reidentifiable. And once again, planning ahead is the key. The second step the researcher can take involves access control. Sometimes the best way to share sensitive and potentially reidentifiable data is by regulating who gets access and under what conditions. If you're going to do this, you'll need to consider very carefully where you plan to deposit the data for the longer term, as you need to be sure that there are proper systems in place to manage access. There are a couple of places where it might be appropriate to place your data. The Australian Data Archive, ADA, which was set up originally to hold sociological data, is one. Data held here is generally not for public use and use is restricted to specific purposes after user registration. Users sign an end user licence in which they agree to certain conditions. For example, not to use data for commercial purposes or to identify any potentially identifiable individuals through data mining or other techniques. ADA qualitative is the part of ADA which accepts qualitative data rather than survey and quantitative data. At CEDA is the Aboriginal and Torres Strait Islander Data Archive and contains materials about those groups. Research institutions offering facilities for the storage and access of sensitive and confidential data will need to have similar facilities in place to ensure properly regulated access. You might like to inquire if your own institution can support the kind of access controls you need. Licensing your data lets other researchers know what they can do with it. For this kind of data, you may need a restrictive licence. If the data is not re-identifiable, then a CC licence or Creative Commons licence might be adequate. And again, planning ahead is the key. Anonymising data means just that, removing anything from the data which might identify the individual. Data can be classified as identifiable, non-identifiable and re-identifiable. Identifiable data contains information which will allow for the identity of participants to be known. For example, name, address, phone number, date of birth and so on. These are the elements which need to be removed or generalised so that an individual can't be identified. Non-identifiable data has been edited to eliminate or generalise any information which might identify the participants. This means not just taking out names and addresses and other obvious information, but putting the data through a broader process to generalise fields such as postcode, age or geographic location. Re-identifiable data is dated which, although not immediately identifiable, contains elements which might allow someone to determine identifiers at some later time. This might be possible through data mining or other kinds of manipulation. Looking at some of the techniques of anonymising data, some of them are pretty obvious and I've talked about them already, such as removing direct identifiers such as name, address and phone number. In addition to this, however, it's a good idea to reduce the precision of information by, for example, removing dates of birth and replacing them with age groups or generalising occupations so that a pediatric cardiac surgeon becomes a medical specialist, for example. You could also look out for outliers, which is to say subjects who are 104 years of age or over or who own over a million dollars when they stand out from others in the research sample and that might make them readily identifiable. Other things to consider include GA references which might lead to identification. There are those who say it's not possible to anonymise data and it is true that data mining techniques can go a long way to sifting out identities. If this is a danger, then access control is recommended so that anyone wanting to use the data has to agree that the data will not be subject to this kind of manipulation. It's been shown that in the US that 87% of the population could be uniquely identified using only three pieces of information, their zip code, their date of birth, and their sex. So the need for anonymisation is clear. The cost of anonymising data can be significant unless it's something that's been carefully planned for at the outset of a project. One exception to this is audio-visual files and these are expensive and time-consuming to anonymise. It's possibly better to leave them as they are and get permission to share if that sounds reasonable or to make transcripts available rather than the original voice conversations. And again, planning ahead is the key. Researchers aren't the only people who have a role to play here in possibly in doing something about increasing rates of data sharing. Human research ethics committees are well placed to support data sharing and dissemination in a number of ways. The role of these committees is to help protect the safety, rights, and wellbeing of research participants and to promote ethically sound research. This involves ensuring that research complies with legislation regarding the use of personal information while also adhering to funder rules about the sharing of data. Human research ethics committees can ensure that the issue of data sharing and dissemination is addressed in ethics applications by making sure that relevant questions are asked and answered thoughtfully. They can actively support the data sharing aspects of funder requirements. They can advise researchers that most data obtained from participants can be successfully shared without breaching confidentiality. The privacy laws don't normally relate to anonymised data. That personal data should not be disclosed unless consent has been given for disclosure and that identifiable information may be excluded from data sharing. They can provide about appropriate data storage and access facilities and support the development of these facilities locally. They can emphasise the need for good data management and the need for careful data management planning. I'd like to take the opportunity to acknowledge the contributions of the many people who helped create the ANS Guide. Some provided text, some comment, the UK Data Archive gave us permission to use the ethics chapter of their data management guide as the basis for ours. And we've had useful feedback from the staff of Victoria University, ADA, the Australia at CEDA, the staff of the NHMRC, the staff of the Research Office of the Australian National University and of course staff of ANS. This is actually a repeat recording of the ANS webinar because we had technical issues with the original recording but there were questions posed at the end of the presentation and I'd like to go through some of these and the answers the answers which were given. The first one we talked about was will ANS be providing a data anonymisation service or will indeed anyone? That's a really good question but the answer is that I'm not aware of anybody who is planning to provide such a service. The Australian Bureau of Statistics I understand does provide such a service so what you could always enquire there. One, there are some developments however that will help in a different way rather than actually using an external service. One is the development of some software known as Quadrant. Quadrant is a secure web-based cross-institutional research project management and data collection platform to be used for this kind of human research. It's self-managed integrated collaborative environment it's very secure it provides really clear data management processes and so on and it is ethically designed to minimise any problems. One of the things that Quadrant does do is at the end of a research project it will actually provide a de-identified data set which can potentially be then used for sharing. Quadrant isn't available to everybody just yet but we hope it will be sometime in the near future and will ANTS will certainly be publicising its availability when that happens. One person asked whether ANTS does have a document that can be provided to the group responsible for ethics in institutions outlining issues and presenting suggested approaches and procedures. The answer is at the moment we have the guide but if you have any other requests for documentation please let me know. We are just doing what it can to reach all the Human Research Ethics Committees in Australia. We have distributed information about the guide to about 150 of the over 200 Ethics Committees. We are planning to talk to groups through ARMS the Australian Research Managers Society and to use any other means we have at our disposal to do that. On a slightly different topic there was a comment made by one of the webinar attendees saying that one thing I didn't talk about with anonymising data is that researchers simply don't necessarily have to ask for personal information up front when it's not required and that's absolutely true and no I didn't mention it at the time. Another comment from Monash University one of the useful things they did at Monash was that all procedures relating to research got reviewed at the same time as the data management policy was done so discussions covered both Ethics and data management at the same time rather than being created separately and that sounded like a really very sensible idea. There was a question around the anonymisation of video data and I had to refer people to the internet and to experts on that not personally an expert in that and another question about any tips on overcoming the three myths well it's amazing how many institutions have come forward with those exact three aspects so I guess it comes down to being an institution wide approach to disseminating information about Ethics so thank you very much indeed for listening goodbye.