 Hello everyone, I think we'll get started. So I'm Bonnie Tehrina and this is Emily Keller. We're both here from Data and Society Research Institute. For those of you who don't know me, I was an academic librarian about a year and a half ago. I went to Data and Society as a research fellow and now I'm working on a few projects related to libraries at Data and Society. We're going to tell you a little bit more about Data and Society in a few slides. And Emily Keller is a project coordinator at Data and Society. She works on ethics-related initiatives including this project, working with me and Dana Boyd on this, as well as the Council for Big Data Ethics in Society, which is a council that's led by Data and Society and also partnered with the NSF. So today we're here to talk a bit about what we see as emerging issues in a technical space within and outside of academia. Let me know if you can't hear me at any point. We can stand if necessary. We came at this project wondering about what roles libraries can play, looking broadly at our skill sets and values, as well as narrowing in on what a sample of research libraries within the United States are doing or want to be doing on their campuses. So we're not here to kind of give answers or say, this is what your library needs to be doing. We do give some suggestions at the end of our talk. But what I hope our talk can kind of do is kind of be more of a challenge to the academic library community to consider what our role is in emerging information and data management issues and what new ways and new services we can create or new partnerships we can develop to better support cutting edge research. The project we're going to discuss with you is an exploratory project which initially sought to answer the question, is there a role for libraries as part of a campus support network helping technical researchers navigate emerging ethical issues in sort of big data research. And sometimes we put sort of big data in quotes. Many research libraries provide a growing array of research management support services partially in response to the data management plan requirements. This project was intended to explore big data research in computer science and the new ethical landscape that computer scientists are navigating. We then wanted to map the network of formal and informal support for these researchers where there are gaps and what, if any, role the library could play in supporting these researchers at various points in their research life cycle. I also, I'd like to thank the Alfred P. Sloan Foundation who funded our exploratory project and then specifically Josh Greenberg from Sloan who has spent a bit of time with us and has helped us brainstorm along the way. So I mentioned big data research just now, so I think it'd be helpful if I define what I mean by big data. There's no one sort of definitive definition of big data. Often we think of big data as just referring to the size of the data set, but even then what is sort of big enough to be big data. Some define big data as not about size but about the tools that allow us to manipulate and analyze data in ways we never could before. Others look to the variety of data sets that can be analyzed simultaneously, thus providing us with results or interesting connections that couldn't have been made in the past. The researchers in our study all came with their own definitions of big data or data research based on how they use their data in their research. So before we go on talking about our project, I kind of want to start a little bit broader. So I'm sure many of you have heard or seen in popular news stories and higher education publications. There's been extreme examples of what can happen when data is misused. And Cliff mentioned a little bit about this in his opening talk too. So when research has violated some existing or not yet existing agreement that just kind of makes people feel icky or makes us question how far is too far. As new research possibilities emerge due to technological developments and policies, guidelines and protocols are not yet able to keep up with these advances. Many researchers are grappling with what is acceptable behavior and what warrants further thinking. The Facebook emotional contagion study raised issues such as split responsibility between collaborators with different ethical review processes and standards. What qualifies as harm to research subjects, consent in social media, and the ethics of using third party data for research, which was the basis for Cornell's IRB decision to decline review. And I'm sure most of you know about the Facebook contagion story, but if not, I recommend Googling about that. I specifically mentioned the Facebook emotional contagion story here because this study and its fallout was mentioned several times in our interviews with researchers. Many researchers were concerned about how reactions to the study could impact research. Many felt that over reactions to the study were making researchers more hesitant to conduct certain studies or share existing studies publicly. These examples draw attention to the growing challenges in oversight and review, including potential gaps or missed opportunities to teach ethical behavior. And they provide a frame for understanding more routine incidents of questionable ethics. They raise the questions, should publications require data to support findings prior to accepting papers? How many steps ahead must a researcher think to protect her data from malicious future uses? When should the reuse of public data require a new review? When does the nature of a study preclude consent to ensure accurate results? And in what circumstances is this trade-off justified? So these questions fit pretty well within the initiatives at Data & Society. So I wanted to talk a little bit about Data & Society. Data & Society is a think-slash-do tank in New York City, which examines the social, cultural, and ethical impacts of technological development, including living in a data-centric society. Currently, our main initiatives are these six. So we look at data and fairness, the future of labor, education as for connected learning, intelligence and autonomy, data humanity and human security, and ethics. And under ethics, Data & Society hosts the Council for Big Data Ethics in Society. Within that, they seek to create a framework for thinking about the complex nature of big data ethics. And the Council is made up of interdisciplinary set of scholars and ethicists who are kind of thinking about these issues from a variety of perspectives. In terms of this project, it originated with a conversation that Dana Boyd, who is the founder of Data & Society, and I had, she was talking about this Council and how they were very high-level and they were thinking about these issues. And I wanted to think about what these issues look like on the ground with researchers and with the support systems that help them. And I told her about the work that librarians do in supporting data services, sort of new areas that libraries are exploring, and the data management planning work that libraries are involved in. And so we decided to bring together her interest in learning about what's happening with computer scientists and my interest in thinking about the role of libraries, both our current role and potential emerging roles. So we proposed this project. So I thought I'd start by just sort of talking about some categories of ethical concerns. We'll talk a little bit about our methodology in our project, but just to give you, again, more context as to what we're talking about, I wanted to focus on a couple of areas and give you some examples of what we're talking about. So under data collection, web crawling and scraping information for research studies seems to be causing a lot of concern. Researchers often violate terms of service, especially when mining or scraping, sometimes by accident and sometimes on purpose. One researcher had his access to the network cutoff for web scraping, which was followed by receiving help from the library to negotiate access. Another researcher intentionally violates terms of service on government sites on principle because he doesn't think governments should be able to ban researchers from using their data. A handful of researchers said they violate Twitter's terms of service, which requires you to expunge any tweets that have been deleted by users, which presents a logistical challenge for researchers. So ignoring this is a legal violation and at least one professional association said they'll not publish anything that violates terms of service. However, conferences, associations, and publications vary whether or not they kind of fit with that requirement. So who is teaching computer science students about copyright and legal issues when they scrape? Another but related issue is secondary data use. To what extent should voluntary participants who self-report about their medical conditions on a public online listserv that is private by obscurity be protected when researchers collect and publish their data? So while it is both publicly, it is publicly available on the web, this idea of privacy by obscurity, by the fact it's a very specific community that would have interest in that. What does it mean when that data gets sort of amplified through research or through other means? Regarding data storage, we heard concerns about third-party storage. Dropbox might promise security, but what if they change their policy? What if personally identifiable information accidentally leaks or gets shared because of this change? Is the researcher responsible for keeping on top of policies and changes at every company they've interacted with throughout their entire research process? And do they have to do that forever? This issue is beyond university control, but the university can help provide some guidance on how to navigate that. Or worse we've seen in improper storage, one researcher talked about what he has seen when researchers work with corporations and their data. He said, I feel like some of the biggest ethical violations are when I talked to computer scientists who did some internship, and then maybe they have their hard drive with a terabyte of super sensitive data on it, and they just brought back, that they just brought back with them and didn't really realize that this is not a good thing to do. Regarding data sharing, reuse, and replicability, for sharing, what counts as public data? When is it okay to republish data that is public, but maybe doesn't have a lot of visibility, like I mentioned before, is consent required to reuse publicly available data? None of trusive research raises consent issues. For instance, like the Facebook emotional contagion story. There aren't clear guidelines and people don't want another kind of media nightmare like the researchers working on the Facebook study. One person told us that there's limited voluntary usage of repositories, in part because preparing one's data for replication is time consuming and is not tied to career advancement or reputation with an academic research culture. One librarian said the notion of reuse is new, and researchers and faculty generally focus on designing their data to showcase their own research and the integrity of their work, rather than to generate reusable data for others to apply new research questions. Another librarian said while attempts at reuse overall are low, some faculty members have encountered difficulties and frustration when trying to reuse data for a project due to the way it was preserved or documented. Regarding re-identification and consent, one researcher was interviewed by a journalist about her research, which included publishing anonymized tweets. The Twitter or the journalist Googled these tweets and then interviewed a participant without telling the researcher. She's since become more cautious, so she's actually changed the way she publishes her research and does her research because of an incident of someone sort of de-anonymizing one of her participants. And so she no longer quotes tweets or any content over five words. However, another interviewee says you shouldn't assume that any data are permanently de-identified because we're finding it has become a parlor game for computer scientists to prove that they can re-identify almost any data set given enough networked information about it. And then we have sort of these unknown and emerging issues. New ethical issues are unfolding all the time. One of our interviewees said one of the things that worries me more is that as we develop new methods and new technologies and new techniques and new types of research, new potentials for harm, new ethical questions are coming up for which we don't have many examples yet. So I'm going to turn it over to Emily who's going to talk a little bit about the first part of our research project. Thank you. It's gotten more complicated in recent years. Computer scientists are not trained like social scientists to think about where their data comes from. Human subjects review processes and IRBs are designed to address one part of the puzzle, providing ethical oversight to research projects with risks to individuals. Yet, many of the complex ethics questions and trade-offs that are emerging are not best addressed by existing IRB protocols, sometimes due to the use of third-party solutions emerged late in the project. We suspected that computer scientists would benefit from having a trusted partner to work through ethical issues as they arise, and that research libraries may be well positioned to provide this. We talked briefly about our project before, and now I'd like to dig into some of the nuts and bolts of our work. Again, our question was, is there a role for libraries as part of a campus support network helping technical researchers navigate emerging ethical issues in big data to help us answer our question. We set out to interview computer scientists and librarians and map ethical protocols, funder requirements, and the different levers for integrating ethics into computer science, education, and research. We targeted publicly funded universities and top-tier and mid-tier computer science programs. We also spoke to one private institution. All our institutions had strong research libraries. Our project had two parts. One was phone and Skype interviews, with librarians interviewed by Bonnie and I, and CS or iSchool researchers interviewed by Dana Boyd and I. For librarians, we asked about the role with data management and DMPs. We asked if they've seen examples of research data reuse and about data retention policies. We also wanted to get a sense of how librarians think about privacy and ethics, so we asked them about what those words mean for their work and what they think librarians should or researchers engage with these issues. We asked about what role they can imagine librarians playing in helping CS researchers with data-related privacy and ethics concerns and what barriers exist, whether they be organizational, technical, funding, or procedural. Lastly, we had them imagine that they were invited to design a program where librarians helped computer science researchers with no big barriers. What kind of training should librarians have? What would be that kind of relationship with researchers and what would the interactions be like over time? For computer science researchers, we dug more into the ethics questions. We asked what it means to be an ethical researcher, what major privacy and ethical challenges they've faced and what relevant training they've received. We asked whether their experiences with IRB had been helpful and about a current data-related project they're working on and how their funders affect what they do with their data. We asked about DNPs, data sharing and reproducibility. We asked about their relationship with the library if they've talked with librarians about privacy or ethics and what role they could envision for librarians to help them manage and share data and if they could get help with privacy and ethics issues, what would they ideally want? In the end we interviewed 23 people at 9 institutions for this part of the project. We'll talk a little more about part 2 of our project, but first we want to tell you about what we learned that helped us with the second part. What we found from the interviews was that current formal structures to address these issues are not sufficient. For IRBs we heard that they provoke some questions but are not the best structure for thoroughly addressing ethics. They are too focused on compliance, approval and alignment with grants versus promoting discussion or providing an advisory function. Researchers hesitate to share ethical concerns in their IRB applications. The IRBs are slow which discourages them from sharing potential problems. They may be too lenient, especially on secondary data use as they don't ask how it was collected. Researchers describe their relationship with the IRB as transactional and sometimes adversarial. Several researchers said ethical responsibility ultimately rests with them, not the IRB, that they must decide what they are willing to do and the IRB is more effectively utilized as an advisor than as an ethical proxy. For DMPs we found that researchers often circulate and copy or repurpose their plans. Libraries often review a handful of DMPs per year and offer the DMP tool to many researchers and pre-filled out templates are also common. Researchers critique the DMP for lacking a follow-up method to ensure the data was shared. Many researchers view DMPs as obstacles, checkboxes or impediments to getting a grant rather than opportunities for contemplating their work. We saw that funder requirements lead more researchers to the library. They also create tension between privacy and sharing mandates, both of which lack detailed guidance. What counts as anonymous or confidential with sensitive information is complex. Data sharing often happens through trusted personal relationships rather than publicly. And some researchers avoid sharing their data due to fear of misuse. One researcher in particular told us she often tells the NSF upfront that she has no intention of sharing her data due to its sensitive nature and she said that explanation is accepted. One finding was that people defined ethics differently. Asked for one word that ethics makes them think of, participants said violations trustworthy, morality, conscience, responsibility, protection, integrity, teaching, mistakes, transparency and objectivity. Ethics were often associated with providing anonymity, consent, confidentiality, security and privacy to subjects. For a researcher, it means accountability, responsibility, participatory research, do no harm, sensitivity to participant's goals and outcomes, awareness of research implications, balancing inquisitiveness with risk, respecting tacit agreements or user expectations for social media data, setting personal rules about acceptable funders and warning subjects about information that may be inferred from their data in combination with external sources. Researchers vary in whether they evaluate the ethics of a project based on intent versus research protocols or outcomes. The puzzle becomes even more complex when you factor in potential data breaches and repository infrastructure and storage issues outside of the researcher's control. We heard these more complex definitions as well. One person said, a hybrid, formal and informal system of guidelines for how to engage in good behavior in a community. Another one said, doing things in an appropriate way and taking reasonable measures not to hurt anyone and ensuring that the data is not compromised, the data is secure and the data is presented in a way that doesn't show the person in an inappropriate light, reveal their identity or create avenues of discrimination. One researcher told us it is unclear these days who ultimately has the ethical responsibility, corporations where data may be coming from, academia where data is analyzed, questions are formed and results are derived, professional associations who set ethical guidelines, publications and peers who decide what gets published or funders who create mandates. We also found that ethics seem to be the wrong word. Librarians as well as computer science researchers do not necessarily think of their work as ethical or unethical. Compliance mechanisms rely on the ethics frame, but this may be perceived as a judgment, criticism or implication that the researcher did something wrong rather than trying to understand their processes and challenges. The Council for Big Data Ethics and Society has noted that computer scientists sometimes resist, quote, ethics experts from outside their field and that ethical standards are more practical when they come from within. We have found that nuanced discussions of ethics and computer science often take place within the context of one's work with advisors and collaborators rather than in spaces designated for ethical oversight and training. Ethical decision making is embedded throughout the research process. On the flip side, librarians are often engaged in work that involves ethical decision making, such as long-term preservation, legal or contractual use of resources and copyright issues without necessarily using the word ethics to describe their work. Though IRBs are more likely to be credited with overseeing ethics, librarians may be the ones helping researchers to make their nuanced ethical decisions. All the schools we came across are exploring ways to integrate ethics into computer science education in various ways. Apprentice relationships with advisors are common. Teaching ethics may include stand-alone classes or specific modules. Required trainings or click-through courses, such as responsible conductive research or IRB training are focused on compliance with academic integrity standards. Another approach is to embed ethical questioning into existing courses and topics, such as discussing accessibility in a web development class or addressing biases and limitations in a communications methods class. One creative approach we found was having students design but not implement a project that causes harm to enable them to gain a better understanding of the sometimes blurry line between ethical and unethical decision making. Lecture series with outside experts are popular, though events that are not integrated with course requirements may bring the risk of having students opt out. The Council for Big Data Ethics and Society published a report examining various pedagogical approaches to teaching data ethics and is currently commissioning case studies to exemplify challenges associated with data ethics. Our research showed a range of undergraduate and graduate four credit courses focused on topics such as privacy and data mining, political and social aspects of facial recognition software and GPS tracking, the transparency of algorithms in news reporting, predictive analytics and social good, intellectual property, biomedical big data and data ownership and access. Promoting the discussion of ethics is challenging since it makes researchers sometimes feel defensive or may come across as limiting. Researchers noted that ethical violations are often accidental or the result of lack of knowledge or best practices rather than malicious intent. However, the topic is often most engaging in the discussion of scandals which generate widespread discussions. These can be used as a lens to raise more nuanced and routine ethical issues. Overall, we found that all of the schools we talked to are struggling with how best to approach this and the collective development of curricula problem sets and relevant data sets would be helpful. Teaching ethics and computer science is not only for students, the people we interviewed also suggested training professors, librarians and even the IRB to develop more comprehensive knowledge on the issues involved. When it comes to training and qualifications for librarians, some suggested that experience with data or technical subject matter would provide additional expertise beyond traditional backgrounds or in MLS. This may be pertinent to the emerging role of librarians, particularly in gaining trust with researchers. We see informal and now more formalized courses for librarians in data science. One way we could see for this informal learning to take place is via data clinics which are modeled after statistics clinics. When computer scientists are positioned to teach others how to use technical tools they are often better positioned to reflect on the norms and values embedded in what they are doing. Creating structures for them to teach and for people from other fields to ask questions might be a helpful way to get increased engagement on ethics issues. Programs that utilize the clinic model for data management provide drop-in hours and consultations through data science institutes. This model helps to raise ethical issues for discussion outside of the potentially punitive compliance approach. As we moved into the second part of our project we considered how to reframe ethics. How ethics is taught and who researchers might be reaching out to during their research projects. We especially wanted to explore this idea of data clinics as a drop in place for graduate students and researchers to seek help or to help others. This part of our project involved visiting a couple of campuses and having in-person workshops or brainstorming sessions. On each campus the session was two hours long. We brought together technical researchers and graduate students, librarians, IT administrators data repository managers and faculty from the iSchools as well as others. These workshops said 14 to 18 people. We came with big sticky paper small post-it notes and two exercises for them to work on. Our first group exercise was called computer science PhD student. She's fictional. She's interested in working at the intersection of computer vision and machine learning. She plans to scrape an open data set and get images from a company that we called Image Company. As she goes about constructing her research project and seeking funding we see her trying to navigate formal structures such as IRB and funder mandates as well as technical and social issues. At one point she goes bankrupt and she's unsure if she can still use the data she has. She also realizes late in her research that the use of the data she has collected violates terms of service. Where does Alice go for help during the many twists and turns of her research especially when her advisor is busy has never gone through the IRB approval and hasn't scraped data before. The goal of this exercise was to identify what resources exist and what should exist on campus. What processes researchers should follow and what would be the ideal way to support technical researchers who are doing work with large amounts of potentially sensitive data. Some of our conclusions were that there were resources available that not everyone was aware of that they may have received emails about but hadn't necessarily used the services if their personal sources hadn't led them there or couldn't vouch for their quality. These resources are spread widely across campus and there's some overlap between departments. There is confusion offered by each one and the way those puzzle pieces fit together was not always coordinated. Some participants learned about services that were offered at their schools through the session that we held. As we mentioned earlier, we felt the data clinic model might be a good area to explore so we asked our stakeholders to design a data clinic that would work at their university. Many universities have statistics clinics while the structures vary, they tend to allow students and faculty to communicate with someone with advanced statistics knowledge to help them through problems. Often faculty and students play a role in the clinics but there are models also where full time statisticians masters or PhD level are hired. The funding model for these clinics varies. Some are paid for centrally others charge a fee to departments. In most cases using the clinic is free though there may be charges for extensive use. Data clinics can be an extension of stats clinics but there are other issues in which people could benefit from a data centric perspective. For example, how to store, secure, encrypt, de-identify and otherwise protect data. The trade-offs of using different private vendors to store or process data. How to clean data and assess its limitations and biases. And how to prepare data to make it shareable without violating privacy. In designing an imaginary data clinic, participants shared the following priorities. Discussing security systems in terms of cost, convenience and effectiveness. Learning about software carpentry, Amazon web services, Hadoop, Python and R. Discussing ethics questions for related to using mechanical Turk where laborers may receive a very low wage. Building infrastructure to deposit data. Having a triage system with faculty and professional staff as front-line consultants and students for follow-up work. Having the clinic visit different departments. A speaker series about the challenges of big data and ethics. And documented institutional memory such as guidelines, a wiki or a question bank. People also noted that data science changes faster than stats and faculty would have to stay on top of up-to-date methods. They also said the library could have a role in finding accessing, negotiating data access and ensuring future preservation and providing intellectual property support. Balancing the topics covered could be political. So they said a data clinic is more amorphous and open to interpretation than the stats clinic. An effort should be made to prevent overlap between the two. One student said he would like a service that provided information and a place to discuss and advance his research. But noted that if the visit led to a constraint being placed on his work, he would be unlikely to return. So as you can see from the exercises that we did in the in-person meeting, we didn't use the word ethics at all. Because like Emily was saying before a lot of that work is just it's not seen as something like I need to stop and think about the ethics. It's more of this question making throughout the entire process. So we wanted to get away from that from the use of the word ethics. So coming on campus and playing this sort of neutral sort of third party role was very beneficial in getting a picture of what's happening what various stakeholders are offering and what services researchers know are available. And what we found from our experience that was most surprising is that these campuses seem to need and want this sort of interlocutor to help them think through these issues. So there was this enthusiasm for this connector or consultant to gather stakeholders from separate spheres to address data ethics. And it could be because we came in sort of outside of the campus politics and other kind of things that are happening on campus and we were able to kind of bring these people together from the outside. So I recently informally presented this project in Europe and two universities want to try to host similar workshops on their campuses so as to better understand researchers needs and to map available services around campus. So as I was pulling this presentation together and sort of the findings from this talk, I thought if there's any organization on a campus who could take on this role as sort of this connector or sort of interlocutor, it's the library. So while I'm not sure it's sort of one of our official findings I think it's something for us to consider. Could your library as sort of the central hub on a campus who networks across campus be a good place to bring together these stakeholders and surface these types of issues? You know, having, you know, whether it's something formal like the workshop we had or just sort of educational series and that sort of thing, essentially being that neutral place to have these potentially difficult conversations. We also felt that the exercises we get rather than talking to them over the phone where we were saying more like what do you think about ethics and tell us about a project where you thought about ethics really approaching it more from giving a real fictional but like a real case of like this is this person and these are the real issues they're running into seem to really help them think through the types of questions they would ask and who on campus would be the resources they would go to. So as many of you know research data management services are available at numerous libraries. Some libraries already have a research data management triage model in which librarians serve as a point of contact for data related needs, referring researchers to specific people or departments across campus for tasks beyond their expertise. In this role, librarians offer a network of support and provide background legwork to save researchers time. Being seen as a strong central resource seems like the perfect role for the university library. The library at one of the schools we spoke to provides consultations on web scraping and statistical software use primarily for social science undergraduate students but also for graduate students and faculty. At one school the library played a strong role in research data management service involving various institutions and departments with strong data use. The librarian who coordinates the service provides support on data related tasks and skills for disciplines, various disciplines answering questions about storage, data sharing and the cloud and helping researchers prepare metadata to deposit their research in repositories. One school has library liaisons with data experience for specific disciplines and a data librarian and programmer for its social science institute. So I sort of beyond that interlocutor role that I just talked about in this triage role I wanted to end this presentation with some areas where libraries could expand their roles to take on more in support of researchers emerging ethical issues. So I thought through sort of what are things that libraries already do or categories of work that we do and how in thinking about the results of this project could we expand some of those or think about those roles a little bit differently. So in terms of our work around data sharing, obviously creating technically sound and long lasting repositories for data sharing considering preservation and access. Libraries need to build the repositories that are robust enough to hold large data sets or partner with on campus or discipline specific organizations to ensure data is being held for long-term preservation when necessary. I think libraries could create better best practices around when data should be held for years versus into perpetuities. This is a conversation that I don't really think is happening in libraries but it's happening outside of libraries. It's thinking about data centers and the environment or the environmental impact of keeping multiple copies of let's say junk email and that sort of stuff like what is the long-term effects of us just keeping everything because it seems so easy to just keep everything. So I think it'd be interesting in the future for libraries to think about what role they have in thinking about what decision making about what to keep into perpetuity versus shorter term. As organizers of knowledge, libraries need to be creating and thinking through the metadata needed to ensure the security and privacy of sensitive or potentially sensitive data sets as well as to ensure proper cataloging for future use. So this metadata can help ensure any sensitive data is sort of wrapped with the right descriptive information for future sharing. Regarding proper use of information, libraries often play an important education and advocacy role on campus regarding proper use of copyrighted materials and working with creators to think about when to use Creative Commons. Libraries host workshops to get out information on how to share and make accessible copyrightable works. They can take on a role to help researchers think about the legal or gray areas when it comes to use of data, scraping data off the web in terms of service issues and violations. Libraries license e-content. So more and more researchers have come to libraries in hopes of getting access to data mine and text mine scholarly resources actually came up. One of our grad students in an in-person interview talked about their gateway to the library and to the services of the library because by shutting down access to a scholarly database because they were crawling it and the library contacted them to say this is a violation and then that started a conversation and now that grad students have gone to the library a few times to help negotiate similar kinds of data mining projects. Libraries are playing a role in negotiating the use of digital content and libraries can see a growing service in that area. When licensing traditional content we have sort of a set of ideal terms for the user community or an ideal license. What are ideal terms of use for data and text mining projects? I think that's something interesting that we could explore. Libraries could play a bigger role in the use of data mining projects. I'm sure some of your libraries do that but I've also in my experience in managing e-resources team often our problem was once we had licensed the data set or provided access we had no place to then have that live and have the appropriate level of ability for people who are supposed to access it and to close it off to others who weren't. That was always a challenge. In terms of education and advocacy and being a profession that is concerned about privacy, intellectual freedom and the public good there's a role for libraries to play as we all figure out the new norms of how we handle data being collected about us and how we think about future uses of it and where we go from here. When libraries advocate for open access and open science and open data libraries must take the next step and help support what it means to make data open and shareable. That means having the difficult questions and conversations about ensuring privacy, confidentiality and potential unintended future uses of this data. Libraries can be campus leaders for conversations about privacy in the digital world. They can facilitate those hard conversations of deciding what will the world look like in the future and how to sort of support researchers in thinking through those hard questions. So to conclude we're wrapping up our final findings and we'll be releasing a paper in January of 2016 hopefully. The paper will be announced on data in society site and newsletter and via Twitter. We've also, because we've gotten a few requests for those exercises that we talked about we're thinking about packaging up a way for another organization or entity to be able to run similar type exercises on campuses. So we're still in a phase of brainstorming ideas for next steps. So now we've kind of learned this stuff. We've put this out there thinking about what's next steps and what we can do next. So we're happy to talk with you all here at CNI or at another time. Our contact information is there to get your reactions and thoughts. So thank you.