 As the recent Facebook data disclosure suggests, it's very easy nowadays to capture personal content without individual's consent or knowledge. Therefore, we need to think carefully through our policies and procedures to make sure that we are being good stewards of digital content. Ethical issues are of course not new to the profession. In addition, we have a longstanding code of ethics which covers several aspects, including professional judgment, carrying out basic archival tasks, protecting record authenticity, access to and use of records, professional relationships with donors, privacy issues, insurance security against theft, and questions of trust and archivist conduct. So let's examine a few of these and please forgive me from the outset as these are in some sense intertwined and or inseparable and therefore hard to unpack in a linear fashion, but let me at least set out some core concepts. So appraisal, let's start with the question of what should we collect and why. The digital revolution has introduced several new concerns for archivists when it comes to appraisal. I talked last time about the digital deluge governing our work in every aspect to effectively narrow our focus to determine how easily to identify and capture digital collections selectively rather than their entirety, especially when context is missing or hard to determine. In other words, given the considerable time associated with processing and providing access to digital content, we need to make sure that we're only taking in content that warrants such a commitment. So how do you appraise a mountain of electronic files? Let's take two case studies. As we take in more and more born digital content from creators, including disks, flash drives, hard drives or several computers, we've had to redouble our efforts to address creators' intentions and desires with respect to content we can retrieve or copy from media. That is, with our ability to utilize powerful tools in use by law enforcement and the FBI, we must ask ourselves what materials are in scope for capture. We can easily capture everything, but should we? For example, given said tools, we can create bit-for-bit image of a hard drive that will capture everything, including deleted files. Or, on the other hand, do we just capture what they specify or what we can easily determine from their working files? What if the creators no longer are alive to address these concerns? These types of appraisal questions are obviously not new. There are well-known cases of individuals who have either complied with or gone against creators' intentions. In 2016, you might remember that Malcolm McLaren's son destroyed his father's multi-million-dollar punk archive. Very punk rock. Others have saved materials despite their creators' specific intentions, including Kafka's literary agent. Sure, we'd all be tempted to do otherwise, but fundamentally, we as professionals have to comply with donors' intent. So let's get back to the challenge of appraising electronic records. So what's new with digital files is that it's so easy to collect everything. It's no longer a matter of space or cost, for the most part. But then again, as long as with analog files, we don't want everything. So we have to somehow determine, in keeping with donors' wishes, what we want and why. As a result, our default policy is to create disk images in a logical capture that is specific files and folders and not what we call a forensic capture or everything, including deleted files. Now, this may not always be possible given issues with various media when under certain time constraints, not being able to easily navigate file organization, file structure, languages, etc. As a result, when we do have to create a forensic capture, we treat raw, unprocessed disk images as restricted, not open to researchers. Only processed portions will then be made available to the public, and I'll explain what I mean by processing them a little bit. For the moment, though, why does this matter? What are the ethical implications of creating disk images? Well, for one, and beyond complying with donor intentions, is that of provenance. That is, we need to accurately and securely transfer authentic records in a well-documented process to guarantee consistent, reliable, trustworthy records. As more and more records are born digital, we need to make sure our procedures adhere to principles, guaranteeing third-party review in cases of open records request, FOIA request, indoor e-discovery, or litigation. A second reason concerns records management and compliance. In the case of a university, for example, we deal with student records, medical records, personnel records, financial records, all the sexy, scary things that make headlines when they are improperly accessed, shared, or used inappropriately. We need to guarantee that the tools and procedures utilized conform with principles of secure handling and or destruction of sensitive records. Utilizing forensic or raw captures of media output from this type of information puts us at risk for taking content we should not, which brings me to a second use case. Email. Particularly troublesome for a variety of reasons. First is a sheer volume of messages. We've recently taken in collections in excess of over 600,000 email messages. Second is the fact that email is often used for both professional and personal use, hence there is a lot of potentially sensitive content to be found in otherwise routine business records. Third, it's rarely organized, certainly not well done over an extended period of time. Several years ago we did an informal survey of faculty email, and we found that 80% of faculty did not create folders. Gmail and search is primarily to blame for that. To address these varied issues, we've worked with developers to create a sophisticated email archiving tool, I spoke last time, about called E-PAD. So E-PAD includes functionality that allows us to run regular expressions, pattern matching searches, and custom lexicon searches to look for and filter out sensitive content. Further, it allows us to select individual accounts and or correspondence to include or not. It even searches attachments and has an image browser for viewing photographs, which might come in handy when dealing with certain politicians. We therefore utilize every one of these functions, often with input from the creator, to winnow down material to a subset available for public access. Now since we don't provide full unfettered access to individuals' entire email accounts, E-PAD utilizes natural language processing to index names, places, and subjects, which themselves can be shared online so that users can determine whether or not an email corpus is relevant to their research to warrant a trip to Stanford. So up to now I've talked solely about reactive collecting. So let's switch gears and talk about proactive collecting and its ethical challenges. As tools make it easier to capture more and more of our online presence, we need to make sure that we are applying similar ethical considerations in terms of notification of consent, particularly for individuals and groups who have no voice or role in collections' creation, retention, or public use. It's fairly straightforward to negotiate capture of institutional and personal sites using standard agreements or deeds of gift. But what about trends or spontaneous events, such as those highlighted in Twitter hashtags or spontaneous events captured by web archiving? Given the ephemeral nature of websites and social media, tools have been primarily designed up until this point to capture first and even ignore robots.txt files. This has sometimes been necessary, say during the Arab Spring or other crisis in totalitarian regimes where content, if it is not immediately captured to document what's going on, it will later be deleted by authorities and hence disappear from the historical record. Hence the trend has been to capture now and figure out access later. However, now that our systems have matured and researchers are interested in actually using these materials, we are now being forced to figure out how to provide access and as a result requirements for notification and opportunity for those who wish to opt out. Thankfully, our profession has responded to these concerns from the resurgence of activism sweeping our country. As more and more communities utilize Twitter for communication and documentation, we've created tools to allow ethical capture. Notification and inclusion in community archives. Just this last week, a conference was held in New York on the ethics of web archiving. Several great sessions examined these issues and recommended real solutions that we at Stanford and the larger archival community will follow. Let's switch gears again and talk about providing access to all of this content. In our previous session, I discussed several tools that we utilized for providing access. What I didn't discuss is the decisions that need to be made to provide access to whom, how, when and to what extent. Several years ago, our profession looked in the mirror and realized that we were not being very good stewards of our collections, letting them sit unprocessed in dusty old rooms. An article was published challenging archivists to take a new approach called More Product, Less Process, MPLP. In other words, spend less time processing, reorganizing, refoldering, taking out staples, and focus on providing faster public access. Repository's responded and thankfully we're all beneficiaries of this increased focus on access. When it comes to digital records, however, what an MPLP approach entails is still being worked out. At a minimum, we check for viruses and do our best to search for sensitive content using a combination of tools including Bit Curator, Forensic Toolkit, and E-Pad. As good as these tools are, however, they still relied on hard-coded instructions, search terms, patterns, and regular expressions. In other words, they are 100% reliable, but neither then is analog processing. So how do we guarantee that we are only providing access to the material we are supposed to? Well, we rely on several levels of protection, including embargo periods, audience limits, links to takedown requests, and copyright complaints on all our webpages and policies informing patrons of their specific responsibility for flagging sensitive content and asserting that they will not disclose it if they do find it. When it comes to digital records and the ease of immediate worldwide exposure, you thus have to define your risk aversion, which can vary from collection to collection from item to item. For those gray areas, sometimes you proceed with an opt-in strategy and can live with takedown requests. Sometimes you have to proceed with an opt-in strategy to cover all of your bases. How do you make the call? Finally, let's explore the challenges of privacy, confidentiality, and sensitive content through some specific use cases. The first concerns an oral history project with LGBT alums carried out in connection with a Stanford faculty member. Since we have a long-standing oral history program at Stanford, we provided the team with training on informed consent and created releases for all participants. Given the nature of the project, we wanted to make sure that everyone was on the same page and well aware that these materials would be made available online for teaching, learning, and research. Everyone understood and consented to provide online access to their transcripts and to their recordings. Later though, when some started to search for new jobs or change careers, they googled themselves, and lo and behold, their oral histories were at the top of their research results. Now, having second thoughts about the project in light of their career prospects, a few contacted us to restrict access to just the Stanford community. We, of course, complied. A few more requests came in, and we decided to pull down the entire collection and only provide access to the reading room. This case shows that no matter how well-planned and informed participants are, systems and procedures must be in place to respond quickly to questions and concerns about access, even in cases of clear-cut and informed consent and permission. Now, while I'm talking about oral history, let me take a brief discussion about confidentiality. Some of you might be familiar with an oral history collection held at Boston College on troubles in Ireland carried out with members of the IRA and loyalist UVF red-hand commandos. Their agreement stipulated that confidentiality of the interviews was guaranteed, quote, to the extent American law allows and the conditions of the interview and the conditions of its deposit, including terms of an embargo period, end quote. Additional agreements between the scholars and the interviewees promised self-imposed institutional limitations and assurances of confidentiality. Interviews were to be made public either at the time of the interviewee's death or at some time in the future specified by an individual agreement. Unfortunately, it appears that the interview's ease were not made aware of the fact that confidentiality was limited to the extent of American law. A commission had been in place for several years to investigate several murders in Ireland dating back to 1972. Given the public revelation of new evidence in the case, the Ireland authorities requested access to the collection. Given the act upon this request, a subpoena was duly served to the library in 2011. Boston College initially resisted the subpoena in court, but eventually some rendered the oral histories when they lost their case in federal court. The takeaway from this case, which sent a chill throughout the entire community, was that access restrictions and a deed of gift were not sufficient to withstand a court subpoena. In other words, there is no such thing as archival privilege. This ruling may very well mean that participants will feel less likely to take part in such activities if participants cannot be guaranteed confidentiality. It is also likely that there will be an even more chilling effect if archivists and others involved in obtaining historical documentation are perceived as giving legal advice that has no basis in law. The archivist's fundamental responsibility would therefore seem to be to represent the current legal situation to potential donors as the best possible situation rather than engage in speculation about the law that could be easily be seen as self-serving. Potential donors should therefore make informed decisions based on the legal ramifications of their actions and not based on legal speculation. A second case study includes human sexuality, specifically a popular and long-standing class in human biology at Stanford. Professor Haran Kachadorian donated 4,000-plus digital images he made of his lecture slides given over the course of 40 years. Given the subject matter, as well as the fact that many images are copyrighted and from other sources, we're still working out what to do exactly with this collection. Do we need a warning? Do we have to restrict certain lectures, for example, on development and or images of children? Is there a difference between educational use in the classroom versus online exposure? Pardon the pun. Well, we certainly will not provide worldwide access to the collection as a whole, but that is the only answer we have at this point. It certainly seems a waste to require patrons to come all the way to the reading room to use materials, but that may be the only solution. Other institutions, such as the Kinsey Institute or Cornell, which also has a similar human sexuality collection, are in the same boat. Many other institutions have similar content, yet no consensus has emerged thus far in terms of what to do with this type of material. Then again, maybe we're all just being naive. As I have heard, second hand, that there is a nude picture or two on the internet. A third case concerns a Duke University collection of extremist materials donated by the Southern Poverty Law Center. Rather than expose the content online for misuse or exploitation, Duke recently chose not to digitize it. It should be noted to you that these materials are copyrighted, and it's not simply a matter of a version to provide access to materials that can be used to spread hate. This is an interesting case because it stands as a stark contrast to another case where several equally reputable institutions are collaborating on a research project to digitize KKK newspapers from the 1920s and 30s. With the events of the last year, however, many are asking why such a project exists, even despite research interest in the rise of the Klan and several faculty back in the project. In other words, this case reveals a much deeper question we now have to face in the profession beyond simple research value or interest. And instead, what do those actions say about us, our collections, our decisions, our priorities? Many archives and archivists think of themselves as progressive, wanting to capture and preserve voices and collections that have been hitherto neglected or overlooked, engaging in community history projects and other collecting initiatives. As a result, we can't over the impact of our priorities. What we say about us and our mission, particularly to those communities we wish to work with and to expand a diverse, inclusive collection and narrative. These are just a few of the myriad ethical questions we deal with on a day-to-day basis. I'd be happy to explore anything I've discussed thus far later. Thank you.