 I know I've very much been like that. Good morning, everyone. Welcome to our small, intimate session on digital preservation this morning. Thanks so much for being here. Oya, Rieger, and I are really thrilled to have a chance to share with you some of what we've been exploring and thinking about at Ithaca SNR about the topic of digital preservation. Just by way of introduction, I should probably share that, for many of you know Oya, she's a familiar figure in the digital preservation community. We were fortunate enough that although she is continuing to serve as program director of the archive at Cornell, she has joined Ithaca SNR as a senior advisor and is helping us to build out our agenda and work in preservation scholarly communications and some related areas. So we're really thrilled to have someone as talented as Oya to join our team. So what I'd like to share, what we'd like to share with you is in the first instance, just a little bit of background about where Ithaca SNR comes into these questions from. And then Oya will share the findings of a first project looking at the state of digital preservation today. So as sort of looking back over the past 10 years or so, my colleagues and I have had a chance to think about a variety of different frameworks for preservation. Really thinking about and examining some of the ways that we've moved beyond and are continuing to move beyond this sort of print environment model where preservation was assured without any central planning through a proliferation model. And so what we've done in that space includes things like the what to withdraw project where we commissioned an operations researcher to look at the number of copies of print materials that are needed when they've been well digitized to assure their preservation. We did a number of projects looking at the Federal Depository Library Program, looking at again how in a transition from a sort of proliferated print model towards new kinds of digital access, preservation and access together can be assured. And then perhaps most recently looked at some of the dynamics around how our efforts to move to shared print models for print collections have both effectively helped us to manage down print, but at the same time maybe are something slightly short of long-term preservation. And I think that as we explore a variety of topics and think about some of the new digital models that have emerged in recent years, I think we need to be thinking together about how the frameworks that guide us and inform the various parties that are participating in preservation, how that develops and endures. Perhaps one of the foundational documents in our space that really helped to organize our thinking about digital preservation now more than 20 years ago, 22 years ago was the Don Waters and John Garrett paper on preserving digital information, the task force on archiving. And as we look back now over the sort of arc of more than two decades of work that have taken place since then, the urgent action statement that Mellon issued in 2005, some really rapidly changing questions about what constitutes the scholarly and cultural record and what preservation means in terms of how we think about our digital culture. We're really at a point where there's been so much change where even thinking about how the user experience as Clifford has written about quite a bit, how the user experience is different and the different user experiences differ and what is it that we are, in fact, preserving become really, really interesting questions. So with that in mind, looking at some of the ways that we've perhaps made substantial progress in some areas but also see some unfulfilled directions as yet as well, one of the first projects that Oya took on when she joined us was a look at the state of digital preservation which she did over the summer and the early fall. And so we really wanna have a chance to share with you today some of the findings of what we've begun to learn, the snapshot we've taken that Oya has taken and have a chance, we hope, for some discussion even in such a small and intimate space about really what some of the directions and implications that you see coming out of the state of digital preservation today are. So Oya, let me turn things over to you. Good morning, thank you for coming. This room definitely looks much larger when you're at a podium and at my size, so thanks for filling the rows. Well, it's a great pleasure to share with you some of the findings from a study that I conducted a few months ago. It's a qualitative study and that it's based on 21 interviews just to kind of put it out there. I don't really mean to generalize. That's why we are calling it a snapshot. And I'm going to try to keep my remarks to 20 minutes or less so that we have time for your questions and we will look forward to hearing your thoughts on this issue. The other thing I wanna tell you about the study before I start is the individuals that I interviewed. It was not necessarily a representative sample. I did not necessarily try to balance individuals who are working on the ground and then who are in leadership positions. So let me start with a list of questions that I used. They were kind of, I had a number of questions, but it was open-ended. But basically, my conversations were framed with what's working well, what are the gaps, and that the context was the changing nature of scholarly record and scholarly practices. First of all, I want to go over some of the advances and I must note that just going over doing due justice, it's a presentation in and of itself. So by no means, my kind of three minutes quick visit of strengths should be sufficient in describing to you how much advance we have registered during the last two decades. But we thought that it would be more interesting to talk about the gaps and research questions. So with that kind of caveat in mind, let me just take us through a couple of slides to emphasize and to give a shout to the amazing preservation community. There is a very strong community of practice with meetings and conferences and workshops. As you would see at the center, there is even now a digital preservation day to celebrate and this is a global celebration. I don't know if you have looked recently, but we have a wonderful collection of books on various topics and this may give you ideas for the holiday season as gifts, I hope, and that these books actually represent not only the expertise, but also standards and best practices. We also have witnessed deep specialization during the last, I would say, 10 years. Again, just very quickly, I wanted to cover some examples from looking at deeply to 3D and preservation issues associated with 3D, all the way to looking at digital preservation as a socio-technical construct and looking at preservation from the angle of social justice, similar to what we are seeing document the Now project is doing with foregrounding ethical issues. So it's definitely a very impressive framework there. Tools and services, we managed to move what we were referring to as migration emulation from theory into practice. We have so many now open source or proprietary tools from ingest process to file format to check. And again, by no means this is representative, but we have more than two dozen services and frameworks. This slide is kind of in a way a kitchen sink, but what I would like to really emphasize is we do have several service points and also repository frameworks. And within this kind of rich, as I quickly demonstrated technical organization policy framework, we have been seeing the emergence of very specialized position, such as digital forensics specialist or data ingest technician. As I said, we are hoping that there will be a discussion after my kind of brief overview, remaining overview of gaps and open issues. So what I did is from the paper, from the brief, I selected four categories and I just wanna kind of share them with you and please see them as food for thought. We very much welcome additions and criticism and new ways of looking at these problem areas. I would like to start with an organizational perspective and also I think is a very good point for me to note that because national libraries, research libraries and archives have been kind of seen as the stewards of cultural heritage, inevitably, I would say my bias with these interviews were kind of more looking at this community. It was maybe not a bias, it was a kind of design principle of this 21 interview. By the way, again, I wanna acknowledge the generosity of my colleagues who shared their thoughts with me. From an organizational perspective, if you look at research libraries or many different libraries, what we are seeing is the priorities are evolving. As the service scope is now kind of more covering the life cycle from all the way from investigation to dissemination stages, the libraries are really kind of, some libraries are spreading thin and as they are spreading thin, one can question and many interviewers asked where is digital preservation in the priorities of senior leaders in the libraries? Is it a priority? Is it a product? Is it an outcome that appeals to provost or presidents as we are trying to justify, not necessarily justify, but to illustrate the library's role within the campus environments? And related to this role is changing roles and responsibilities within libraries. The good thing is that now we have so many rich programs that allows us to contribute to scholarship from digital scholarship to digital humanities to research data services to maker spaces. But also this, in a way, specialization is forming some fragmentation. And that fragmentation, the concern with the fragmentation, what I heard from the individuals I interviewed was that kind of the common preservation mandate is not very easy when you have fragmented services and when you have specialization. You know, we are a community of innovators and we get very excited about new things, but we still have heritage that we haven't fully taken care of. I was involved in an e-journal preservation study three years ago and what we found out was only perhaps 36, 37% of e-journals had third-party preservation solutions, meaning publishers working with groups such as clocks or locks or portico or such agencies. And I had the pleasure of attending a shared print meeting last week. Again, what we are seeing is our community is energy, but also challenges in addressing even our print challenge. So the new file format is heading on to some of the legacy problems that we are working hard on, but have not fully addressed yet. But the fourth bullet is an interesting one because as a community, we cherish collaboration. However, collaborations, as we all know, are very tricky and collaborations need leadership and the leadership needs to be agile and entrepreneurial. So there were questions and comments about collaboration within the space, maybe some slow progress due to group thinking or too much time spent in trying to reach consensus. Also, another issue I want to highlight with collaboration is we always talk about interdisciplinary trends in science by the same token, perhaps digital preservation should be more interdisciplinary, meaning including librarians, archivists, information scientists, scholars of science and technology studies, information scientists, sociologists, so kind of looking at preservation with a collaboration in means of representation of different skills would be useful. Now what is, I hope, implicit but not out there on the slide is funding, resources. What happens to resources? Association of Research Libraries until 2008 used to collect data from research libraries to document how much funds were being spent on preservation and then it was microfilming, microfish, digitization was just kind of entering the realm. And if you look at the last statistics, which is 2007, 113 libraries together were spending in today's dollars $130 million and then we kind of stopped collecting. And then I'm going to take us to 2018, Canadian Research Library is just completed a study on preservation, the state of preservation in Canada and one of the findings that really resonated with me and reminded me of the interviews I had was 78% of the respondents and it was 50 libraries involved. They were not able to tell how much money is being spent or who makes the decisions. I know it's difficult especially as I described that there are changing roles, there's a bit of fragmentation but my point here is that perhaps there needs to be more emphasis on digital preservation as a business and then maybe it requires some business planning, business models because how could we allocate resources if we cannot put kind of a general value in processes like this. I mentioned that there is such a fantastic range of preservation tools and one issue that came up was some confusion about how these tools work with each other and are there gaps, are there redundancies, the common ones that we hear often just to throw out a couple of names there, Internet Archive, AP Trust, Chronopolis, Portico, Lux, Clux, I think there's pride in seeing this vibrant marketplace but there are also questions about how do these work together and more importantly and actually we'd really witnessed this with the recent decision about digital preservation network, the most important thing is also how do these services tie into the local digital asset management systems because even we have these wonderful grids of electricity and water still in your house, you need to have the pipelines to be able to take advantage of these services. Inevitably again there were questions about sustainability and succession planning of these services and also again something that permeated through my conversations was digital preservation is research and development, it's also practice and sometimes before we know it a research and development service creeps into production but it may not necessarily have a solid business model or sustainability trap. Again as you know our community has been quite successful in creating some self audit and assessment tools so that the repositories that we are using, we can assess them and we can understand how reliable they are for serving our needs. Some questions emerge not about their value but about how they are being used. Is the one time process, how often should we be certifying them but also looking at the current infrastructure, current information ecology and placing certification within this kind of new landscape. Let me give you a very quick example. As you know information is so distributed, libraries less and less are owning or controlling information. It's inevitable that there is need to collaborate with commercial entities in managing some of this information and taking this example to certification ProQuest last year announced that they will use safe and proprietary system using of course Amazon as the preservation framework to preserve ProQuest digital assets. They opted not to use one of the third party systems. And there is no question in my mind that ProQuest digital assets are very important. They probably will take good care of it but for the sake of accountability and transparency wouldn't it be great that we use some of these audit tools in interacting with our commercial partners so that we encourage them for some transparency so that we know a little more about how they are taking care of their assets. The last bullet is an interesting one. It's about preservation and text and data mining and artificial intelligence and there were two issues. One was cost effectiveness of just in case preservation and could we kind of start thinking about can we also have text mining so on so forth so that the return on investment is broader especially as we are making a value statement to our deans and proists or entering in partnership. And the other aspect of it was can we use artificial intelligence or these computational techniques to bring efficiencies to digital preservation. And I'm very actually pleased to see that Internet Archive is experimenting with machine learning and data mining in identifying kind of what we call long tail of open access journals where there are kind of small publishers and they don't have the means to preserve them. Going to my third theme which is enduring access under the rubric of digital scholarship we have been witnessing scholars engagement in this digital space through use of digital content digital methodologies and they are becoming collectors they are becoming digital scholars and they are kind of they have these first hand interactions with digital content. So if I look at digital preservation from this angle one of the questions is usability of systems archives that we create such as web archives and research data. Again I wanna emphasize that more and more user experience is mediated through software and contextual information. So imagine research data preserved in an archive and then how would a scientist find it? How would the scientist discover find it? And when it's discovered what would be the software documentation needs to make sense of that data? So that's one point from enduring access perspective. Digital humanists again it's a very vibrant area we are seeing much experimentation, network analysis, text mining and some of these methodologies are manifesting themselves through research and development or experimental tools. So we are kind of they are creating evidence sometimes stuck in these research and development mode tools that how do you regenerate evidence if they are using an experimental tool that will not be maybe sustainable or valid. I already actually mentioned the open access publications and their vulnerability. So let me take us to the next theme. I really enjoyed reading this article. It's last month's issue of Harper's Magazine. What the magazine what they do is they look at, well they look at documenting the now but also they look at the Library of Congress's Twitter archive. Some of you may remember that I think it was in 2006 the Library of Congress announced that they would be archiving the Twitter and then 2017 due to many issues ethical, legal, policy, resources, discovered access they said this is not possible we can't do wholesale archive or sorry wholesale preservation of Twitter's and they are doing it selectively and there are half a trillion I believe, Twitter's. So I love this perspective of the journalists. It's not the quantity. Although we are dealing with quantity but when it comes to digital preservation it is not the quantity it is really allowing valuable experience with word perfect we might use emulators to have a scholar 50 years from now to kind of feel how word perfect felt but how do you have them experience how Facebook was once. Value driven preservation definitely it was woven in my conversations with my colleagues 21 interviewees who generously share their perspectives with us and let me just kind of highlight four issues one of them is you know there were concerns about are we focusing too much on treasures for preserving or you know how do we represent how do we represent underrepresented communities or marginalized voices you know is this value should this value be more integral integral in our preservation activities. Social justice angle I was pleased to hear actually yesterday many references to it you know what happens to websites down because of you know political messages that does not resonate with the current regime or the regimes in the future or hate speeches and as I mentioned I think documenting the now is a good example of maybe seeing preservation as a kind of a rich socio-technical construct and looking at it not only from technical organizational but values perspective. Repair in the age of innovation I was kind of introduced to this literature that is called repair in technology history of technology science technology studies through Steve Jackson who is actually a faculty member at Cornell at computing and information science and so you know we are a community not necessarily like your community but if you look at us kind of as a nation or in the world there's so much infatuation with innovation new things and you know what happens to things that we have created who maintains them but it's not only who maintains them but who kind of who positions maintainers as kind of playing this valuable role. I'm very pleased to see actually recent initiative coming from science and technology studies discipline on maintainers again looking at preservation kind of from the angle of it's not always innovation. Liberians actually science and technology studies scholars they used to consider Liberians as maintainers but I'm not sure if we are some maintainers or if we are a little biased towards innovation I'm not sure how we are maintaining that boundary there. Cliff had some great examples about environmental aspects of digital consumption. We can look at it from two angles one of them is I recently read a study that by 2025 they are anticipating that 20% of world's electricity consumption would be through the internet and devices connected to internet so that's huge. So you know what does it mean for digital preservation? What does it mean for us who have the stewardship role working with digital information and digital tools? And the other angle of environmental aspects Cliff again provided excellent examples. Even digital has physicality you know we have servers, we have server rooms there's infrastructure with the climate change you know what's happening to archive not only print archives but even digital content in these archives. So this was rather kind of a quick review just to highlight four themes that came through the interviews and one nice thing about a qualitative study like the one I conducted is you get really rich data I have 22 hours of data every time I read something new emerges but of course the limitation is this is qualitative I'm not generalizing I'm just sharing with you what I heard and our goal here as I said is to encourage through our findings to expand the research agenda or to expand to understanding the status of what's happening in this space. So I wanted to kind of show you again the interview questions that I used but Roger and I are really happy to turn it over to you now for your questions for your suggestions a huge space nobody's going to solve this problem alone we are just going to be you know kind of tip of the iceberg. I think I'm gonna sit next to Roger and invite your questions.