 Okay. Welcome to this session on really, I guess, the best way to summarize it is preservation of born digital material in the large. There's a tremendous amount of work, as you know, going on on the sort of mechanics of how you preserve particular classes of digital objects, be they websites, be they electronic books, you name it, digital art. And that's not what we're going to talk about here. What we're going to talk about here is how we think about a whole sort of emerging ecology and economy of digital artifacts of various kinds, and how we think about those as comprising an important part of the cultural record and the scholarly record. We see those, or at least I see those as a really important record not just of scholarship but of the raw material we're going to need to do future scholarship. So those are the kinds of things we're going to talk about here. I should have introduced myself. I'm Cliff Lynch, but you can see I'm really focused on this topic. This is my colleague and friend Carol Mendel, Dean Emerita of New York University and now a clear presidential fellow. How did we get here? So I have been looking at the kind of dual problems of the changing economy as material moves digital and the emergence of a whole pile of new genres of digital objects from web pages to ebooks to you name it. And I've been tracking this since about the turn of this last century. It's really been that long now. I've seen a lot of things happen. I've seen, for example, how we actually in some ways did pretty good with journals as they made the transition from digital to electronic building systems like locks and portico to ensure their persistence. There are interesting stories to be extracted there. One of which is that that is an area where the authors, the publishers and transmitters and the marketplace all shared a common value that lasting preservation was important and worked together to figure out how to make it happen, which does not characterize many other domains here. I saw the development of things like the Keeper's Registry, which Peter Bernhill and his colleagues updated us on earlier, which actually is a sort of a first heroic attempt to answer the question at least for journals. Well, how well are we doing? Are we doing any better than we did last year? Is this a disaster or should we feel relatively good, although there's room to do better? One of the things I've learned in looking at other areas is that even for what I'd characterize as pretty mainstream, you know, commercial cultural content, think of something like eBooks, we have no idea. I can guarantee you with high certainty we're doing extremely badly, but if you ask me to put a quantitative number on it year over year, I can't. We can't even begin to. We have some really, really bad approximations, which we can talk about, but these are the kinds of large scale things that have been keeping me up at night for the last 20 years on and off. I've actually got sort of the very beginnings of the manuscript that has been languishing since around 2017, because I can't seem to find any time to work on it. And then I got this email from Carol Mandel, who's sort of in the mode of, well, I just finished being dean. I now have the luxury of time to really think deeply about these problems I'm really interested in, and I'd really like to chat with you about it. And so we had several of these like amazing lengthy chats and realized that we were very much on the same wavelength, except that she's actually produced a concise and coherent account of at least a sizable subset of the problems I've been agonizing about, put it out in a clear report that came out around what September thereabouts, which is available on the web. And I think I put the coordinates for that out to CNI announced, and then boiled it down to a very concise slide presentation of some of the highlights and questions. What we're going to do today is now that I've finished these kind of framing remarks, we're going to run through that and conclude with a set of questions that I hope we can talk about. I'll make a few additional remarks after she goes through the presentation and frames the questions, and then we're going to open it up to discussion. And I am very hopeful that we'll have at least like 20, 25 minutes for conversation. And I'm really very interested, and I know Carol is really very interested in hearing what some of you are thinking about these issues. Over to you Carol. Thanks so much Cliff. And for those of you who've seen this presentation at DLF, it is mostly the same one, but the point here is the reason I've done this framing and this article, which is really intended to be just the first chapter of what goes on to really look at how can we dig into different types of material and address this, is to get conversation going. And you'll see why that's so important as we talk about this. And so what started out, I thought this was just going to be a very short introduction to work that was going to continue. And the more I dig into, dug into it, the more I saw the depth of the situation and the issues, which is, you know, that's the work that Cliff has been doing and trying to do. It's just, you realize how huge this is. And so it turned into the piece that's now published as a clear report, but that is still just a frame. And we have a lot more work to do. So I'm really honored to be here today with Clifford and I'm excited to be here with so many creative, talented professionals and to help us to dig into this. You know, Clifford's talking about the fact that we're working on this since the turn of the century, essentially. And I was privileged to serve on that, the task force that Don Water so brilliantly led in the 90s. Yes. That on archiving digital information, where we're just beginning to imagine what that meant and what this world was going to be. And quite a landmark effort. And since then, we've really seen what's the equivalent of a generation of accomplishment in this arena. And I really saw it at the presentation just a little earlier this afternoon of folks from different digital preservation projects, you know, 20 years ago the field didn't even exist. And then here we're on these wonderful professionals developing the standards and the tools and the activities that we need and they keep doing it. So just a lot has been accomplished. And Cliff alluded to, you know, Portico and Locke's and the new kinds of constructs that we needed to go forward. There's continuing research and development and there's some pretty significant born digital collections in national libraries and, you know, there's, you know, Internet archive and so there's all this just terrific work going on. So how come, you know, where I'm up, Cliff is up in the middle of the night worrying about this. And that is because for all of this, when you look at the nature and scale of how our world is now documented and expressed, how knowledge is conveyed, how cultural heritage is conveyed, the scale and nature of born digital content is quite extraordinary. And there's still a lot of holes in what the future is going to know about this. So that nightmare, the picture on this screen, is what kind of led me to think, well, you know, now I've got a little time to think about this. What might we start to do about it? And of course, you know, we can't save everything. We can talk later about how many cat videos are enough. There's, you know, I mean, there, a case could be made for saving everything in that, then you have the picture of everything and you have technologies to mine it. But I'm not up here to advocate for saving everything. So if you want to poo poo what I'm saying for that reason, no, that's not what I'm about. But the holes of what we have are just huge. Just think about from streaming, from streaming broadcast news to, I mean, news, what could be more important and basic for, you know, future knowledge of what happened. To creative production, to cultural heritage, there are just huge gaps right now if we stay as we are about what the future will know about us. And so I've been trying to figure out why are we in this state? We have such great memory institutions. Why is born digital collecting? And that's what I'm talking about. We've done all this great work in digitizing collections, making things available. We've all this great collecting. I remember when it was a big deal to look at audio and video and not just print. Remember Howard, when that like, you know, but so we've done some great things. But born digital has stumped us in very different ways. And I want to talk about why I think that is. And, you know, the bottom line here is if you're not collecting it, if you're not taking stewardship responsibility for it, then it's not going to be preserved no matter how many wonderful technologies you have. And so I think it really gets down to that collecting and stewardship answer. And so that sounds like, oh, Eureka, simple answer to this question, but it's really complex. We make assumptions about collecting and stewardship, but you look around and our whole like last several hundred years of collecting and stewardship do not naturally translate into the digital age. That is why we're where we are. But that task force that I spoke about, under Don's leadership, I want to read this quote because, you know, that was back in 1996. And that report said that when we barely had technologies for understanding how to preserve digital content. But the group said the problem of preserving digital information for the future is not only, or even primarily, a problem of fine tuning a set of technical variables, although goodness knows it's hard to find two that set of technical variables. Rather, it is a grander, I love that word, problem of organizing ourselves over time and as a society, as a society to maneuver effectively in a digital landscape. And that is what I started to look at and think about as in the framing. And I realized that the problem was bigger and scarier than I started out. Think about. So I spent some time, if you've read the paper or if you have some time with nothing else to do after this meeting and you want to take a look at it, I spent a lot of time looking at what I call the mosaic of memory. The memory institutions, the mosaic of different kinds of memory institutions that we have that have grown up in the past several hundred years, some have an even longer background, to create the fact that we have memory. And different definitions of different kinds of documents. You'll notice in these mosaic tiles, they're not all big research libraries. There might be a picture of the Library of Congress here, but it's a combination of all these kinds of specialized collections and small collections and private collections that have grown up and come together. And they haven't come together in a particularly coordinated way. But there's a set of kind of societal understandings and expectations that puts this together. And so even though this mosaic doesn't have a premeditated uber picture to it, and you can't use that word anymore in quite the same way. I wasn't talking about them, but there has been a cohesive expectation of memory. But I started comparing this picture to what we're looking at in the digital age and where we really are in almost the end of the second decade of the 21st century here. And looking at the realities of our 21st century memory institutions. For one thing, even the libraries who have been charged with collecting are moving their emphasis from collecting to services. This is not a criticism. I'm not up here to say I wish it were different. I'm just up here to say this is how it is, so now what do we do? And also, of course, libraries have focused largely on collecting published material. What is the born digital content that we're looking at? It's not the kind of stuff that was ever in their scope anyway. So they're not focused on, well, how do we then change our scope and expand it? They're focused on all the services that they are under pressure from their institutions to deliver, because that's what their parent institution wants. That's the towns that support public libraries, the universities that support university research libraries. They want to see those community services. That's what they're paying for. We have a really complicated picture. The copyright deposit is a great way to catch, you know, at least national heritage imprints, but there's lots of holes in that net when it comes to e-deposit. It hasn't matched up. We don't really, the small specialized institutions, the local historical societies, the public libraries that used to at least make sure they got the high school yearbooks and the local hometown newspapers. We haven't really created an infrastructure for them to turn that kind of hyper-local collecting into born digital collecting. And then, as I mentioned, the new forms of born digital content such as social media just, you know, don't match anyone's responsibility. Not published, not archival. What is it? Whose is it? And so, as institutions that are traditional memory institutions turn towards what they're working on, that's just kind of going by them. I do want to note that one place where we have a kind of clear align of responsibility, even though material is born digital, is in organizational archives. I think that's one reason archivists have been at the forefront of dealing with things like e-mail and new forms of digital content, because they know that they need to collect their institutions, archives. There are, however, lots of problems. First off, they're struggling. They're doing a great job in many cases, but you also have situations where smaller organizations or professional architectural firms, you know, publishers whose records we always want, they are very interesting. They don't hire full-time archivists, and at least they used to be able to hand over boxes of stuff to people, but now they don't have archivists on board. So, even though the responsibility may be clear, the match of technology and expectation is just not working there. So, as I said, you know, we're making assumptions about collecting, and it's not matching these organizations anymore. And as I said, I'm not out to criticize them. It's that they are up against challenges and mismatch that don't work. We have really daunting elements of what I'm calling here digital disruption. There is the vast scale of documentary content. Whose responsibility is that? It's so much. How do you do it? I guess you can't talk with your hands when you're knocking microphones over. The, speaking of unwielding nature of these network forms, you know, streaming, interactive, computer, they're really hard and problematic. The dispersed creation of content, you know, even traditional kinds of published material isn't going through publishers. It's just going right to Amazon, and we can talk about stories of, I know, one national library tried to get Amazon to encourage authors to deposit things, self-published work, and, you know, Amazon's real interested in that, right? That's just what it's out to do. So let's, you know, let's talk about personal records. I mean, the folks that are interested in supporting people in their personal digital archiving are heroic. It's wonderful. We know that human nature is stuff such that if this is hard to do, it's just not going to happen. You know, how narcissistic do you have to be to spend all your time dealing with your own photographs of yourself? You know, people just don't do that. And then, again, no boxes of things to hand over later to your grandchildren. You know, there are all kinds of intellectual property barriers. Those have been talked about a lot. They don't go away. You've got to deal with the platform owners and the content owners. Privacy scares everyone to death. It's a valid concern. We're all usually on the side of, oh, my God, you know, they're surveying us all the time. They know everything. But the other thing is the big tech is worried about letting anything out because they're under the gun on privacy. So then they don't share it for us to be able to do research with it. Very tricky. And then there's, you know, the ephemerality of the problem. Again, the boxes don't come down later if you're not thinking about preserving it upfront. Where is it going to be? So these are just huge disruptive changes in the nature of what and how we've been collecting. So what that really turns into is that it is, I think, really by the definition of systems designers and urban planners and, you know, social problem solvers. It is a wicked societal problem. It's not just wicked because it's hard. I won't go through all the characteristics of wicked problems, but it's a societal problem. It's not just for the traditional memory institutions to solve by themselves. We can't. They can't. We're not the same. We don't match it. It's a wicked societal problem. And we have to recognize it for that if we're going to begin talking about what should we do, as opposed to just waiting for some heroic memory institution to solve it all for us. We really need new strategies, new roles, new partnerships, new initiatives, and new collaborations. And we have to think about how to construct those. What should those be? What are the most promising things that we can do? I don't have a bunch of answers. I hope to do more work where, and you've got to take it format by format and problem by problem and try to figure it out. But the fact is we have to strategize these things in new ways. So that's why Cliff and I are here picking your brains today because we need diverse creative problem solving. Memory institutions need to take the lead in this because we're the ones that wake up caring about it, but we can't do it alone. And we have to figure out how to get help and what that help should be and how to attract help. And so we have to take a quite different approach. And that's why earlier on I highlighted some of the wholly new approaches like Portico clocks, I think the Software Preservation Network and the work they're doing fits into that. And of course, Internet Archive. Just completely new bottles for new wine. That's what we need. But we're not done here. We just need so many more of these ideas. So there are so many, and this is one of the areas we want to hear from you, kind of priorities, because every one of these things that you see on these bullets, and this is only the beginning of a list. And I realize I should have added, you know, the newest kind of hot one is open source intelligence sources. You've got to, you know, all this open content that's out there that's being used right now to do investigative journalism. Isn't that the same stuff you should be writing history with later or figuring out what society looked at later. So each of these things needs different kinds of strategies, different kinds of partnerships, different ways of looking at it. And so what are your priorities? What, where do you want to go? Who, you know, how can we address this? We need to get a lot of energy around, around each of these things. I just, I couldn't help. And I think it's also kind of a moment, you know, on the one hand I don't expect much from Amazon, but it is true that big tech is under a certain amount of pressure right now for trying to look better at least. Maybe it's a moment of getting their attention to find ways to do a little good in the world. I've also noticed, I put this, I just found this, this cartoon, can you read that from here? Or I can read it to you. And here are those family photos you thought you had lost in the cloud is the caption here. I was so happy to see that in the New Yorker a couple of weeks ago, because at least this problem is maybe coming to, you know, the consciousness of the general public. So, but it needs more than just consciousness. It needs proposed solutions that we can get people and there won't be one solution. Wicked problems don't have solutions. They have mitigating strategies and we need mitigating strategies that we can get around. So, I've put up a set of discussion questions here and I'm going to turn, I'm going to turn the hard part over to Cliff. While you contemplate these discussion questions for a minute, let me just say a few things. I think that there is a unfortunate tendency in this area to start with things that we think we understand and are easy. And sometimes that's an appropriate thing to do. For example, I think it really was appropriate to do scholarly journals because for many of our institutions that really is sort of a core piece of record and we understood at least initially the properties that those maintained as they did a fairly literal translation from paper to digital paper. We understood the properties there. I think we want to question that strategy in other areas where just because we think we know what we're doing and maybe have a prayer of going forward doesn't mean that's the place to park huge resources. Priorities are really hard to figure out because you've got conflicting things that you may want to optimize. There's one argument that says pick the things that are going to be most important in the future, the classes of things. There's another argument that says pick the things that are most at risk. Those are the priorities. The things that are most at risk aren't necessarily going to be the most important. Another aspect of this which I think makes us deeply uncomfortable and we've been burned on in the past but at the same time we have no choice pragmatically but to consider is in what cases is content of various sorts that is owned by commercial interests of enough lasting value and the commercial interests seem to be stable enough that at least for the short to middle term future we can say they will be well motivated and sufficiently resourced to take care of this material and in what cases isn't that going to happen. We need to be very careful making those assumptions of course because one of the things we've learned in the transformation to the digital world is that whole sectors of the economy can kind of collapse in very, very short order. A case in point being local journalism which really was pretty healthy and then it just suddenly collapsed and local journalism had at least back in the print days a relatively good record of maintaining archives and morgues and working with local memory institutions to make sure that some record of that stuff was preserved. It wasn't perfect but you know you would have hoped that that would have turned out okay and then it really really really didn't turn out okay. At the same time I'd like to believe at least this morning that you know the big national newspapers which are clearly very important you know sources for future scholars in any number of fields right now seem to be sufficiently well funded and sufficiently motivated that they'll probably take reasonably good care of that collection of material. At least I'd like to believe that. So we're going to need to wrestle with that sort of thing. I think that one of the things that we don't consider enough is we seem to feel like any area we need to go into to preserve we need to do it comprehensively and I'm not sure that's true. I think there are areas where we might wish we could do it comprehensively but we can't for various reasons lack of resource lack of priority lack of access but we can do enough to preserve a sense of that material into the future. We can we can project into the future some idea of what it was like to see that or participate in that or do that and even that has very genuine value I think that we may need to be satisfied with in some cases. So here are some questions that are framed. I think I'll just close my comments with one last reflection. When we start thinking about commercial content well actually this goes beyond commercial content but it's most obvious with commercial content. We are in a world or about to enter a world depending on the specifics of the marketplace where basically we cannot preserve and indeed cannot share also in a library sense of sharing content without the explicit consent of the rights holder. That is a absolutely radical departure from the world of physical objects and it's very easy for the general public for example to not understand how profound that is. It feels to me like we need some kind of a strategy for a very large scale public understanding campaign about just how serious that shift is and about the need for not just you know sort of inside baseball legislative tinkering but a genuine sort of public opinion and policy commitment that that's not where we want to be and that we need to get somewhere else and I think that ties in also to what Carol was saying about perhaps we are entering a period of opportunity to get some players to do the right thing not because they have to but because they can win in the court of public opinion by doing the right thing and we should be there to help them. That's going to require longer term strategies and activities that I think are pretty unfamiliar to our community but that at least to me are feeling increasingly important and can be and I firmly believe can be made reasonable and comprehensible to the broad public by drawing on their own experiences and their experiences rolled around so with those reflections let's talk about this list and let's hear from some of you we do have a microphone