 Hi everybody. Good morning. Thank you for being here. I'm Katrina Fenlon. I'm a assistant professor at the College of Information Studies at the University of Maryland College Park. I'm also an affiliate faculty member with the Maryland Institute for Technology and Humanities and some of the work I'll talk about today is a collaboration with that with that research center. So today I'm talking about an ongoing research project on studying digital scholarship in the humanities and especially the burgeoning challenges confronting libraries of reconciling with the mass of digital scholarship and its sustainment. So in this year's debates in the digital humanities, Ted Underwood proclaimed that digital humanities is a semi normal thing now. So another survey found that at least 50% of faculty in the humanities report using digital tools and collections, but also creating digital tools and collections, and of those a majority are intended for ongoing use and research by other people. So we're here after decades of investment in the reconstruction of the cultural record on the web in data-fying, modeling, and coding, annotating, enriching, digital scholarly, editing, mapping, modeling, software development. At this point, digital collections and projects created by humanities scholars are an extensive scattered heterogeneous and completely independent cultural record. So DH is a semi normal thing, but sustaining DH is not. Marin and Pickle in their survey observed that from conception to ultimately preservation or some other form of second life, most DH projects are unsupported in any systematic way, even on campuses that have a dedicated DH center. I think if you're here, you're acquainted with maybe even plagued by the scale of this problem, the sheer number of projects that are difficult to keep alive, and their ongoing accumulation on every campus. So despite the fact that most DH projects fall well within the preservation mission of institutions in their capacity of stewards for institutional research and scholarship, despite the fact that lots of these projects are unique cultural records, often representing underrepresented or gaps in mainstream preservation institutions, despite these factors, DH has struggled to gain systematic support in libraries, and we're going to talk about why. The reason is that it's really hard to do. The most immediate problem I think is conceptual. We're not clear on what sustainability and preservation really mean for different kinds of projects, but we know that it's very context-dependent. We're all familiar with the socio-technical, interwoven social-technical challenges of sustaining DH, things like the short-term funding model, the dependence on initial creator or originary community. I can talk more about technical challenges if anyone's curious about those, but there's also a lack of clarity around the value of these projects for an institution, from an institutional perspective, what an institution owes to digital scholarship, how they should understand ownership for projects that are distributed, collaborative, across institutional boundaries, and beyond that, most institutions simply don't have the capacity to take in digital scholarship at scale. Finally, one point I really want to highlight for the purposes of this presentation is even beyond the pragmatic problem that libraries can't take in digital scholarship at scale, there's a problem that libraries shouldn't always do that. Many collections, of course, are seeking new models of institutional partnership that keep varying levels of power and control over the collection in the hands of the originary community. So, if libraries can't or in some cases shouldn't take everything in, what are the alternatives? Hopefully not this, but this is a page that I stumble on all the time in my research. In fact, I took the screenshot this morning when I went to check on one of my example collections for this talk. Today, what I'm going to talk about is an ongoing research project on scholar-generated research collections, one that I started as part of my dissertation work at the University of Illinois, and what those collections teach us about pragmatic challenges, confronting sustainability for DH, but also how to rethink how we understand sustainability. I'll talk a little about existing approaches in libraries, and then I'll talk about a new line of research that I've started work on with Trevor Munoz and colleagues at the Maryland Institute for Technology and Humanities and at the Maryland iSchool on community-centered and collaborative models for advancing our capacity to sustain digital scholarship. So, I have three main goals for today. It's kind of ambitious. We'll see where we land in terms of timing, but what I'd like to do is give us some kind of handle on what our project's contributions are and what that means for sustainability. Also, I'd like to give us some conceptual clarity around the varieties of sustainability, and last, I'd like to persuade you that the collaborative models that we're starting to research are feasible, valuable, sometimes the only ethical solution to sustaining digital scholarship. Okay. So, the new research on community-centered sustainability has its origins in research on collections generally that I started in my dissertation, and I don't want to belabor that study here, but the main questions were, what are these digital collections that scholars are making? They're really cool. How do they contribute to scholarship more broadly? And also, what are the challenges for libraries and other cultural institutions in supporting them as a sort of mode of scholarship and production? This project was done in a few phases, starting with a large-scale review and typology of about 150 different DH collections, a really close examination of three of them, the three that are shown on the screen, and then a series of interviews with experts, most of them in well-established DH centers, on how they handle this kind of scholarship over time. I found out many things, but the two main kinds of things that are relevant to this new strand of work have to do with the varieties of contribution that digital scholarship makes and with the difficulties, of course, of sustaining and preserving that scholarship, which I'll dig into. When we're talking about sustainability, what we're talking about is sustaining or keeping alive the contributions of a scholarly effort. Different DH projects are making different kinds of contributions, and we have trouble talking about those. That's one of our challenges. Which means that we have trouble making decisions about how to sustain projects over time. So I just want to exemplify just three different ways in which DH projects make contributions. Of course, there are many other ways that they contribute, but I want to illustrate how a concept, the concept of completeness, can help us get a handle on what sustainability might mean for different kinds of projects. So the question here is what does it mean for different kinds of DH projects to be complete? And by complete, I don't only mean like done or finished, I mean in the sense of whole. What does it mean for them to be complete? In a really common and familiar case, a collection is complete when it provides a complete and definitive set of sources. This is a really common model for a project. It's often called a digital archive. So the example case I'll point to, this is one at Myth, the Shelley Godwin archive. The goal this project is to produce a complete set of digitized manuscripts from the Shelley Godwin family of writers. High quality images mapped to transcriptions, richly encoded annotations. This is like an archive of definitive primary sources. Okay, so that's one mode of contribution. Here's another contrasting example. A mode of scholarship that's not about providing the original sources, but instead about providing contextual information, relationships around some exemplary set of sources. So this kind of project is not trying to have everything in its most authoritative form. Instead, it's trying to show you enough and then provide valuable layers of context and interpretation on top of those sources. So for example, something like the Vault at Fafs at Lehigh. This is an aggregation. It starts with resources that already exist elsewhere on the web. It pulls them in, but what they really built here is an incredibly intricate and browsable network of relationships among bibliographic entities and people. So you can navigate from a list of works to people represented in the works to people who have relationships with those people and so on. Okay, so clearly the contribution here is really different even though the modes of access are kind of similar. We're looking at a discovery and access mechanism across primary sources, but the basic contribution of what the project is doing is very different with different implications for sustainability. To sustain Vault at Fafs, I don't need to take in the digitized objects. They exist somewhere else. I'm taking in a fundamentally different kind of thing. Okay, a third example, the last example, even though we could go on. In some cases what it means for a collection to be complete is to be providing a platform with new kinds of evidence, new kinds of uses. So collections or projects like these are remodeling original sources for new purposes. So doing things like what Osay can you see is doing, pulling documents from a bunch of different legal archives, digitizing them, but maybe not with a focus on digitizing for archival quality, instead with a focus on digitizing to support transcription and datafication and encoding. So the main point was to pull the data out of the encoded documents and reuse it to do social network analysis and to tell new kinds of stories. So this kind of platform is doing something different from what the other two collections are trying to do, deconstructing and remodeling existing sources for new kinds of use. So what would you be preserving in this case and how would it be related to the preservation or sustainment of the original collections? Okay, clearly if all of these kinds of projects are all doing different kinds of things and if we're trying to sustain their main scholarly contribution, then one of our greatest difficulties is that there is even more context dependency than I think we usually tend to think about in how we sustain collections. Preservation minded people might be inclined to ask, okay, this sounds complicated, these are complicated resources, but isn't this a problem we're solving with advanced approaches to software preservation or web archiving? Can we just save these tools and these sites and encapsulate them somehow? Isn't this something we're coming to the edge of understanding how to do? And the answer that's come out of the interviews that I've done with scholars suggests that what's useful to keep around, what they need for scholarship, what the scholarly record demands is not always the collection in its current form. We're not driving at fixity or at preserving discreet publishable outcomes of projects. Instead we're thinking about sustainability as opposed to preservation and when we're thinking about that we're thinking less about discrete outcomes and thinking more about these projects, things like OSAY as having their value because they are alive and functional as resources for ongoing and active work and collaboration. So in our interviews, sustainable and preservation were distinguished, scholars were only interested in sustainability. Sustainability as active hubs for ongoing work. At this conference I've noticed a bent in the discourse about sustainability which mirrors the literature on research and practice in this area as a whole. Towards one kind of sustainability, I don't think there's a strong shared definition in the literature or in the discourse in these rooms despite the fact that so many great minds are working on this problem in different institutional contexts. Widely varying definitions but they're all oriented towards things like organizational resilience and economic viability of programs. Some recent NEH funded work by Langmead and others has highlighted socio-technical aspects of sustainability and made strides towards helping us recognize and anticipate those. I'll talk for a moment about the way that these conceptions of sustainability are panning out in practice and then come back to how this motivates our new work on collaborative models for sustainability. So the way it pans out, a few different models and I think these will be quite familiar. So in many cases digital humanities centers where they even exist on campus end up as sometimes reluctant memory institutions. Some have partnerships with libraries, many do, or some kind of affiliation but most are sort of taking sole responsibility for the ongoing care of digital collections and that in itself may not be sustainable. Many institutions are taking a service level approach. I should also note each of these strategies is sort of overlapping usually just in combination with others. So many institutions are taking a service level approach which is to say there are different frameworks emerging for how we identify to what extent an institution is willing to make a commitment to a digital project in terms of specific components of the project. So identifying levels or layers of commitment preservation for certain artifacts or pieces of artifacts. A third sort of broad trend I see in how sustainability is being approached is in advancing infrastructures. So creating shared infrastructures across institutions or domains in scaling up or trying to aggregate digital content towards some kind of critical mass that may lead to sustainability through maybe a bigger user base. Advancing repositories and publishing platforms to accommodate more complex interlinked digital collections, networked digital scholarship and link data. These are all very exciting and while all of these are really important contributions we think we hope we can add to them with a shift in how we're thinking about sustainability. And this shift is motivated by this promising movement towards shared stewardship post-custodial approaches to archiving. But also by the research I've been talking about my own prior work which found that across all the amazing DH centers where I was conducting interviews sustaining collections was heavily dependent on sustaining workflows, distributed collaborative workflows of development and maintenance when those broke down, collections broke down. So it was about supporting collaborative work. I'm going to go back to the main points of the definition of sustainability and build on it here. So yes, organizational resilience, economic aspects, socio-technical aspects and also we are distinguishing sustainability and preservation for resources that are essentially or by their nature interactive. So things that are not inclined or predisposed to be shelved. Thinking about digital objects, digital collections as metaphorical, as computed, often not self-contained, usually heavily interlinked with resources elsewhere on the web or with services or utilities located elsewhere. Making their contributions through the act of being used as living things actually serving communities. And when we start to think about digital collections and objects this way, we start to turn away from an artifact-oriented preservation paradigm, think more about what it might look like to turn our attentions towards sustaining the communities that build and make these things instead of sustaining the artifacts that come out of them. So what might that look like? Trevor Munoz and I have been working on rethinking sustainability and here's one of our proposed definitions. A digital humanities collection is sustained as long as it responsibly supports the endurance of the communities that create it. As a locus of memory, communication, knowledge production for as long as useful and in whatever forms are useful. This definition of sustainability is guiding our research and the implications are significant. The first I've mentioned that if we want to sustain communities and their collections, we'll have to understand more about the workflows that create them and maintain them. So those are idiosyncratic, they're distributed, they're highly collaborative, they're difficult to understand, which means that we have a need for new research. Research on which critical roles libraries can play if we're not talking about the handoff of collections or objects. Research on the communities that are creating collections and their collections on their workflows and then on new models of institutional partnership. And to that end, we're starting this new phase of work. So we're starting the sustaining digital community collections project. The goal of it is to develop context-driven sustainability models which share responsibility for long-term care of projects between libraries and research communities. Our goal is to build out sustainable infrastructures or patterns of work in a growing diversity of communities and institutions. So this project looks across digital scholarship, but also into community archives and other kinds of collections outside of academia. We're developing right now as part of our pilot work three case studies of three different digital humanities projects which are shown here on the screen. A large-scale linked data hub of data about enslaved people called the enslaved project. That's our first case. Our second case is a corpus of Islamic hit texts and OCR tools. And our third case is a local to this area, digital community archive. All these projects are in active development, which I'm really excited about because often we study sustainability towards the end stage of the life of a digital collection or in retrospect. And in this case we're going to be looking for vulnerabilities in the development processes. Eventually we hope to expand this out to looking at success stories or collections that have transitioned in different kinds of creative ways. So while we're in our early phases, just starting our fieldwork and our interviewing this semester to give you some sense of what these kinds of models might look like. I've been really excited by the developments described in other sessions at this conference, including things like toolkits, best practices, sub granting, consultation, workshops, minimal computing investments and so on. So we'd like to look at the emergent models that are already coming forward in libraries for reconciling with this problem and with community-oriented research, but also integrating research from other domains. We're looking for a field, broad-ranging library practices, so outside of academic libraries or research libraries, community and critical archives, theory and practice, but also computer-supported cooperative work, studies of how people do research in collaboration and what it would take to make a research community resilient. That's our goal. So I'll look forward to reporting back on our effort to realize community-determined, community-led strategies for sustaining digital collections with new models of support from cultural institutions. Thanks very much. I'll be happy to take questions.