 So, good morning everybody, if we could switch the presentation over. So welcome to this morning's session on Collections as Data, part to whole. I am representing the grant team of Thomas Padilla, Hannah Skates-Kettler, and Stuart Varner. My name is Yasmin Suresh. Some of you might be surprised in thinking, I swear I saw Thomas earlier in the conference, and in fact you did. But he has been taken ill, and I strongly encouraged him to not stand here with a mask off talking to you all. He wasn't feeling great. And so as a result, I am doing this presentation, and I invite you all to engage with me in this and see if you can tell at what point in the presentation the slides that Thomas was responsible for, what ones those were, and what ones I am. And then make a no, and then we can compare at the end and see who is the closest to that target. So, if the past two years have taught us anything, it's like, you just roll with this, so here we go. All right, first we are going to pay our respects to our funding agency, the Mellon Foundation, sporting their brand new logo. As of this week, everyone, I encourage you to read the press release and understand the thinking behind the innovation here. So thanks to them for their generous support, and not just financially, but their support throughout this sort of ongoing no cost extensions that we've had to put forth again due to the past two years. Essentially the crux of the projects we are looking about, how we can responsibly support computational engagement with cultural heritage collections and communities, right? So we're talking about doing this work in concert, not piecemeal, but thinking about these things comprehensively and holistically. Underscoring all this is that we want to foster the development, and you'll see this several times throughout the presentation because we really want it to stick of broadly viable models that support responsible implementation and use of collections as data. And so this is the moment where, yeah, I teach a graduate course in data management, so I get to say, oh, for everyone just joining us, we're just getting started, and Thomas Padilla has been taken ill, so you'll just have me speaking to you today. Get a friend to fill you in on the game part of this presentation. So we're really talking about going from this isolated individualized model of doing things to something that can proliferate and have much more of a systems thinking sort of approach behind it. These are the model characteristics that we're not going to read. We're going to go through these later in the presentation, all six of them, but these are the characteristics that we want the model building to incorporate and really and live through its design work. So this is coming later, just a sneak peek. Here's how this presentation is going to be structured. So I'm going to give a bit of an overview of the project work of the part to whole grant itself, give some examples from a selection of our projects, what lessons we learned as a grant team, and then what needs and next steps are for this kinds of collections as data work in the future. So for those of you who are unfamiliar with collections as data, the series of grants, it's been facilitated by two grants. The first was an IMLS funded project in 2017-19 that sought to establish the foundations for collections as data, including shared values, robust documentation, and a fundamental ethical core. The second Melon funded grant built off that foundation to provide regrantees an opportunity to put those ideas and concepts into action through a holistic model implementation. You're going to hear the same words said over and over again because marketing shows you need to hear something seven times before it actually sticks. This is the grant, the second grant that we'll be speaking about today, which provided us with an opportunity to develop these models through regranting to a variety of institutions. We had two cohorts of regrantees. The first cohort completed their projects, not their final reporting, but the project worked itself prior to the emergence of the pandemic, while the second cohort began their work as the truth of the severity of the pandemic became apparent. As a result, our cohort two participants became ever more adept at pivoting and flexibility, and we are giving this presentation in 2022 instead of 2021. So the crux of the work to elucidate broadly viable models that we would intentionally center projects that would not be bespoke to any singular institution, but that had broad applicability. The two cohorts that I spoke of, each team consists of these roles. That doesn't mean that the project teams were only three people or like just limited to three people in these roles. Many of them were much larger, but they had to have people who filled these roles on the team. We chose these roles because the purpose is to create models, not projects. And models need institutional buy-in, and administrators are needed partners in helping to create that organizational operating environment. The disciplinary scholar is another aspect of that environment, and it's one that extends beyond the organizational environment and brings that user perspective to the model design work so that our considerations don't end up cloistered within the libraries and archives. So as is mentioned, the models we sought to serve as were both organizational and implementation. To come to the work really from both directions, the implementation model focuses more on workflows, technical parameters, digitization, details, and so on. The organizational model gets at the people and the resources and how they need to work together to do so in a sustainable way. The other critical component of the project proposals was that they needed to work with collections that have been underrepresented from local or institutional histories or priorities. We also wanted these cohorts to be able to community build within the cohorts themselves with one another. So they attended a formative institute to hear from some digital scholarship experts about project design, ethical practices, and process considerations, looking at Harriet, because she was one of those experts, before breaking up into groups to reflect on their original proposal and consider how they may need to refine that based off what they just spent all this time hearing and being in conversation. Each cohort also participated in a summative experience where each institution could report out on how the project went, what they learned, and then also come back together in small groups and reflect and provide feedback to us, the grant team. Feedback from Cohort One had significant bearing on how we proceeded with Cohort Two. We also took advantage of the new acceptance of Zoom reality to have a summative event with both cohorts together so they could share in that way, and that spanned quite a bit of time. And it was actually a really concerned people would be Zoom fatigued out because this was in the fall. But they were excited to get back to talking about their projects and sharing with one another across these two cohorts because they hadn't been brought together in that way before. So these are the 12 institutions that participated across the two cohorts. And it was important to the project team, the grant team, that the cohorts have a diverse representation of institution and project types. It would be problematic to say that we're trying to create models that are broadly viable and then have very homogenous institutions or homogenous project types. So they could all be like newspaper digitization. So you'll see a mix of private and public institutions, including universities and cultural heritage institutions. Try to slow down, sorry, I'm a fast talker. So, guys are going to be out of here really fast. So at this point, I'm going to just touch on three examples to help demonstrate how differently our colleagues were able to design models, which will provide a lot of different avenues for engagement and different on ramps for this work depending on where you're coming from. All 12 of the projects produced very different materials and managed different challenges. And we'll share later in the presentation, it's actually in the program too, where you can find all these cohort materials. So please do check those out. So these three institutions found ways to create a collections of data model and re-imagined what that model would look like for their own context. I'm going to start with University of North Carolina Chapel Hill, which was really dove into this concept of interprofessionalism within the libraries and archives. I'm going to do a very brief institutional profile so that you can understand a little bit of what their context was that they approached this work. It's a very large public institution with a large library staff as well. And this is a slide from actually the summative form presentation that UNC made. They had a large project team, but what they identified as being almost unexpectedly critical to the success of the project was the thoughtful incorporation of expertise from across domains. And thinking about the interprofessionalism of that collaboration as expressed through the model development. So it required the coordination from curators, data specialists, graduate students, developers, soft repository services, disciplinary experts, and folks doing outreach and communications. So oftentimes in many institutions, especially larger institutions, those are fairly containerized activities of work. And when they come together, it might come together as a very special project for a limited amount of time. And because those things happen, there's often that friction or that tension of how do we work together? What does this mean? Because they were doing this project around developing a model, an organizational model, and an implementation model, it really forced, and I don't mean that in a negative way, but forced them to think about how they come together across these different domains within the libraries and archives to recognize the expertise and perspectives of one another to develop something that was a lot more holistic and comprehensive and true to the purpose of the grant. The project created a range of outputs, and by making this work so accessible, they were also able to gauge interest and consider how to continue with this kind of work if there was a lot of uptake and there was. So as a result, UNC used what they learned in Cohort One to expand the project and apply the model beyond North Carolina. So this Mellon-funded project is currently underway, but speaks to the extensibility of even large-scale projects when you have a tested model to build from. I'm sorry, that's like much smaller font than I thought it would be, but you can Google on the books and get to it and see their timeline. My University of Denver saw the Collections of Data Model as an opportunity to examine and rebuild departmental processes in a way that wove the work throughout and created a cohesive socio-technical infrastructure. So it's a very different kind of institution. It's smaller, it's private, and it has a smaller library staff as well. This is from their presentation, and their goal was to explore handwritten text recognition technologies to produce these collections of data. But because their corpus involved medical data from the Jewish Consumptive Release Society, they recognized that they would need to go beyond the minimal office of sponsored programs, IRB requirements to establish an ethics advisory board. And that was really helped informed by that formative summit where they were able to reflect on what they had heard and consider what else they could do to go beyond the minimal. And this ethics advisory board was comprised of members from the university, the Rocky Mountain Jewish Society, a medical ethicist, a lawyer, and a retired physician. So this thoughtful and invested group helped the Denver team develop Terms of Use, which is one of the squares in the right hand side that was passed by the ethics board and integrated into the API user interface. And Denver spent a lot of time making sure that the APIs were really accessible to people who might have had like less familiarity with using APIs, and they also did a large amount of technical testing and development as well as skill building throughout the libraries. They recognized that their limited personnel resources meant that they could not treat the grant project as an add on to their existing work, but had to integrate that work really meaningfully and thoughtfully. This meant they could assess the workflows through the collections as data model lens and design the work with clarity of scope and purpose with the expectation the other workflows will build upon that foundation in the future. The team observed that they had not reflected on the existing infrastructure and determined how to integrate this work as part of their digital collections and not a special project. It would have been much more difficult to manage personnel disruptions that they experienced during the course of the project, which I think all of us can relate to that. Louisiana State took a different approach to modeling collections as data work, recognizing the state's consortial digital library as presenting a unique opportunity to build a shared model across distributed institutions. I was very excited by this because it was an extremely ambitious project. LSU serves a large student population, it's a large university, but has relatively lean library staffing, but this institutional profile is sort of incomplete because it was only talking about LSU, but LSU's project was about leveraging this work across the entire Louisiana digital library. And that's 25 institutions that are very different from one another. I'll give you a second just to look at that so you can appreciate the diversity of their mission, their materials, staffing, et cetera. So this was their project. And while the project team in this case was very small, their goals to facilitate this work across the consortium as a community effort. And this involves so many stakeholders with a range of awareness levels and prior engagement with even thinking about collections as data. So a lot of this team's effort was about building that organizational model structure and building a really strong foundation. And from there, they could then provide on ramps to the technical efforts to build the implementation model. So this project had a really strong focus on professional development from both philosophical and technical perspectives and really try to show how we could think about collections as data at a consortial scale. I think that oftentimes this community building around an unfamiliar but resonant topic takes longer and requires more care than technical skill development which often occurs at an individual level. LSU's project plan called for some really exciting and concentrated institutes with the LDL members with a lot of enthusiasm to get the chance to get everybody together in a room and cloistered at tables together and put their heads together and think real hard about things. Unfortunately, the pandemic upended all of that. And our colleagues in Louisiana had a spate of significant weather emergencies during this time. And I also just wanna take a moment here and acknowledge how dedicated and care centered all of the project teams were during this time even as some suffered significant loss. This slide represents how LSU leveraged everyone's expectations of virtual sessions to continue the work, albeit in a slightly more distanced way than originally conceived. So even though things didn't go as planned and it didn't materialize in this way that they had really had so much excitement about being in the room together and paying for the travel for everyone to come around together, they found ways through the virtual environment and through the speaker series to facilitate the conversations across the member institutions in a way that I think was still really successful and has led to this sense of community and sense of growing and learning and sharing of experiences together. Just like a recap slide of some of the key considerations. These 12 participants created different projects and deliverables, but there are shared commonalities in their experiences, which we're gonna address more. This is the link that's in the program as well, but here's where you can find the deliverables from all the projects. And I really hope you check them all out. There's something there for everyone. And the model building approach means that there's a greater likelihood that you will be able to engage and translate this work into your context. So what were our lessons learned? Again, this is what we're trying to do. Broadly viable models, responsible implementation and use. Not the single building, but the scalable, the community, transferable, all these things that we always want to happen with grants and they're always like... But this is our goal. We're gonna talk about each of these characteristics that were really fundamental to the success of being able to say, yes, this was a model that emerged, right? And not a one-off or not some thing that relied on institutional idiosyncrasies. We really set out to encourage the mindset of building upon, right? That we're looking at libraries, operations, and we're not trying to tack the stuff in from the side. But we're saying this is core libraries and operations. These are collections. So how do we treat them as fundamental processes of our collections and build upon traditions and workflows that we have and iterate where we need to? But don't have a perspective of, well, here's this brand new completely alien kind of thing that's coming in that we have to somehow rejigger all this other stuff to make it work. Think of it as building upon the cores that we already have in our institutions, archives, museums, libraries. Keeping holistic objectives centers the considerations of all the communities involved. And these are the folks doing the implementation, doing the work as practitioners. These are the populations whose the collections might be centered around or about. Scholars, instructors, the public, students. If you can have ethos of thinking about these populations as working in concert with one another as a holistic section of humanity, your objectives will be truer to the project and more tangible for you to achieve. Having values aligned characteristics, I mean, these things do sort of follow one another, right? If you're supporting responsible computational engagement with cultural heritage collections, that means that you're thinking about those holistic partnerships, those holistic considerations, and you're thinking about labor, right? Making the labor transparent, helping people see the value of the labor and the time that goes into doing this work. You're doing this in concert and in conversation with communities who are directly affected by the work. And you're remediating, reconsidering, reprioritizing efforts to strengthen communities who have been minoritized throughout America. So this is all really taking a values centered, values aligned approach to the characteristics here. So it's a lot of work, right? Because these communities are different from one another. And so you have to go in with your eyes. It's like a Thomas pun. I didn't even mean it because of this gift, but like go in with your eyes open. And those of you who don't know, like Thomas doesn't understand puns, but like, anyway, enter this work with your eyes open and recognize that it's gonna take a lot of time to engage in a way that's authentic to each community that you're working with, right? Because if you try to generalize too much, and it's almost like, I'm going a bit on a tangent here, it's almost like metadata, right? Like how much time do you spend on a thing? It's always that balance. It's the same here. Like you can try to generalize some foundational things, but then when you're working with specific communities, it does need to be specific to them, to their needs, to their goals, to their values and their priorities and making sure that you're coming together to a common place. Otherwise it's not a model, right? Just a one-off. It's constantly changing. Even without horrible things, it's constantly changing sometimes for great things. And these changes need to be rooted in our interprofessional, our interdisciplinary methods and particularities and traditions even, right? But expect the change, embrace the change, recognize what can grow, as per these little pea shoots or whatever, what can grow from the change. It is stressful. It can be stressful. This past two years it's been stressful, so maybe you're just getting me that particular lens, but that change is important. And it is a people-centered project, right? We're talking about collections as data. Well, there are collections about people sometimes. Most times they're about people. Amanda, not always about people, as we heard yesterday, but people are doing the things. So people are primacy here. And considering care for people, their experiences, their expertise, if you can lead from that with your model design, then you are giving consideration to all those other characteristics that I spoke about earlier. So here's, I know, I'm sorry, I didn't say not CPUs or GPUs, but there's no other thing like people. But what are the needs and next steps? So what have we seen as needs in this domain of collections as data, not specifically the work of the grant, but when we think, okay, work that's happening in the field, what are the needs? We do need more responsible computational engagement with contemporary collections, right? With these things from today. What are Twitter feeds and so on? Like, what are we prepared to do and engage with these things, with the systems that we have? Are they really set up to do that? Or are we constantly wedging things in and not talking across domains, right? Across archival description domains and the repository domains, for example, for things that are happening contemporary nearsely, we have to think about both of those things because they will become archival works later, but we have the ability to extend that thinking to them now. We haven't had to think about that, but if we're going to lead from this ethically sound, considerative place of model building, we have to be thinking about the contemporaneous materials as well. Wherever you may land on the thoughts of AI and machine learning and libraries, we do need more engagement. With thinking about how collections' data approaches could improve core library functions like discovery or description and access. I'm just gonna put out, like it's just a thing we need to do more about I'm a skeptical person on the bounds that we heard from the privacy folks about doing a bunch of machine learning with vendors that will end up serving them and I'm like, okay, but we gotta be part of that conversation and we have to help guide what the values, what the people-centered approaches are in order to make sure that we are not falling into a limited vendor monopolistic overlay of describing our collections through AI, but we are actually having a hand in design and implementation of that kind of work too. And really, for something that has been said, I'm sure for many years, that we need to focus on more cross-functional teams across units or divisions of the libraries. It was something that every team across the 12 groups was able to highlight as a benefit of the work. It had friction, of course, at various moments, but because they were working towards this common purpose of developing a project using a model's lens, it helped facilitate those conversations at their institution in ways that maybe had historically been difficult, but it's ever more important as we think about structures, the organizational models that came out of this work are applicable beyond collections of data work. So I think that that's something that can happen, we can do more of as a profession. And something that we've experienced is, this work is happening globally, and it's happening globally in different contexts in different ways across different professions within that work, it's not all in the libraries. And having that sense of awareness and openness to reconceptualize ways of doing things, reconceptualize what is the best or what is the most effective practice when you're working across culture, when you're working across professions, I mean culture in national culture, profession culture, this will help us get to a more representative place with collections of data work globally on a whole faster than if we nationalistically silo ourselves or even regionally silo ourselves off and then try to come together and see how things fit. So these are all needs. And so here's some examples of how we see collections as data work happening globally. And as a result, the grant team is going to be having an international collections as data summit in Chicago like July or August, we're figuring it out Patricia, but it'll happen in the summer where we want to be able to take all of these lessons learned, all these experiences, have folks who are doing this work in other contexts come together and surface, how does this model's approach work globally, does it? How could we potentially think about sort of stratifying the models based off different contexts, right? Centering the care, centering the people in the communities that we're trying to be authentic to and be in conversation with and come up with frameworks potentially that could be adopted by the community and really help bring this work into the core library functions and not something that is just for the most well-resourced, the most technologically resourced, the most risk-taking institutions but really help create a new culture around thinking of collections as data. Literally have no idea what the slide is. It's a fern. So that is the presentation. I will very heavily take questions and thank you for your time this morning. I'm Chris Rashman, Drexel University. I was wondering, you talked about this developing framework for other people to follow, but is there a way that at a specific institution when you present them with a model and they implement it that they can carry it on to future work, is there some way to operationalize that? Yeah, I mean, definitely, thank you, great question. The implementation model, so there were two models, the organizational model and the implementation model and so the idea with the implementation model really is how do you take this work and extend it beyond this collection, beyond this project, potentially in your own institution? It doesn't necessarily have to be an implementation model for all institutions, but the idea was could they document the things in such a way that it can be transferred to other works? Is that what you were? Danielle. Hi, I'm Danielle Cooper at Ithaca SNR and I'm really fascinated by the idea that you've developed a cohort model to really activate the potential there to show how these projects can work and then there is this longer term goal where these practices will be more holistically integrated into libraries and I'm kind of curious like what other things you think need to happen in the middle to get there? Recognizing, of course, that this is speculative, you haven't had a chance to try it, but I could see anything from fellowships to better educate or revising the kinds of education people get in their LIS programs. I mean, the sky's the limit, but especially since you've been working with people on the ground, I'm curious what you think needs to happen more in the middle. Yeah, I mean, and that's a great question because across the cohorts, across the participants, they were very diverse institutions, right? So we have art museums and cultural history museums like Weeksville and thinking about what that middle area is, is the area of more specificity for those contexts, right? So that's where we kind of see the stratification potentially occur between like if you're an archive or if you're a library and if you're an art museum. I think some of the things that we have talked about is this, how to inculcate a philosophical shift in mindset, right? How do we get the thinking of data as collections? All right, we've been saying collections as data, but also like data as collections and have that be a more connected thoroughfare in people's minds, right? And how they approach the work. I think your example of what kind of education reform might need to happen in this area, I think we're seeing steps in that direction with the Scholarly Communication Open Education Initiative that is helping to put more of a focus on Scholarly Communication within the Masters of Library and Information Science programs. This feeds into that same ecosystem, right? I think we're really hoping that the summer summit helps us find what those exact takeaways are because we won't be able to get to that framework if we're not considering that middle layer and we want more people in the room with us helping us figure out the middle layer. It's an unsatisfactory answer, but it's an honest one. Yeah, just because there's nobody else in line, I'll just add one more thought, which is that I think it's an interesting moment from a labor perspective within the library because maybe 10 or 15 years ago, I would have argued to you that you could probably activate the liaison model to do a lot of this work. What would it mean if a liaison as part of their core responsibility saw thinking about the collection in their purview in this way as a core part of their work and there'd be so many of them throughout the library, but given that so many libraries have moved away from that model, I won't argue that. But anyways, thank you for your presentation. Sure, thanks. I'll respond to that. Yeah, yeah, yeah. But I'll sit down because there's more people. Sure, that could work. I think we struggle as a profession to get liaisons to engage with things that are, and I am a former liaison, so. Outside of their instruction, collection building, and research support, because going back to the previous question, things like scholarly communication haven't been inculcated into the educational paradigm of LIS education. I'm just gonna narrow it to LIS for the sake of this question. And so getting robust buy-in through a liaison model to engage with things even within a scholarly communication domain is oftentimes challenging. And so then data management is oftentimes challenging, and these are all connected. Sorry, it's like. Getting a little meta here. But it is connected when we think about how, where does this go from a labor perspective? Because we're, I'm up here, so I'm just gonna say, as a profession, we're very bad at letting things go. We're very bad at recognizing when we don't need to do a thing anymore and can do something different instead. And data management, data services, data ethics, privacy, scholarly communication, rethinking, publishing, conceptualizing data, collections as data and the workflows that are needed across many domains to do that work. I don't think using our current context of the educational background and training that people are receiving that can happen. And for folks who are in the jobs and just looking for like, I don't know, I'm eager, let's professionally develop our way into this. We don't need to do it in a classroom. That's great. The time that people need to do that is scarce, and it's very hard to get people that time to do that deep thinking and reconsideration of their patterns of work. And so if you all are in a position to think about as your organization, I have to remember I'm at CNI, sorry, I'm on my first CNI, as library leaders, how can you make the space if this is an area that you want to engage in? How do you make the space for the folks in your organization to go into something unfamiliar and learn without fear and be supported in that journey? Yeah, in many ways you've begun to answer the question I was thinking about, which I think many of these questions are sort of theme and variation, which is the shifting from project-based work to built-in workflows. Because often, grant funding allows somebody to buy out someone's time, so they create that time, but then that can become its own isolated team that doesn't get integrated. So I was just wondering if you could talk through some of the ways in which you saw from the cohort how some of those kinds of challenges of going from, because I would imagine that project-based way of thinking was pretty baked into institutional DNA. So how do people unlearn that to embrace what you're talking about? Yeah, it was very disparate across the institutions and that was part of why we had those requirements for the project team, that those roles were in there and that those folks in those roles went to the institutes, were in conversation with their counterparts at other institutions to talk about what they were engaging in when we broke them into small groups. Some of those small groups were built around the roles, so they weren't just at the institution level. And I think some organizations were able to practice introspection and reflection easier than others. But that's what it's gonna take, right? It isn't like you're doing something wrong, that's not the approach here. The approach is how can you do something differently to get to this outcome? And that framing is not often the framing that we employ as a profession. It's a very deficit-minded approach to things. And so I think if we can have, you have to have the authenticity of actually wanting to do this work in this way, of actually wanting to center these populations, of actually wanting to prioritize these histories. If you're not coming at it from that place, you're just not ready to do the work. There is an organizational readiness component to all this. You know, that's okay. You can sit down and read some more and get to a place where you're ready to do it. And that's okay, too. That's what we're all like. We're all lifelong learner advocates. Like, any other questions? Okay, so my question to you all before I let you go is at which slide was Thomas supposed to take over? The GIFs? That's because you know Thomas and he's like a GIF maniac, correct? You win first break option. Well thank you all very much and you can reach out to any member of the project team if you have any follow-up questions. I really thank you for your time.