 So, hello. Thank you, Andrew. That was awesome. So, we're going to move a little bit to a different type of story that has a lot of the same similarities, but not quite the same outcome. Now, I will just let you all know, just in case this story does have a happy ending and no data were harmed in the making of this story. So, if you had some qualms, let them at rest. It's all good. I'm Bradley Daigle. I'm Executive Director of AP Trust. And I'm Jill Sexton. I'm Associate Director for Digital and Organizational Strategy at North Carolina State University Libraries. And so, we're going to tag team a story about digital preservation. So, it's incumbent upon me to remind everyone that digital preservation has many components, and these are some of the key pieces that don't really play a singular role. They play a collective role in the outcome of the story, but elements that are critical to not having an unhappy ending for, you know, not if data loss happens, but when data loss happens at our individual organizations. And basically, things for an organization to always make sure that they have in due course, which is these elements, strategic framework, understanding that digital preservation, of course, is active and as many of you have heard me say before, I strongly believe that digital preservation is an asymptote. It's something that you approach but never really get to. And that's okay. It's about the journey. If you feel like you've solved digital preservation, please stand up and let's all exit the room accordingly and follow you to your organizational home. The second part is digital preservation is about assurance. And this is where this key component comes into this conversation, and assurance comes in many ways. And I can tell you based on the story that we've heard earlier from my colleagues, Karen and Rosalynn, assurance is not an MOU. It has not come from a board. It does not come from any kind of documentation that is an NDA. Assurance is a dialogue and relationship between and among players in digital preservation, both in your organization and the services with whom you engage. Specifically from an AP Trust perspective is this concept of fire drills. And I'll talk a little bit about AP Trust in the next slide very briefly. But the idea of fire drills and if some of you went to, there's a blog post that I put on the DPC, but essentially a fire drill is something that I developed with AP Trust, which is the means by which we can test an organization's resilience. So for example, AP Trust as a service will, we have a methodology by which we will randomly restore content that you've deposited in our service. And we do that for two reasons. One reason is that we want to make sure that you, in the case of a catastrophe, are able to make sense of that which you put in originally. So it's testing our service's ability to deliver what you expect. The second part of that and the equally important part is that assurance also tests your organization's ability to understand what you get back internally. So not only are you assured in these instances that the service is doing what you expect, you're also assured that your organization is able to understand how it's using digital preservation and where it fits. So that's a key piece for digital preservation that constantly needs to be assessed and tested. Not all of our members love that idea. Sometimes it's inconvenient, but as a story will tell you, that doesn't always fit with our plans. So a quick take on AP Trust. So AP Trust were a small consortium of what I would consider fiery and deeply engaged individuals who are passionate about digital preservation are very, I would say, self-deprecatory about their ability to do digital preservation, but also have a very strong desire to figure out our common problems around digital preservation. We've been around for a while. We have, I don't have a number slide like Andrew's Impressive Numbers, but I can tell you off the top of my head, we have roughly 10.2 million items. We have conducted about 192 million premise events that we've logged with our depositors content. So we have a fair number of things going on. And of course we're dedicated to transparency. Thanks for the shout out from Rosalyn earlier. All of our documentation is publicly available off our Wiki and website. So you can see it all there. That's AP Trust. Okay. So I'm going to start talking a little bit about NC State in our context. And I thought it would be interesting to kind of give a little bit of my perspective. So listening to Andrew's talk earlier, Andrew spoke on the perspective of Harvard, a very well resourced institution. And it's a really impressive infrastructure you've got. You've got, you know, an order of magnitude more content than we've got at NC State. Nevertheless, you know, NC State I think is fairly fortunate and well resourced for an institution of our size. But, you know, not quite the same level as Harvard. But we do still feel that we've got a strong need for robust digital preservation infrastructure. Our special collections program is relatively young in the scale of special collections programs, but it's a growing program. And many of our collecting areas have a healthy born digital component to them. So it's really important to us that we keep those precious born digital materials especially safe. And I think, you know, and going to talk about why we chose AP Trust as our digital preservation infrastructure, I just think it's important to understand like what made us make that decision. So in 2019, our libraries did all of our digital preservation work in-house. So not only did we create workflow applications, our storage infrastructure we hosted ourselves. And we did have an offsite tape back up that we maintained, you know, ourselves as well. But we thought it was time for us to kind of take a look at what we were doing and make a decision about whether that was the best way for us to go forward in the future. Some of the factors at play here, there were some changes in the digital preservation environment at that time. So NC State was an early investor in both AP Trust and Deepin. This was around the time that Deepin folded and it made us kind of step back and think, you know, okay, so what are our current investments? What are we doing? How are we approaching this problem? You know, we were buying into AP Trust, but at the time we really hadn't really committed to using it as our digital preservation platform. We were like, oh, well, we're doing this in-house. You know, we're just we're taking a look at AP Trust. We're going to see if it, you know, how it does. We're still dating. Yeah, we're just, yeah, we're just casual. But, you know, we also decided it was time for us to kind of take a look at our investment portfolio. We're paying dues into this organization. We're spending time going to the meetings. We're not buying into it. Do we believe in it or not? Do we think it's important to invest in shared infrastructure? If so, if we think it's a good and it's a reliable platform, why aren't we using it? So we kind of undertook this exploration with a mind of saying, well, either we're going to go all in. We're going to use this as our digital preservation platform or we're going to disinvest. If we don't think that it's worth it, why are we paying into it? Right? You know, there are a variety of other factors at play. You know, cloud storage was getting cheaper and cheaper all the time. And also, we were in the midst of kind of doing some long vision thinking, doing a really frank assessment of our own staffing levels, recognition that our institution might not be as well served by trying to maintain and develop our own infrastructure for this really critical component when we, you know, we were well resourced, but not that well resourced. We didn't have a real depth of expertise in our pool. We have people who are good, individual people who are really strong in their work functions, but really no redundancy. So that creates a single point of failure, which is really risky when you are thinking about long-term digital preservation. We know we're not going to get any more money to pay for more staff to bring in to provide that redundancy and extra depth of expertise. And just as an institution, historically, NC State puts more of an emphasis on using those resources to develop applications, you know, workflow tools, et cetera, that can really bring an advantage that we couldn't purchase elsewhere, and then paying for services that, like infrastructure, that we can buy, that we can't do better ourselves. So you can pretty much guess, you know, well, I'm skipping a little bit, like, so to give you a sense of the scale of our program in 2019, we had about 60 terabytes of what we would consider irreplaceable data, and with an anticipated growth rate of about 10 terabytes per year. So, you know, you can see an order of magnitude different from Harvard, but still very important to us. And so, you know, in looking at a variety of options, we had a really great team who's listed on one of my next slides. I don't want to take credit for their work. They did a great job creating a really exhaustive review of digital preservation environment, calculating long-term costs of, you know, doing it in-house versus various different platforms. And they looked at a matrix of factors that they considered and added up and tallyed up as to come up with, like, a score of, like, these are the platforms that we think would be acceptable, and here's how we should move ahead. And you can see kind of, you know, typical risk mitigation points here on the slide that they considered in making their decision. And when it all came out, we decided that AP Trust was our best bet for a variety of factors. You know, we also had a lot of confidence in the organization and its financial model, you know, in the leadership of AP Trust, and so we thought it was the best way for us to go moving ahead. And so, maybe this is what you all came to hear about. But, you know, we talk a lot as institutions about our shining moments and our big, you know, really exciting achievements and accomplishments. Look at this great thing we did, and we very rarely talk about, like, wow, we really messed up. Well, we made a big mistake. And in June 2021, we had one of those risk factors that we deliberately planned for, an accidental staff error happened, and the storage volume that held all of our digital preservation masters for special collections was accidentally deleted, and it's back up immediately deleted. And so, that was, I don't know, like, let it sink in, like, imagine if that happened to you in your library. That was a bad day. It was a bad day. But a year before, we had, we, a team of dedicated folks had worked to start ingesting all of our special collections content into AP Trust, and it was all there. It was all there. And so, wow, what a great thing. So, like, the first thing we did was call Bradley. You know, probably not the first thing, like the first, but, you know, one of the first things that we did. And through the course of investigating the incident and seeing what we had on other backup sources, we were able to determine that all but 16 terabytes of that, we were able to recover from local data stores. And we ended up having about 16 terabytes worth of irreplaceable digital special collections content that we needed to recover from AP Trust. I'm not going to go into a great level of detail about the recovery process, but if you have specific questions you can ask, and probably might be more instructive for you to be in touch directly with the technical folks who worked on this recovery effort if you have more specific questions. But it took us about six weeks to complete the restore process. And I'll talk a little bit more about what we did to mitigate root cause. But, you know, all of the things that result, you know, you think that your practices are good. You know, you do your best. You make a good effort to create thorough documentation, to create strong change management protocols, to, you know, not delete precious storage volumes that contain your precious digital masters. But you never, you know, we're all vulnerable. I think the point of this whole talk is, no matter how well prepared you think you are, accidents can happen, disasters can happen. And it's good to be prepared. So the things that happened, the things that went well, we were able to recover everything. We didn't lose anything, which I think is really a credit to the strong work of Bradley and his team at AP Trust, as well as the dedicated work of numerous staff in the libraries throughout essentially an entire summer to extensively kind of document what had been present, what we need to recover and ensure that it was all recovered properly. Costs were contained. And I think Bradley might be going to talk a little bit more about that. But his team worked really closely with Amazon to determine the rate at which we could download content without incurring egress charges, and also NC State is an internet two member. And our deal with AWS meant that we were shielded and buffered from some AWS egress fees for recovering our data. I think a great strength of the throughout the entire process is, you know, this is a really high stress incident for us at NC State. But I have to give credit to the team. No one lost their temper. No one pointed fingers. This was an accident. And it's unfortunate. But accidents happen. And it doesn't do good to make people to blame to ashamed people. And I think that we had very clear level headed communication throughout I think, you know, Mike Castellic in the libraries at NC State, Trevor Thornton, Brian Deets, Kevin Beswick, Jason Rinaldo, Jamie Bradway, listed on the slide here, just acted extremely professionally. I think that is such a critical part of why we were able to get through this disaster, still talking to each other, still friends, you know, still trusting each other and trusting in the capabilities of every group to do what's required to maintain our collections safely. And with AB Trust help, we completed a really thorough after action review of the incident to, you know, document exactly what happened, and to the best of our ability to create actions and, and follow up on those plans to do what needs to be done to prevent future incidents from happening. I think one of the things that could have gone better. You know, just as Bradley mentioned, everybody, I guess you always think, well, I'll get to that fire drill, I'll test this out later. We'll, you know, we'll get there. We'll test it out one day. We hadn't done any really fire fire drills testing kind of massive data recovery efforts. And that slowed us down initially. We didn't really everything the knowledge we had was theoretical about how we might recover things. It wasn't practice. And we hadn't documented any of the methods, policies or roles that were going to be required to make this a success. So I really can't emphasize enough that it's really important for you to, to do that test to practice to document and define roles before you need to do it because it just what in the midst of a crisis, you're not thinking as clearly as you could be. And it's just good to have a starting point. Bradley. So I'm thinking we should have called this talk more something along the lines of Jill, and the horrible terrible no good day. But we went with data loss again. But from, from a service perspective, I think you're hearing a lot of things that could have been done better things that were done well. And I really want to focus on the learning strategies that have evolved from this. So for example, because AP trust have been doing those fire drills, we had a pretty good sense of how to pull data out, even from Glacier deep archive. So this was for those of you who know your cloud storage tiers. This was like the worst case cost scenario. It was across a multiple data centers and the far region using a Glacier service of Amazon, which basically means that they're just sitting there and they've got that meter on, you know, speed. But because we are constantly interacting with the data and pulling things down for fixity, we had kind of microservices in place to be able to know exactly what threshold we could withdraw data to not incur costs. And in fact, at the end of the day, the threshold wound up being human, not technological, because giving, you know, NC State 16 terabytes of data and be like, Hey, is it is it all good? Let us know. Because we need to delete it off this other, you know, mid tier storage was not a good solution. It was about what's the right balance? How much can an organization physically evaluate further materials knowing that the stakes are very high? So we had to look at the cost factors. And of course, Brian Dietz, a special collections individual was very concerned about, you know, that of course would be the one time that faculty want all these high resolution images that they requested years ago. Where's my scans? So we had to find all the right balance points. The other piece was, as I told my staff and Michael, Oh gosh, if this doesn't go well, then we might as well just all go home because that's the story that people would tell. And it's the story that people kind of want to hear instead of the happy ending story. Just for a variety of reasons. Speak for yourself. I know. Yeah. But in terms of like, well, thank God that wasn't us. In terms of that. But one of the other pieces is we did, AP trust does happen to have a staff member who has certification risk mitigation. And so she did sit down with NC State folks and say, Okay, let's walk through this post mortem. Here's how you do an incident reporting. And then here's how you more importantly, turn that into organizational change and advocacy strategies for where you're seeing those gaps. And I think that was a critical piece for how does this kind of relationship move forward in that regard and learning from this as we have so many things and then sharing the story. The last piece I'll tell you is I have to give Greg and Jill, Greg Roshki, the Dean there, so much credit for being willing and open to tell this story. Because I feel like the digital preservation community is much like the special collections community was, you know, 25, 50 years ago, where if you had a theft, you know, no one said anything like, you don't say anything. The donors here that someone stole something, then, you know, it's going to look bad. We'll never get, we'll never get any more funding. Just don't say anything. Whereas at some point, they realized, well, if we don't say anything, then we're never going to learn and we're never going to get the stuff back. I think digital preservation is very similarly disposed, whereas if we don't tell as many stories about where data loss has happened and what we've done to mitigate and solve for that in the future, other organizations won't learn until it's their turn. So it's really incumbent upon us to be mindful and I would say objective about these stories to say, yeah, it's going to happen to us or it has happened to us, and how do we learn together as a community and move forward. So that's what I'm looking at is the outcomes to this story. Yes, I'm glad that it went very well. We had 100 percent successful, you know, restore, but more importantly, I'm glad we're able to tell this story so that you all can hear it. Yeah, so to sum up and in response to the list of mitigation factors that we identified during the post-mortem, we've put the following practices into place, you know, really made sure that all of our documentation is accurate. We have worked to script real-time dependency identification, so to be sure if something goes offline, some system depends on. We hear about it with effective monitoring and alerting that wasn't in place previously. We have greatly enhanced our change management procedures, instituting a change management process that is mandatory across our development groups, so that there are templates of questions that must be answered, reviewed by the relevant development team, and approved before any change of this, you know, of any significant magnitude can be enacted. We have instituted more intentional knowledge sharing to try to mitigate that effect of, well, this person is the expert on this system, and if they're not here, then we really just have to wait until they get back from vacation, or they're back from the sick leave before we can do anything about that. We've really tried to make an intentional effort to cross-train, and I think maybe most significantly, it really kick-started a conversation that had already begun, and that I alluded to previously, knowing that we have, you know, staff retirements coming up, we're all subject to staff turnover, especially now, just really assessing our strengths as an organization, our capacity, and our staffing expertise, and our staffing levels, and making an informed decision about how our infrastructure should evolve in the future to really provide the most stable and sustainable solution and platform for NC State. And so, you know, just underscored to us the importance of outsourcing in areas where we can't do as well as an organization that is devoted to that purpose. So, you know, physical hardware maintenance, you know, storage infrastructure is one. You know, we have our disinvesting, I think, in like self-maintained hardware in favor of outsourcing to our central IT office, for example, or relevant cloud services when available. And another example is AP Trust for digital preservation infrastructure. I think, you know, Simeon mentioned in his lightning talk yesterday, shared infrastructure equals shared distribution of effort, a better product, and a product that's less susceptible to oversight. I think that's roughly what he said. I wrote it down because I was like, yes, I agree. And so, that's the upshot. I think it was a learning experience for us, thankfully one that we were able to recover from. We did lose a summer worth of effort from some of our, you know, really talented skilled staff. They could have been working on something else and said they were working on recovering work that had already been done. But, you know, it was a valuable experience for us. And I think, leave it open to any questions you might have. And I don't know, did you have more? Oh, here's the team. Yes. Thank you so much, team. That's it. So. I don't have a question, but I do have a comment. I wanted to thank you, Jill, because it's not easy to stand up in front of a large room of people and say, we made a mistake. That's why I'm sitting. But I think it's, yeah. But it's really helpful, I think, for all of us to know that when we do make a mistake, we're not alone. And that we can all learn from our mistakes. So, thank you. I'm sharing on problems that happened. We had a journal file system fail. And we'd had someone make the decision that they didn't need to move that information to AP Trust as quickly. And so we had to scramble locally and had to actually re-image a small amount of stuff. So that leads me to knowing how hard it was to sort through what we could find in different places and what we had to re-digitize. I wondered, did you ever consider just pulling the whole 35 terabytes from AP Trust rather than trying to recover out of local resources? There was conversation around it, but the level of confidence on what they were able to resurrect from local storage was high enough that they felt that was a safe bet. I should mention that we have made significant investment into digital preservation and just workflow tools at North Carolina State. So Trevor Thornton and Brian Dietz, I think, worked really closely on a tool. They call scoops, I think it's called Special Collections Preservation Systems Scoops, SCPS. And so we had really great records of what we had deposited and what it was. So we did have a high degree of confidence and didn't feel it was necessary to just pull everything. Plus it would have taken much longer, a lot more effort. A lot of staff time went into this. I don't want to minimize that. It was significant unexpected commitment of staff time and to have more than doubled it would have been... We didn't need to. I think to Andrew's point earlier as well, which I, of course, always enjoy when Andrew speaks, he's right. We're a smart bunch. There's a lot of really smart people here. There's no reason why we shouldn't be solving these problems together. And even if you are a well resourced organization or you think you're a well resourced or you think you could be better resourced, there's always room for collaboration. There's always room for information sharing. And I think it's more of the behavioral cultural limitations that minimize the amount of preservation that could happen. And if we could think more about how we meet each other where we're at from a digital preservation perspective, I think we'll find that more digital preservation will happen as a result. And that's why I think coming to these meetings and listening to different perspectives is always a refreshing way to do that. So I certainly don't think our service is the be all and end all and we're very clear about what we do. And that's why we focus more on the community that we develop. So all the pieces that Jill talked about, we've learned just as much from them as they may have from what AP Trust is working on. And that's really the key point, whether you're an AP Trust member or not, it's like what do you bring to the table? So I like to just remind us that that's kind of why we're all here, right? Wholeheartedly agree. I'll also mention that Mike Castellacu is one of the leads in this project at NC State. We'll be giving a presentation I think probably with a more technical bent on the on the recovery process at Code for Lib. Coming up, I can't even remember. Is it in Philly? I can't remember where it is. For instance, OK, so he'll be giving a talk on this. If you if any you want to send your folks to hear him talk at Code for Lib, you know, he'll be addressing it there and you can speak with him directly as well. No way we're not done. We're all going to sit here quietly until two o'clock. Josh, did you have a question? I had a question. Oh, sorry, see? You can come up and ask us. The question for Andrew. OK, Andrew, you had a blue biscuit on one of your slides. It was DRM management tool. Could you just briefly describe what that is? The blue biscuit? So I think the biscuit to which you are referring. It was actually the DRS management, which basically just sort of generally speaking, like the DRS software. Oh, there's a database. Yeah, there's a database that's in there for sure. This is the biscuit. Yeah. Yeah, so it's it's like the what's right now is a bespoke system that includes caching and database. And a lot of custom software. All right, now you can go.