 Welcome and thanks for coming. We are going to, excuse me, in the University of California we have started working on a multi-phase digital preservation suite of activities in order to facilitate and understand the needs of a system like ours for digital preservation. And while this is phase one of a multi-phased series, I thought it was kind of important to just come to a meeting like this and let folks know that we are doing what we're doing. The University of California is a pretty large space. We have lots of different types of campuses and as a result many different kinds of data that needs to be preserved. So I'm going to talk for a few minutes about sort of how we got to where we are and then Mary and I are going to talk about our process. I'll put my administrator hat on. I often get questions about how we sort of govern the University of California libraries. And there's really a couple of groups that are important to this discussion. Across the university we have a group called, this organization called U-Class which is the governance structure. At top of that organization is this thing called the Council of University Librarians or as we call them COOL. And Supporting COOL is another group called DOC, the Direction and Oversight Committee. And I said on DOC DOC is a group of high level administrators reporting to the University Librarians and our job is really to take projects, plans, goals from COOL and make those things happen. And COOL states these plans and priorities annually. And one of those goals stated is to maximize the long term access to digital content. It's been a priority for the University of California library system for about a decade. We've done lots of work in lots of many ways prior to this about different aspects of digital content access. And now we're getting, we're trying to get quite serious about digital preservation. And just, I wanted to mention here that during the discussion at DOC about the working group and how we might best position the working group to do the work, we made an emphasis to recognize the importance of marrying information technology with library preservation activities. So you'll see the working group at the end and there are IT professionals working alongside the library. And we put this working group together with a charge to do really a few things. Investigate internally. What are we doing as a system right now with regards to digital preservation? We wanted to develop a baseline to measure against. So we used OAIS as a baseline to develop sort of a high level overview to find out where our gaps were between what we were doing and what we should be doing. And then we took a look externally. So we looked at a lot of external digital preservation providers and we did a series of interviews with them asking them a bunch of questions and that's sort of the genesis of our report. And so this is really the meat and potatoes of the report. This is the work that people did. The idea with the phase one report was just to get really a steady state about what it looks like in the system, what people were using, what activities they had, sort of what they didn't have. And I'll talk a little bit at the end about sort of what we discovered as well, Mary and Edson as they talked through the process. And I think we found lots of preservation style activity but not necessarily structured and organized in a way that we felt the system will ultimately need to be in due in order to do this content. So with that setup, let's start talking about phase one focus. So as Todd laid out, we were going to focus on a couple of tasks as a working group. First we're again looking outward, we're going to talk to external digital preservation service providers. We're going to then look inward, talking to the UCs, talking to our fellows and colleagues about the activities either both current or planned down in the future. Then we're going to look across those two groups and see where we had some overlap, where we had gaps, try to come up with some best practices and look at what are the areas we needed to develop. And then ultimately the goal was to develop a phase to charge that we would pass on to the next group. And so we were really the starting point. We were sort of laying the groundwork that we would then build on in subsequent groups. So our timeline was fairly aggressive. We kicked off in October 2018 and we were to deliver our report in April 2019. So we had about six months, which given the scope and everything was a bit aggressive but we stuck to our guns and wanted to be fairly targeted. We held weekly meetings, we gathered data, we then analyzed our data and then we wrote our report collaboratively online together. Edson was our chair, so he kept us on task, which was great. And we shared data through a wiki and also a Slack channel. So the scope was defined for us to be digital assets that were owned or created by the University of California. So digitized content, born digital content, research data, publication data sets and scholarly output. So our methodology, we considered doing a survey we thought that might be one way to gather the data, but felt that that was overly broad and often the answers are too variable. So we really wanted to target the groups that we were going to be talking to, which were a fairly small community. So we decided to do interviews and we sat down, each two members of the working group would sit down with the interviewees and we did targeted interviews to try and get really specific information that we wanted in that particular in this particular area. So the people we interviewed you'll see here on the left we end up calling them the exemplars which are some of them were identified by the charge, but some of them we chose because they represented a nice cross-section of digital preservation service providers. We had vendors, we had academic institutions, profit, non-profit, independent. So we had a nice cross-section we felt of exemplars. And then of course we had all the UCs, the 11 groups, the California Digital Library CDL is one of, we consider one of the UCs. We later sort of broke them out as a vendor because it seemed like they fit better with that and you'll see that in the report with the vendors. Unfortunately when we were doing our interviewing process the DPN was actually kind of starting to wind down and so it was unfortunate to lose them as an exemplar but we were able to interview them and actually hearing their story and their process was very informative for our group. So the group collaboratively designed a series of questions to pose to each of our interviewees and their questionnaire was covering 14 different topics that you see here, organization, mission, business model, succession, etc. From those 14 topics we developed about 80 questions. So these were fairly in-depth and detailed interviews. The questions sought information about systems, requirements, compliance, quality assurance, methods, best practices, reasoning decisions that people made, policies, and both road box and successes and future plans. So the first interview was held in November 2018 and they continued until February 2019. So we had four months of that process, both of the external and internal interviews. We held 22 targeted interviews. The interviews again were done by two members of the working group and one recording and one asking questions. We actually taped some of the interviews which was really great. And then again the UCs were sort of self-identified. I interviewed UC Berkeley as I'm representative for UC Berkeley and then we had other teams who interviewed the other UCs. So the interviews ran between 60 to 80 minutes. So again they were very detailed, very thorough. We used the same questionnaire for both. And then the questions were designed to be very detailed and specific but also we left a lot of room for clarifying questions and flexibility in terms of going in different directions. But we tried to keep to a questionnaire script. So this is pretty and this is not meant for you to see because this is our working document. But one of my former employees who now works at UCSF created this color coded questionnaire and these are the answers that were color coded by interviewee by topic. And what we did is we created one of these for every single one of the interviewees and then we used it to combine and analyze the data. And so actually while it was very pretty it was also very useful in terms of color coding because you could jump from one interview to another and see by color quickly the answers and jump to where you wanted to be. So we color coded the 23 interviews into the sheets and we did some further analysis to kind of distill the information down and again to observe the gaps overlaps and other issues that we wanted to kind of raise up in the interview responses. And then we used these sheets to create what we call our matrices which are sort of the boiled down. And again these are not things that I want you to see because these are in the reports so you can see these in the report but this is really what we distilled those 80 questions and all the 22 interviews into these simple exemplar matrices. We did one for UC as well. So the response data is really simplified here but again it tells the story of what we were hearing across and trying to compare and analyze what we saw across the exemplars and then we did the same for the UCS to sort of look at where the UCS were in terms of all this. Again CDL ended up in the other matrix so we'll see them positioned there. So we identified through these matrices the trends, we observed the gaps and then also confirmed some of our assumptions about where we are in terms of digital preservation as a community and those are detailed in the report. So we relied on the interview data to surface the key requirements from our charge. Again looking outward looking inward coming up with best practices looking at gaps and overlap. And those are detailed in the final report which is here the pretty cover of that. You can see all the names you'll see the names later. So the final report was drafted collaboratively online by the working group and it provides an analysis and detailed representation of what we found in our matrices. Edson is going to be discussing the report in a little bit more detail but it really provides a high level overview of current best practices for digital preservation as well as outlining the key issues, building blocks and lessons learned to be considered in developing a shared vision for digital preservation in the libraries. And with that I'm going to pass it off to Edson. Alright thank you Mary. So for those of you who aren't familiar with the University of California system there are ten campuses spread throughout the state. Five are in the north and five are in the south and the one thing that we have in common is that we're linked by the San Andreas fall. At last check UC had over 500,000 faculty, staff and students so with so much distances between campuses and so many people involved it's not surprising that individual campuses don't always know what the others are doing. So based on our interviews we found that UC campuses fell into three broad tiers of digital preservation activity. There are two organizations at the top tier. The California Digital Library operates at the system-wide level and offers centralized services to all UC campuses. With regards to digital preservation CDL offers a full suite of technical services to the UC campuses. The CDL preservation component is Merit which is a robust and geo-diverse well-architected storage system and it is core trust CDL certified. The other top tier digital preservation program is Chronopolis which was run out of UC San Diego. Chronopolis is a dark archive offering bit level preservation. It's track certified and partners with several large institutions including national labs and statewide digital libraries. Both of these can be considered to be exemplars within the UC system. So at the other end of the spectrum are campuses that are consumers of services provided by the top tier. On the south both the Riverside and Irvine campuses fall into this category and in the north the newest UC campus at Merced does it well. For a practical matter these schools receive the bulk of their digital preservation services directly from CDL. They'll typically use CDL's dam system and as resources permit we use CDL's Merit for digital preservation. So what was really eye-opening for the committee though were the findings regarding what the schools in the middle tier were doing. And these are some of UC's largest campuses including flagship schools in Los Angeles and Berkeley. In our interviews with ourselves a number of common themes emerged from the middle tier campuses. So first of all digital preservation is largely still aspirational. No one's close to implementing practices that can be certified and where preservation is practiced there are large gaps between current practices and those of the exemplars that we interview. Also no one is working together. Each of the middle tier campuses have pretty much decided to pursue digital preservation on their own. The campus have been working in partnership with, have not been working in partnership with each other although recently we started to see some collaborative efforts. Additionally responsibilities for digital preservation are anything but uniform depending on where you are the cognizant folks might be archivist, IT staff research data librarians or born digital specialists. Additionally there's lots of legacy systems that are going to need to be replaced and be standardized workflows. This is expensive and difficult work and frequently it's data migration is hard and it's not always a priority. So where efforts are in place we are seeing that staff and economic resources are very limited. Digital preservation is almost always a side job for someone who has another role and everyone's deeply concerned about the cost especially for long term preservation grade storage. Digital preservation and I'm going to say this is a forever project and that means forever cost. So it's only natural this is going to meet resistance from those who are cost adverse. Finally at many campuses the de facto preservation system is in fact a dams with a raid storage array augmented by a tape backup or some cloud storage. This is really an IT first approach and doesn't even come close to the recommended standards for replication, fixity and geographic diversity. In short this isn't digital preservation. So some additional findings in the area of technology while there's still some unresolved technical problems in digital preservation to be addressed for the UC system our challenges aren't really in technology. There's really nothing magical about having a preservation repository. All the salient issues have been understood and addressed for a generation. We have lots of tools to do the job and we have reference architectures followed on. Standards and policies are in place and are continuing to evolve as technology moves forward. Data integrity at rest is no longer a significant problem. We used to worry about spontaneous corruption of data at rest but in practice once content makes it to bit level storage we're not seeing problems. Something that was interesting we found is that most of the storage being used for digital preservation is local. Campuses are storing it on site. We're not sure why this is the case. Storage is frequently less expensive and potentially more robust. We think that this could be inertia from a previous era. We are inherently conservative organizations but in any case the lack of geodiversity here is a gap and that needs to be addressed. And finally the cost of storage ongoing annual storage costs for digital preservation are high. Many folks see the list prices for storage and they just throw their hands up in the air. The good news is though that prices for cold storage in the cloud have gotten much more reasonable in the last few years and this may change some perceptions of cost. Some other findings of we have determined that the OAIS model is sound. Everyone we spoke to was comfortable with it. Both the reference and the functional models. No one told us we were on the wrong path and it's pretty clear to us that the community is getting this right. Similarly there's not a lot of debate on what constitutes best practices for digital preservation. As far as certification goes there doesn't seem to be any doubt that preservation repository needs to be certified. If it's worth building it's definitely worth building to the community standard. On the flip side though certification is very time consuming and by extension very expensive. With regards to staff resources in many cases staff sizes are small or composed of people who have other jobs. Outside of the two certified repositories no one has digital preservation listed as their primary job. Staff limitations are in turn hurdles to adopting and following best practices. Many of the required skills to do digital preservation exist within the system but they're spread out among the campuses and as I said before we're not really collaborating. Finally prioritization many campuses frankly aren't interested in building their own preservation system and that's okay. They're expensive and the ROI is really difficult to quantify. Instead we're seeing resources being directed to dam's development. So our working group was supposed to do a survey and identify gaps and not come to any conclusions. But we couldn't help but come to some. So for us the bottom line here is technology should not be our focus. The gaps we need to address are in procedures, policies and workflows. Also system-wide resources are limited and they are not coordinated. And we agree it's fair to say that UC's best path is not going to lie with 11 separate certified repositories. Our challenge instead is to develop a well articulated governance model for system-wide digital preservation services. And my final point here is the path to success at UC level is not to centralize but rather to collaborate. And Todd will wrap it all up for us. So I'm going to just offer a little bit of thought on the phase one and give you an overview on what we're doing in phase two. Now when we think about digital preservation as the collective we we think about it in terms of technology, personnel and policy. And to do digital preservation well you need all three of those things to run together. One thing that was interesting is that we found nine of the ten campuses to some degree using the merit system from CDL, which is a trusted digital repository. It has all three of those elements. That was pretty good news. I think the challenge with the system is that, as Edson and Mary pointed out there's not a lot of coordination. We often are talking about when we talk about digital preservation sometimes in doc we approach the same problem thinking about it in different ways because it is a very complex and complicated thing that we were just sort of scratching the surface. And I think without staff whose job it is to do digital preservation it's also one of those things where it's really hard to get your hands on exactly what's going on and who's doing what. People are but maybe we don't know who to ask. So phase two is here are the meat and potatoes of the charge. It's a way for us to start talking collaboratively about what's in the system, what needs to be taken care of. What sort of workflows do we need in order to develop preservation programs that are scalable across the system. And also provide services, resources, activities for people within the system who want to learn about digital preservation to get some training on that activity. It's important that if we want to have a workforce that's trained for these future activities that we need to provide them with those services. So that's really part of what we're doing. So we wanted to just recognize what an excellent working group we had having ourselves on the back. It was really a very good group. We did a lot of work very quickly. I thought it was really high quality work. The phase one report is linked here. It's actually for a kind of a technical report it's a pretty good read. So if you're interested it's not a bad place to start. And we will be producing other phases of this work and producing the further reports as well. With that I think we'll take some questions.