 Good morning. We're doing things about reverse. Jay was talking about doing, I'm talking about planning to do. So, reverse engineering the actual way we should be doing this, probably. I'm going to talk to you a little bit about our journey to having a digital preserve installation framework. A Victoria University of Wellington libraries, We've recently adopted a digital preservation framework that I wrote. This framework is what I consider to be a vital part of our policy environment and will underpin our digitisation programme, our digital archiving programme and the management of our delivery of our digital assets. So we got to, we knew we wanted a digital preservation framework, we had digital objects, I needed to figure out what to write in a policy thing to what we wanted to do. And I knew that what I wanted to do was to create a policy framework, I couldn't call it always a policy framework strategy, something like that, that would be, we could use immediately, quickly, and be able to start from, but wouldn't answer all the questions. A lot of the questions Jay was asking, I didn't want to answer in this thing, in the framework itself, but I wanted to have a framework which would give us steerage when we actually came to answering those questions. So I had to start with this and I wanted to, so where did I have to start on this? So this is going to start my programme, my presentation is going to wobble a little bit into me discovering about digital preservation and then into what I ended up writing into the framework. So I had to figure out what digital preservation was. And I was going to sell it to the wider environment, so there's an example of some of the digital objects that we have in the library that we needed to preserve. So where did I start? For me actually digital preservation begins way back in the past, back in the long days of antiquity as it were. When people had, because digital preservation, while it's a new thing, which is part of an old story. And that story is about sharing stories and content. So back then they wanted to do, they wanted to share things and they didn't know, and how did they share it? They shared it by talking about it, this is before writing. They put pictures on, pictures onto walls as you were. So access to stories, which we're going to be very big importance in digital preservation, is access, it's not just preserving it for preserving sake, it's preserving it so you can access the content. So access to the stories was limited to being able to listen to someone who knew the story or visiting the cave and being able to interpret the picture. And so many stories were passed along and many were lost. Later on we got writing and writing was invented and images moved from paintings on the walls to calvestone to clay to papyrus and then later to paper. The story was preserved as early people collected and stored these stories in early libraries. Digital preservation actually sort of starts here because it begins to take from here not only the idea of preserving and accessing the story but the beginnings of making many of the concepts which are central to the libraries and archives, description and discovery, which is important for digital preservation as well. As we go on in this old, ancient narrative, there was... I've just lost my fort. Sorry. As we moved on, there comes to another part of the narrative which was very important, which is in the Western world in 1450, Gutenberg invented the printing press while in the East Bay Shing invented the printing press around 400 years earlier in 1041. In the West the printing press had a profound impact on the preservation and access of the story so much so that Gutenberg is credited as the father of printing press despite Shing's invention 400 years prior. Many books could then be printed and accessed. Our stories were spread more widely but our ideas of how we accessed and preserved those stories was also being developed. Those works were stored and accessed in many more libraries and archives. Many can be still accessed today although you might have to learn a new language to read them. This goes back to what Jay was doing there. Even if it was originally written in English, you might not actually be able to read it because old English is different to modern English. Which is when we're linked back to digital preservation. A key question is raised. Can we access the content even if we can access the artefact? For with the written words, some stories have journeyed from the earliest days and are transmitted and translated to allow access to a new generation. In the digital world, we need to start asking will technology be able to read the contents of the file? How can we do this? The narrative continues on and we come to the 20th century which technology allows us for transmission of ideas further and easier than before. Here we've got a screenshot from the NZTC where we have the official war history of the Wellington Mounted Rifles Regiment in 1914-1919 originally it was printed in a paper. It's now been preserved digitally as it were and made into a digital artefact which we give to you all on the internet through the NZTC. But we now then have to figure out how do we preserve that artefact and how do we ensure that the content of this carries on into the future. So that's when we came to developing the digital preservation. So this is what we said. We said we have these artefacts like this. We want to ensure that in 100 years I always talk about 21-14, 100 years from now, the person in my job at uni hopefully it's going to be there, hopefully somebody's going to be doing my job and thinking that it's important. How are they going to ensure that those things that we've created now are still accessible. So that's what we wanted to do in the digital preservation area make sure that these objects continue to be accessible and also to make sure that people knew what was going on. So the core part of the digital preservation framework that I wrote contains basically 14 principles which I'm going to sort of work through now for you. Some of them may be relevant to you, some of them are a little bit specific to us but in general they give you a flavour of what we're trying to achieve within it. I mean the first principle of course is that digital preservation is a critical library process. I would put that in there. I think it's pretty essential and that's probably one of the big sales. Hopefully I've sold it well enough to the rest of the library and I think there's a general acceptance now that it is because basically if we're going to continue to provide access to our unique digital objects we need to preserve them. It's chicken and egg, self-explanatory in some way but also a little bit, not so. So the next bit is that the second principle is that the library adopts three levels of a free level view of digital preservation. Primary preservation, secondary preservation and descriptive preservation. This was my little way of encapsulating what I thought it was all about. Some of you may disagree but this is what I've sold to the library. So the first level is the most basic and easily understood primary preservation. This is where you back your files up. Simple as that, you're taking it and you're ensuring you've got multiple copies. It's simple, easily and understood. The second, however, is a lot more difficult to come to grips with and it's a bit of what again Jay was doing. This is where we look at a digital object and we start to ask about having systems to check that the files stay uncorrupted. Just because it's sitting on a server doesn't mean that random little bits fall off. We look at systems to check file obsolescence. Is it in Word 1? When we've got Word 10 now, can Word 10 read Word 1? If we can't, how do we do that? So this is all about putting in place systems and starting the conversations where we can have and we can identify objects and say we have this collection of objects, how do we save it, how do we make sure that it continues to be accessible. When we've put things up on the Internet, how do we ensure that they stay uncorrupted? How do we preserve the integrity of those objects? The last level is a bit which I get quite passionate about, which is the descriptive preservation. This is where we make sure we can tell the story of the object. To me it's all very well having an object there and knowing that you can access it. But in a hundred years, the researcher and the person who's curating that needs to also know why that object's there, what rights are around that object originally, what rights have been transformed, what decisions have been made to make it more accessible. So if it was originally in a one sort of proprietary format and you had to transform it once that came obsolete and then you transformed it again, if you have the original, and then the access copy later, the path that has led to there. So to me, this part of telling the story about the object is as integral as the technical bits of preserving the actual thing so you can open them. So this is talking about we not only want to collect descriptive metadata for the object, but also the technical metadata and then the metadata around the metadata. So that's the bit that I get really passionate about. The third principle is probably a little bit library-centric for us. This is about we want to minimise the number of systems and environments we maintain. We're a small team. We've only got so many resources. So we want to make sure we minimise our application creep to maximise our development potential. Also the more systems we have in place, the more complex preservation is going to be. Our fourth one was each new proposal for preservation should be considered against the existing framework solutions and new options should be explored. So while wanting to restrict the number of solutions and complexity of the environment that we have, we still need to be flexible in our approach and look at new technologies. It's always a matter with digital, anything digital really, not just digital preservation, that one size doesn't fit all and we need to keep that in mind. The next principle, the preservation of metadata around the digital objects is as important as the preservation of the digital objects themselves. As you can tell, that's pretty important to me. So I managed to get, include that in there. So this is really trying to emphasise that the narrative around the stories is just as important as technical ability to do the preservation. Again, the sixth principle is very much an internally focused one, which is the internal digital preservation as the preferred option. This doesn't preclude using external vendors for digital preservation. Indeed, often having a tertiary external digital preservation service can be advantageous. The more places they store, the better, as it were. Rather, this is saying in the first instance, we want to have our internal digital preservation done in-house properly, and that really needs to be done first, so that we can have surety that when we've got something in-house, we're dealing with it and how we want to deal with it. The seventh is that where possible current digital objects can be considered archival copies and access copies. This is again an internal one where we were saying, we're trying to say, we've done things in the past, we can't second-guess that, we need to move forward, so we're going to draw a line about what we've got in there and say, well, is this archival or access, can it be? Yes, probably it can. We'll just say that now this is it, and we can move on from that. One of the things that sort of unforeseen things that fall out of doing a digital preservation is, we've actually started to look at our actual digitisation way we digitise things. When we start to look at, well, what's an archive copy, what's an access copy, how are we going to preserve this for the future and trying to preserve it for the future needs of our researchers, we have to ensure that when we digitise something, we're digitising it so that it meets those needs. We said that the library will implement best practice standards for archival access copies for all digital objects. So this is one which actually we've already implemented is where previously we may have thought, well, we can just digitise to a JPEG level at a certain level, because it will create easy access. We're now changing all of our digitisation to the best practice ones for, like, tiffs for images at the highest resolution, and we're preserving those as archival copies and saying, well, we can have the archival copy and we preserve it and we keep it, and then we can make access copies which can then do all sorts of different things with where we haven't done that previously. It means that we don't actually have the ability to do future needs of our researchers where we may not have a best quality archival copy of an image that we had done previously. So this has had an immediate impact on how we actually act in our digitisation. They're sort of both related. So we wanted one and the library will implement and we will do it. So there we go. We're going to have the best practice archival copy and best practice access copy. And that's when it flowed on to principle 10 where we will endeavour to capture the best quality standard metadata as well. Both technical and descriptive. We're creating new digital objects and accepting the deposit of digital objects. So we're going to start trying to ask our depositors when we fall like our theses to ensure that they've got the best quality metadata they've can put into their PDFs and stuff so that we can then bring that out. We're putting the best quality of our metadata when we capture things internally. Making sure so that we can tell that story and that we know what's happened when and where and later on we can come back and see what's happened when and where. The 11th. So this is where we start to get the nitty-gritty of secondary preservation. And this basically allows us to do what Jay was doing. Is when we can start to look at where I said we wanted to have something that we can implement immediately but not get bogged down in the details. 11 is where we start to say we're going to do bogged down in the details and we're referring back to 11 to do it. So that's pretty much going to be one of our big focuses coming up in the future establishing procedures around quality control of existing digital objects and looking at what we do. 11. Principle 12 was emulation should be considered on a case-by-case basis. Relates back to the ability to access access copies. How do you access programs? How do you access things that were created in a specific environment? It's very... 13. The library maintains a registry of collections and related knowledge base. This one's, of course, really close to my little narrative heart. As you can tell where I bang on about the story of the object. This is about creating ability so that what we can do is that researchers or administrators can go in and look at an object and say it was digitised then this is the story about how it was digitised, why it was digitised, what rights were there, what programmes, what transformations it may have had. Very, very close to what I want to do. Lastly, and certainly not least, the future planning is central to the process of digital preservation. I've heard some people talk about a quote from somebody who I won't say it was but say that they thought that digital preservation had been done. It was all good. I cannot agree with that. Digital preservation has never done. Digital preservation will always be a problem as new technologies come about, as new things happen, as things become obsolete, as new ways of storing come about. So digital preservation is one of the central things is future planning. As you're trying to second-guess the future, you can't do it. So you try and create an environment that is flexible and adaptive and is looking as future-focused as possible to actually try and meet those needs. So all these principles come together to frame how we in the library will approach digital preservation. We're starting to ensure that our digitisation programme meets the best archival practice and that as we digitise we are not only digitising for access now but access in the future. We are looking at how best to develop systems and processes that will ensure the integrity of the work and the future custodians will have the surety of knowing not only that the resources are accessible but also that the story of how and why those resources are kept. That these future custodians will see that we've not only sought to preserve for preservation sake at the moment but also for future access and that we've looked at best as we can to ensure that those unknowable future research needs can be met. In conclusion, I hope you can see that we've sought to balance strategy with pragmatism in establishing these principles. Immediate action with governance and to create a framework that enables us to start immediately work on digital preservation and to give us steerage for the future. And that's my presentation. Thank you very much. I've had some questions but I've been asked to use the microphone so pass it around. Hi, Michael at Sherry from AUT. Are you interested to know if you looked at other frameworks or other principles or did you come up with this yourselves? I spent six months reading. So I looked at other principles, I looked at other strategic frameworks, I looked at the I borrowed, I did look at, I looked at the National Libraries one, I looked at various ones from overseas and then I drafted and redrafted and went through a long process of negotiating with other staff as they wanted to put what they wanted in and to dress frame it so that the whole library can accept that what I was saying was actually meaningful and was what we wanted as a library to progress forward with. Yes, I did look at other funds. Joanna, Canterbury Museum. Do you also have a policy that gives you guidelines on what types of digital material to collect or are you only concerned about the digital material you already have in your perspective? There's a collection policy which looks at what we collect but basically, which basically says we collect anything that's relevant to the subject area for the IRs, we try and collect in PDF but then collect all the associate materials in the original content for purposes of collection, if you want to collect what you collect and what people want to donate and what's coming. We're guided, we're looking at, we do ask in some ways we're looking at how we can ask when people deposit their thesis about getting them to do PDF A and then PDFs and how we transfer, if we can get from them in archival and in access copy and so we're looking at that sort of thing and basically, there's a collection policy which governs what is collected this is about what we preserve and how we preserve it. Hi, Kirstie at Turnbull Library. I was just interested, it's a bit going on from what Joanna just talked about but accepting around digital deposits, I mean for us we don't really have any control what comes in until it comes on our desk and so determining things like I don't know, sort of like you were talking about PDFs and so forth but that's just a piece, so imagine you collect a lot more than that and well you would like to collect a lot more than that in the digital sphere and it's very hard we decided not to put those kind of limitations on the types of digital objects we received so as a result we do receive quite a fear array but yeah, it's something that's once you start doing it, I think you really realise the whole game you're in. Yes, no, that's fair of point. We know that we can't tell people, especially academic students, what to actually give us. We can give them best guidelines and give them best practice things but in the end we have to accept what they will give us. We are developing for the IR in itself, a self-deposit type of thing where they can deposit digitally online and part of how we're doing that is that for the actual thesis itself they can only deposit a PDF but everything else, all the supplementary ones can be any format so through that we can access a little bit of control over what we get in the form of the actual PDF at the actual thesis itself but we can't really control anything else and we can only, the best we can do is give really clear guidelines and we can say look this is what we want to do, this is why we're doing it please listen I was interested in where you're saving the preservation copies and do you state that in your policy that it's stored in here or off-site or No, it's not actually stated in there because I didn't want to get down to that particular level that's the nitty-gritty that we need to do in but part of the on-flow from actually creating this is now that we're looking at how we're looking at bombs, dams, however you want to put it digital object to the management system which will be the centerpiece of us actually implementing a digital preservation in action so that flows from it but it's not articulated in it does that make sense? Well just to clarify it's stored on your servers at Vic At the moment yes and no intention of having it somewhere else protected That is a part of the future planning from it so in the first instance we want to make sure that we're preserving internally first and that's part of creating a doms or dams after we've done that we will then also look at whether what we can do about tertiary and going out and storing it externally as well so it's a long convoluted path to start on so for us and for where I wanted to what we could achieve is starting first internally and then moving outwards because of course if you try and store it externally there's costs involved and then you have to talk about budgetary things and that gets complex quickly Thanks Would it be correct to assume that all members of console are doing the same applying the same principles? I could not possibly comment Just wondered I would like to be able to say yes Hi Sam from National Library I'm just following on from that because one of the principles was we want to do it ourselves and I was just wondering what work you're doing with console or with National Library because it seems to me to make sense this is something which you should be trying to do together rather than yourself and use your thoughts on that In a perfect world it would be really nice to be able to do everything externally in partnership with other institutions How can I say that without getting in trouble? I don't want to get in trouble so basically every institution has their own drivers and their own people in front and top and they all have their own priorities so the danger with trying to go in the first instance to do a big one like that is that other institutions might have different drivers and different pressures on them to do things which may mean that it could make it a much longer project than it needs to be or you may not get what you actually need so in the first instance what I was trying to do with saying that internal first is that if we can have surety and a knowledge of how we're handling it internally then later on when something comes out in the future and we want to look at external things then we know we've got a fallback position if we don't have a fallback position and we try and rely on external factors which could go along a path and spend 24 months on a project that doesn't go anywhere we're back to square one so if we start at the beginning ourselves internally then if we get on to if we get that first then we can move forward I'm all about getting ourselves first and making sure we're doing right we want to do properly so that afterwards we can say let's go do other things and explore other options and see what exciting things we can do we have talked to the National Library about doing some pilots for their digitisation for some of the stuff on the NZTTC so we have had talks about that Never had to use one before in my life but what I find interesting is that when you're talking about your IR is it not just purely for your students to submit theses as well as for your lecturers to submit their articles and stuff that they're publishing because that's how our IR works at Waikato it's not to have other things kind of submitted that we may feel we would like to preserve well we want to preserve everything in the IR because it's integral archival and it's an essential part of what we do is the IR isn't just a holding place for theses it's some people it's the IR what we want comes into the IR is every research output from the university ideally what we want in there is every research output and I talk about the IR in all of this because to me it's a digital object we want to make sure that we're treating all of our digital objects which are archival and we have a curate meaning the same so for me when I talk about some of the other stuff as well it's talking about the whole lot holistically does that make sense? I just wonder because one of the things we're in discussions at the moment of looking at we have a lot of old more archival content that we are looking at wanting to digitise and predominantly audio at this stage but we are wanting to keep it completely separate from our IR because of the policies that surround that it is actually not the appropriate place for it to go because it's not hooked into research or into the PBRF so we are looking at setting up something that is separate where you don't have those restrictions and that's why I find it interesting when you're talking about IR because I thought they had similar restrictions in place for theirs well when I talk about a dom's dams for our digital objects basically the way I see it is what I want to create is for one place where we store all of our digital objects and we treat them with the same digital preservation systems and from that if we've done it properly then we can create an IR instance outside of it so if we have everything like our audio other research outputs our NZETC the IR content in a single digital object sort of management place we can then apply a consistent level of metadata across it a consistent level of practice for our digital preservation across it and then from that we can then say create a separate thing to come in and look at that and say these are the objects in there that consist of our IR and expose them out so we don't that's trying to also minimise the number of systems that we have in the place because at the moment we have three or four to five or six different systems doing the same thing differently so that's part of the framework is to try and do that