 Kia ora, my name is Tim Barnett, I'm in charge of digitisation and digitisation heritage and strategy at Auckland Libraries and I'm going to talk to you today about Kura heritage collections online and the whole process of where we're up to, the journey of how we got there. I want to start with a little bit of context. In 2010, all the seven different councils in Auckland, in the greater Auckland region, merged to create a single library system and so job number one for the libraries was to merge all of the online catalogues into a single catalogue, which they did, a huge job. Then in 2012, the libraries produced a strategic document that set out future directions and what they did was they said, well, over the course of the next ten years, the digital library is going to become the most important thing and we want to see the library in every pocket by 2023. So that was our challenge. We had a single catalogue but we did not have a single search across our heritage collections. So our task was to merge all of the heritage and research databases into a single collection, have a single search across all those resources and make the content discoverable on the web. One of the predominant things you hear about Auckland Libraries' heritage collections all the time is when people discover them, they always say, I had no idea that you had such great stuff. Our collections have been largely invisible unless you know about them already. So the first thing was a discovery process. What do we have? We did an inventory of everything on all the hard drives. We interviewed librarians and we discovered we had 70 different data sets. We had almost two million records, which is about the same size as our catalogue actually. The top five are there and these are really the top five the librarians went to to help with their estimates. So to get the library, the digital library in every pocket by 2023, we have the pillar that is the catalogue. The second pillar is all of our research heritage, all of our research content. What about the online resources? Big inventory of what we had online and it looked like this, actually there's more than this, but 22 databases, central library, 10 West Auckland databases, that whole list there. Important ones are the last two. A lot of the manuscripts are in the catalogue, but it's a good place for them to be, but not particularly discoverable when you're doing a Google search and a huge number of resources on the library website. When the library website was redesigned, a lot of those resources disappeared. This is what it looks like if you're trying to find something. Really what you needed was a librarian to help you and in many instances the librarians had their kind of go to resources and didn't necessarily know everything that was up there. So really confusing for our users, both behind the desk and in front of the desk. That's a little dizzying and that's a little bit what it was like trying to figure out even what we had. So what does that look like? Well, here's the solution. First off we made the declaration we have one collection. So we wanted a search across that one collection. We weren't going to do an aggregator that did a federated search pulling results from all the different datasets. We need to actually move everything into a new repository. We wanted a single system for managing all of those resources and for publishing them. We also considered digital preservation and came to the conclusion that we would do the digital preservation piece of it separately. There are packages that provide both collections management publishing and digital preservation all in one but typically they don't do all three of those things well. So we determined that we needed to do the dams, the digital asset management system separately. We set out this roadmap and in retrospect we pretty much followed it. A bunch of work to do up front to establish our requirements and then quite a big bubble for procurement. A lot of work to do with design, metadata cleanup and so on and then we were going to do this in a phased way. We weren't going to try and move everything all at once because of the scale of the undertaking and that's pretty much what we ended up doing. So the first thing was the selection process. We had some fundamental criteria. It needed to be scalable, something that could handle two million records and that's growing all the time. We looked at what everyone else in the country was using because we would love to have used what the other libraries around New Zealand are using because there's so much movement of personnel, staff between institutions, a lot of people learning that goes on. We wanted to leverage that. What it turned out to be is that the field is hugely fragmented. There really isn't one collection management system that everyone is using. So we had to kind of give up on the hope that we could use what everyone else is using. We also had to meet the needs of our stakeholders. Collection specialists in particular are charged with a huge responsibility of looking after these collections that are enormously valuable. We had to make sure that they could do their job, that whatever we gave them as a collection management system would work for them. Metadata specialists, we were looking at interoperability between what's in the catalogue and what's in the heritage resources. The research librarians were helping the customers. They had to meet their needs. Web team for publishing. All those various 22, however many databases it is that we have online. Not really good publishing capability in those platforms. And IT had to be happy with it. So we pulled together all the stakeholders and we received presentations from our existing vendors and a couple of big players in the field. And this is just to give you an idea of what we did. This was our selection criteria. The top one in yellow there, that's looking at the platform itself. The middle one is looking at what the collection specialist is needed to manage the collections. And the bottom one is the publishing one. Each of the criteria was weighted. So if you look at the bottom one, search is top. Shopping cart, don't really care. And so as we received each of the presentations from the vendors, we rated them, got together afterwards, had a debate about what we thought. None of this of course is empirical. It kind of looks like it is, but it's not. People are kind of going on their gut feeling, but what it does help is it helps frame the conversation. So as much as possible you're comparing apples to apples. And what we found when we went through this process, there was pretty much one clear leader. There was quite a bit of consensus around which one came out on top here. And for us that was OCLContentium. It's very library friendly. It's very scalable. It's customizable. We can do what we need to do with it. So then we switched over to, and now we've made our decision. We switched to the move to implementation. We had to create a new schema. We had to create new metadata standards. We couldn't rely entirely on the metadata standards that, for example, reliant on Library of Congress. It had to be Google friendly, which is a little different from finding about where on the Tree of Knowledge is the set. Where in the Library of Congress is arrangement of the information in the world. Where does it sit on that thing? It really needs to be Google friendly. So that meant a bit of a change of thinking for some of our collection specialists. We revised our rights management approach. Looking about, looking at rather than making a determination on what the legal position is for each one of those records, more trying to tell the users what they can do with this stuff. So we ended up with three rights management, basically. Either there's no known copyright, or if it's created by libraries, it's Creative Commons. About half of our stuff is created in-house. So that's great. Or there is some rights restriction for those ones you need to check with us. We decided we would do minimal cleanup, rather than trying to standardize the metadata across all 1.7 million records. Too big of a job. We would just do the piece that Content DM needed, which was essentially the title and the dates. Except that, as soon as you start looking at data, you can't help yourself. And you end up in a deep, dark trench before too long. Hard to pull out of that and live with the fact that close enough is good enough for this point in the process, which really had to do with being okay with interim records. Some of our collection specialists want really polished stuff that's really perfect online. They don't want to see an interim record up there because it doesn't look good. Everybody had to get comfortable with the project of the scale. We had to be happy with having interim records up there. And it works really well. There's really no problem with it. We're enriching the metadata all the time. Another issue was finding all of the image files. Over 20, 25 years of digitization, the image files we were dealing with were all over the place. Particularly given that we're coming from what were four main collecting archives that had completely different protocols, different ways of doing things, merging all of that. The whole getting the files sorted out was a big job. And this is just one example of that. All of these images were originally digitized using the Codec Photo CD system. So they were on Photo CD disks. The way they always worked is a unique barcode on every disk. And then all the files are named one through 25, one through 30, whatever. So it meant we had hundreds and hundreds of files with identical file names. Because when those disks, all that data was moved onto hard drives, they lost the barcode that made them unique. So the way it works now is if you've got a heritage images, you've got to hover over the image, look at the URL that pops up, that tells you what folder that thing is in. It's anyway to find it. We renamed all these using Visual Basic, named the files to match essentially the accession number. So that's an example of this kind of cleanup is helping us move further down the track with our digital preservation work. Because we need to get the file names straight as job number one for the digital preservation. So although we've put that off to one side, we're working on along with it. So that was a really good problem to solve. We needed a really good name. So we gave this task to the Maori specialist from libraries. They came up with some suggestions and it got narrowed down to Kura heritage collections online. And Kura means treasures, valued possessions. It has the connotation of the intrinsic value. The worth we associate with an heirloom. I know probably for most of us we hear Kura, we think school. So a place of learning is not a bad connotation for our online repository as well. And it's worked really well. Pretty much now it's become Kura just like that. And it's pretty Google friendly. There aren't that many things online when you search for Kura that come up. It's in the top three I think. Quite a big design process. Our web team led the design. This is an example of a record it's sort of before and after, where you can see how this record looked in the catalogue on the left. By the time we move it over to content DM, of course we've got all the images. We've got all the text in there. This one happens to have over 13,000 signatures in it. We transcribed all of those. All searchable. The difference couldn't be more stark between two records like this. Another great thing for us with this platform, and actually because it's web based with a lot of platforms, but one thing we didn't anticipate that we can now do is we've got these language facets. And this is simply a field in the database we put in there what language it is. It's English by default so we haven't included that. We decided to use the original script for our Arabic and Persian and for Chinese. If you search on that using that script, these records will surface. So someone on the other side of the world can be using their own keyboard searching for this content. Our stuff will surface for them. And in many instances these are unique. No one else has a copy of this. So that was a big plus. Similarly with Chinese characters, if someone's doing a search somewhere on the planet using Chinese characters, this material will surface. That was not the case in our previous platform. Going live. So we decided that in terms of how we migrated data, we wanted to shut down a lot of the a lot of the platforms that were hosted elsewhere. So we did those first and we determined what our minimum viable product was. And that was we ended up with 650,000 records we went live with in January. It's about 40% of the total. Since then we've added another about 100,000. And that work is carrying on in terms of the response. Really good shift in the way we worked internally. Because we're working it really started with the selection process. Having everybody around the table involved in the selection process, people were kind of invited to consider the whole instead of just their piece of it. And now our teams are working across collections, metadata schemas that are shared and so on. We've got a much better way of working. We also have a governance structure that reflects that. So it's no longer owned by one unit in Auckland libraries. The governance structure runs across units because the collection specialists are involved, the metadata guys are involved, content and access, the web team are involved. In terms of the users, huge amount of really positive feedback. We've got feedback at the record level. So when we went live, a lot of that material had been of course in our pre-existing databases. It was being responded to as if it had never been online before. Which told us that our original analysis was correct. Our original databases weren't surfacing the content. And now they are. So it's we're really shifting from discoverability to engagement is really what we're seeing. A lot of dialogue going on which is really great. So where do we go from here? We're going to continue migrating our records. We've probably got another 18 months I would say to get fingers crossed to get the rest of the data migrated. At the same time, we're digitizing new content all the time. So we've got those two work streams. We're refining the web interface as we learn more about how it's working. We're going to continue to refine that. The one thing that it's giving us is really good data. We're seeing what people are searching for. That is going to actually influence what we collect. A simple example recently is that we noticed that there were a lot of searches for Himata. Now we've got resources in our collections that show that part of Mangare. But they weren't tagged. So when people were searching for Himata, they weren't getting any response, any hits. We've gone and tagged those. Now they will. So that's an example of how the data is helping us with, the analytics are helping us with our metadata. But we'll see what is getting a lot of usage and that will direct our collection strategies too. Which was somewhat unexpected. So what we really have now is we do have a single collection that is accessible from anywhere. Put a little bit of context in there. Only about 10% of our materials in our collections, heritage collections are described at the record level. And we've digitized about only 5% of our photographic collections. That linear meters measure about 2,000 is probably conservative, I would say. So there's plenty more where this came from. So there's a lot more to go. But that is our story so far. So jump on, have a look, play around. I've put some bookmarks up the back that have got the URL on it if you want to grab one of those. And happy to take questions. This has been a lightning visit across the landscape. Just wondering whether you had a crowdsourcing function on that? Yeah, if you jump down to the record level you've got a box that says tell us more about this record. And so we can use that for a lot of things. One of the things we'd like to be able to do is to set up a project where people can tag their iwi. So if you look at we've got a lot of Maori resources on there but they're not necessarily unified by their iwi and their hapu affiliation. People can put that in there and let us know. We can tag it so when someone's doing Whakapapa research that stuff will come up. We're getting actually quite a lot of feedback not unlike the digital New Zealand feedback that we get from our stuff in Digital New Zealand. This is my uncle Bob, not my uncle John. So yeah, we're getting that stuff. A lot of it. So you work with that because sometimes some of the records have missed out on the names of people. It could be multiple versions of the same name. That's right. This is why one of our criteria was something that was library friendly. So it comes with Maori subject terms built in and all kinds of authority lists you can select. You can make your own. Because we didn't have the luxury of cleaning everything up at the record level as it is and we're cleaning up as we go. It's interesting that when we set a new platform mistakes that we didn't see in the old one jump out because it's much more visible. And people are telling us. So that cleanup is ongoing kind of basically all the time. Kia ora, I used to work at Auckland's libraries and I used to manage a lot of those DVTs and the wrangle, the volunteers from HG who used to do a lot of the indexing. Yeah, kudos because there were so many of them. And I can imagine the disparity of data but as I just alluded to what my colleague Fiona was just alluding to. But that was just they transcribed them directly as a sort of thing in many cases. So to alter them based on what we know, what their spelling was especially as contentious sometimes as well. So to get to really the manuscripts and things like that because I know that's what you're probably alluding to as far as not described to item level because I know when I was there we used to just describe the collection and then create a PDF which was the finding aid and attach that. And that had a huge amount of data. The manuscript collection is phenomenal there and that is my concern because it is like a lot of the ones are just indexing and passenger lists and things like that. They're individual but the manuscripts is where the true richness I think anyway of the heritage collections is. And it's a pity it isn't described to that level. And I'm just sort of interested in an example that you showed you didn't show hierarchical description at all and I just wanted just a quick question where the CMS can handle that because that is probably how you would surface those manuscripts. Yeah, you can do a sort of parent-child relationship between a record and this components to two levels. So it's not really, it's not that well suited to a parent-child relationship thing. You can do it for those collections and manuscripts that are not described at the item level. We put in a collection record and all that great work that was done creating those finding aids, those go into the collection record. So that stuff will show up if you're doing a search and eventually we'll get all of the elements of that or if we can digitise and get that stuff added. So you mentioned about how you made the decision to just transfer certain fields and that it's difficult to avoid the temptation of making things as perfect as you can. So do you have any tips for managing people who would like things to be perfect and struggle with? You know that was the great thing about our process of having everybody around the table, everybody involved in the decision and not just a selection of the platform, the decisions all along the way. So changing the way we worked was actually one of the great outcomes of this project. We were really working differently. And you know a lot of the decisions we make of course are pragmatic. There are some people for whom perfection is their thing. And they actually have the licence to go in and edit records to their hearts content. So some of our librarians, the way it works is they have access to edit them and then I look in the evening and see what needs to be indexed. And there are some of our very knowledgeable librarians who have edited this many records in a day because the full stop wasn't right or the comma wasn't right. And that's totally fine. They can knock themselves out. And it's great to see them doing that. In a way that thing of the platform allowing people to do the job that they feel that it's their duty to do, that they've got this drive to get it right, fantastic. It lets them do that. So yeah it's working really well. Tim that was great. As an old collection manager you've made so many decisions there to sort of get to a point. I can just imagine the discussion about some of those things you've done, my goodness. And merging so many databases together is always a challenge. So brilliant. Well done. Thank you.