 Right, well, good afternoon. Kia ora tatou. Yes, Adam should have been here, and about three weeks ago I got the last minute opportunity to go on the museum leadership programme at Macquarie University in Sydney, and I went to him and said, look, it's looking like I might not get to NDF. And he goes, hmm, I think I've got a better excuse. So, and had a text from an hour ago and baby is nearly here. The main reason I wish Adam was here is that this is the project that he has been putting his heart and soul into for the last two and a half years, and I know he was really looking forward to being here to share it this afternoon. So I'm going to do my best to present what we were going to do together. He did also say to me on Friday, I wish I'd thought about this 40 weeks ago. So Auckland Museum, just by the piece of interaction I need to give you, is more around the variety of the collections. It's a reasonably encyclopedic museum. We sometimes say that we're an A to Z museum with a few letters missing. Over the last at least 20 years the collections have been increasingly made digital, put into databases, adding digital images, but even after that 20 years' work we still have only one-third of the collection cataloged in an electronic form, and only one-tenth of that third have a digital image. So we've got quite a long way to go. But the point about this slide is the variety of types of data that we're dealing with. In 2012 we published this document which is our high level strategic plan to guide the processes that we're going through to renew our institution. It sets out some values around how we place the collection at the very centre of everything we do and treat audience engagement as the primary thing that comes from that. It also expresses that the museum building is just one of the ways that we're going to achieve our aims, and the mantra of everything we do, having a component of being on-site and off-site and online, is something that most museum staff now that just rolls off the tongue. Every project that we begin, what's the off-site component? What's the on-line component? What's the on-site component? In terms of online collection access, what we've had to date was the system we knew in-house as MUSE. It was launched in 2007. These were really just curated subsets of the collection. Each department had its own small search page. In some cases there were only perhaps 100 collection items per search. Data was not updated regularly. The only database that was really kept up-to-date was the library catalogue. The system suffered from a bit of a build-and-forget phenomenon. A whole lot of money was spent in that period, but then the focus went elsewhere and there wasn't effort and money put into keeping it up-to-date and keeping it developed. We needed to do something about this as guided by our future museum aspirations. We believe that, like I presume most people in this room, museum and library business is in a state of continual change and that in order to both look after the heritage that we have inherited but also make sure that it continues to be relevant for new audiences, we had to do something quite different. So in redeveloping the Collections Online project, we had two guiding principles that we started with. That is that we are open as a rule and closed by exception and that we are one collection, not many. Now, Courtney, you talked about digital brand and I actually was thinking actually this is possibly, these two phrases are probably our digital brand and I'm going to take that and have a way to think about how we apply it in fact to the onsite experience so I think we can use that usefully. Now, this might not be too revolutionary, but for an organisation that is 163 years old that has certain ways of doing things and certain heritances through staff passing on practice from one to the next, it has required quite a bit of rethinking and change. One of the things that we looked at was to start using linked open data to make true those things about being open and being a single collection, not discrete units. We wanted an environment that would encourage new forms of searching and new connections between collections. The museum is the first slide showed. We've got 12 different collecting areas, 12 different data standards and what I came to realise was that in each pocket those data standards are right. The bottomists describe things in certain ways because they need to conform to international standards and maintain connectivity, but that doesn't connect sideways across other collections in the museum. So, although perhaps some in the room and I remember sitting in meetings when I worked at the National Library where there was much tuttying about multiple data standards, but in fact they come from a good place and what we have to get cleverer at is joining them up, not destroying the reason they existed in the first place, but building the bridges between them. That's what the linked open data schema allowed us to have a look at. What it's based on is breaking down every single piece of information into its smallest bytes, so these things called triples. So it's a subject, a predicate and an object, so a thing, a relationship and another thing, a concept, a relationship and another concept. What that allows us to do is free things from the original context and recombine them in new contexts but not forget where they've come from. So perhaps an example of the kind of things that we're trying to say. As well as breaking things up, it also allows the system to start building inferences between pieces of information. For example, I was born in 1967. Digital decimal currency came into existence in New Zealand in 1967. Those are two separate pieces of information, but they've got something in common and the system can infer then, without having to write it down, that I was born in the year that digital decimal currency came into being. The kinds of things that we were hoping that would come from this was that people would not have to know how we organised the museum. So the previous Muse databases, you had to know what we had put in the Applied Arts database as a distinct from the History database. You had to know what entomology was and to know that's where you'd find the butterflies. So what we're wanting to do with the one collection philosophy is to start with a front door that can look at everything. With one search, you could find the skeleton of a tui, a bird. You could find the cloak that contained tui feathers. You could find the sixpence that has the tui on it. You could find the book in the library or the manuscript that refers to tui. And we're then removing the need for our audience to have to imagine how it is that we've organised things. And I remember one day through the, as we were working with Adam and the team through this, that we were just sort of coming back from a meeting with our technical partners and I said, what it's actually like is that things exist out in the world in all sorts of complex ways. We bring them into a museum and then we compartmentalise them. We put them into this department or that department. We describe them in certain ways. What we needed to do was free them of those strictures that we'd put on them and return them to the world in their complexity and their joined-upness. So that's what the philosophy of the system was that we've built. Quite a lot of work though. This is our data map, our concept diagram. Not expecting you to read the words. The point is just to say it's a complex picture. So this is the diagram of all the different kinds of fields that we can have in our databases. The three systems are our library and documentary heritage system, our core objects and specimens database, which some of you will have heard about yesterday. All of these are being mushed into a new digital content platform. The source databases still perform their core functions as collection management systems and so forth behind the scenes, but for the public use, we needed to bring all of that data together and make sense and draw the connections between the different kinds of data. Adam's and Britain have me a whole lot of stuff about this, but I'm going to skip on. It is to an international standard. CDOC is the semantically rich, the richness of all of these relationships that produces a set of rules which lets us understand that birth date is different from establishment date, that it's different from the date something was made. It's a work in progress. It'll never be finished. What happens when you... So this is the data that can exist for just one object, all of these possible little packets of information. When we put our nearly one million records that were ready into the system and set it to go, it created 67 million triples, different pieces of information that could be inferred about the network of our collection. Graphically, it looks a bit like this. So the diagram on the left is a picture of just 1,000 objects and all the different connections that they can have. The one nearest me is the connections between artists. So that's taking all of our Crown Lynn collections and making the links to all the artists who worked at Crown Lynn all over their career or over the establishment of that company and just linking up all the who worked with whom in a graphical form. But that's the technology. We've also got a building full of people who need to be populating this system and working with it and be thinking about it. It was very important for us to bring the entire organisation along with us, particularly those who work closely with the collections. So we had various workshops, we had all-staff meetings along the way through the project to test our ideas, also to get people familiar with the fact that we weren't going to be working in the way that we always have been. For a brief moment, we had a dedicated room, so when we had the prototype system ready, it was a drop-in centre for anybody in the organisation to come in and try and play with it, break it, tell us their disappointments about what we hadn't achieved. And also workshops to think about some of the really gnarly issues. In order to be a truly open collection, we had to think through some of the things which are more sensitive about museum collections and sensitive about the way people have written about them and cared for them up till now. We had to ensure that the level of trust that the museum enjoys wouldn't be wrecked by us just blasting out a whole lot of rubbish information and think through what some of those stakeholders would consider to be important. The shift is quite substantial. And if you think of curators, collection managers, librarians who have treated this system as an in-house system that they were managing their collections for in over 20 years, that's all it had been. And suddenly, it's like the real estate agents come along and we're doing an open home on Saturday. Suddenly, like, well actually no, I don't want everybody looking through all my cupboards and drawers and the kitchen's not clean and there's quite a journey to go on to suddenly be forced to make stuff public. The other aspect of this was that there wasn't time to tidy the house before the crowds come in. Some of the concerns are the records ready, are not complete, they look messy, I'm going to be embarrassed amongst my colleagues if we release these. In some cases we know we've got inaccurate information in some fields concerned that that might mislead people, particularly in the scientific sphere where information gets exported and then aggregated with other data sets. Of course there is some confidential data. Where's our tolerance around that? Issues of privacy. So these things all came up and we debated them in working groups. We tried to work, because there's no right answer to quite a lot of these things, we tried to introduce the idea of a benefit harm continuum. So for each instance we had tried to think what's the benefit of releasing the information? What's the potential harm of it? What's the benefit of keeping it closed? What's the potential harm of keeping it closed? So asking the two ends of the spectrum in both directions. Referring back to things like NZ Goal which gives great good guidance about principles of openness and also the principles of our future museum document which is about engaging audiences with collections. Just as one quick example, the matter of acquisition price. These three items are currently on display at the museum as part of a small display marking the bicentenary of Marston's first sermon. The little book, Akaroa No New Zealand, is the first book printed in Māori in 1815. There was only one copy left in existence. It was purchased by the museum in 1894 for £1.15 shillings. That is quite an interesting piece of information because that's about the same as a fortnight's rent on a house. So it tells you a little bit about how much we valued those things at that time. In the middle is the Hongiheka Mary which was gifted to Mary Marston, Samuel Marston's daughter. The museum purchased that from the family and it had come down through descent for all those years but it was purchased for a private sum and it was funded by the museum's patron circle. That information, the understanding is that that's confidential. On this side, the Te Pahi Medal which was purchased last year in a joint bid between Te Papa and the Auckland Museum. That one is also a recent purchase so we were thinking maybe old acquisitions we can release the information and new ones we shouldn't but this one was bought at public auction. The information is already out there so why would we hide that and also public money was spent on it? Shouldn't we be accountable for that? So you can't have one single rule for all of these collection items. Then we got clever with inventing a filter so we had to think through because we can't possibly categorise and go through every field for three million objects to try and work out which one's open, which one's closed. We tried to think of a way to categorise for each of our 12 collecting areas just four categories of openness and what those four categories would include. So which fields would be included and which records would be included. You can see there, predominantly open means that we perceive that there's no problem if we're pretty okay with the information and then go down to things which are closed which we know are really problematic. What we tried to do was get most things in the predominantly open, predominantly closed category and then we of course discovered a fifth category which was restricted which is actually records that we never want out there. So for instance our records of human remains they are not for the public database and we don't believe they will ever be. So what we tried to do was get around the problem of every single record being just a black and white decision realising that there was actually grades of openness that we could have. So instead of one record, well is it on the database or is it not, we can say well most of them are but within that there's gradations of how open they are, how much information is released. And this actually helped hugely getting staff comfortable with the idea of releasing information. So we went through and applied these ABCDs in eight weeks to every single collection item in our collection and doing it by grouping things and then running programmes to just add a flag in the source databases. What we've ended up with is the red line where we've got a small number of records with quite a few fields visible and a larger number of records with a smaller number of fields available. What we want to do over time is move towards the green line where most records have most of their fields exposed but recognising of course there'll always be some which are kept confidential. Things like for botanical specimens of an endangered species, we don't want to display the location of those last few plants because that could cause difficulty if people suddenly decide they're going to go and souvenir them. And I've already mentioned things like acquisition price. So what does it end up with in terms of being able to be used? We built on top of this sort of the black box in the middle, our digital content platform, the linked open data graph that holds all of the 67 million triples. We built the new Collections Online website and this is both in the top section there. It's both a search interface and we tested varying levels of advanced search or not and different ways of faceting a layer below that. What's been pleasingly successful is our topic pages and we decided to put in 40 samples of different areas of the collection, utilising images, sort of like a mini encyclopedia page, I suppose, and initially, and I don't know if Gareth's in the room, he is. So Gareth helped manage this for us and to start with we were looking at 10 topic pages and I said, no, not enough. Cos what I thought would happen is we just did 10, it would be like the top 10 museum hits. What are the top 10 things that we're known for? And then I could see months of meetings arguing about which was in the 10 and which wasn't. So by making it a larger number, it actually got rid of that and it means that we've now had staff from all over the organisation submitting their little, I suppose like blog posts or their little pieces of research into all sorts of curious corners of the collection. So in fact there aren't many of our 40 topic pages which you would think of as the top 10. We've got the bottom 40 or something, not the bottom 40. But it's given a really rich, curious insight into the different kinds of things that the museum holds that people might not expect and it gives a little bit of a clue for people who don't, if they're just faced with a blank search box, well, I don't know, what should I search on? The other part of this digital content platform is that it is the reservoir and we have opened it up with an API and used it ourselves for the collect and connect table which some of you may have seen the presentation on earlier. And it's open for anybody else to come in and interrogate. That's all very well, but what next? What we've realised now is that we've built the vessel but we have to get the other two thirds of the collection catalogued. Just in the last couple of months, we have closed one of our or two of our conjoined gallery spaces, the Oceans and Coastal Galleries which between them had about 300 collection items in them and that's going to be turned into a collections hub for the next three years to employ additional staff, particularly to look at our Pacific collections but also to run a cataloging programme across all of the other backlog areas of the collection and a photographic studio staffed to really ramp up the quantity of images that we have available. So we're converting a part of the building which used to have a certain number of visitors through and a fairly small number of collection items into something that's going to feed the offsite and the online and my argument is that this is a much richer use of that piece of real estate. And with good grace, the curators in the marine area were really happy about that because when we did our calculations of the size of the backlog, the marine area has a 47-year backlog of un-catalogged material so they could see themselves as the beneficiaries of this project. I think I'll leave it there and I've got 57 seconds left, how's that? Yeah, happy to take any questions and if I don't know them, I'll just flick them on to Adam. Hey David, so I was picking up on your point about my point about digital brands and acknowledging your position as an unsychopedic museum, do you think this commitment to openness will flow through into your future collecting policies that you will be acquiring objects on the basis of their ability to be open and that they may tilt your collecting decisions away from things that can't be? I think it certainly will because already our acquisition agreements assume that it's going to go online and we actually changed the wording of the acquisition agreement where people opt out of having their name published, for example. Assume that the donor will have their name published and they can ask for it not to be. Previously, our form said please tick here if you would like your name published and just that subtle change and so that's an example of yes, I think that philosophy will flow through. So now that you've opened up the data, I was just wondering if anyone's using it. Have you had any good examples yet? We haven't got lots of examples yet. It is only end of June that this was launched. Adam did tell me how many people he's given the API keys to and I can't recall how many it is. It's a handful. So yep, there's one there. So I think early days and we're really looking forward to what people will do with the data. Ah, it's the third most popular section on the web page. After centre half perhaps. So I was really good to see how the decision-making process you went through about making it open and not making it open kind of touched on stuff I was talking about yesterday about being well managed in that. Are you thinking of maybe publishing your policies and stuff as Creative Commons license to open policies so that other organisations can learn from you? There's a lot that Te Papa and Auckland Museum and National Library and Archives can do to help the smaller museums to make those decisions themselves. Yep, I think we can certainly do that. What we've got at the moment are those policies in a fairly written and a quite internally focused way. But I think our focus has been on getting this thing built and getting it out the door. Now we can reflect a little bit and in fact even just Adam and I this presentation has helped us think through what are the key things that we feel we've achieved and you're absolutely right we should publish those and make them shareable. In some ways this is a bit of a rich question from me. It can be quite hard for research purposes for example to work with an API in order to, so those visualisations that you displayed before they require almost like a huge chunk of the data set which is very difficult to get at through a series of API calls. Would there be any consideration for certain purposes to release the data as a bulk download? So we have got a download button on the site so both individual records can be downloaded and also search results can be downloaded and we initially put a limit of around 200 items on that but we've just opened that right up because initially we're worried about what server traffic that might cause. In fact we've decided to go the other way. We'll apply our own policy be open, see what happens and then chunk it back if we have to. So in fact now you can search for the entire database and download it as a API but only certain fields. We've chunked it that way but not that way. We'll know who's done it. Big round of applause for David. I think a lot of people are watching you.